c("car")
wants <- wants %in% rownames(installed.packages())
has <-if(any(!has)) install.packages(wants[!has])
Non-orthogonal factorial between-subjects designs typically result from non-proportional unequal cell sizes. Proportional unequal cell sizes are present when \(n_{jk}/n_{jk'} = n_{j'k}/n_{j'k'}\) and \(n_{jk}/n_{j'k} = n_{jk'}/n_{j'k'}\) for all \(j, j', k, k'\) holds.
In the non-proportional case, so-called type I, II, or III sums of squares can give different results in an ANOVA for all tests but the highest interaction effect. “Types of SS” is a misnomer: The SS of an effect is the sum of squared differences between the predicted values from the least-squares fit of a restricted model and the prediction from the least-squares fit of a more general model. What differs between the “types of SS” is the choice for the restricted and more general model when testing an effect. Orthogonal designs obscure these differences because then all SS types will be equal.
SAS and SPSS use SS type III as their default, while functions that ship with base R use type I. This can lead to different results when analyzing the same data with different statistics packages.
A thorough resource on the topic is chapter 7 of Maxwell & Delaney (2004). Designing Experiments and Analyzing Data. A Model Comparison Perspective. Mahwah, NJ: Lawrence Erlbaum.
Hypotheses for main effects as tested by different SS types in a non-proportional unbalanced two-factorial between-subjects design (IV A with \(P\) groups, B with \(Q\) groups).
Null-Hypotheses SS type I (A entered before B):
A: \(\sum_{k=1}^{Q} \frac{n_{1k}}{n_{1+}} \cdot \mu_{1k} = \ldots = \sum_{k=1}^{Q} \frac{n_{jk}}{n_{j+}} \cdot \mu_{jk} = \ldots = \sum_{k=1}^{Q} \frac{n_{Pk}}{n_{P+}} \cdot \mu_{Pk}\)
B: \(\sum_{j=1}^{P}(n_{jk} - \frac{n^{2}_{jk}}{n_{j+}}) \cdot \mu_{jk} = \sum_{k' \neq k} \sum_{j=1}^{P}(\frac{n_{jk} \cdot n_{jk'}}{n_{j+}}) \cdot \mu_{jk'} \qquad \forall \, k = 1, \ldots, Q-1\)
Null-Hypotheses SS type II:
A: \(\sum_{k=1}^{Q}(n_{jk} - \frac{n^{2}_{jk}}{n_{+k}}) \cdot \mu_{jk} = \sum_{j' \neq j} \sum_{k=1}^{Q}(\frac{n_{jk} \cdot n_{j'k}}{n_{+k}}) \cdot \mu_{j'k} \qquad \forall \, j = 1, \ldots, P-1\)
B: \(\sum_{j=1}^{P}(n_{jk} - \frac{n^{2}_{jk}}{n_{j+}}) \cdot \mu_{jk} = \sum_{k' \neq k} \sum_{j=1}^{P}(\frac{n_{jk} \cdot n_{jk'}}{n_{j+}}) \cdot \mu_{jk'} \qquad \forall \,k = 1, \ldots, Q-1\)
Null-Hypotheses SS type III (with effect- or orthogonal coding method):
A: \(\mu_{1.} = \ldots = \mu_{j.} = \ldots = \mu_{P.}\) B: \(\mu_{.1} = \ldots = \mu_{.k} = \ldots = \mu_{.Q}\)
3
P <- 3
Q <- c(41, 43, 50)
g11 <- c(51, 43, 53, 54, 46)
g12 <- c(45, 55, 56, 60, 58, 62, 62)
g13 <- c(56, 47, 45, 46, 49)
g21 <- c(58, 54, 49, 61, 52, 62)
g22 <- c(59, 55, 68, 63)
g23 <- c(43, 56, 48, 46, 47)
g31 <- c(59, 46, 58, 54)
g32 <- c(55, 69, 63, 56, 62, 67)
g33 <- data.frame(IV1=factor(rep(1:P, c(3+5+7, 5+6+4, 5+4+6))),
dfMD <-IV2=factor(rep(rep(1:Q, P), c(3,5,7, 5,6,4, 5,4,6))),
DV =c(g11, g12, g13, g21, g22, g23, g31, g32, g33))
xtabs(~ IV1 + IV2, data=dfMD)
IV2
IV1 1 2 3
1 3 5 7
2 5 6 4
3 5 4 6
Type I sum of squares have the following properties:
anova(lm(DV ~ IV1 + IV2 + IV1:IV2, data=dfMD))
Analysis of Variance Table
Response: DV
Df Sum Sq Mean Sq F value Pr(>F)
IV1 2 101.11 50.56 1.8102 0.1782
IV2 2 1253.19 626.59 22.4357 4.711e-07 ***
IV1:IV2 4 14.19 3.55 0.1270 0.9717
Residuals 36 1005.42 27.93
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(lm(DV ~ IV2 + IV1 + IV1:IV2, data=dfMD))
Analysis of Variance Table
Response: DV
Df Sum Sq Mean Sq F value Pr(>F)
IV2 2 1115.82 557.91 19.9764 1.458e-06 ***
IV1 2 238.48 119.24 4.2695 0.02168 *
IV2:IV1 4 14.19 3.55 0.1270 0.97170
Residuals 36 1005.42 27.93
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(lm(DV ~ 1, data=dfMD),
SS.I1 <-lm(DV ~ IV1, data=dfMD))
anova(lm(DV ~ IV1, data=dfMD),
SS.I2 <-lm(DV ~ IV1+IV2, data=dfMD))
anova(lm(DV ~ IV1+IV2, data=dfMD),
SS.Ii <-lm(DV ~ IV1+IV2 + IV1:IV2, data=dfMD))
2, "Sum of Sq"] SS.I1[
[1] 101.1111
2, "Sum of Sq"] SS.I2[
[1] 1253.189
2, "Sum of Sq"] SS.Ii[
[1] 14.18714
anova(lm(DV ~ 1, data=dfMD),
SST <-lm(DV ~ IV1*IV2, data=dfMD))
2, "Sum of Sq"] SST[
[1] 1368.487
2, "Sum of Sq"] + SS.I2[2, "Sum of Sq"] + SS.Ii[2, "Sum of Sq"] SS.I1[
[1] 1368.487
Type II sum of squares have the following properties:
Anova()
from package car
library(car)
Anova(lm(DV ~ IV1*IV2, data=dfMD), type="II")
Anova Table (Type II tests)
Response: DV
Sum Sq Df F value Pr(>F)
IV1 238.48 2 4.2695 0.02168 *
IV2 1253.19 2 22.4357 4.711e-07 ***
IV1:IV2 14.19 4 0.1270 0.97170
Residuals 1005.42 36
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(lm(DV ~ IV2, data=dfMD),
SS.II1 <-lm(DV ~ IV1+IV2, data=dfMD))
anova(lm(DV ~ IV1, data=dfMD),
SS.II2 <-lm(DV ~ IV1+IV2, data=dfMD))
anova(lm(DV ~ IV1+IV2, data=dfMD),
SS.IIi <-lm(DV ~ IV1+IV2+IV1:IV2, data=dfMD))
2, "Sum of Sq"] SS.II1[
[1] 238.4826
2, "Sum of Sq"] SS.II2[
[1] 1253.189
2, "Sum of Sq"] SS.IIi[
[1] 14.18714
anova(lm(DV ~ 1, data=dfMD),
SST <-lm(DV ~ IV1*IV2, data=dfMD))
2, "Sum of Sq"] SST[
[1] 1368.487
2, "Sum of Sq"] + SS.II2[2, "Sum of Sq"] + SS.IIi[2, "Sum of Sq"] SS.II1[
[1] 1505.859
Type III sum of squares have the following properties:
Anova()
from package car
# options(contrasts=c(unordered="contr.sum", ordered="contr.poly"))
# options(contrasts=c(unordered="contr.treatment", ordered="contr.poly"))
lm(DV ~ IV1 + IV2 + IV1:IV2, data=dfMD,
fitIII <-contrasts=list(IV1=contr.sum, IV2=contr.sum))
library(car)
Anova(fitIII, type="III")
Anova Table (Type III tests)
Response: DV
Sum Sq Df F value Pr(>F)
(Intercept) 121174 1 4338.7178 < 2.2e-16 ***
IV1 205 2 3.6658 0.03556 *
IV2 1181 2 21.1452 8.447e-07 ***
IV1:IV2 14 4 0.1270 0.97170
Residuals 1005 36
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
drop1()
The model comparisons for main effects with SS type III cannot be done using anova()
due to the violation of marginality principle - anova()
automatically includes all lower-order terms for the interactions it finds. This would be the comparison:
# A: lm(DV ~ IV2 + IV1:IV2) vs. lm(DV ~ IV1 + IV2 + IV1:IV2)
# B: lm(DV ~ IV1 + IV1:IV2) vs. lm(DV ~ IV1 + IV2 + IV1:IV2)
In contrast, drop1()
drops each term in turn even if marginality is violated, so it gives SS type III.
drop1(fitIII, ~ ., test="F")
Single term deletions
Model:
DV ~ IV1 + IV2 + IV1:IV2
Df Sum of Sq RSS AIC F value Pr(>F)
<none> 1005.4 157.79
IV1 2 204.76 1210.2 162.13 3.6658 0.03556 *
IV2 2 1181.11 2186.5 188.75 21.1452 8.447e-07 ***
IV1:IV2 4 14.19 1019.6 150.42 0.1270 0.97170
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
drop1(fitIII, ~ ., test="F")
Single term deletions
Model:
DV ~ IV1 + IV2 + IV1:IV2
Df Sum of Sq RSS AIC F value Pr(>F)
<none> 1005.4 157.79
IV1 2 204.76 1210.2 162.13 3.6658 0.03556 *
IV2 2 1181.11 2186.5 188.75 21.1452 8.447e-07 ***
IV1:IV2 4 14.19 1019.6 150.42 0.1270 0.97170
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
try(detach(package:car))
R markdown - markdown - R code - all posts