1 Balanced 3-Way Factorial Design

Use the following commands to initialize R before answering the questions in this section.

options(contrasts=c("contr.sum","contr.poly") )  # set definition of contrasts
load(url("http://pnb.mcmaster.ca/bennett/psy710/datasets/3-way-data-6.rda") )

An experiment examined how performance in three tasks was affected by divided attention in three age groups of subjects using a 3 (group) x 3 (task) x 2 (attention) crossed-factorial design. The data are stored in the data frame df1. The independent variables are group, task, and attention, and the dependent variable is Y

  1. Confirm that the data are balanced.
summary(df1)
##     subject    group   task     attention        Y       
##  s1     :  1   g1:60   t1:60   focus :90   Min.   :43.9  
##  s2     :  1   g2:60   t2:60   divide:90   1st Qu.:67.8  
##  s3     :  1   g3:60   t3:60               Median :74.8  
##  s4     :  1                               Mean   :75.0  
##  s5     :  1                               3rd Qu.:82.3  
##  s6     :  1                               Max.   :98.2  
##  (Other):174
xtabs(~group+task+attention,df1)
## , , attention = focus
## 
##      task
## group t1 t2 t3
##    g1 10 10 10
##    g2 10 10 10
##    g3 10 10 10
## 
## , , attention = divide
## 
##      task
## group t1 t2 t3
##    g1 10 10 10
##    g2 10 10 10
##    g3 10 10 10
  1. Use ANOVA to evaluate the all of the main effects and interactions. List the ANOVA table.
aov.01 <- aov(Y~group*task*attention,data=df1)
summary(aov.01)
##                       Df Sum Sq Mean Sq F value Pr(>F)   
## group                  2   1142     571    6.06 0.0029 **
## task                   2    587     294    3.12 0.0468 * 
## attention              1    513     513    5.44 0.0209 * 
## group:task             4    197      49    0.52 0.7195   
## group:attention        2      0       0    0.00 0.9979   
## task:attention         2     61      31    0.32 0.7230   
## group:task:attention   4   1646     411    4.37 0.0022 **
## Residuals            162  15250      94                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
library(afex)
aov.01.car <- aov_car(Y~group*task*attention+Error(subject),data=df1)
summary(aov.01.car)
  1. The 3-way interaction is significant. What null hypothesis is evaluated by this \(F\) test?
# The F test for the 3-way interaction evaluates the null hypothesis
# that i) the group x task interaction is the same at both levels of attention;
# ii) the attention x task interaction is the same in all groups;
# iii) the group x attention interaction is the same in all tasks.
  1. Evaluate the simple task x attention interaction in each group. If a simple interaction is significant, the analyze the simple simple main effect of task at each level of attention. Finally, if the simple simple main effect is significant, evaluate the pairwise differences among tasks with Tukey HSD.

levels(df1$group)
## [1] "g1" "g2" "g3"
# simple task x attention at g1:
aov.task.x.attention.g1 <- aov(Y~task*attention,data=subset(df1,group=="g1"))
summary(aov.task.x.attention.g1)
##                Df Sum Sq Mean Sq F value Pr(>F)   
## task            2    448     224    2.17 0.1242   
## attention       1    162     162    1.57 0.2161   
## task:attention  2   1095     547    5.30 0.0079 **
## Residuals      54   5582     103                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
levels(df1$attention)
## [1] "focus"  "divide"
# simple simple main effect of task at "attention==focus" & group==g1
aov.task.focus.g1 <- aov(Y~task,data=subset(df1,attention=="focus"&group=="g1"))
summary(aov.task.focus.g1)
##             Df Sum Sq Mean Sq F value Pr(>F)   
## task         2   1437     719    7.86  0.002 **
## Residuals   27   2469      91                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# analyze sub-effect with TukeyHSD
TukeyHSD(aov.task.focus.g1,which="task")
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = Y ~ task, data = subset(df1, attention == "focus" & group == "g1"))
## 
## $task
##          diff     lwr    upr  p adj
## t2-t1   9.499  -1.104 20.102 0.0856
## t3-t1  -7.413 -18.016  3.190 0.2114
## t3-t2 -16.912 -27.515 -6.308 0.0014
# simple simple main effect of task at "attention==divide" & group==g1
aov.task.divide.g1 <- aov(Y~task,data=subset(df1,attention=="divide"&group=="g1"))
summary(aov.task.divide.g1)
##             Df Sum Sq Mean Sq F value Pr(>F)
## task         2    106      53    0.46   0.64
## Residuals   27   3113     115

# simple task x attention at g2:
aov.task.x.attention.g2 <- aov(Y~task*attention,data=subset(df1,group=="g2"))
summary(aov.task.x.attention.g2)
##                Df Sum Sq Mean Sq F value Pr(>F)
## task            2    233   116.3    1.49   0.24
## attention       1    184   184.2    2.36   0.13
## task:attention  2    131    65.7    0.84   0.44
## Residuals      54   4223    78.2

# simple task x attention at g3:
aov.task.x.attention.g3 <- aov(Y~task*attention,data=subset(df1,group=="g3"))
summary(aov.task.x.attention.g3)
##                Df Sum Sq Mean Sq F value Pr(>F)
## task            2    103    51.5    0.51    0.6
## attention       1    167   166.8    1.65    0.2
## task:attention  2    481   240.4    2.39    0.1
## Residuals      54   5444   100.8
  1. Analyze the data with a linear contrast that tests the null hypothesis that the difference between tasks 1 and 2 (t1 & t2) is the same in groups 1 and 3 (g1 & g3).
# using aov:
levels(df1$task)
## [1] "t1" "t2" "t3"
t1.vs.t2 <- c(-1,1,0)
levels(df1$group)
## [1] "g1" "g2" "g3"
g1.vs.g3 <- c(-1,0,1)
contrasts(df1$task) <- cbind(t1.vs.t2)
contrasts(df1$group) <- cbind(g1.vs.g3)
aov.10 <- aov(Y~group*task*attention,data=df1)
summary(aov.10,split=list(group=list(g1.vs.g3=1,other=2),task=list(t1.vs.t2=1,other=2)))
##                                            Df Sum Sq Mean Sq F value  Pr(>F)    
## group                                       2   1142     571    6.06 0.00289 ** 
##   group: g1.vs.g3                           1   1039    1039   11.03 0.00111 ** 
##   group: other                              1    103     103    1.09 0.29697    
## task                                        2    587     294    3.12 0.04684 *  
##   task: t1.vs.t2                            1     39      39    0.41 0.52186    
##   task: other                               1    549     549    5.83 0.01690 *  
## attention                                   1    513     513    5.44 0.02085 *  
## group:task                                  4    197      49    0.52 0.71955    
##   group:task: g1.vs.g3.t1.vs.t2             1     99      99    1.05 0.30595    
##   group:task: other.t1.vs.t2                1     34      34    0.36 0.54938    
##   group:task: g1.vs.g3.other                1     63      63    0.66 0.41603    
##   group:task: other.other                   1      1       1    0.01 0.92380    
## group:attention                             2      0       0    0.00 0.99791    
##   group:attention: g1.vs.g3                 1      0       0    0.00 0.98913    
##   group:attention: other                    1      0       0    0.00 0.94964    
## task:attention                              2     61      31    0.32 0.72305    
##   task:attention: t1.vs.t2                  1     59      59    0.63 0.42831    
##   task:attention: other                     1      2       2    0.02 0.88966    
## group:task:attention                        4   1646     411    4.37 0.00220 ** 
##   group:task:attention: g1.vs.g3.t1.vs.t2   1    176     176    1.87 0.17328    
##   group:task:attention: other.t1.vs.t2      1    378     378    4.02 0.04665 *  
##   group:task:attention: g1.vs.g3.other      1   1088    1088   11.55 0.00085 ***
##   group:task:attention: other.other         1      4       4    0.04 0.83667    
## Residuals                                 162  15250      94                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# the (g1.vs.g3) X (t1.vs.t2) interaction  is NOT significant (F=1.055, p=0.306)
# also, the value of that contrast does not differ significantly across attention (F=1.871, p=0.173)
# Here is the same interaction contrast computed with emmeans
library(emmeans)
emm.10 <- emmeans(aov.10,specs=~task+group+attention)
emm.10 # note the order of the conditions!
##  task group attention emmean   SE  df lower.CL upper.CL
##  t1   g1    focus       78.3 3.07 162     72.3     84.4
##  t2   g1    focus       87.8 3.07 162     81.8     93.9
##  t3   g1    focus       70.9 3.07 162     64.8     77.0
##  t1   g2    focus       79.4 3.07 162     73.3     85.4
##  t2   g2    focus       78.4 3.07 162     72.3     84.5
##  t3   g2    focus       75.6 3.07 162     69.6     81.7
##  t1   g3    focus       72.4 3.07 162     66.3     78.4
##  t2   g3    focus       71.5 3.07 162     65.4     77.5
##  t3   g3    focus       75.6 3.07 162     69.5     81.7
##  t1   g1    divide      77.4 3.07 162     71.3     83.4
##  t2   g1    divide      73.1 3.07 162     67.0     79.2
##  t3   g1    divide      76.7 3.07 162     70.7     82.8
##  t1   g2    divide      72.4 3.07 162     66.3     78.5
##  t2   g2    divide      78.7 3.07 162     72.6     84.7
##  t3   g2    divide      71.8 3.07 162     65.7     77.9
##  t1   g3    divide      74.0 3.07 162     67.9     80.0
##  t2   g3    divide      71.2 3.07 162     65.1     77.2
##  t3   g3    divide      64.3 3.07 162     58.3     70.4
## 
## Confidence level used: 0.95
# create contrasts that compare conditions:
( t1.vs.t2 <- rep(c(-1,1,0),times=6) ) # task 1 vs task 2
##  [1] -1  1  0 -1  1  0 -1  1  0 -1  1  0 -1  1  0 -1  1  0
( g1.vs.g3 <- rep(c(-1,0,1),each=3,times=2) ) # group 1 vs group 2
##  [1] -1 -1 -1  0  0  0  1  1  1 -1 -1 -1  0  0  0  1  1  1
( focus.vs.divide <- rep(c(1,-1),each=9)) # focus vs divide
##  [1]  1  1  1  1  1  1  1  1  1 -1 -1 -1 -1 -1 -1 -1 -1 -1
( int.contrast <- g1.vs.g3 *  t1.vs.t2 ) # g1.vs.g3 x t1.vs.t2 (ignores attention)
##  [1]  1 -1  0  0  0  0 -1  1  0  1 -1  0  0  0  0 -1  1  0
( int.contrast.2 <- g1.vs.g3 *  t1.vs.t2 * focus.vs.divide ) # g1.vs.g3 x t1.vs.t2 x  attention
##  [1]  1 -1  0  0  0  0 -1  1  0 -1  1  0  0  0  0  1 -1  0
# p values match corresponding values in ANOVA table:
contrast(emm.10,
         method=list(g1.vs.g3.x.t1.vs.t2=int.contrast,
                     g1.vs.g3.x.t1.vs.t2.x.attn=int.contrast.2))
##  contrast                   estimate   SE  df t.ratio p.value
##  g1.vs.g3.x.t1.vs.t2           -8.91 8.68 162  -1.027  0.3060
##  g1.vs.g3.x.t1.vs.t2.x.attn   -11.87 8.68 162  -1.368  0.1733
# Here is another emmeans method for evaluating slightly different hypotheses
task.contrast <- c(-1,1,0)
con1 <- contrast(emm.10,method=list(task=task.contrast),by=c("group","attention"))
# following table doesn't include the *specific* group contrast used above
# So 1st & 3rd lines are closer to an omnibus F of variation across groups
# 1st line asks if task contrast differs IN ANY WAY across groups
# 2nd line asks if task contrast differs between levels of attention (this >>is<< the same as the anova result)
# 3rd line asks if task-contrast variation across groups DEPENDS on attention
joint_tests(con1)

2 Unbalanced Factorial Design

Use the following commands to initialize R before answering the questions in this section.

options(contrasts=c("contr.sum","contr.poly") )  # set definition of contrasts
load(url("http://pnb.mcmaster.ca/bennett/psy710/datasets/DfCD-2.rda") )

An experiment was done to measure the effects of treatment C and treatment D on a dependent variable, y, using a crossed-factorial design. Six subjects were assigned randomly to each condition; however, the data from two subjects in one of the conditions were lost. The data are stored in the data frame Df.CD.

  1. Confirm that the data are unbalanced.
# with(Df.CD,tapply(y,list(C,D),length))
xtabs(~C+D,data=Df.CD)
##     D
## C    b1 b2 b3
##   a1  6  6  6
##   a2  6  6  4
  1. Verify that the results of the two-way ANOVA depend on the order of the terms in the full linear model. Calculate the Type I and Type II sum of squares for C and D.
options(contrasts=c("contr.sum","contr.poly"))
cd.aov.01 <- aov(y~C+D+C:D, data=Df.CD)
cd.aov.02 <- aov(y~D+C+D:C, data=Df.CD)
summary(cd.aov.01)
##             Df Sum Sq Mean Sq F value  Pr(>F)    
## C            1   83.0    83.0    14.5 0.00069 ***
## D            2  275.0   137.5    24.1 8.2e-07 ***
## C:D          2   26.3    13.1     2.3 0.11853    
## Residuals   28  159.8     5.7                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(cd.aov.02)
##             Df Sum Sq Mean Sq F value  Pr(>F)    
## D            2  299.6   149.8    26.2 3.8e-07 ***
## C            1   58.4    58.4    10.2  0.0034 ** 
## D:C          2   26.3    13.1     2.3  0.1185    
## Residuals   28  159.8     5.7                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
C_SS_1 <- 83.0 # C: Type I SS
C_SS_2 <- 58.4 # C: Type II SS
D_SS_1 <- 299.6 # D: Type I SS
D_SS_2 <- 275.0 # D: Type II SS
  1. Evaluate the main effects of C and D and the C x D interaction using Type III sums of squares. What null hypotheses are evaluated by these \(F\) tests?
# using ANOVA:
require(car)
Anova(cd.aov.01,type="3")
Anova(cd.aov.02,type="3") # order independent
# using drop1:
drop1(cd.aov.01,.~.,test="F")
# type III SS using aov_car & aov_ez:
library(afex)
N <- dim(Df.CD)[1]
Df.CD$id <- factor(x=1:N,labels="s",ordered=FALSE)
# type 3 sums of squares
acar.t3.01 <- aov_car(y~C*D+Error(id),data=Df.CD,type="3")
# type 3 sums of squares with aov_ez:
# aez.t3.01 <- aov_ez(id="id",dv="y",between=c("C","D"),data=Df.CD,type="3")
summary(acar.t3.01)
# type II SS using aov_car:
# type 2 sums of squares
acar.t2.01 <- aov_car(y~C*D+Error(id),data=Df.CD,type="2")
summary(acar.t2.01) # same as Anova(cd.aov.01,type="3")
# compare to Anova:
Anova(cd.aov.01,type="2")
# using aov_ez:
# type 3 sums of squares with aov_ez:
aez.t3.01 <- aov_ez(id="id",dv="y",between=c("C","D"),data=Df.CD,type="3")
summary(aez.t3.01) # same as Anova(cd.aov.01,type="3")
# type 2 sums of squares with aov_ez:
aez.t2.01 <- aov_ez(id="id",dv="y",between=c("C","D"),data=Df.CD,type="2")
summary(aez.t2.01) # same as Anova(cd.aov.01,type="2")

Answer: The hypotheses for the main effects are that the unweighted marginal means are equal. The hypothesis for the interaction is that the effect of C does not vary across the levels of D (or, equivalently, that the effect of D does not vary across the levels of C).