1 Initialize R

Use the following commands to initialize R. The commands set R to define effects using the sum-to-zero constraint and loads two data frames, df0 and df1, that we’ll analyze in the following sections.

options(contrasts=c("contr.sum","contr.poly") )  # set definition of contrasts
load(url("http://pnb.mcmaster.ca/bennett/psy710/datasets/RLab06b.rda") )

2 Three-Way ANOVA

An experiment used a factorial design to analyze the effects of three factors A, B, and C on a dependent variable, Y. The data are stored in the data frame df0. Use df0 to answer the following questions. When analyzing simple interaction effects or simple main effects, you do not need to recalculate \(F\) and \(p\) values using MS-error and df-error from the original ANOVA.

  1. Confirm that the data are balanced.
summary(df0)
##   A       B       C            Y        
##  a1:54   b1:54   c1:54   Min.   : 70.7  
##  a2:54   b2:54   c2:54   1st Qu.: 93.6  
##  a3:54   b3:54   c3:54   Median :102.0  
##                          Mean   :101.2  
##                          3rd Qu.:108.5  
##                          Max.   :132.1
xtabs(~A+B+C,df0)
## , , C = c1
## 
##     B
## A    b1 b2 b3
##   a1  6  6  6
##   a2  6  6  6
##   a3  6  6  6
## 
## , , C = c2
## 
##     B
## A    b1 b2 b3
##   a1  6  6  6
##   a2  6  6  6
##   a3  6  6  6
## 
## , , C = c3
## 
##     B
## A    b1 b2 b3
##   a1  6  6  6
##   a2  6  6  6
##   a3  6  6  6
  1. Use ANOVA to evaluate the all of the main effects and interactions. List the ANOVA table.
aov.01 <- aov(Y~A*B*C,data=df0)
summary(aov.01)
##              Df Sum Sq Mean Sq F value  Pr(>F)    
## A             2    762     381    3.33 0.03872 *  
## B             2   1669     834    7.29 0.00099 ***
## C             2    304     152    1.33 0.26812    
## A:B           4   1536     384    3.36 0.01182 *  
## A:C           4    352      88    0.77 0.54768    
## B:C           4   1267     317    2.77 0.02997 *  
## A:B:C         8   2352     294    2.57 0.01221 *  
## Residuals   135  15452     114                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  1. The 3-way interaction is significant. What null hypothesis is evaluated by this \(F\) test?
# The F test for the 3-way interaction evaluates the null hypothesis that 
# i) the AxB interaction is the same at all levels of C;
# ii) the AxC interaction is the same at all levels of B;
# iii) the BxC interaction is the same at all levels of A.
  1. Does it make sense to analyze the sub-effects of the main effects of A and B, or to decompose/analyze the A:B interaction? Why or why not?
# Probably does not make much sense to analyze these effects because the
# significant 3-way interaction indicates that the effects of one factor (e.g., A)
# depend on the levels of the other factors (B & C), and that the 2-way interactions
# (e.g. AxB) depend on the level of the third factor (e.g., C). So we should start by 
# decomposing the 3-way interaction.
  1. Plot the 3-way interaction and come up with an idea about why it is significant.
library(ggplot2)

myTheme <- theme(
  plot.title=element_text(family="Helvetica",face="bold",size=(24),hjust=0.5),
  strip.text=element_text(family="Helvetica",face="bold",size=(18),hjust=0.5),
  legend.title=element_text(family="Helvetica",face="bold",size=(18),hjust=0.5),
  legend.text=element_text(family="Helvetica",size=(18)),
  axis.title=element_text(family="Helvetica",size=(24)),
  axis.text=element_text(family="Helvetica",size=(18)))

library(emmeans)
aov.emm <- emmeans(aov.01,~A*B*C);
emmip(aov.emm,C~A|B,dotarg=list(size=4))+myTheme + ylim(60,130)# look at B3

# To me, it looks like A x C interaction is stronger 
# when B=b3 than when B=b1 or B=b2. One way of seeing if this
# idea is reasonable is to remove b3 and see if the AxBxC interaction
# changes...
aov.01.nob3 <- aov(Y~A*B*C,data=subset(df0,B!="b3")) # remove b3
summary(aov.01.nob3) # the AxBxC interaction is no longer significant
##             Df Sum Sq Mean Sq F value  Pr(>F)    
## A            2   1095     548    4.33 0.01602 *  
## B            1   1628    1628   12.87 0.00054 ***
## C            2     78      39    0.31 0.73614    
## A:B          2   1005     503    3.97 0.02222 *  
## A:C          4    187      47    0.37 0.82946    
## B:C          2    214     107    0.85 0.43266    
## A:B:C        4    212      53    0.42 0.79524    
## Residuals   90  11386     127                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  1. Evaluate the simple A:C interaction for each level of B. For each analysis, calculate a measure of effect size or association strength for AxC. For each analysis, if the simple interaction is significant, then evaluate the simple simple main effect of C at each level of A. If the simple interaction is not significant, then evaluate the simple main effects of A and C at each level of B.
# a x c at b1:
aov.01.b1 <- aov(Y~A*C,data=subset(df0,B=="b1"))
summary(aov.01.b1) # no significant effects
##             Df Sum Sq Mean Sq F value Pr(>F)
## A            2      2     0.9    0.01   0.99
## C            2    243   121.4    1.01   0.37
## A:C          4    179    44.8    0.37   0.83
## Residuals   45   5396   119.9
library(effectsize)
cohens_f(aov.01.b1)$Cohens_f_partial[3] # Cohen's f for A:C
## [1] 0.1823
# a x c at  b2:
aov.01.b2 <- aov(Y~A*C,data=subset(df0,B=="b2")) 
summary(aov.01.b2)  # significant simple main effect of A (at b2)
##             Df Sum Sq Mean Sq F value Pr(>F)   
## A            2   2099    1049    7.88 0.0012 **
## C            2     49      24    0.18 0.8327   
## A:C          4    219      55    0.41 0.7988   
## Residuals   45   5989     133                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
cohens_f(aov.01.b2)$Cohens_f_partial[3] # Cohen's f for A:C
## [1] 0.1914
TukeyHSD(aov.01.b2,which = "A") # marginal means of a1 & a3 differ
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = Y ~ A * C, data = subset(df0, B == "b2"))
## 
## $A
##         diff    lwr   upr  p adj
## a2-a1  7.781 -1.539 17.10 0.1182
## a3-a1 15.270  5.950 24.59 0.0007
## a3-a2  7.489 -1.832 16.81 0.1374
library(gplots)
plotmeans(Y~A,
        data=subset(df0,B=="b2"),
        main="Simple Main Effect of A at B=b2") # plot of simple main effect of A at b2

# a x c at  b3:
aov.01.b3 <- aov(Y~A*C,data=subset(df0,B=="b3"))
summary(aov.01.b3)
##             Df Sum Sq Mean Sq F value  Pr(>F)    
## A            2    198      99    1.10 0.34236    
## C            2   1279     640    7.08 0.00212 ** 
## A:C          4   2305     576    6.38 0.00037 ***
## Residuals   45   4066      90                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
cohens_f(aov.01.b3)$Cohens_f_partial[3] # Cohen's f for A:C
## [1] 0.7528
# simple-simple main effect of C at a1 & b3:
aov.01.b3.a1 <- aov(Y~C,data=subset(df0,B=="b3"&A=="a1"))
summary(aov.01.b3.a1) # not significant
##             Df Sum Sq Mean Sq F value Pr(>F)
## C            2     45    22.7     0.3   0.75
## Residuals   15   1138    75.9

# simple-simple effect of C at a2 & b3:
aov.01.b3.a2 <- aov(Y~C,data=subset(df0,B=="b3"&A=="a2"))
summary(aov.01.b3.a2) # not significant
##             Df Sum Sq Mean Sq F value Pr(>F)
## C            2     33    16.3    0.17   0.84
## Residuals   15   1414    94.3

# simple-simple effect of C at a3 & b3:
aov.01.b3.a3 <- aov(Y~C,data=subset(df0,B=="b3"&A=="a3"))
summary(aov.01.b3.a3) # significant [look at these 3 means in my figure]
##             Df Sum Sq Mean Sq F value  Pr(>F)    
## C            2   3506    1753    17.4 0.00012 ***
## Residuals   15   1514     101                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
TukeyHSD(aov.01.b3.a3) # c3 different from c1 & c2
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = Y ~ C, data = subset(df0, B == "b3" & A == "a3"))
## 
## $C
##         diff    lwr   upr  p adj
## c2-c1 -3.021 -18.09 12.04 0.8624
## c3-c1 27.978  12.91 43.04 0.0006
## c3-c2 30.999  15.93 46.07 0.0002
library(gplots)
plotmeans(Y~C,
        data=subset(df0,B=="b3"&A=="a3"),
        main="Simple-Simple Main Effect of C at A=a3 & B=b3")

3 Two-Way ANOVA

An experiment used a factorial design to evaluate the effects of two drugs, A and B, on the dependent variable Y. The levels of A and B were drug absent (a0 & b0) and drug present (a1 & b1), condition a0b0 is a baseline (placebo) condition. The data are stored in the data frame df1.

  1. Use ANOVA to evaluate the effects of A and B. List the ANOVA table.
aov.20 <-aov(Y~A*B,data=df1)
summary(aov.20)
##             Df Sum Sq Mean Sq F value  Pr(>F)    
## A            1    669     669    34.0 1.2e-06 ***
## B            1    238     238    12.1  0.0013 ** 
## A:B          1    554     554    28.2 5.9e-06 ***
## Residuals   36    708      20                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  1. Evaluate the homogeneity of variance assumption.
df1$condition <- with(df1,interaction(A,B))
levels(df1$condition)
## [1] "a0.b0" "a1.b0" "a0.b1" "a1.b1"
bartlett.test(Y~condition,df1) # we do not reject H0 that variance is equal across conditions
## 
##  Bartlett test of homogeneity of variances
## 
## data:  Y by condition
## Bartlett's K-squared = 2.7, df = 3, p-value = 0.4
  1. Evaluate the normality assumption.
shapiro.test(residuals(aov.20)) # we do not reject H0 that the residuals are normal
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(aov.20)
## W = 0.98, p-value = 0.8
qqnorm(residuals(aov.20))
qqline(residuals(aov.20))

10. The experimenters hypothesized that both drugs had to be present to have an effect on Y. Perform an analysis that follows up your ANOVA to evaluate this hypothesis.

aov.20.emm <- emmeans(aov.20,~A*B);
emmip(aov.20.emm,A~B,dotarg=list(size=4))+myTheme + ylim(90,130)

# There are at least two ways you could answer this question. One would be to evaluate
# all pairwise differences in your design. The prediction is that a1b1 should differ from the
# other 3 conditions (which should not differ from each other).
TukeyHSD(aov.20,which="A:B")
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = Y ~ A * B, data = df1)
## 
## $`A:B`
##                diff    lwr    upr  p adj
## a1:b0-a0:b0  0.7409 -4.599  6.081 0.9819
## a0:b1-a0:b0 -2.5635 -7.903  2.777 0.5733
## a1:b1-a0:b0 13.0577  7.718 18.398 0.0000
## a0:b1-a1:b0 -3.3044 -8.644  2.036 0.3558
## a1:b1-a1:b0 12.3167  6.977 17.657 0.0000
## a1:b1-a0:b1 15.6211 10.281 20.961 0.0000
# another Tukey test:
aov.21 <- aov(Y~condition,data=df1)
TukeyHSD(aov.21) # same as above
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = Y ~ condition, data = df1)
## 
## $condition
##                diff    lwr    upr  p adj
## a1.b0-a0.b0  0.7409 -4.599  6.081 0.9819
## a0.b1-a0.b0 -2.5635 -7.903  2.777 0.5733
## a1.b1-a0.b0 13.0577  7.718 18.398 0.0000
## a0.b1-a1.b0 -3.3044 -8.644  2.036 0.3558
## a1.b1-a1.b0 12.3167  6.977 17.657 0.0000
## a1.b1-a0.b1 15.6211 10.281 20.961 0.0000
# A second way would be to evaluate a linear contrast that compares a1b1 to the other conditions
# This contrast accounts for nearly all (96%) of variation across conditions
levels(df1$condition)
## [1] "a0.b0" "a1.b0" "a0.b1" "a1.b1"
a1b1 <- c(-1/3,-1/3,-1/3,1)
contrasts(df1$condition) <- cbind(a1b1)
aov.22  <- aov(Y~condition,data=df1)
summary(aov.22,split=list(condition=list(a1b1=1,others=2:3)))
##                     Df Sum Sq Mean Sq F value  Pr(>F)    
## condition            3   1461     487   24.77 7.2e-09 ***
##   condition: a1b1    1   1401    1401   71.25 4.7e-10 ***
##   condition: others  2     60      30    1.53    0.23    
## Residuals           36    708      20                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# contrast accounts for 96% of variation across conditions
(R2.alerting <- 1401/1461)
## [1] 0.9589