1 Initialize R

Initialize R with the following commands. The load command loads three data frames – L4.df01, visRT, and blood – though we will not use blood in this lab.

options(digits=4,contrasts=c("contr.sum","contr.poly")) # IMPORTANT! [Use sum-to-zero constraint for effects]
load(file=url("http://pnb.mcmaster.ca/bennett/psy710/datasets/rlab4-data-2023.rda"))

2 One-way ANOVA

In this section we will analyze data in the L4.df01 data frame, which contains the data from an experiment that measured scores in for groups of subjects. There are 12 observations per group.

with(L4.df01,tapply(score,group,length))
##  1  2  3  4 
## 12 12 12 12
summary(L4.df01)
##      group          score      
##  Min.   :1.00   Min.   : 64.1  
##  1st Qu.:1.75   1st Qu.: 88.8  
##  Median :2.50   Median :100.2  
##  Mean   :2.50   Mean   :101.1  
##  3rd Qu.:3.25   3rd Qu.:112.8  
##  Max.   :4.00   Max.   :132.4
  1. Use the variables group and score in L4.df01 to evaluate the null hypothesis that the four group means are equal. Based on the ANOVA table, what is your decision regarding the null hypothesis?
aov.01 <- aov(score~group,data=L4.df01)
summary(aov.01)
##             Df Sum Sq Mean Sq F value Pr(>F)  
## group        1   1384    1384    6.39  0.015 *
## Residuals   46   9961     217                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# I reject the null hypothesis that the means are equal (or, equivalent that all of the effects are zero)
  1. What is wrong with the previous analysis? Hint: Look at the degrees of freedom.
# Note the degrees of freedom for group is 1, which is NOT correct.
# There are 4 groups so there should be 3 degrees of freedom.
# R has treated group as a numeric variable, not a factor
sapply(L4.df01,class)
##     group     score 
## "numeric" "numeric"

The following code i) transforms group into a factor; and creates another factor condition and adds it to our data frame.

L4.df01$group <- factor(x=L4.df01$group,labels="g")
n <- 12
a <- 4
gnumber <- rep(x=1:a,each=n)
L4.df01$condition <- factor(x=gnumber,labels="c",ordered=FALSE)
class(L4.df01$condition)
## [1] "factor"
levels(L4.df01$condition)
## [1] "c1" "c2" "c3" "c4"
summary(L4.df01)
##  group       score       condition
##  g1:12   Min.   : 64.1   c1:12    
##  g2:12   1st Qu.: 88.8   c2:12    
##  g3:12   Median :100.2   c3:12    
##  g4:12   Mean   :101.1   c4:12    
##          3rd Qu.:112.8            
##          Max.   :132.4
  1. Visualize the data by plotting box plots of score for each condition.
boxplot(score~condition,data=L4.df01,main="data frame L4.df01",cex.axis=1.25,cex.lab=1.5)

  1. Now use aov to evaluate the null hypothesis that all of the group means are equal. How does the result differ from the previous ANOVA?
L4.aov.01 <- aov(score~group,data=L4.df01) 
# L4.aov.02 <- aov(score~condition,data=L4.df01) # using condition gives same result
# anova(L4.aov.01)
summary(L4.aov.01)
##             Df Sum Sq Mean Sq F value Pr(>F)  
## group        3   1515     505    2.26  0.095 .
## Residuals   44   9830     223                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Unlike before, I do not reject the null hypothesis that all of the group means are equal.
  1. Check the normality assumption.
# I do not reject the null hypothesis that the residuals are distributed normally
shapiro.test(residuals(L4.aov.01)) 
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(L4.aov.01)
## W = 0.98, p-value = 0.8
qqnorm(residuals(L4.aov.01),main="L4.aov.01");qqline(residuals(L4.aov.01))

  1. Evaluate the null hypothesis of equal group means using an analysis that assumes the residuals are distributed normally but does not assume that the variances are equal across groups.
oneway.test(score~condition,data=L4.df01)
## 
##  One-way analysis of means (not assuming equal variances)
## 
## data:  score and condition
## F = 2.1, num df = 3, denom df = 24, p-value = 0.1

# I do not reject the null hypothesis of equal means.

2.1 Pairwise Tests

The following code shows how to perform all pairwise tests and adjust the \(p\) values using the Tukey method of adjustment. The confidence intervals listed by TukeyHSD are adjusted, or simultaneous, confidence intervals. An individual 95% CI includes the true population value 95% of the time (in the long run). On the other hand, all of the adjusted 95% CIs will (in the long run) contain the true population value. In other words, if we repeated this experiment many times, all of the adjusted CIs would contain their respective true values 95% of the time.

# Tukey HSD -----------
TukeyHSD(L4.aov.01,"group") # assumes equal variance
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = score ~ group, data = L4.df01)
## 
## $group
##         diff     lwr   upr  p adj
## g2-g1  1.851 -14.441 18.14 0.9902
## g3-g1 11.091  -5.201 27.38 0.2789
## g4-g1 12.930  -3.362 29.22 0.1630
## g3-g2  9.240  -7.052 25.53 0.4378
## g4-g2 11.079  -5.213 27.37 0.2798
## g4-g3  1.839 -14.453 18.13 0.9904

# using emmeans -------
library(emmeans)
lab4.emm <- emmeans(L4.aov.01,specs="group")
contrast(lab4.emm,method="pairwise",adjust="tukey")
##  contrast estimate  SE df t.ratio p.value
##  g1 - g2     -1.85 6.1 44  -0.303  0.9902
##  g1 - g3    -11.09 6.1 44  -1.818  0.2789
##  g1 - g4    -12.93 6.1 44  -2.119  0.1630
##  g2 - g3     -9.24 6.1 44  -1.514  0.4378
##  g2 - g4    -11.08 6.1 44  -1.816  0.2798
##  g3 - g4     -1.84 6.1 44  -0.301  0.9904
## 
## P value adjustment: tukey method for comparing a family of 4 estimates

Tukey HSD assumes equal variance across conditions, but Dunnett’s T3 test does not.

# Dunnett T3 Test (unequal variances) -------
library(PMCMRplus)
dunnettT3Test(score~group,data=L4.df01) # doesn't assume equal variance
##    g1   g2   g3  
## g2 1.00 -    -   
## g3 0.38 0.58 -   
## g4 0.23 0.39 1.00

Dunnett’s test is useful when you are comparing several groups to a single control group.

## Dunnett pairwise tests (to control group) ----------
# install.packages("DescTools")
library(DescTools)
levels(L4.df01$group)
## [1] "g1" "g2" "g3" "g4"
DunnettTest(score~group,data=L4.df01,control="g4")
## 
##   Dunnett's test for comparing several treatments with a control :  
##     95% family-wise confidence level
## 
## $g4
##          diff lwr.ci upr.ci   pval    
## g1-g4 -12.930 -27.78  1.916 0.1000 .  
## g2-g4 -11.079 -25.92  3.767 0.1829    
## g3-g4  -1.839 -16.68 13.007 0.9814    
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

2.2 Linear Contrasts

In this section we illustrate how to use linear contrasts/comparisons to analyze the data in L4.df01. First we create three contrasts, demonstrate that they are a complete orthogonal set, and assign them to our factor condition.

# create contrasts ---------
c1 <- c(-1,-1,1,1) 
c2 <- c(-1,1,0,0)
c3 <- c(0,0,-1,1)

# store 3 contrasts in one matrix:
( cMat <- cbind(c1,c2,c3) )
##      c1 c2 c3
## [1,] -1 -1  0
## [2,] -1  1  0
## [3,]  1  0 -1
## [4,]  1  0  1

# make sure contrast weights sum to zero
apply(cMat,MARGIN=2,FUN=sum) # note MARGIN & FUN are capitalized
## c1 c2 c3 
##  0  0  0

# 4 levels; 3 degrees of freedom
levels(L4.df01$condition) 
## [1] "c1" "c2" "c3" "c4"

# complete orthogonal set has 3 mutually-ortogonal contrasts
sum(c1*c2) # orthogonal
## [1] 0
sum(c1*c3) # orthogonal
## [1] 0
sum(c2*c3) # orthogonal
## [1] 0

# finally, assign contrasts to our factor
( contrasts(L4.df01$condition) <- cMat ) # assign contrasts to our factor
##      c1 c2 c3
## [1,] -1 -1  0
## [2,] -1  1  0
## [3,]  1  0 -1
## [4,]  1  0  1

Next, we incorporate the contrasts into our analysis of variance using aov and split. Notice that the sum of the SS values for the three contrasts equals the SS for condition (within rounding error), and that average \(F\) for the three contrasts equals the omnibus \(F\).

L4.aov.02 <- aov(score~condition,data=L4.df01)
options(digits=8)
summary(L4.aov.02,split=list(condition=list(c1=1,c2=2,c3=3)))
##                 Df Sum Sq Mean Sq F value  Pr(>F)  
## condition        3 1515.5  505.15  2.2611 0.09456 .
##   condition: c1  1 1474.6 1474.60  6.6006 0.01366 *
##   condition: c2  1   20.6   20.56  0.0920 0.76303  
##   condition: c3  1   20.3   20.29  0.0908 0.76457  
## Residuals       44 9829.8  223.40                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
1474.6+20.6+20.3 # SS values
## [1] 1515.5
(6.6006+0.092+0.0908)/3 # F values
## [1] 2.2611333
options(digits=4)

The next chunk of code shows how to use emmeans to perform the same three contrasts.


## with emmeans -------
library(emmeans)
myContrasts <- list(c1,c2,c3)
con.res <- contrast(lab4.emm,method=myContrasts)
summary(con.res,adjust="none")

Finally, we can use linear.comparison to evaluate the contrasts.

# download linear.comparison function:
source(url("http://pnb.mcmaster.ca/bennett/psy710/Rscripts/linear_contrast_v2.R"))
## [1] "loading function linear.comparison"
myC.res <- linear.comparison(y=L4.df01$score,
                             group=L4.df01$group,
                             c.weights=list(c1,c2,c3),
                             var.equal=T)
## [1] "computing linear comparisons assuming equal variances among groups"
## [1] "C 1: F=6.601, t=2.569, p=0.014, psi=22.171, CI=(4.779,39.562), adj.CI= (0.692,43.649)"
## [1] "C 2: F=0.092, t=0.303, p=0.763, psi=1.851, CI=(-10.447,14.149), adj.CI= (-13.336,17.039)"
## [1] "C 3: F=0.091, t=0.301, p=0.765, psi=1.839, CI=(-10.459,14.137), adj.CI= (-13.349,17.026)"
  1. What null hypothesis is evaluated by each contrast?
# ANSWER:
c1 <- c(-1,-1,1,1) # mean(groups 1 & 2) = mean (groups 3 & 4)
c2 <- c(-1,1,0,0) # mean(group 1) = mean(group 2)
c3 <- c(0,0,-1,1) # mean(group 3) = mean(group 4)

2.3 Trend Analysis

Trend analysis uses linear contrasts to decompose the observed effects into orthogonal linear and nonlinear trends across the levels of our grouping variable. Trend analysis can only be applied in cases where the groups are differentiated on a factor that has ordered levels and the differences between levels are meaningful (e.g., grade, age, stimulus duration, etc.). The following code creates an ordered factor, age, that defines four equally-spaced levels on a grouping variable. We then use trend analysis to evaluate the linear and nonlinear trends of score across the levels of age.

a <- 4 # number of gruops
n <- 12 # n per group
L4.df01$age <- factor(x=rep(1:a,each=n),
                      labels=c("age6","age8","age10","age12"),
                      ordered=TRUE)
class(L4.df01$age) # type of variable
## [1] "ordered" "factor"
levels(L4.df01$age) # levels
## [1] "age6"  "age8"  "age10" "age12"
# We set-up R to use the following coding schemes:
# options(contrasts=c("contr.sum","contr.poly"))
# so our ordered factor already uses polynomial contrasts
contrasts(L4.df01$age) # contrasts for Linear, Quadratic, & Cubic trends
##           .L   .Q      .C
## [1,] -0.6708  0.5 -0.2236
## [2,] -0.2236 -0.5  0.6708
## [3,]  0.2236 -0.5 -0.6708
## [4,]  0.6708  0.5  0.2236

# do anova:
L4.age.aov <- aov(score~age,data=L4.df01)
summary(L4.age.aov,split=list(age=list(Lin=1,Quad=2,Cubic=3)))
##              Df Sum Sq Mean Sq F value Pr(>F)  
## age           3   1515     505    2.26  0.095 .
##   age: Lin    1   1384    1384    6.20  0.017 *
##   age: Quad   1      0       0    0.00  0.999  
##   age: Cubic  1    131     131    0.59  0.447  
## Residuals    44   9830     223                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Although R can be set up to automatically use polynomial contrasts on ordered factors, sometimes it is necessary to assign the weights ourselves. For example, we may have an ordered factor with levels that are not separated by equal intervals. In that case, the automatic trend weights assigned by R are not correct. The following code shows how to use contr.poly to calculate the correct weights and assign them to our factor.

a <- 4 # number of gruops
n <- 12 # n per group
# unequal steps in our age factor:
L4.df01$age2 <- factor(x=rep(1:a,each=n),
                      labels=c("age6","age8","age12","age18"),
                      ordered=TRUE)
levels(L4.df01$age2)
## [1] "age6"  "age8"  "age12" "age18"
contrasts(L4.df01$age2) # these weights assume equal intervals!
##           .L   .Q      .C
## [1,] -0.6708  0.5 -0.2236
## [2,] -0.2236 -0.5  0.6708
## [3,]  0.2236 -0.5 -0.6708
## [4,]  0.6708  0.5  0.2236
# these are the correct weights:
(newWeights <- contr.poly(n=4,scores=c(6,8,12,18)))
##           .L      .Q       .C
## [1,] -0.5455  0.5128 -0.43519
## [2,] -0.3273 -0.1709  0.78335
## [3,]  0.1091 -0.7407 -0.43519
## [4,]  0.7638  0.3989  0.08704
contrasts(L4.df01$age2) <- newWeights # assign to our factor
# do anova:
L4.age2.aov <- aov(score~age2,data=L4.df01)
summary(L4.age2.aov,split=list(age2=list(Lin=1,Quad=2,Cubic=3)))
##               Df Sum Sq Mean Sq F value Pr(>F)  
## age2           3   1515     505    2.26  0.095 .
##   age2: Lin    1   1318    1318    5.90  0.019 *
##   age2: Quad   1    137     137    0.61  0.438  
##   age2: Cubic  1     61      61    0.27  0.604  
## Residuals     44   9830     223                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Note that you can use contr.poly to assign trend weights to a regular, non-ordered factor.

class(L4.df01$group)
## [1] "factor"
contrasts(L4.df01$group) <- newWeights # assign to our factor
L4.group.aov <- aov(score~group,data=L4.df01)
summary(L4.group.aov,split=list(group=list(Lin=1,Quad=2,Cubic=3)))
##                Df Sum Sq Mean Sq F value Pr(>F)  
## group           3   1515     505    2.26  0.095 .
##   group: Lin    1   1318    1318    5.90  0.019 *
##   group: Quad   1    137     137    0.61  0.438  
##   group: Cubic  1     61      61    0.27  0.604  
## Residuals      44   9830     223                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Finally, trend analysis can be performed with emmeans. Note that there is no need assign trend weights to the grouping variable. Instead, the weights are specified by using the poly method of in the contrast command. The poly method assumes the levels of the grouping variable are equally spaced.

class(L4.df01$condition)
## [1] "factor"
L4.condition.aov <- aov(score~condition,data=L4.df01)
# library(emmeans)
L4.condition.emm <- emmeans(L4.condition.aov,specs="condition")
contrast(L4.condition.emm,method="poly")
##  contrast  estimate    SE df t.ratio p.value
##  linear       48.03 19.30 44   2.489  0.0167
##  quadratic    -0.01  8.63 44  -0.001  0.9989
##  cubic       -14.79 19.30 44  -0.766  0.4475

When the levels are not equally spaced, then you must provide the weights yourself.

L4.age2.emm <- emmeans(L4.age2.aov,specs="age2")
# newWeights <- contr.poly(n=4,scores=c(6,8,12,18))
wLin <- newWeights[,".L"] # 1st column
wQuad <- newWeights[,".Q"] # 2nd column
wCubic <- newWeights[,".C"] # 3rd column
myTrends <- list("lin"=wLin,"quad"=wQuad,"cub"=wCubic)
contrast(L4.condition.emm,method=myTrends)
##  contrast estimate   SE df t.ratio p.value
##  lin         10.48 4.32 44   2.429  0.0193
##  quad        -3.38 4.32 44  -0.782  0.4383
##  cub         -2.25 4.32 44  -0.522  0.6044

2.4 Multiple Comparisons

This section illustrates how to adjust the \(p\) values for our various trends with the Holm and FDR procedures. First we use emmeans:

# emmeans -------------
# library(emmeans)
# myContrasts <- list(c1,c2,c3)
# con.res <- contrast(lab4.emm,method=myContrasts)
summary(con.res,adjust="holm") # holm correction
summary(con.res,adjust="fdr") # false discovery rate

Next we use p.adjust:

## p.adjust ---------
summary(L4.aov.02,split=list(condition=list(c1=1,c2=2,c3=3)))
##                 Df Sum Sq Mean Sq F value Pr(>F)  
## condition        3   1515     505    2.26  0.095 .
##   condition: c1  1   1475    1475    6.60  0.014 *
##   condition: c2  1     21      21    0.09  0.763  
##   condition: c3  1     20      20    0.09  0.765  
## Residuals       44   9830     223                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# p values taken from ANOVA table:
p.adjust(c(0.0137,0.7630,0.7646),"holm")
## [1] 0.0411 1.0000 1.0000
p.adjust(c(0.0137,0.7630,0.7646),"fdr")
## [1] 0.0411 0.7646 0.7646

Finally, we use the Scheffe method to adjust \(p\) values for post-hoc tests. Note that scheffe.rank must equal the degrees-of-freedom for our grouping factor, regardless of the number of contrasts that we analyze. Also note that you need to use the summary method for the Scheffe test to work properly: including scheffe.rank directly into the contrast command yields an incorrect result.

## scheffe ----------
c1 <- c(-1,-1,1,1)
# next command yields INCORRECT answer (p=0.0137)!
contrast(lab4.emm,method=list(c1),adjust="scheffe",scheffe.rank=3)
##  contrast        estimate   SE df t.ratio p.value
##  c(-1, -1, 1, 1)     22.2 8.63 44   2.569  0.0137
## 
## P value adjustment: scheffe method with rank 1
# incorrect Scheffe.rank!!

# this is how you get the correct answer:
c1.post.hoc <- contrast(lab4.emm,method=list(c1),adjust="scheffe",scheffe.rank=3)
summary(c1.post.hoc,adjust="scheffe",scheffe.rank=3) # p=0.1015  is correct

# alternative method:
# install.packages("DescTools")
library(DescTools)
ScheffeTest(L4.aov.02,which="condition",contrasts=c1)
## 
##   Posthoc multiple comparisons of means: Scheffe Test 
##     95% family-wise confidence level
## 
## $condition
##              diff lwr.ci upr.ci   pval    
## c3,c4-c1,c2 22.17 -2.914  47.25 0.1015    
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

2.5 Effect Size & Association Strength

Effect size for linear contrasts can be expressed using Cohen’s d, which divides the value of \(\Psi\) by an estimate of the standard deviation of the within-group error. The next chunk of code calculates Cohen’s \(d\) with eff_size in the emmeans package. Note that emmeans has a bug: specifically, it does not scale the contrast weights correctly. Instead, it assumes that the postive and negative weights within a contrast sum to 1 and -1, respectively. In the following code, c1 and c4 are equivalent contrasts and should lead to the same effect size, but emmeans says that the effect size for c4 is twice the effect size for c1.

# library(emmeans)
c1 <- c(-1/2,-1/2,1/2,1/2)
c2 <- c(-1,1,0,0)
c3 <- c(0,0,-1,1)
(c4 <- 2*c1)
## [1] -1 -1  1  1
myContrasts <- list(c1,c2,c3,c4)
eff_size(lab4.emm,
         sigma=sigma(L4.aov.01),
         edf=df.residual(L4.aov.01),
         method=myContrasts)
##  contrast                effect.size    SE df lower.CL upper.CL
##  c(-0.5, -0.5, 0.5, 0.5)       0.742 0.299 44    0.138    1.345
##  c(-1, 1, 0, 0)                0.124 0.408 44   -0.699    0.947
##  c(0, 0, -1, 1)                0.123 0.408 44   -0.700    0.946
##  c(-1, -1, 1, 1)               1.483 0.599 44    0.277    2.690
## 
## sigma used for effect sizes: 14.95 
## Confidence level used: 0.95

Next we calculate effect size using linear.comparison. It scales the contrast weights correctly so the estimated effect size is the same for c1 and c4.

# source(url("http://pnb.mcmaster.ca/bennett/psy710/Rscripts/linear_contrast_v2.R"))
# closeAllConnections() # close connection

myC.res <- linear.comparison(y=L4.df01$score,
                             group=L4.df01$group,
                             c.weights=list(c1,c2,c3,c4),
                             var.equal=T)
## [1] "computing linear comparisons assuming equal variances among groups"
## [1] "C 1: F=6.601, t=2.569, p=0.014, psi=11.085, CI=(2.389,19.781), adj.CI= (-0.153,22.323)"
## [1] "C 2: F=0.092, t=0.303, p=0.763, psi=1.851, CI=(-10.447,14.149), adj.CI= (-14.042,17.744)"
## [1] "C 3: F=0.091, t=0.301, p=0.765, psi=1.839, CI=(-10.459,14.137), adj.CI= (-14.054,17.732)"
## [1] "C 4: F=6.601, t=2.569, p=0.014, psi=22.171, CI=(4.779,39.562), adj.CI= (-0.305,44.647)"
class(myC.res) # this is a list
## [1] "list"
length(myC.res) # has 4 parts; 1 for each contrast
## [1] 4
# str(myC.res[[1]]) # each part has LOTS of components
c(myC.res[[1]]$d.effect.size,
  myC.res[[2]]$d.effect.size,
  myC.res[[3]]$d.effect.size,
  myC.res[[4]]$d.effect.size)
## [1] 0.7417 0.1239 0.1230 0.7417

Finally, we can use linear.comparison to calculate various measures of association strength. Measures of association strength also can be calculated directly from values in the ANOVA table.

# Three measures of association strength for the first contrast
myC.res[[1]]$contrast
## [1] -0.5 -0.5  0.5  0.5
c(myC.res[[1]]$R2.alerting,
  myC.res[[1]]$R2.effect.size,
  myC.res[[1]]$R2.contrast)
## [1] 0.9730 0.1300 0.1304

summary(L4.aov.01,split=list(group=list(c1=1,c2=2,c3=3)))
##             Df Sum Sq Mean Sq F value Pr(>F)  
## group        3   1515     505    2.26  0.095 .
##   group: c1  1   1003    1003    4.49  0.040 *
##   group: c2  1    170     170    0.76  0.387  
##   group: c3  1    342     342    1.53  0.223  
## Residuals   44   9830     223                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# values taken from ANOVA table
(c1.R2.alerting = 1475/1515) # proportion of between-group SS
## [1] 0.9736
(c1.R2.effectsize = 1475/(1515+9830) ) # proportion of SS-total
## [1] 0.13
(c1.R2.contrast = 1475/(1475+9830) ) # proportion of (SS-contrast + SS-residual)
## [1] 0.1305

3 Divided Attention Experiment

An experiment measured the effect of dividing attention on reaction times in a visual detection task. Performance was measured in five groups of 12 participants while they simultaneously completed an auditory detection task. The difficulty of the auditory detection task was varied across the five groups by presenting the the auditory target (i.e., a pure tone) in higher levels of background noise. The experimenters hypothesized that performance in the visual task would vary non-monotonically with the level of background noise. In other words, the prediction is that visual performance will be best at an intermediate level of noise. The data are stored in the data frame visRT which contains the numeric variable, RT, and the ordered factor, noise.

sapply(visRT,class)
## $noise
## [1] "ordered" "factor" 
## 
## $RT
## [1] "numeric"
summary(visRT)
##  noise         RT      
##  n1:12   Min.   :-162  
##  n2:12   1st Qu.: 261  
##  n3:12   Median : 380  
##  n4:12   Mean   : 423  
##  n5:12   3rd Qu.: 597  
##          Max.   :1161

Here is a plot of the data:

y.bar <- with(visRT,tapply(RT,noise,mean))
y.sem <- sqrt( with(visRT,tapply(RT,noise,var))/with(visRT,tapply(RT,noise,length)) )
plot(x=1:5,y=y.bar,
     type="b",
     ylim=c(200,800),
     ylab="RT",
     xlab="Noise")
segments(x0=1:5,x1=1:5,y0=y.bar-y.sem,y1=y.bar+y.sem)

  1. Use ANOVA to evaluate the differences among the means. Is the difference among group means statistically significant?
rt.aov.01 <- aov(RT~noise,data=visRT)
summary(rt.aov.01)
##             Df  Sum Sq Mean Sq F value Pr(>F)  
## noise        4 1026362  256590    3.41  0.015 *
## Residuals   55 4136322   75206                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## The effect of noise is statistically significant, so we reject the null hypothesis of no difference among the group means.
  1. Explain why the ANOVA in the previous question is or is not an appropriate test of the experimenters’ hypothesis about the effect of noise level on visual performance.
## The omnibus F test is not a fair test of the original hypothesis because it looks for ANY difference among means, whereas the experimenters were looking for a specific pattern of differences among means.
  1. What type of variable is noise? What contrasts does R use for that variable?
sapply(visRT,class) # noise is an ordered factor
## $noise
## [1] "ordered" "factor" 
## 
## $RT
## [1] "numeric"
contrasts(visRT$noise) # R is using polynomial contrasts
##              .L      .Q         .C      ^4
## [1,] -6.325e-01  0.5345 -3.162e-01  0.1195
## [2,] -3.162e-01 -0.2673  6.325e-01 -0.4781
## [3,] -3.288e-17 -0.5345  9.637e-17  0.7171
## [4,]  3.162e-01 -0.2673 -6.325e-01 -0.4781
## [5,]  6.325e-01  0.5345  3.162e-01  0.1195
cat("We initialized R so that it uses sum-to-zero coding for plain factors and polynomial contrasts for ordered factors.")
## We initialized R so that it uses sum-to-zero coding for plain factors and polynomial contrasts for ordered factors.
  1. Evaluate the linear and quadratic trends of RT across the levels of noise. (For this analysis, you may assume that the trend analysis was planned, and that the successive levels of noise differ by a constant amount.)
library(emmeans)
rt.em <- emmeans(rt.aov.01,specs="noise")
contrast(rt.em,method="poly",adjust="none")
##  contrast  estimate  SE df t.ratio p.value
##  linear       649.8 250 55   2.596  0.0121
##  quadratic    773.7 296 55   2.612  0.0116
##  cubic        -72.9 250 55  -0.291  0.7719
##  quartic      -30.1 662 55  -0.045  0.9639
summary(rt.aov.01,split=list(noise=list(L=1,Q=2,others=3:4)))
##                 Df  Sum Sq Mean Sq F value Pr(>F)  
## noise            4 1026362  256590    3.41  0.015 *
##   noise: L       1  506726  506726    6.74  0.012 *
##   noise: Q       1  513095  513095    6.82  0.012 *
##   noise: others  2    6540    3270    0.04  0.957  
## Residuals       55 4136322   75206                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## The linear and quadratic trends are statistically significant. The quadratic trend is the one that is most imporant because the hypothesis that preformance is best at an intermediate level of noise, which should produce a quadratic trend. Note that we should plot the data (and/or check the value of the quadratic contrast) to make sure the quadratic trend has the correct sign (i.e., RT is at a minimum, not maximum, value at an intermediate level of noise.
  1. Now perform the analysis under the assumption that we decided to analyze the trends only after looking at the data. Explain how the results differ from the ones obtained in the previous question.
cat("We will use a Scheffe test. Here we adjust the p values.")
## We will use a Scheffe test. Here we adjust the p values.
library(emmeans)
rt.em <- emmeans(rt.aov.01,specs="noise") # create emmeans object
con.res <- contrast(rt.em,method="poly")
summary(con.res,adjust="scheffe",scheffe.rank=4)
cat("None of the p values are < 0.05, so none of the trends are significant.")
## None of the p values are < 0.05, so none of the trends are significant.
cat("Here we adjust the critical value of F.")
## Here we adjust the critical value of F.
summary(rt.aov.01,split=list(noise=list(L=1,Q=2,others=3:4)))
##                 Df  Sum Sq Mean Sq F value Pr(>F)  
## noise            4 1026362  256590    3.41  0.015 *
##   noise: L       1  506726  506726    6.74  0.012 *
##   noise: Q       1  513095  513095    6.82  0.012 *
##   noise: others  2    6540    3270    0.04  0.957  
## Residuals       55 4136322   75206                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
df.noise <- 4
alpha <- .05
( F.alpha <- qf(1-alpha,df1=4,df2=55) ) # critical F value (p<.05)
## [1] 2.54
( F.scheffe <- df.noise*F.alpha ) # critical F for Scheffe test
## [1] 10.16
cat("None F tests for our trends exceeds F.scheffe, so none of the trends are significant.")
## None F tests for our trends exceeds F.scheffe, so none of the trends are significant.
## The emmeans test adjusts the p values: none of the adjusted p values are less than 0.05. In the second test we want to compare the F values for the contrasts to the critical, Scheffe F. The observed Fs are less than the Scheffe F (10.1), and therefore the contrasts are not signficant. Hence, both tests find that the linear and quadratic trends are not statistically significant.
  1. Create a set of contrast weights to evaluate the hypothesis that mean RT in noise levels 1-4 equals the mean RT in noise level 5. Perform that linear contrast.
# here we use emmeans:
noiseContrast <- c(1/4,1/4,1/4,1/4,-1) # weights
con2 <- contrast(rt.em,method=list(c1=noiseContrast))
summary(con2)
# any multiple of noiseContrast yields similar results
nc3 <- 3 * noiseContrast
con3 <- contrast(rt.em,method=list(c1=nc3))
summary(con3)
# here use use aov/split:
noiseContrast <- c(1/4,1/4,1/4,1/4,-1) # weights
rt.df <- visRT # make copy of data
contrasts(rt.df$noise) <- c(1/4,1/4,1/4,1/4,-1) # assign contrast to noise
aov.rt.02 <- aov(RT~noise,data=rt.df)
# list anova table [only care about 1st contrast]
summary(aov.rt.02,split=list(noise=list(c1=1,others=2:4)))
##                 Df  Sum Sq Mean Sq F value Pr(>F)   
## noise            4 1026362  256590    3.41 0.0146 * 
##   noise: c1      1  812721  812721   10.81 0.0018 **
##   noise: others  3  213641   71214    0.95 0.4244   
## Residuals       55 4136322   75206                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  1. Now perform the contrast assuming it was done after looking at the data.
summary(con2,adjust="scheffe",scheffe.rank=4)
alpha <- .05
a <- 5 # 5 groups
n <- 12 # n per group
N <- a*n # total N
(F.scheffe <- (a-1)*qf(1-alpha,df1=a-1,df2=N-a))
## [1] 10.16
F.observed <- 10.807
F.observed > F.scheffe # F is significant
## [1] TRUE
cat("The contrast's F (10.81) is greater than the critical Scheffe F (10.16), so our contrast is significant.")
## The contrast's F (10.81) is greater than the critical Scheffe F (10.16), so our contrast is significant.