Initialize R with the following commands. The load
command loads three data frames – L4.df01
,
visRT
, and blood
– though we will not use
blood
in this lab.
options(digits=4,contrasts=c("contr.sum","contr.poly")) # IMPORTANT! [Use sum-to-zero constraint for effects]
load(file=url("http://pnb.mcmaster.ca/bennett/psy710/datasets/rlab4-data-2023.rda"))
In this section we will analyze data in the L4.df01
data
frame, which contains the data from an experiment that measured scores
in for groups of subjects. There are 12 observations per group.
with(L4.df01,tapply(score,group,length))
## 1 2 3 4
## 12 12 12 12
summary(L4.df01)
## group score
## Min. :1.00 Min. : 64.1
## 1st Qu.:1.75 1st Qu.: 88.8
## Median :2.50 Median :100.2
## Mean :2.50 Mean :101.1
## 3rd Qu.:3.25 3rd Qu.:112.8
## Max. :4.00 Max. :132.4
group
and score
in
L4.df01
to evaluate the null hypothesis that the four group
means are equal. Based on the ANOVA table, what is your decision
regarding the null hypothesis?aov.01 <- aov(score~group,data=L4.df01)
summary(aov.01)
## Df Sum Sq Mean Sq F value Pr(>F)
## group 1 1384 1384 6.39 0.015 *
## Residuals 46 9961 217
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# I reject the null hypothesis that the means are equal (or, equivalent that all of the effects are zero)
# Note the degrees of freedom for group is 1, which is NOT correct.
# There are 4 groups so there should be 3 degrees of freedom.
# R has treated group as a numeric variable, not a factor
sapply(L4.df01,class)
## group score
## "numeric" "numeric"
The following code i) transforms group
into a factor;
and creates another factor condition
and adds it to our
data frame.
L4.df01$group <- factor(x=L4.df01$group,labels="g")
n <- 12
a <- 4
gnumber <- rep(x=1:a,each=n)
L4.df01$condition <- factor(x=gnumber,labels="c",ordered=FALSE)
class(L4.df01$condition)
## [1] "factor"
levels(L4.df01$condition)
## [1] "c1" "c2" "c3" "c4"
summary(L4.df01)
## group score condition
## g1:12 Min. : 64.1 c1:12
## g2:12 1st Qu.: 88.8 c2:12
## g3:12 Median :100.2 c3:12
## g4:12 Mean :101.1 c4:12
## 3rd Qu.:112.8
## Max. :132.4
score
for
each condition
.boxplot(score~condition,data=L4.df01,main="data frame L4.df01",cex.axis=1.25,cex.lab=1.5)
aov
to evaluate the null hypothesis that all of
the group means are equal. How does the result differ from the previous
ANOVA?L4.aov.01 <- aov(score~group,data=L4.df01)
# L4.aov.02 <- aov(score~condition,data=L4.df01) # using condition gives same result
# anova(L4.aov.01)
summary(L4.aov.01)
## Df Sum Sq Mean Sq F value Pr(>F)
## group 3 1515 505 2.26 0.095 .
## Residuals 44 9830 223
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Unlike before, I do not reject the null hypothesis that all of the group means are equal.
# I do not reject the null hypothesis that the residuals are distributed normally
shapiro.test(residuals(L4.aov.01))
##
## Shapiro-Wilk normality test
##
## data: residuals(L4.aov.01)
## W = 0.98, p-value = 0.8
qqnorm(residuals(L4.aov.01),main="L4.aov.01");qqline(residuals(L4.aov.01))
oneway.test(score~condition,data=L4.df01)
##
## One-way analysis of means (not assuming equal variances)
##
## data: score and condition
## F = 2.1, num df = 3, denom df = 24, p-value = 0.1
# I do not reject the null hypothesis of equal means.
The following code shows how to perform all pairwise tests and adjust
the \(p\) values using the Tukey method
of adjustment. The confidence intervals listed by TukeyHSD
are adjusted, or simultaneous, confidence intervals. An individual 95%
CI includes the true population value 95% of the time (in the long run).
On the other hand, all of the adjusted 95% CIs will
(in the long run) contain the true population value. In other words, if
we repeated this experiment many times, all of the adjusted CIs would
contain their respective true values 95% of the time.
# Tukey HSD -----------
TukeyHSD(L4.aov.01,"group") # assumes equal variance
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = score ~ group, data = L4.df01)
##
## $group
## diff lwr upr p adj
## g2-g1 1.851 -14.441 18.14 0.9902
## g3-g1 11.091 -5.201 27.38 0.2789
## g4-g1 12.930 -3.362 29.22 0.1630
## g3-g2 9.240 -7.052 25.53 0.4378
## g4-g2 11.079 -5.213 27.37 0.2798
## g4-g3 1.839 -14.453 18.13 0.9904
# using emmeans -------
library(emmeans)
lab4.emm <- emmeans(L4.aov.01,specs="group")
contrast(lab4.emm,method="pairwise",adjust="tukey")
## contrast estimate SE df t.ratio p.value
## g1 - g2 -1.85 6.1 44 -0.303 0.9902
## g1 - g3 -11.09 6.1 44 -1.818 0.2789
## g1 - g4 -12.93 6.1 44 -2.119 0.1630
## g2 - g3 -9.24 6.1 44 -1.514 0.4378
## g2 - g4 -11.08 6.1 44 -1.816 0.2798
## g3 - g4 -1.84 6.1 44 -0.301 0.9904
##
## P value adjustment: tukey method for comparing a family of 4 estimates
Tukey HSD assumes equal variance across conditions, but Dunnett’s T3 test does not.
# Dunnett T3 Test (unequal variances) -------
library(PMCMRplus)
dunnettT3Test(score~group,data=L4.df01) # doesn't assume equal variance
## g1 g2 g3
## g2 1.00 - -
## g3 0.38 0.58 -
## g4 0.23 0.39 1.00
Dunnett’s test is useful when you are comparing several groups to a single control group.
## Dunnett pairwise tests (to control group) ----------
# install.packages("DescTools")
library(DescTools)
levels(L4.df01$group)
## [1] "g1" "g2" "g3" "g4"
DunnettTest(score~group,data=L4.df01,control="g4")
##
## Dunnett's test for comparing several treatments with a control :
## 95% family-wise confidence level
##
## $g4
## diff lwr.ci upr.ci pval
## g1-g4 -12.930 -27.78 1.916 0.1000 .
## g2-g4 -11.079 -25.92 3.767 0.1829
## g3-g4 -1.839 -16.68 13.007 0.9814
##
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
In this section we illustrate how to use linear contrasts/comparisons
to analyze the data in L4.df01
. First we create three
contrasts, demonstrate that they are a complete orthogonal
set, and assign them to our factor condition
.
# create contrasts ---------
c1 <- c(-1,-1,1,1)
c2 <- c(-1,1,0,0)
c3 <- c(0,0,-1,1)
# store 3 contrasts in one matrix:
( cMat <- cbind(c1,c2,c3) )
## c1 c2 c3
## [1,] -1 -1 0
## [2,] -1 1 0
## [3,] 1 0 -1
## [4,] 1 0 1
# make sure contrast weights sum to zero
apply(cMat,MARGIN=2,FUN=sum) # note MARGIN & FUN are capitalized
## c1 c2 c3
## 0 0 0
# 4 levels; 3 degrees of freedom
levels(L4.df01$condition)
## [1] "c1" "c2" "c3" "c4"
# complete orthogonal set has 3 mutually-ortogonal contrasts
sum(c1*c2) # orthogonal
## [1] 0
sum(c1*c3) # orthogonal
## [1] 0
sum(c2*c3) # orthogonal
## [1] 0
# finally, assign contrasts to our factor
( contrasts(L4.df01$condition) <- cMat ) # assign contrasts to our factor
## c1 c2 c3
## [1,] -1 -1 0
## [2,] -1 1 0
## [3,] 1 0 -1
## [4,] 1 0 1
Next, we incorporate the contrasts into our analysis of variance
using aov
and split
. Notice that the sum of
the SS values for the three contrasts equals the SS for condition
(within rounding error), and that average \(F\) for the three contrasts equals the
omnibus \(F\).
L4.aov.02 <- aov(score~condition,data=L4.df01)
options(digits=8)
summary(L4.aov.02,split=list(condition=list(c1=1,c2=2,c3=3)))
## Df Sum Sq Mean Sq F value Pr(>F)
## condition 3 1515.5 505.15 2.2611 0.09456 .
## condition: c1 1 1474.6 1474.60 6.6006 0.01366 *
## condition: c2 1 20.6 20.56 0.0920 0.76303
## condition: c3 1 20.3 20.29 0.0908 0.76457
## Residuals 44 9829.8 223.40
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
1474.6+20.6+20.3 # SS values
## [1] 1515.5
(6.6006+0.092+0.0908)/3 # F values
## [1] 2.2611333
options(digits=4)
The next chunk of code shows how to use emmeans
to
perform the same three contrasts.
## with emmeans -------
library(emmeans)
myContrasts <- list(c1,c2,c3)
con.res <- contrast(lab4.emm,method=myContrasts)
summary(con.res,adjust="none")
Finally, we can use linear.comparison
to evaluate the
contrasts.
# download linear.comparison function:
source(url("http://pnb.mcmaster.ca/bennett/psy710/Rscripts/linear_contrast_v2.R"))
## [1] "loading function linear.comparison"
myC.res <- linear.comparison(y=L4.df01$score,
group=L4.df01$group,
c.weights=list(c1,c2,c3),
var.equal=T)
## [1] "computing linear comparisons assuming equal variances among groups"
## [1] "C 1: F=6.601, t=2.569, p=0.014, psi=22.171, CI=(4.779,39.562), adj.CI= (0.692,43.649)"
## [1] "C 2: F=0.092, t=0.303, p=0.763, psi=1.851, CI=(-10.447,14.149), adj.CI= (-13.336,17.039)"
## [1] "C 3: F=0.091, t=0.301, p=0.765, psi=1.839, CI=(-10.459,14.137), adj.CI= (-13.349,17.026)"
# ANSWER:
c1 <- c(-1,-1,1,1) # mean(groups 1 & 2) = mean (groups 3 & 4)
c2 <- c(-1,1,0,0) # mean(group 1) = mean(group 2)
c3 <- c(0,0,-1,1) # mean(group 3) = mean(group 4)
Trend analysis uses linear contrasts to decompose the observed
effects into orthogonal linear and nonlinear trends across the levels of
our grouping variable. Trend analysis can only be applied in cases where
the groups are differentiated on a factor that has ordered levels and
the differences between levels are meaningful (e.g., grade, age,
stimulus duration, etc.). The following code creates an ordered
factor, age
, that defines four equally-spaced
levels on a grouping variable. We then use trend analysis to evaluate
the linear and nonlinear trends of score
across the levels
of age
.
a <- 4 # number of gruops
n <- 12 # n per group
L4.df01$age <- factor(x=rep(1:a,each=n),
labels=c("age6","age8","age10","age12"),
ordered=TRUE)
class(L4.df01$age) # type of variable
## [1] "ordered" "factor"
levels(L4.df01$age) # levels
## [1] "age6" "age8" "age10" "age12"
# We set-up R to use the following coding schemes:
# options(contrasts=c("contr.sum","contr.poly"))
# so our ordered factor already uses polynomial contrasts
contrasts(L4.df01$age) # contrasts for Linear, Quadratic, & Cubic trends
## .L .Q .C
## [1,] -0.6708 0.5 -0.2236
## [2,] -0.2236 -0.5 0.6708
## [3,] 0.2236 -0.5 -0.6708
## [4,] 0.6708 0.5 0.2236
# do anova:
L4.age.aov <- aov(score~age,data=L4.df01)
summary(L4.age.aov,split=list(age=list(Lin=1,Quad=2,Cubic=3)))
## Df Sum Sq Mean Sq F value Pr(>F)
## age 3 1515 505 2.26 0.095 .
## age: Lin 1 1384 1384 6.20 0.017 *
## age: Quad 1 0 0 0.00 0.999
## age: Cubic 1 131 131 0.59 0.447
## Residuals 44 9830 223
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Although R can be set up to automatically use polynomial contrasts on
ordered factors, sometimes it is necessary to assign the weights
ourselves. For example, we may have an ordered factor with levels that
are not separated by equal intervals. In that case, the
automatic trend weights assigned by R are not correct. The following
code shows how to use contr.poly
to calculate the correct
weights and assign them to our factor.
a <- 4 # number of gruops
n <- 12 # n per group
# unequal steps in our age factor:
L4.df01$age2 <- factor(x=rep(1:a,each=n),
labels=c("age6","age8","age12","age18"),
ordered=TRUE)
levels(L4.df01$age2)
## [1] "age6" "age8" "age12" "age18"
contrasts(L4.df01$age2) # these weights assume equal intervals!
## .L .Q .C
## [1,] -0.6708 0.5 -0.2236
## [2,] -0.2236 -0.5 0.6708
## [3,] 0.2236 -0.5 -0.6708
## [4,] 0.6708 0.5 0.2236
# these are the correct weights:
(newWeights <- contr.poly(n=4,scores=c(6,8,12,18)))
## .L .Q .C
## [1,] -0.5455 0.5128 -0.43519
## [2,] -0.3273 -0.1709 0.78335
## [3,] 0.1091 -0.7407 -0.43519
## [4,] 0.7638 0.3989 0.08704
contrasts(L4.df01$age2) <- newWeights # assign to our factor
# do anova:
L4.age2.aov <- aov(score~age2,data=L4.df01)
summary(L4.age2.aov,split=list(age2=list(Lin=1,Quad=2,Cubic=3)))
## Df Sum Sq Mean Sq F value Pr(>F)
## age2 3 1515 505 2.26 0.095 .
## age2: Lin 1 1318 1318 5.90 0.019 *
## age2: Quad 1 137 137 0.61 0.438
## age2: Cubic 1 61 61 0.27 0.604
## Residuals 44 9830 223
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Note that you can use contr.poly
to assign trend weights
to a regular, non-ordered factor.
class(L4.df01$group)
## [1] "factor"
contrasts(L4.df01$group) <- newWeights # assign to our factor
L4.group.aov <- aov(score~group,data=L4.df01)
summary(L4.group.aov,split=list(group=list(Lin=1,Quad=2,Cubic=3)))
## Df Sum Sq Mean Sq F value Pr(>F)
## group 3 1515 505 2.26 0.095 .
## group: Lin 1 1318 1318 5.90 0.019 *
## group: Quad 1 137 137 0.61 0.438
## group: Cubic 1 61 61 0.27 0.604
## Residuals 44 9830 223
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Finally, trend analysis can be performed with emmeans
.
Note that there is no need assign trend weights to the grouping
variable. Instead, the weights are specified by using the
poly
method of in the contrast
command. The
poly
method assumes the levels of the grouping variable are
equally spaced.
class(L4.df01$condition)
## [1] "factor"
L4.condition.aov <- aov(score~condition,data=L4.df01)
# library(emmeans)
L4.condition.emm <- emmeans(L4.condition.aov,specs="condition")
contrast(L4.condition.emm,method="poly")
## contrast estimate SE df t.ratio p.value
## linear 48.03 19.30 44 2.489 0.0167
## quadratic -0.01 8.63 44 -0.001 0.9989
## cubic -14.79 19.30 44 -0.766 0.4475
When the levels are not equally spaced, then you must provide the weights yourself.
L4.age2.emm <- emmeans(L4.age2.aov,specs="age2")
# newWeights <- contr.poly(n=4,scores=c(6,8,12,18))
wLin <- newWeights[,".L"] # 1st column
wQuad <- newWeights[,".Q"] # 2nd column
wCubic <- newWeights[,".C"] # 3rd column
myTrends <- list("lin"=wLin,"quad"=wQuad,"cub"=wCubic)
contrast(L4.condition.emm,method=myTrends)
## contrast estimate SE df t.ratio p.value
## lin 10.48 4.32 44 2.429 0.0193
## quad -3.38 4.32 44 -0.782 0.4383
## cub -2.25 4.32 44 -0.522 0.6044
This section illustrates how to adjust the \(p\) values for our various trends with the
Holm and FDR procedures. First we use emmeans
:
# emmeans -------------
# library(emmeans)
# myContrasts <- list(c1,c2,c3)
# con.res <- contrast(lab4.emm,method=myContrasts)
summary(con.res,adjust="holm") # holm correction
summary(con.res,adjust="fdr") # false discovery rate
Next we use p.adjust
:
## p.adjust ---------
summary(L4.aov.02,split=list(condition=list(c1=1,c2=2,c3=3)))
## Df Sum Sq Mean Sq F value Pr(>F)
## condition 3 1515 505 2.26 0.095 .
## condition: c1 1 1475 1475 6.60 0.014 *
## condition: c2 1 21 21 0.09 0.763
## condition: c3 1 20 20 0.09 0.765
## Residuals 44 9830 223
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# p values taken from ANOVA table:
p.adjust(c(0.0137,0.7630,0.7646),"holm")
## [1] 0.0411 1.0000 1.0000
p.adjust(c(0.0137,0.7630,0.7646),"fdr")
## [1] 0.0411 0.7646 0.7646
Finally, we use the Scheffe method to adjust \(p\) values for post-hoc
tests. Note that scheffe.rank
must equal the
degrees-of-freedom for our grouping factor, regardless of the number of
contrasts that we analyze. Also note that you need to use the
summary
method for the Scheffe test to work properly:
including scheffe.rank
directly into the
contrast
command yields an incorrect result.
## scheffe ----------
c1 <- c(-1,-1,1,1)
# next command yields INCORRECT answer (p=0.0137)!
contrast(lab4.emm,method=list(c1),adjust="scheffe",scheffe.rank=3)
## contrast estimate SE df t.ratio p.value
## c(-1, -1, 1, 1) 22.2 8.63 44 2.569 0.0137
##
## P value adjustment: scheffe method with rank 1
# incorrect Scheffe.rank!!
# this is how you get the correct answer:
c1.post.hoc <- contrast(lab4.emm,method=list(c1),adjust="scheffe",scheffe.rank=3)
summary(c1.post.hoc,adjust="scheffe",scheffe.rank=3) # p=0.1015 is correct
# alternative method:
# install.packages("DescTools")
library(DescTools)
ScheffeTest(L4.aov.02,which="condition",contrasts=c1)
##
## Posthoc multiple comparisons of means: Scheffe Test
## 95% family-wise confidence level
##
## $condition
## diff lwr.ci upr.ci pval
## c3,c4-c1,c2 22.17 -2.914 47.25 0.1015
##
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Effect size for linear contrasts can be expressed using Cohen’s d,
which divides the value of \(\Psi\) by
an estimate of the standard deviation of the within-group error. The
next chunk of code calculates Cohen’s \(d\) with eff_size
in the
emmeans
package. Note that emmeans
has a bug:
specifically, it does not scale the contrast weights correctly. Instead,
it assumes that the postive and negative weights within a contrast sum
to 1 and -1, respectively. In the following code, c1
and
c4
are equivalent contrasts and should lead to the same
effect size, but emmeans
says that the effect size for
c4
is twice the effect size for c1
.
# library(emmeans)
c1 <- c(-1/2,-1/2,1/2,1/2)
c2 <- c(-1,1,0,0)
c3 <- c(0,0,-1,1)
(c4 <- 2*c1)
## [1] -1 -1 1 1
myContrasts <- list(c1,c2,c3,c4)
eff_size(lab4.emm,
sigma=sigma(L4.aov.01),
edf=df.residual(L4.aov.01),
method=myContrasts)
## contrast effect.size SE df lower.CL upper.CL
## c(-0.5, -0.5, 0.5, 0.5) 0.742 0.299 44 0.138 1.345
## c(-1, 1, 0, 0) 0.124 0.408 44 -0.699 0.947
## c(0, 0, -1, 1) 0.123 0.408 44 -0.700 0.946
## c(-1, -1, 1, 1) 1.483 0.599 44 0.277 2.690
##
## sigma used for effect sizes: 14.95
## Confidence level used: 0.95
Next we calculate effect size using linear.comparison
.
It scales the contrast weights correctly so the estimated effect size is
the same for c1
and c4
.
# source(url("http://pnb.mcmaster.ca/bennett/psy710/Rscripts/linear_contrast_v2.R"))
# closeAllConnections() # close connection
myC.res <- linear.comparison(y=L4.df01$score,
group=L4.df01$group,
c.weights=list(c1,c2,c3,c4),
var.equal=T)
## [1] "computing linear comparisons assuming equal variances among groups"
## [1] "C 1: F=6.601, t=2.569, p=0.014, psi=11.085, CI=(2.389,19.781), adj.CI= (-0.153,22.323)"
## [1] "C 2: F=0.092, t=0.303, p=0.763, psi=1.851, CI=(-10.447,14.149), adj.CI= (-14.042,17.744)"
## [1] "C 3: F=0.091, t=0.301, p=0.765, psi=1.839, CI=(-10.459,14.137), adj.CI= (-14.054,17.732)"
## [1] "C 4: F=6.601, t=2.569, p=0.014, psi=22.171, CI=(4.779,39.562), adj.CI= (-0.305,44.647)"
class(myC.res) # this is a list
## [1] "list"
length(myC.res) # has 4 parts; 1 for each contrast
## [1] 4
# str(myC.res[[1]]) # each part has LOTS of components
c(myC.res[[1]]$d.effect.size,
myC.res[[2]]$d.effect.size,
myC.res[[3]]$d.effect.size,
myC.res[[4]]$d.effect.size)
## [1] 0.7417 0.1239 0.1230 0.7417
Finally, we can use linear.comparison
to calculate
various measures of association strength. Measures of association
strength also can be calculated directly from values in the ANOVA
table.
# Three measures of association strength for the first contrast
myC.res[[1]]$contrast
## [1] -0.5 -0.5 0.5 0.5
c(myC.res[[1]]$R2.alerting,
myC.res[[1]]$R2.effect.size,
myC.res[[1]]$R2.contrast)
## [1] 0.9730 0.1300 0.1304
summary(L4.aov.01,split=list(group=list(c1=1,c2=2,c3=3)))
## Df Sum Sq Mean Sq F value Pr(>F)
## group 3 1515 505 2.26 0.095 .
## group: c1 1 1003 1003 4.49 0.040 *
## group: c2 1 170 170 0.76 0.387
## group: c3 1 342 342 1.53 0.223
## Residuals 44 9830 223
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# values taken from ANOVA table
(c1.R2.alerting = 1475/1515) # proportion of between-group SS
## [1] 0.9736
(c1.R2.effectsize = 1475/(1515+9830) ) # proportion of SS-total
## [1] 0.13
(c1.R2.contrast = 1475/(1475+9830) ) # proportion of (SS-contrast + SS-residual)
## [1] 0.1305
An experiment measured the effect of dividing attention on reaction times in a visual detection task. Performance was measured in five groups of 12 participants while they simultaneously completed an auditory detection task. The difficulty of the auditory detection task was varied across the five groups by presenting the the auditory target (i.e., a pure tone) in higher levels of background noise. The experimenters hypothesized that performance in the visual task would vary non-monotonically with the level of background noise. In other words, the prediction is that visual performance will be best at an intermediate level of noise. The data are stored in the data frame visRT which contains the numeric variable, RT, and the ordered factor, noise.
sapply(visRT,class)
## $noise
## [1] "ordered" "factor"
##
## $RT
## [1] "numeric"
summary(visRT)
## noise RT
## n1:12 Min. :-162
## n2:12 1st Qu.: 261
## n3:12 Median : 380
## n4:12 Mean : 423
## n5:12 3rd Qu.: 597
## Max. :1161
Here is a plot of the data:
y.bar <- with(visRT,tapply(RT,noise,mean))
y.sem <- sqrt( with(visRT,tapply(RT,noise,var))/with(visRT,tapply(RT,noise,length)) )
plot(x=1:5,y=y.bar,
type="b",
ylim=c(200,800),
ylab="RT",
xlab="Noise")
segments(x0=1:5,x1=1:5,y0=y.bar-y.sem,y1=y.bar+y.sem)
rt.aov.01 <- aov(RT~noise,data=visRT)
summary(rt.aov.01)
## Df Sum Sq Mean Sq F value Pr(>F)
## noise 4 1026362 256590 3.41 0.015 *
## Residuals 55 4136322 75206
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## The effect of noise is statistically significant, so we reject the null hypothesis of no difference among the group means.
## The omnibus F test is not a fair test of the original hypothesis because it looks for ANY difference among means, whereas the experimenters were looking for a specific pattern of differences among means.
noise
? What contrasts does R
use for that variable?sapply(visRT,class) # noise is an ordered factor
## $noise
## [1] "ordered" "factor"
##
## $RT
## [1] "numeric"
contrasts(visRT$noise) # R is using polynomial contrasts
## .L .Q .C ^4
## [1,] -6.325e-01 0.5345 -3.162e-01 0.1195
## [2,] -3.162e-01 -0.2673 6.325e-01 -0.4781
## [3,] -3.288e-17 -0.5345 9.637e-17 0.7171
## [4,] 3.162e-01 -0.2673 -6.325e-01 -0.4781
## [5,] 6.325e-01 0.5345 3.162e-01 0.1195
cat("We initialized R so that it uses sum-to-zero coding for plain factors and polynomial contrasts for ordered factors.")
## We initialized R so that it uses sum-to-zero coding for plain factors and polynomial contrasts for ordered factors.
library(emmeans)
rt.em <- emmeans(rt.aov.01,specs="noise")
contrast(rt.em,method="poly",adjust="none")
## contrast estimate SE df t.ratio p.value
## linear 649.8 250 55 2.596 0.0121
## quadratic 773.7 296 55 2.612 0.0116
## cubic -72.9 250 55 -0.291 0.7719
## quartic -30.1 662 55 -0.045 0.9639
summary(rt.aov.01,split=list(noise=list(L=1,Q=2,others=3:4)))
## Df Sum Sq Mean Sq F value Pr(>F)
## noise 4 1026362 256590 3.41 0.015 *
## noise: L 1 506726 506726 6.74 0.012 *
## noise: Q 1 513095 513095 6.82 0.012 *
## noise: others 2 6540 3270 0.04 0.957
## Residuals 55 4136322 75206
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## The linear and quadratic trends are statistically significant. The quadratic trend is the one that is most imporant because the hypothesis that preformance is best at an intermediate level of noise, which should produce a quadratic trend. Note that we should plot the data (and/or check the value of the quadratic contrast) to make sure the quadratic trend has the correct sign (i.e., RT is at a minimum, not maximum, value at an intermediate level of noise.
cat("We will use a Scheffe test. Here we adjust the p values.")
## We will use a Scheffe test. Here we adjust the p values.
library(emmeans)
rt.em <- emmeans(rt.aov.01,specs="noise") # create emmeans object
con.res <- contrast(rt.em,method="poly")
summary(con.res,adjust="scheffe",scheffe.rank=4)
cat("None of the p values are < 0.05, so none of the trends are significant.")
## None of the p values are < 0.05, so none of the trends are significant.
cat("Here we adjust the critical value of F.")
## Here we adjust the critical value of F.
summary(rt.aov.01,split=list(noise=list(L=1,Q=2,others=3:4)))
## Df Sum Sq Mean Sq F value Pr(>F)
## noise 4 1026362 256590 3.41 0.015 *
## noise: L 1 506726 506726 6.74 0.012 *
## noise: Q 1 513095 513095 6.82 0.012 *
## noise: others 2 6540 3270 0.04 0.957
## Residuals 55 4136322 75206
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
df.noise <- 4
alpha <- .05
( F.alpha <- qf(1-alpha,df1=4,df2=55) ) # critical F value (p<.05)
## [1] 2.54
( F.scheffe <- df.noise*F.alpha ) # critical F for Scheffe test
## [1] 10.16
cat("None F tests for our trends exceeds F.scheffe, so none of the trends are significant.")
## None F tests for our trends exceeds F.scheffe, so none of the trends are significant.
## The emmeans test adjusts the p values: none of the adjusted p values are less than 0.05. In the second test we want to compare the F values for the contrasts to the critical, Scheffe F. The observed Fs are less than the Scheffe F (10.1), and therefore the contrasts are not signficant. Hence, both tests find that the linear and quadratic trends are not statistically significant.
# here we use emmeans:
noiseContrast <- c(1/4,1/4,1/4,1/4,-1) # weights
con2 <- contrast(rt.em,method=list(c1=noiseContrast))
summary(con2)
# any multiple of noiseContrast yields similar results
nc3 <- 3 * noiseContrast
con3 <- contrast(rt.em,method=list(c1=nc3))
summary(con3)
# here use use aov/split:
noiseContrast <- c(1/4,1/4,1/4,1/4,-1) # weights
rt.df <- visRT # make copy of data
contrasts(rt.df$noise) <- c(1/4,1/4,1/4,1/4,-1) # assign contrast to noise
aov.rt.02 <- aov(RT~noise,data=rt.df)
# list anova table [only care about 1st contrast]
summary(aov.rt.02,split=list(noise=list(c1=1,others=2:4)))
## Df Sum Sq Mean Sq F value Pr(>F)
## noise 4 1026362 256590 3.41 0.0146 *
## noise: c1 1 812721 812721 10.81 0.0018 **
## noise: others 3 213641 71214 0.95 0.4244
## Residuals 55 4136322 75206
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(con2,adjust="scheffe",scheffe.rank=4)
alpha <- .05
a <- 5 # 5 groups
n <- 12 # n per group
N <- a*n # total N
(F.scheffe <- (a-1)*qf(1-alpha,df1=a-1,df2=N-a))
## [1] 10.16
F.observed <- 10.807
F.observed > F.scheffe # F is significant
## [1] TRUE
cat("The contrast's F (10.81) is greater than the critical Scheffe F (10.16), so our contrast is significant.")
## The contrast's F (10.81) is greater than the critical Scheffe F (10.16), so our contrast is significant.