Initialize R
with the following commands:
options(contrasts=c("contr.sum","contr.poly") )
load(file=url("http://pnb.mcmaster.ca/bennett/psy710/datasets/practice1-2022.rda") )
The options
command sets up R to use sum-to-zero coding
for factors, and polynomial coding for ordered factors. The
load
command loads the data variables that you need to
answer the following questions.
Verbal IQ scores were obtained from 30 children. The data are stored
in the variable iq.dat
, which was placed into R with the
load
command above. Use the data in iq.dat
to
answer the following questions.
1.1. Use a \(t\) test to evaluate the null hypothesis that the data were sampled from a population with a mean of 100. Make sure answer includes the values of \(t\), \(p\), and the degrees of freedom, and a clear description of the null and alternative hypotheses being evaluated.
t.test(iq.dat,mu=100)
##
## One Sample t-test
##
## data: iq.dat
## t = -1.511, df = 29, p-value = 0.1416
## alternative hypothesis: true mean is not equal to 100
## 95 percent confidence interval:
## 90.58561 101.41439
## sample estimates:
## mean of x
## 96
Answer: My null hypothesis is that \(\mu = 100\) and my alternative hypothesis is that \(\mu \ne 100\). The \(t\) test was not significant (\(t=-1.5\), \(\mathrm{df}=29\), \(p=0.1\)). Therefore I choose to not reject the null hypothesis. (N.B. In real life I would examine the data for outliers, deviations from normality, etc.).
1.2. Calculate Cohen’s \(d\).
require(effectsize)
cohens_d(x=iq.dat, mu=100) # effectsize package
## Cohen's d | 95% CI
## -------------------------
## -0.28 | [-0.64, 0.09]
##
## - Deviation from a difference of 100.
require(effsize)
cohen.d(iq.dat,f=NA,mu=100) # effsize package
##
## Cohen's d (single sample)
##
## d estimate: -0.2758621 (small)
## Reference mu: 100
## 95 percent confidence interval:
## lower upper
## -1.0262180 0.4744938
(d <- (mean(iq.dat)-100)/sd(iq.dat)) # from mean and SD
## [1] -0.2758621
1.3. Cohen’s \(d\) is a common measure of effect size. What, exactly, does Cohen’s \(d\) represent? Why is a measure like Cohen’s \(d\) a useful complement to \(p\) values?
Answer: We will discuss this answer in class.
1.4. The IQ test was designed so that, in a population of typically-developed children, the mean score is 100 and the standard deviation is 15 (\(\mu=100\), \(\sigma=15\)). Use an equivalence test to evaluate the hypothesis that our sample was selected from a population with a normal mean, with the assumption that a mean is considered normal if it is within the bounds \(100 \pm \frac{\sigma}{2}\), or \(100 \pm 7.5\). State the null and alternative hypotheses for the equivalence test, and your conclusion regarding that null hypothesis.
Answer: The null hypothesis is that \((\mu \le 92.5)\) OR \((\mu \ge 107.5)\). The alternative hypothesis is that \((\mu > 92.5)\) AND \((\mu < 107.5)\). Here I perform the equivalence test by calculating the 90% confidence interval for the mean. If it lies within the equivalence bounds, then I would reject the null hypothesis (\(p < 0.05\)) that \((\mu \le 92.5)\) OR \((\mu \ge 107.5)\). The 90% CI is [91.5, 100.5], which is inside the equivalence bounds, so I do reject the null hypothesis of non-equivalence. (N.B. I do not the null hypothesis, I simply fail to reject it.)
(equiv.bounds.raw <- c(100-15/2,100+15/2) )
## [1] 92.5 107.5
t.test(x=iq.dat,mu=100,conf.level=0.90)$conf.int
## [1] 91.50186 100.49814
## attr(,"conf.level")
## [1] 0.9
Answer: I can perform the same test using two one-tailed \(t\) tests. If both tests are significant, then I would reject the null hypothesis of non-equivalence in favor of the alternative hypothesis that the mean is equivalent to 100. However, only one test is significant: I reject the null hypothesis \(\mu \ge 107.5\) but not the null hypothesis that \(\mu \le 92.5\). Therefore I do not reject the null hypothesis of non-equivalence.
upper.bound <- 100+15/2
t.test(x=iq.dat,mu=upper.bound,alternative="less")
##
## One Sample t-test
##
## data: iq.dat
## t = -4.344, df = 29, p-value = 7.806e-05
## alternative hypothesis: true mean is less than 107.5
## 95 percent confidence interval:
## -Inf 100.4981
## sample estimates:
## mean of x
## 96
lower.bound <- 100-15/2
t.test(x=iq.dat,mu=lower.bound,alternative="greater") # not significant
##
## One Sample t-test
##
## data: iq.dat
## t = 1.3221, df = 29, p-value = 0.09824
## alternative hypothesis: true mean is greater than 92.5
## 95 percent confidence interval:
## 91.50186 Inf
## sample estimates:
## mean of x
## 96
Answer: Finally, I could use the TOSTER package to compute the two one-tailed \(t\) tests.
library(TOSTER)
upper.bound <- 100 + (15/2)
lower.bound <- 100 - (15/2)
t_TOST(x=iq.dat,
low_eqbound = lower.bound,
high_eqbound = upper.bound)
##
## One Sample t-test
##
## The equivalence test was non-significant, t(29) = 1.32, p = 0.1
## The null hypothesis test was significant, t(29) = 36.263p < 0.01
## NHST: reject null significance hypothesis that the effect is equal to zero
## TOST: don't reject null equivalence hypothesis
##
## TOST Results
## t df p.value
## t-test 36.263 29 < 0.001
## TOST Lower 1.322 29 0.098
## TOST Upper -4.344 29 < 0.001
##
## Effect Sizes
## Estimate SE C.I. Conf. Level
## Raw 96.000 2.6473 [91.5019, 100.4981] 0.9
## Hedges's g 6.448 0.8522 [5.0066, 7.8151] 0.9
## Note: SMD confidence intervals are an approximation. See vignette("SMD_calcs").
A perceptual learning experiment measured visual sensitivity in a
group of 40 young adults before and after practice in visual detection
tasks. The data are contained in the data frame paired.df
:
visual sensitivity measures taken before and after practice are stored
in the variables t1
and t2
.
2.1. Use a \(t\) test to see if the two measures differ significantly. State the null (H0) and alternative (H1) hypotheses, as well as your conclusion about H0.
with(paired.df,t.test(t2,t1,paired=T) )
##
## Paired t-test
##
## data: t2 and t1
## t = 2.1004, df = 49, p-value = 0.04086
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## 0.1297525 5.8702475
## sample estimates:
## mean difference
## 3
Answer: The null hypothesis (H0) is that mean visual sensitivity before and after practice did not differ (\(\mu_{t2} = \mu_{t1}\)). The alternative hypothesis (H1) is that mean sensitivity did differ (\(\mu_{t2} \ne \mu_{t1}\)). I used a paired \(t\) test to evaluate H0. The difference between means was significant (\(t(49)=2.1\), \(p=0.04\)) and therefore I reject H0 in favor of H1.
2.2. What does the \(p\) value of your \(t\) test mean? In other words, the \(p\) value is a probability, but a probability of what?
Answer: We will discuss this answer in class.
An experiment was conducted to assess the effects of four treatments
on a dependent variable, y
. The experiment measured
y
on 32 subjects assigned randomly to the four treatments
(\(n=8\) per treatment), and the data
are stored in the data frame aov.df
. Use
aov.df
to answer the following questions.
3.1 Calculate the mean and standard deviation of y
for
each treatment group.
names(aov.df)
## [1] "y" "treatment"
with(aov.df,tapply(y,treatment,mean))
## t1 t2 t3 t4
## 86.375 101.375 93.250 100.500
with(aov.df,tapply(y,treatment,sd))
## t1 t2 t3 t4
## 9.840695 8.895223 11.744300 7.559289
3.2 Conduct an analysis of variance to evaluate the effect of
treatment
on y
. Record the results of your
ANOVA (i.e., write the ANOVA table).
lm.mod.01 <- lm(y~treatment,data=aov.df)
anova(lm.mod.01)
## Analysis of Variance Table
##
## Response: y
## Df Sum Sq Mean Sq F value Pr(>F)
## treatment 3 1182.2 394.08 4.2485 0.01357 *
## Residuals 28 2597.2 92.76
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
3.3 Explain the null and alternative hypotheses that are evaluated by your ANOVA.
**Answer: We will discuss this answer in class.
3.4 Our ANOVA makes assumptions about the distribution of the errors: specifically, that the errors are distributed normally and have constant variance across groups. Use statistical (i.e., not graphical) methods to evaluate these assumptions.
**Answer:I will use bartlett.test
to test the assumption
of homogeneity of variance:
bartlett.test(y~treatment,data=aov.df)
##
## Bartlett test of homogeneity of variances
##
## data: y by treatment
## Bartlett's K-squared = 1.3454, df = 3, p-value = 0.7184
The Bartlett test was not significant (\(K^{2}=1.34\), df=3, \(p=0.718\)), so the homogeneity of variance
assumption is reasonable for our data. Next, I will test the normality
assumption by analyzing the residuals of a full linear model with
shapiro.test
:
shapiro.test(residuals(lm.mod.01))
##
## Shapiro-Wilk normality test
##
## data: residuals(lm.mod.01)
## W = 0.97432, p-value = 0.626
The Shapiro-Wilk’s test was not significant (\(W=0.974\), \(p=0.626\)), so the normality assumption is reasonable for our data.
3.5 Evaulate the pairwise differences between all group means.
Answer: I will evaluate pairwise differences using
the Tukey HSD test. First I use the contrast
command in the
emmeans
package:
library(emmeans)
pairwise.em <- emmeans(lm.mod.01,"treatment")
con.pairwise <- contrast(pairwise.em,method="pairwise")
summary(con.pairwise,adjust="tukey")
## contrast estimate SE df t.ratio p.value
## t1 - t2 -15.000 4.82 28 -3.115 0.0207
## t1 - t3 -6.875 4.82 28 -1.428 0.4933
## t1 - t4 -14.125 4.82 28 -2.933 0.0317
## t2 - t3 8.125 4.82 28 1.687 0.3491
## t2 - t4 0.875 4.82 28 0.182 0.9978
## t3 - t4 -7.250 4.82 28 -1.506 0.4477
##
## P value adjustment: tukey method for comparing a family of 4 estimates
Answer: Next I will use the TukeyHSD
command. TukeyHSD
only works with aov
objects,
so I need to create a model using the aov
command.
aov.mod.01 <- aov(y~treatment,data=aov.df) # need to use aov command
TukeyHSD(aov.mod.01,conf.level = 0.95)
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = y ~ treatment, data = aov.df)
##
## $treatment
## diff lwr upr p adj
## t2-t1 15.000 1.8519885 28.148012 0.0207220
## t3-t1 6.875 -6.2730115 20.023012 0.4932943
## t4-t1 14.125 0.9769885 27.273012 0.0316790
## t3-t2 -8.125 -21.2730115 5.023012 0.3490547
## t4-t2 -0.875 -14.0230115 12.273012 0.9978182
## t4-t3 7.250 -5.8980115 20.398012 0.4477413
Answer: Onlythe differences between treatments 1 and 2 (\(p_{\mathit{adj}}=0.0207\)) and treatments 1 and 4 (\(p_{\mathit{adj}}=0.031\)) were significant.
3.6 Explain what is meant by Type I errors and how they relate to your analysis of pairwise differences.
Answer: We will discuss this answer in class.
3.7 What does eta-squared (\(\eta^2\)) represent? Calculate \(\eta^2\) for treatment
.
library(effectsize)
eta_squared(lm.mod.01) # in effectsize package
## # Effect Size for ANOVA
##
## Parameter | Eta2 | 95% CI
## -------------------------------
## treatment | 0.31 | [0.05, 1.00]
##
## - One-sided CIs: upper bound fixed at [1.00].
Answer: \(\eta^2\)
is the proportion of variation in our dependent variable that is
associated with treatment
. It equals the ratio of \(SS_\mathit{group}\) divided by \(SS_\mathit{total}\).
anova(aov.mod.01)
## Analysis of Variance Table
##
## Response: y
## Df Sum Sq Mean Sq F value Pr(>F)
## treatment 3 1182.2 394.08 4.2485 0.01357 *
## Residuals 28 2597.2 92.76
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# values from ANOVA table:
SS.treatment <- 1182.2
SS.residuals <- 2597.2
SS.total <- (SS.treatment + SS.residuals)
(SS.treatment/SS.total) # eta-squared
## [1] 0.312801
The results of an ANOVA are shown below. Refer to the ANOVA table when answering the questions in this section.
## Analysis of Variance Table
##
## Response: y
## Df Sum Sq Mean Sq F value Pr(>F)
## group 3 101.57 33.856 3.2941 0.03136 *
## Residuals 36 370.00 10.278
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
4.1. When the null hypothesis is true, what do the values \(\mathit{MS}_{\mathit{group}}\) and \(\mathit{MS}_{\mathit{residuals}}\) represent?
Answer: We will discuss this answer in class.
4.2. The sums-of-squares for group
is 102. In the
context of nested linear models that were used to construct this table,
what does this value represent?
Answer: We will discuss this answer in class.
4.3. The degrees of freedom for group
is 3, which equals
the number of groups minus 1. The degrees of freedom for group also
corresponds to a particular difference between two nested linear models
that are used to construct the ANOVA table. What is this difference?
Answer: We will discuss this answer in class.
Use aov.df
to answer the following questions.
5.1. Evaluate a linear comparison between the mean of treatment 1 and the mean of the other three treatments. Assume that your comparison is planned.
Answer 1: You could answer this question using
several methods. For example, you could use the
linear.comparison
command:}{}
# method 1:
source(url("http://pnb.mcmaster.ca/bennett/psy710/Rscripts/linear_contrast_v2.R"))
## [1] "loading function linear.comparison"
w <- c(3, -1, -1, -1)
y <- aov.df$y;
g <- aov.df$treatment
my.contrast <- linear.comparison(y, g, c.weights = w)
## [1] "computing linear comparisons assuming equal variances among groups"
## [1] "C 1: F=9.314, t=-3.052, p=0.005, psi=-36.000, CI=(-60.162,-11.838), adj.CI= (-60.162,-11.838)"
my.contrast[[1]]$F
## [1] 9.314467
my.contrast[[1]]$df1
## [1] 1
my.contrast[[1]]$df2
## [1] 28
my.contrast[[1]]$p.2tailed
## [1] 0.004937769
Answer: Or you could use R’s aov
or
lm
commands. First, you associate the contrast weights with
the grouping factor, g
. Then you perform an ANOVA with
aov
, list the ANOVA table, and use the split command to
break the sum-of-squares for g
into separate pieces. The
contrast is significant (\(F(1,28)=9.314\), \(p=0.0049\)).
# method 2:
contrasts(g) <- w
aov.02 <- aov(aov.df$y ~ g)
summary(aov.02,split=list(g=list(myContrast=1,others=2:3)))
## Df Sum Sq Mean Sq F value Pr(>F)
## g 3 1182.3 394.1 4.248 0.01357 *
## g: myContrast 1 864.0 864.0 9.314 0.00494 **
## g: others 2 318.2 159.1 1.715 0.19825
## Residuals 28 2597.2 92.8
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Answer: Instead of listing the ANOVA table, you
could use the contrast
command in the emmeans
package (\(t(28)=-3.052\), \(p=0.0049\)).
library(emmeans) # load emmeans package
contrast.list <- list(g1vsg234 = w) # load w into list
groupMeans <- emmeans(aov.02,"g") # store group means
contrast(groupMeans,contrast.list,adjust="none") # do comparison
## contrast estimate SE df t.ratio p.value
## g1vsg234 -36 11.8 28 -3.052 0.0049
Answer: Finally, you could calculate your contrast directly from the group means:
# method 3:
m <- with(aov.df,tapply(y,treatment,mean)) # group means
n <- with(aov.df,tapply(y,treatment,length)) # group n
MS.w <- 92.76 # from ANOVA table
psi <- sum(w * m) # value of PSI
# from Eq 6 in chapter 4 notes:
SS.contrast <- (psi^2) / sum( (w^2)/n)
( F <- SS.contrast / MS.w )
## [1] 9.31436
N <- sum(n) # total N
a <- length(m) # number of groups
df1 <- 1 # numerator df
( df2 <- N-a ) # denominator df
## [1] 28
(p.value <- 1 - pf(F,df1,df2) ) # our p value [remember to use 1 - pf ]
## [1] 0.004937986
(p.value <- pf(F,df1,df2,lower.tail=FALSE) ) # [or set lower.tail to FALSE]
## [1] 0.004937986
5.2. Perform the same contrast, but now assume that it was done after looking at the data.
alpha.fw <- .05 # type I error rate
a <- length(m) # number of groups
df2 <- N - a; # df for residuals
# Scheffe critical F:
( scheffe.critical.F <- (a-1)*qf(1-alpha.fw,df1=a-1,df2=df2) )
## [1] 8.840056
Answer: I would use the Scheffe method.The observed
value of \(F=9.31\) is greater than the
critical value of \(F_{\mathit{Scheffe}}=8.84\), and therefore
the linear contrast is significant (\(\alpha_{\mathit{FW}}=.05\)). I can also
perform this test using emmeans
. Note how in the following
code I set the parameter scheffe.rank
to 3, which equals
the degrees of freedom for my grouping variable. This setting is
necessary for emmeans
to adjust the p values properly for a
post-hoc comparison
library(emmeans) # load emmeans package
contrast.list <- list(g1vsg234 = w) # load w into list
groupMeans <- emmeans(aov.02,"g") # store group means
con <- contrast(groupMeans,method=contrast.list) # do contrast
summary(con,adjust="scheffe",scheffe.rank=3) # adjust p values & show results
## contrast estimate SE df t.ratio p.value
## g1vsg234 -36 11.8 28 -3.052 0.0425
##
## P value adjustment: scheffe method with rank 3
5.3. List a set of weights that could be used to evaluate a linear contrast that is orthogonal to the contrast tested in the previous two questions. Describe the null hypothesis that is evaluated by your contrast weights.
Answer: Here is my new set of weights:
w2 <- c(0,-2,1,1)
Answer: The weights can be used to evaluate the hypothesis that the mean of group 2 equals the means of groups 3 and 4. To verify that my contrast is orthogonal to the first contrast, I will calculate the sum of the products of the two sets of weights. There are equal \(n\) per group, so the two contrasts are orthogonal if the sum of the products of the weights is zero. The sum of the products of the weights is zero, so the contrasts are orthogonal.}{}
with(aov.df,tapply(y,treatment,length)) # balanced design
## t1 t2 t3 t4
## 8 8 8 8
sum(w*w2)
## [1] 0
An experiment measured detection thresholds for spots of different
sizes. The point of the experiment was to test a prediction that
detection threshold would decline as the spot increased from small to
medium, but then would level off or actually increase as spot size
increased from medium to large. The experiment measured detection
thresholds for spots at six, equally-spaced diameters ranging from small
(condition d1
) to large (condition d16
). There
were 6 subjects per condition. The data are stored in the data frame
spot.dat
which consists of the factor diameter
and the numeric variable threshold
.
6.1. Evaluate the linear and quadratic trends of
threshold
across diameter
.
spot.aov <- aov(threshold~diameter,spot.dat)
summary(spot.aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## diameter 5 2.414 0.4829 5.287 0.00134 **
## Residuals 30 2.740 0.0913
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
contrasts(spot.dat$diameter) # polynomial contrasts
## .L .Q .C ^4 ^5
## [1,] -0.5976143 0.5455447 -0.3726780 0.1889822 -0.06299408
## [2,] -0.3585686 -0.1091089 0.5217492 -0.5669467 0.31497039
## [3,] -0.1195229 -0.4364358 0.2981424 0.3779645 -0.62994079
## [4,] 0.1195229 -0.4364358 -0.2981424 0.3779645 0.62994079
## [5,] 0.3585686 -0.1091089 -0.5217492 -0.5669467 -0.31497039
## [6,] 0.5976143 0.5455447 0.3726780 0.1889822 0.06299408
summary(spot.aov,split=list(diameter=list(Lin=1,Quad=2,Others=3:5)))
## Df Sum Sq Mean Sq F value Pr(>F)
## diameter 5 2.4143 0.4829 5.287 0.00134 **
## diameter: Lin 1 1.0672 1.0672 11.685 0.00183 **
## diameter: Quad 1 1.1945 1.1945 13.079 0.00108 **
## diameter: Others 3 0.1526 0.0509 0.557 0.64755
## Residuals 30 2.7399 0.0913
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# using emmeans
library(emmeans)
spot.emm <- emmeans(spot.aov,specs="diameter")
contrast(spot.emm,method="poly")
## contrast estimate SE df t.ratio p.value
## linear -3.528 1.032 30 -3.418 0.0018
## quadratic 4.089 1.131 30 3.617 0.0011
## cubic 0.135 1.655 30 0.082 0.9355
## quartic -0.718 0.653 30 -1.100 0.2801
## degree 5 -1.320 1.959 30 -0.674 0.5056
6.2. Calculate effect size and association strength for the quadratic trend.
# Calculate Cohen's d:
# rescale so that sum of the subset of + and - weights is ±1
( quadTrend <- contrasts(spot.dat$diameter)[,".Q"] / 1.091089 )
## [1] 0.5000002 -0.1000000 -0.4000002 -0.4000002 -0.1000000 0.5000002
sum(quadTrend[c(1,6)])
## [1] 1
sum(quadTrend[2:5])
## [1] -1
eff_size(spot.emm,
sigma=sigma(spot.aov),
edf=df.residual(spot.aov),
method=list(Quad=quadTrend))
## contrast effect.size SE df lower.CL upper.CL
## Quad 1.35 0.413 30 0.51 2.2
##
## sigma used for effect sizes: 0.3022
## Confidence level used: 0.95
# linear.comparison:
source(url("http://pnb.mcmaster.ca/bennett/psy710/Rscripts/linear_contrast_v2.R"))
## [1] "loading function linear.comparison"
myQuad <- linear.comparison(y=spot.dat$threshold,
group=spot.dat$diameter,
c.weights=list(quadTrend),
var.equal=T)
## [1] "computing linear comparisons assuming equal variances among groups"
## [1] "C 1: F=13.079, t=3.617, p=0.001, psi=0.409, CI=(0.178,0.640), adj.CI= (0.178,0.640)"
myQuad[[1]]$d.effect.size
## [1] 1.353182
# Calculate association strength
c(myQuad[[1]]$R2.alerting,myQuad[[1]]$R2.effect.size,myQuad[[1]]$R2.contrast)
## [1] 0.4947808 0.2317605 0.3036097
# from ANOVA table:
summary(spot.aov,split=list(diameter=list(Lin=1,Quad=2,Others=3:5)))
## Df Sum Sq Mean Sq F value Pr(>F)
## diameter 5 2.4143 0.4829 5.287 0.00134 **
## diameter: Lin 1 1.0672 1.0672 11.685 0.00183 **
## diameter: Quad 1 1.1945 1.1945 13.079 0.00108 **
## diameter: Others 3 0.1526 0.0509 0.557 0.64755
## Residuals 30 2.7399 0.0913
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
( R2.alerting <- 1.1945/2.4143 )
## [1] 0.4947604
( R2.effect.size <- 1.1945/(2.4143+2.7399))
## [1] 0.2317527
( R2.contrast <- 1.1945/(1.1945+2.7399))
## [1] 0.3036041