Initialize R with the following commands.
options(contrasts=c("contr.sum","contr.poly")) # IMPORTANT!!
load(file=url('http://pnb.mcmaster.ca/bennett/psy710/datasets/hw3.rda'))
The options
command sets up R to define ANOVA effects
using the sum-to-zero constraint. The load
command loads
the data frames df0
, df1
, and
df3
.
Answer the questions in Section 3 and submit your answers as a script file on Avenue 2 Learn. Make sure to begin your script file with the following lines:
# PSYCH 710 Lab 3 Homework
# Script File
# SEP-2023
# Your Name: <<Your name here>>
# Student ID: <<Your ID here>>
# Collaborators: <<Names of your collaborators here>>
Also, make sure that text that is not an R command is preceded by a comment symbol (#). For example, you can insert questions or comments among your commands like this:
# The following command doesn't work... not sure why...
# ttest(x=g1,y=g2) # was trying to do a t test
The data frames df0
and df1
each contain
two numeric variables, X
and Y
. Use
df0
and df1
to answer questions 1-4. Unless
stated otherwise, you may assume that the data are consistent with the
assumptions of the t test, and that alpha is 0.05.
X
and Y
means for both df0
and df1
.
Compute Cohen’s d for each t test.library(effectsize)
with(df0,t.test(X,Y))
##
## Welch Two Sample t-test
##
## data: X and Y
## t = -1.6, df = 38, p-value = 0.1
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -11.402 1.402
## sample estimates:
## mean of x mean of y
## 100 105
cohens_d(x="X",y="Y",data=df0,paired=FALSE)
## Cohen's d | 95% CI
## -------------------------
## -0.50 | [-1.13, 0.13]
##
## - Estimated using pooled SD.
with(df1,t.test(X,Y))
##
## Welch Two Sample t-test
##
## data: X and Y
## t = -1.6, df = 38, p-value = 0.1
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -11.402 1.402
## sample estimates:
## mean of x mean of y
## 100 105
cohens_d(x="X",y="Y",data=df1,paired=FALSE)
## Cohen's d | 95% CI
## -------------------------
## -0.50 | [-1.13, 0.13]
##
## - Estimated using pooled SD.
X
and Y
means for
both df0
and df1
. Compute
Cohens d for each t test.with(df0,t.test(X,Y,paired=T))
##
## Paired t-test
##
## data: X and Y
## t = -2.9, df = 19, p-value = 0.009
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## -8.625 -1.375
## sample estimates:
## mean difference
## -5
cohens_d(x="X",y="Y",data=df0,paired=TRUE)
## Cohen's d | 95% CI
## --------------------------
## -0.65 | [-1.12, -0.16]
with(df1,t.test(X,Y,paired=T))
##
## Paired t-test
##
## data: X and Y
## t = -1.7, df = 19, p-value = 0.1
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## -11.279 1.279
## sample estimates:
## mean difference
## -5
cohens_d(x="X",y="Y",data=df1,paired=TRUE)
## Cohen's d | 95% CI
## -------------------------
## -0.37 | [-0.82, 0.09]
df0
and
df1
? Why or why not?## No, the effect of using a paired-sample t test was much larger for df0 than df1. The reason the effect differed is because the correlation between X and Y was much greater in df0 (r-0.7) than df1 (r=0.1). Therefore the benefits of pairing -- which reduces within-group variance -- are much greater in df0 than df1.
cor(df0$X,df0$Y)
## [1] 0.7
cor(df1$X,df1$Y)
## [1] 0.1
X
and Y
in
df0
is equivalent to zero. For this test, define the
equivalence bounds as being between ±6. Your answer should include a
description of the equivalence test’s null and alternative
hypotheses.# equivalence H0: true diff < LOWER.BOUND OR true diff > UPPER.BOUND
# equivalence H0: true diff < LOWER.BOUND OR true diff > UPPER.BOUND
LOWER.BOUND <- -6
UPPER.BOUND <- 6
with(df0,t.test(X,Y,paired=T,mu=0,conf.level=0.9))
##
## Paired t-test
##
## data: X and Y
## t = -2.9, df = 19, p-value = 0.009
## alternative hypothesis: true mean difference is not equal to 0
## 90 percent confidence interval:
## -7.995 -2.005
## sample estimates:
## mean difference
## -5
# are edges of 90% CI in-between lower and upper bound?
-7.99 > LOWER.BOUND # FALSE
## [1] FALSE
-2.005 < UPPER.BOUND # TRUE
## [1] TRUE
# TOST Lower Bound:
with(df0,t.test(X,Y,paired=T,mu=LOWER.BOUND,alternative="greater")) # p = 0.29
##
## Paired t-test
##
## data: X and Y
## t = 0.58, df = 19, p-value = 0.3
## alternative hypothesis: true mean difference is greater than -6
## 95 percent confidence interval:
## -7.995 Inf
## sample estimates:
## mean difference
## -5
# TOST Upper Bound:
with(df0,t.test(X,Y,paired=T,mu=UPPER.BOUND,alternative="less")) # p < 0.001
##
## Paired t-test
##
## data: X and Y
## t = -6.4, df = 19, p-value = 2e-06
## alternative hypothesis: true mean difference is less than 6
## 95 percent confidence interval:
## -Inf -2.005
## sample estimates:
## mean difference
## -5
# do not reject equivalence H0 (true diff < LOWER.BOUND) OR (true diff > UPPER.BOUND)
# EQUIVALENCE TESTS:
library(TOSTER)
t_TOST(x=df0$X,
y=df0$Y,
paired=TRUE,
low_eqbound=LOWER.BOUND,
high_eqbound=UPPER.BOUND,
eqbound_type = "raw")
##
## Paired t-test
##
## The equivalence test was non-significant, t(19) = 0.58, p = 0.29
## The null hypothesis test was significant, t(19) = -2.89p < 0.01
## NHST: reject null significance hypothesis that the effect is equal to zero
## TOST: don't reject null equivalence hypothesis
##
## TOST Results
## t df p.value
## t-test -2.8868 19 0.009
## TOST Lower 0.5774 19 0.285
## TOST Upper -6.3509 19 < 0.001
##
## Effect Sizes
## Estimate SE C.I. Conf. Level
## Raw -5.0000 1.7321 [-7.9949, -2.0051] 0.9
## Hedges's g(z) -0.6196 0.2441 [-1.0017, -0.223] 0.9
## Note: SMD confidence intervals are an approximation. See vignette("SMD_calcs").
## I do NOT reject both one-sided tests (alpha=.05), and therefore I do NOT reject the null hypothesis that the true difference between the two groups is less than -6 OR greater than +6.
The data frame df3
contains the results of a study that
measured performance on a cognitive test in children in grades 1 through
5. The data frame contains a factor, grade
, and a numeric
variable, score
. Use df3
to answer questions
5-8. Unless stated otherwise, you may assume that the data are
consistent with the assumptions of the F test, and that alpha is
0.05.
score
as a function
of grade
.plot(x=1:5,
y=with(df3,tapply(score,grade,mean)),
type="b",
ylim=c(90,106),
xlab="Grade",
ylab="Score")
aov.01 <- aov(score~grade,data=df3)
summary(aov.01)
## Df Sum Sq Mean Sq F value Pr(>F)
## grade 4 1361 340 2.26 0.072 .
## Residuals 70 10551 151
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## The null hypothesis is that the groups were selected from populations with equal means. The null hypothesis is that not all of the group means are equal. Alternatively, we could say that the null hypothesis is that all of the group effects (i.e., the alpha's estimated by ANOVA) are zero, and that the alternative is that they are not all equal to zero. Using an alpha of 0.05, the omnibus F test is not significant, and therefore we would not reject the null hypothesis.
bartlett.test(score~grade,data=df3)
##
## Bartlett test of homogeneity of variances
##
## data: score by grade
## Bartlett's K-squared = 6.6, df = 4, p-value = 0.2
# The bartlett test was not significant, and therefore we do not reject the null hypothesis of equal group variances. Therefore, the homogeneity assumption does not appear to be unreasonable in this case.
grade
on score
using an alternative to ANOVA
that does not assume that variance is constant across
grades.oneway.test(score~grade,data=df3)
##
## One-way analysis of means (not assuming equal variances)
##
## data: score and grade
## F = 2.8, num df = 4, denom df = 35, p-value = 0.04
# We reject the null hypothesis of no difference among group means.