Initialize R with the following commands:
# Initialize R -------------------------------------
# I set the working directory to be the folder that includes this document
graphics.off() # clear all graphs
options(digits=6,width=80) # R console display parameters
options(contrasts=c("contr.sum","contr.poly") ) # set up coding scheme for factors:
op <- par(no.readonly = TRUE) # save copy of current parameters
# usually a good idea to CLEAR WORKSPACE MEMORY
# use command in RStudio "Session" menu
# load libraries into R:
library(effectsize)
library(pwr)
library(TOSTER) # optional
Load the iq
data file.
load(file=url("http://pnb.mcmaster.ca/bennett/psy710/datasets/iq.rda"))
The numeric variable iq
contains IQ scores from 20
individuals. Graphically inspect the data: check for outliers, extreme
skew, etc.
# Check distributions:
par(mfrow=c(1,2)); # create 2 graphs on 1 page
boxplot(iq); # box plot
qqnorm(iq);qqline(iq) # qq plot
The data were analyzed with the following command.
# The data were analyzed with the following command.
t.test(iq,mu=100) # do t test
# What are the null and alternative hypotheses
# that are begin evaluated by this t test?
tres <- t.test(iq,mu=100) # do t test
print(tres$alternative)
## [1] "two.sided"
Answer: The null hypothesis (H0) is that the \(\mu=100\). The alternative hypothesis (H1) is \(\mu \ne 100\).
t.test(iq,mu=100)
##
## One Sample t-test
##
## data: iq
## t = -1.874, df = 19, p-value = 0.0764
## alternative hypothesis: true mean is not equal to 100
## 95 percent confidence interval:
## 91.1297 100.4900
## sample estimates:
## mean of x
## 95.8099
Answer: If my alpha is 0.05, then I do not reject H0. But I do NOT accept H0 because the failure to reject H0 could reflect low power.
t.test
to calculate the 95% and 90% confidence
intervals for the population mean of iq
. Which interval is
narrower? Why?# The t.test command lists the confidence interval with all of the other results.
# The next 2 commands list ONLY the confidence intervals:
t.test(iq,mu=100)$conf.int # default is 95%
## [1] 91.1297 100.4900
## attr(,"conf.level")
## [1] 0.95
t.test(iq,mu=100,conf.level=0.90)$conf.int # 90%
## [1] 91.9434 99.6763
## attr(,"conf.level")
## [1] 0.9
Answer: The 90% CI is narrower than the 95% CI, and because it is narrower we expect to contain the true population mean less often than the 95% CI. In fact, the 90% CI will (in the long run) contain the true value only 90% of the time instead of 95% of the time. A 90% CI is said to have lower coverage than a 95% CI.
iq
. How
many standard deviations is the mean away from 100? Next, read the help
page for cohens_d
and then use that command to compute
Cohens d, a common measure of effect size, for iq
(assuming that the true value of the population mean is 100).( mean(iq) )
## [1] 95.8099
( sd(iq) )
## [1] 10
( ( mean(iq)-100 ) / sd(iq) )
## [1] -0.419014
# library(effectsize)
# help(package="effectsize")
# ?cohens_d
cohens_d(iq,mu=100) # need to specify the H0 mean
## Cohen's d | 95% CI
## -------------------------
## -0.42 | [-0.87, 0.04]
##
## - Deviation from a difference of 100.
# The effect size d = 0.42
Answer: Our observed mean (95.81) is 0.42 standard
deviations below the hypothesized mean (\(\mu=100\)). This value is the same as that
returned by cohens_d
.
power.t.test
to calculate the power of our t
test to reject the false null hypothesis \(\mu=100\). (You may want to read the help
page for power.t.test
.)# ?power.t.test # see help page
n <- 20 # number of subjects
power.t.test(n=n,delta=(100-95),sd=10,type="one.sample",alternative="two.sided")
##
## One-sample t test power calculation
##
## n = 20
## delta = 5
## sd = 10
## sig.level = 0.05
## power = 0.564483
## alternative = two.sided
# power = 0.56
# you can get the same result using Cohen's d:
(100-95)/10 # we are assuming an effect size Cohen's d = 1
## [1] 0.5
# recalculate power using Cohen's d:
power.t.test(n=n,delta=0.5,sd=1,type="one.sample",alternative="two.sided")
##
## One-sample t test power calculation
##
## n = 20
## delta = 0.5
## sd = 1
## sig.level = 0.05
## power = 0.564483
## alternative = two.sided
# power is 0.56 assuming Cohen's d is 0.5
5b. Now use power.t.test
to calculate the sample size
that we would need to attain a power of 0.8.
# specify power=0.8 and set n to NULL to
# calculate sample size need for power = 0.8
power.t.test(n=NULL,power=0.8,delta=(100-95),sd=10,type="one.sample",alternative="two.sided")
##
## One-sample t test power calculation
##
## n = 33.3672
## delta = 5
## sd = 10
## sig.level = 0.05
## power = 0.8
## alternative = two.sided
# repeat with Cohen's d
power.t.test(n=NULL,power=0.8,delta=0.5,sd=1,type="one.sample",alternative="two.sided")
##
## One-sample t test power calculation
##
## n = 33.3672
## delta = 0.5
## sd = 1
## sig.level = 0.05
## power = 0.8
## alternative = two.sided
# for power = 0.8, n = 34
# H0: mu >= 100; H1: mu < 100
t.test(iq,mu=100,alternative="less") # NOTE THE CONFIDENCE INTERVAL!
##
## One Sample t-test
##
## data: iq
## t = -1.874, df = 19, p-value = 0.0382
## alternative hypothesis: true mean is less than 100
## 95 percent confidence interval:
## -Inf 99.6763
## sample estimates:
## mean of x
## 95.8099
An equivalence test is used to determine if an effect falls between pre-defined lower and upper bounds. IQ tests are designed to have a population mean of 100 and a standard deviation of 15. Let’s assume that the smallest difference of interest is 7 (i.e., approximately 1/2 of a standard deviation). In other words, we will assume that any population mean 93 and 107 is essentially equivalent to a mean of 100.
LOWER.BOUND <- 93
UPPER.BOUND <- 107
# t.test(x=iq,mu=100,alternative="two.sided") # not significantly different from 100
t.test(x=iq,mu=UPPER.BOUND,alternative="less") # significantly less than 107
##
## One Sample t-test
##
## data: iq
## t = -5.004, df = 19, p-value = 3.94e-05
## alternative hypothesis: true mean is less than 107
## 95 percent confidence interval:
## -Inf 99.6763
## sample estimates:
## mean of x
## 95.8099
t.test(x=iq,mu=LOWER.BOUND,alternative="greater") # not significantly greater than 93
##
## One Sample t-test
##
## data: iq
## t = 1.257, df = 19, p-value = 0.112
## alternative hypothesis: true mean is greater than 93
## 95 percent confidence interval:
## 91.9434 Inf
## sample estimates:
## mean of x
## 95.8099
Answer: For the equivalence test, the null hypothesis is that \(\mu \le 93\) OR \(\mu \ge 107\). Our 1-tailed \(t\) for the upper bound of the equivalence zone was significant: we rejected the null hypothesis \(\mu \ge 107\) in favor of the alternative hypothesis \(\mu < 107\). However, the 1-tailed test on the lower bound was not significant: we failed to reject the null hypothesis \(\mu \le 93\). Because only one 1-tailed test was significant, we do not reject the null hypothesis that the population mean is not equivalent to 100. When reporting the statistical result, it is standard practice to report the results from the test with larger \(p\) value. In this case, we would say that the equivalence test was not significant \(t(19) =1.257\), \(p=0.11\).
We also can perform the TOST procedure using t_TOST
in
the TOSTER
package.
# install.packages("TOSTER")
# library(TOSTER)
# ?t_TOST
# low_eqbound=LOWER.BOUND,
# high_eqbound=UPPER.BOUND,
LOWER.BOUND <- 93
UPPER.BOUND <- 107
t_TOST(x=iq,
eqb=c(LOWER.BOUND,UPPER.BOUND),
mu=100)
##
## One Sample t-test
##
## The equivalence test was non-significant, t(19) = 1.26, p = 0.11
## The null hypothesis test was non-significant, t(19) = -1.87p = 0.08
## NHST: don't reject null significance hypothesis that the effect is equal to 100
## TOST: don't reject null equivalence hypothesis
##
## TOST Results
## t df p.value
## t-test -1.874 19 0.076
## TOST Lower 1.257 19 0.112
## TOST Upper -5.004 19 < 0.001
##
## Effect Sizes
## Estimate SE C.I. Conf. Level
## Raw -4.1901 2.2361 [91.9434, 99.6763] 0.9
## Hedges's g -0.4022 0.2325 [-0.766, -0.0284] 0.9
## Note: SMD confidence intervals are an approximation. See vignette("SMD_calcs").
conf.level
to 0.90 (rather
than the default value of 0.95). This parameter sets the width of the
confidence interval returned by t.test
. What is the value
of the confidence interval, and how does it compare to the equivalence
test performed in the previous question?t.test(iq,mu=100,conf.level=0.90)
##
## One Sample t-test
##
## data: iq
## t = -1.874, df = 19, p-value = 0.0764
## alternative hypothesis: true mean is not equal to 100
## 90 percent confidence interval:
## 91.9434 99.6763
## sample estimates:
## mean of x
## 95.8099
The boundaries of the two-sided 90% CI equal the boundaries of the two one-sided CIs calculated in the previous question. We can use this CI to perform the equivalence test: if the 90% CI lies within the equivalence region, then we reject the null hypothesis that the mean differs from zero in favour of the alternative hypothesis that the mean is equivalent to zero. In this case we would not reject the null hypothesis that the population mean is equivalent to 100, \(p > .05\).
Load the vitcap
data file.
load(file=url("http://pnb.mcmaster.ca/bennett/psy710/datasets/vitcap.rda"))
The data frame vitcap
contains two numeric variables,
control
and exposed
, each containing 75
values. The data are from an experiment that measured so-called vital
capacity (a measure of lung volume) in liters in a group of workers
exposed to cadmium and an age-matched control group. Use box plots to
graphically inspect each variable for outliers, extreme skew, etc.
# look for outliers & skew:
with(vitcap,boxplot(control,exposed,names=c("control","exposed")))
# The data were analyzed with the following command.
t.test(x=vitcap$control,y=vitcap$exposed,mu=0) # 2-tailed t test is default
##
## Welch Two Sample t-test
##
## data: vitcap$control and vitcap$exposed
## t = 1.573, df = 131.9, p-value = 0.118
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.0621838 0.5452014
## sample estimates:
## mean of x mean of y
## 5.04252 4.80101
UPPER.BOUND <- 0.5
LOWER.BOUND <- -0.5
# is difference (y-x) less than upper bound?
t.test(x=vitcap$control,y=vitcap$exposed,mu=UPPER.BOUND,alternative="less")
##
## Welch Two Sample t-test
##
## data: vitcap$control and vitcap$exposed
## t = -1.684, df = 131.9, p-value = 0.0473
## alternative hypothesis: true difference in means is less than 0.5
## 95 percent confidence interval:
## -Inf 0.495823
## sample estimates:
## mean of x mean of y
## 5.04252 4.80101
# is difference (y-x) greater than lower bound?
t.test(x=vitcap$control,y=vitcap$exposed,mu=LOWER.BOUND,alternative="greater")
##
## Welch Two Sample t-test
##
## data: vitcap$control and vitcap$exposed
## t = 4.83, df = 131.9, p-value = 1.86e-06
## alternative hypothesis: true difference in means is greater than -0.5
## 95 percent confidence interval:
## -0.0128058 Inf
## sample estimates:
## mean of x mean of y
## 5.04252 4.80101
# alternative method with 90% CI
t.test(x=vitcap$control,y=vitcap$exposed,mu=0,conf.level=0.90) # do t test
##
## Welch Two Sample t-test
##
## data: vitcap$control and vitcap$exposed
## t = 1.573, df = 131.9, p-value = 0.118
## alternative hypothesis: true difference in means is not equal to 0
## 90 percent confidence interval:
## -0.0128058 0.4958233
## sample estimates:
## mean of x mean of y
## 5.04252 4.80101
Answer: Using the TOST procedure, I used two 1-tailed tests to i) reject the null hypothesis \(\mu_d \le -0.5\) in favor of the alternative \(\mu_d > -0.5\); and ii) reject the null hypothesis \(\mu_d \ge 0.5\) in favor of the alternative \(\mu_d < 0.5\). Because both tests were significant, we reject the hypothesis of non-equivalence \(t(131.9)=-1.684\), \(p=0.047\), in favor of the alternative that the means are equivalent \(-0.5 < \mu_d < 0.5\). The same result is obtained with the second method: the boundaries of the two-tailed 90% CI, [-0.0128, 0.495], are within the equivalence zone, \(\pm 0.5\), so we reject the hypothesis of non-equivalence, \(p<.05\).