Question 1

load(url("http://pnb.mcmaster.ca/bennett/psy710/p3/tagalog.rda"))
sapply(tagalog.df,class)
##    subjID      time  strategy    recall 
##  "factor"  "factor"  "factor" "numeric"
  1. A study compared three strategies of memorizing the meaning of foreign words. Participants studied 24 Tagalog (Philippine language) words using three strategies: keyword-generated, keyword-provided, and rote learning. In the keyword-generated condition, participants generated their own keywords to help them remember each word. In the keyword-provided condition, the experimenters provided the keywords to help remember each word. In the rote learning condition, participants were simply instructed to memorize the meaning of the words. Memory for word meanings was tested 5 minutes and 2 days after study. The data are stored in tagalog.df, which contains the factors time, strategy, and subjID, and the numeric dependent variable recall. (N.B. The factor time codes the study-test intervals of 5 minutes and 2 days as short or long.)
  1. Conduct an ANOVA to evaluate the fixed effects of time and strategy on recall.
options(contrasts=c("contr.sum","contr.poly"))
mem.aov.01 <-aov(recall~time*strategy,data=tagalog.df)
anova(mem.aov.01)
## Analysis of Variance Table
## 
## Response: recall
##               Df  Sum Sq Mean Sq F value    Pr(>F)
## time           1  316.01  316.01 16.2342 0.0001372
## strategy       2 1022.26  511.13 26.2577 2.729e-09
## time:strategy  2  194.87   97.44  5.0055 0.0092168
## Residuals     72 1401.54   19.47
  1. Evaluate the simple main effect of strategy at each level of time. (You may use local error terms to evaluate the simple main effects). If a simple main effect is significant, analyze the pairwise differences among different strategies using Tukey HSD.


# simple main effect of strategy at short time
mem.condition.short.01 <- aov(recall~strategy,data=subset(tagalog.df,time=="short"))
summary(mem.condition.short.01)
##             Df Sum Sq Mean Sq F value Pr(>F)
## strategy     2  206.0   103.0   4.976 0.0124
## Residuals   36  745.2    20.7
TukeyHSD(mem.condition.short.01)
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = recall ~ strategy, data = subset(tagalog.df, time == "short"))
## 
## $strategy
##                          diff       lwr        upr     p adj
## provided-generated  0.6153846 -3.746674  4.9774428 0.9366719
## rote-generated     -4.5384615 -8.900520 -0.1764034 0.0399532
## rote-provided      -5.1538462 -9.515904 -0.7917880 0.0174975

# tukey test done with emmeans:
library(emmeans)
mem.emm.short <- emmeans(mem.condition.short.01,specs=~strategy)
pairs(mem.emm.short)
##  contrast             estimate   SE df t.ratio p.value
##  generated - provided   -0.615 1.78 36  -0.345  0.9367
##  generated - rote        4.538 1.78 36   2.543  0.0400
##  provided - rote         5.154 1.78 36   2.888  0.0175
## 
## P value adjustment: tukey method for comparing a family of 3 estimates

# simple main effect of strategy at long time
mem.condition.long.01 <- aov(recall~strategy,data=subset(tagalog.df,time=="long"))
summary(mem.condition.long.01)
##             Df Sum Sq Mean Sq F value   Pr(>F)
## strategy     2 1011.1   505.6   27.73 5.14e-08
## Residuals   36  656.3    18.2
TukeyHSD(mem.condition.long.01)
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = recall ~ strategy, data = subset(tagalog.df, time == "long"))
## 
## $strategy
##                          diff        lwr         upr     p adj
## provided-generated  -4.000000  -8.093547  0.09354728 0.0566077
## rote-generated     -12.230769 -16.324317 -8.13722195 0.0000000
## rote-provided       -8.230769 -12.324317 -4.13722195 0.0000569

# tukey test done with emmeans:
mem.emm.long <- emmeans(mem.condition.long.01,specs=~strategy)
pairs(mem.emm.long)
##  contrast             estimate   SE df t.ratio p.value
##  generated - provided     4.00 1.67 36   2.388  0.0566
##  generated - rote        12.23 1.67 36   7.303  <.0001
##  provided - rote          8.23 1.67 36   4.915  0.0001
## 
## P value adjustment: tukey method for comparing a family of 3 estimates


# similar results with emmeans and GLOBAL error term:
mem.emm <- emmeans(mem.aov.01,specs=~strategy,by="time")
pairs(mem.emm)
## time = short:
##  contrast             estimate   SE df t.ratio p.value
##  generated - provided   -0.615 1.73 72  -0.356  0.9327
##  generated - rote        4.538 1.73 72   2.623  0.0284
##  provided - rote         5.154 1.73 72   2.978  0.0109
## 
## time = long:
##  contrast             estimate   SE df t.ratio p.value
##  generated - provided    4.000 1.73 72   2.311  0.0606
##  generated - rote       12.231 1.73 72   7.068  <.0001
##  provided - rote         8.231 1.73 72   4.756  <.0001
## 
## P value adjustment: tukey method for comparing a family of 3 estimates

Question 2

load(url("http://pnb.mcmaster.ca/bennett/psy710/p3/pLearn.rda"))
summary(pLearn)
##     subject       accuracy         stimuli   day    
##  s1     :  4   Min.   :13.00   blocked :80   d0:60  
##  s2     :  4   1st Qu.:44.00   fixed   :80   d1:60  
##  s3     :  4   Median :53.00   variable:80   d2:60  
##  s4     :  4   Mean   :53.57                 d3:60  
##  s5     :  4   3rd Qu.:63.00                        
##  s6     :  4   Max.   :88.00                        
##  (Other):216
sapply(pLearnWide,class)
##   subject   stimuli        d0        d1        d2        d3 
##  "factor"  "factor" "numeric" "numeric" "numeric" "numeric"
  1. A perceptual learning experiment was conducted to examine the effect of stimulus variability on the effects of practice. Performance in a texture identification task was measured on four consecutive days. Response accuracy was measured with a 1-of-10 identification task: On each trial, an observer was shown a brief presentation of a single texture and then had to select the presented texture from a response screen that showed 10 possible textures. On each day, responses were recorded for 100 trials and the dependent measure was the percentage of correct responses. Participants were assigned randomly to one of three conditions. In the fixed condition, the same set of 10 textures were used throughout the four days. In the blocked conditions, a different set of textures was shown on each day. In the variable condition, a different set of 10 textures was used on every trial.
  1. Conduct an analysis of variance to evaluate the effects of stimuli and day on accuracy. Where necessary, correct p values for deviations from sphericity.
library(afex)
options(contrasts=c("contr.sum","contr.poly"))
pl.aov <- aov_car(accuracy~stimuli*day+Error(subject/day),data=pLearn)
summary(pl.aov)
## 
## Univariate Type III Repeated-Measures ANOVA Assuming Sphericity
## 
##             Sum Sq num Df Error SS den Df   F value  Pr(>F)
## (Intercept) 688760      1    32676     57 1201.4716 < 2e-16
## stimuli        346      2    32676     57    0.3014 0.74099
## day           5352      3     6766    171   45.0840 < 2e-16
## stimuli:day    541      6     6766    171    2.2785 0.03845
## 
## 
## Mauchly Tests for Sphericity
## 
##             Test statistic p-value
## day                0.93805 0.61385
## stimuli:day        0.93805 0.61385
## 
## 
## Greenhouse-Geisser and Huynh-Feldt Corrections
##  for Departure from Sphericity
## 
##              GG eps Pr(>F[GG])
## day         0.96239    < 2e-16
## stimuli:day 0.96239    0.04077
## 
##               HF eps   Pr(>F[HF])
## day         1.019206 1.612194e-21
## stimuli:day 1.019206 3.845046e-02
  1. Do an analysis to evaluate the linear trend \(\times\) stimuli interaction. If the interaction is significant, evaluate differences among the groups using Tukey HSD. If the interaction is not significant, determine if the average trend (i.e., ignoring stimuli) is significant.
# create trend weights:
names(pLearnWide)
## [1] "subject" "stimuli" "d0"      "d1"      "d2"      "d3"
ymat <- as.matrix(pLearnWide[,3:6])
ymat[1:2,]
##      d0 d1 d2 d3
## [1,] 37 36 42 42
## [2,] 49 50 57 57
linW <- contr.poly(n=4)[,1] # linear trend weights
# linear trend:
pLearnWide$LinTrend <- ymat %*% linW
linTrend.aov <- aov(LinTrend~stimuli,data=pLearnWide)

# the linear trend x stimuli interaction is NOT significant
# but the intercept (i.e., grand mean) IS significant
summary(linTrend.aov,intercept=T)
##             Df Sum Sq Mean Sq F value Pr(>F)
## (Intercept)  1   4748    4748 156.002 <2e-16
## stimuli      2     41      20   0.669  0.516
## Residuals   57   1735      30
  1. Do an analysis to evaluate the quadratic trend \(\times\) stimuli interaction. If the interaction is significant, evaluate differences among the groups using Tukey HSD. If the interaction is not significant, determine if the average trend (i.e., ignoring stimuli) is significant.
quadW <- contr.poly(n=4)[,2] # quad trend weights
# quadratic trend:
pLearnWide$QuadTrend <- ymat %*% quadW
quadTrend.aov <- aov(QuadTrend~stimuli,data=pLearnWide)
# the quadratic trend x stimuli interaction IS significant
summary(quadTrend.aov,intercept=T)
##             Df Sum Sq Mean Sq F value   Pr(>F)
## (Intercept)  1  567.3   567.3  13.899 0.000445
## stimuli      2  342.8   171.4   4.199 0.019903
## Residuals   57 2326.6    40.8

# Using a Tukey HSD test to evaluate group differences in the
# quadratic trend
TukeyHSD(quadTrend.aov)
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = QuadTrend ~ stimuli, data = pLearnWide)
## 
## $stimuli
##                    diff        lwr         upr     p adj
## fixed-blocked    -4.900 -9.7618105 -0.03818947 0.0478281
## variable-blocked  0.325 -4.5368105  5.18681053 0.9858389
## variable-fixed    5.225  0.3631895 10.08681053 0.0324277
# here we repeat the analyses with emmeans
library(emmeans)
pl.emm <- emmeans(pl.aov,specs="day",by="stimuli")

# trend weights
lw <- contr.poly(n=4)[,1] # linear trend weights
qw <- contr.poly(n=4)[,2] # quad trend weights

# LINEAR TREND:
lin.con <- contrast(pl.emm,
                    method=list(linear=lw))
joint_tests(lin.con) # lin trend x stimuli
##  model term df1 df2 F.ratio p.value
##  stimuli      2  57   0.669  0.5161

# test of grand mean of linear trend ignoring group:
pl.emm2 <- emmeans(pl.aov,specs="day")
contrast(pl.emm2,method=list(linear=lw))
##  contrast estimate    SE df t.ratio p.value
##  linear        8.9 0.712 57  12.490  <.0001
## 
## Results are averaged over the levels of: stimuli

# QUADRATRIC TREND
quad.con <- contrast(pl.emm,
                     method=list(quad=qw))
joint_tests(quad.con) # quad trend x stimuli
##  model term df1 df2 F.ratio p.value
##  stimuli      2  57   4.199  0.0199

# N.B. I am not sure how to do TUKEY tests here

Question 3

load(url("http://pnb.mcmaster.ca/bennett/psy710/p3/liver.rda"))
summary(liver)
##      condition  rat        sample      measure      glycogen    
##  control  :12   r1:6   s1     : 2   m1     : 1   Min.   :125.0  
##  c217     :12   r2:6   s2     : 2   m2     : 1   1st Qu.:135.8  
##  c217Sugar:12   r3:6   s3     : 2   m3     : 1   Median :141.0  
##                 r4:6   s4     : 2   m4     : 1   Mean   :142.2  
##                 r5:6   s5     : 2   m5     : 1   3rd Qu.:150.0  
##                 r6:6   s6     : 2   m6     : 1   Max.   :162.0  
##                        (Other):24   (Other):30
sapply(liver,class)
## condition       rat    sample   measure  glycogen 
##  "factor"  "factor"  "factor"  "factor" "integer"

Sokal and Rohlf (1995) reported data from an experiment designed to analyze glycogen content of rat livers. Glycogen was measured in rats in three conditions: control, compound 217, and compound 217 plus sugar. Each condition used two randomly selected rats. From each rat, the experimenters took three samples of liver, and from each sample the experimenters took two measures of glycogen. Thus, sample is nested in rat, and rat is nested in condition. The data are stored in the data frame liver, which contains the fixed factor condition, the random factors rat and sample, and the numeric dependent variable glycogen.

  1. Evaluate the statistical significance and effect size (Cohen’s f) of condition on glycogen.
library(effectsize)
library(lmerTest)
# with aov
liver.aov.01 <- aov(glycogen~condition + Error(rat/sample),data=liver)
summary(liver.aov.01) # fixed effect
## 
## Error: rat
##           Df Sum Sq Mean Sq F value Pr(>F)
## condition  2 1557.6   778.8   2.929  0.197
## Residuals  3  797.7   265.9               
## 
## Error: rat:sample
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 12    594    49.5               
## 
## Error: Within
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 18    381   21.17
cohens_f(liver.aov.01)
## # Effect Size for ANOVA (Type I)
## 
## Group | Parameter | Cohen's f (partial) |      95% CI
## -----------------------------------------------------
## rat   | condition |                1.40 | [0.00, Inf]
## 
## - One-sided CIs: upper bound fixed at [Inf].

# with lmer
liver.lme.01 <- lmer(glycogen~condition +(1|rat)+(1|sample),data=liver)
anova(liver.lme.01) # fixed effect
## Type III Analysis of Variance Table with Satterthwaite's method
##           Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
## condition 123.99  61.996     2     3   2.929 0.1971
cohens_f(liver.lme.01)
## # Effect Size for ANOVA (Type III)
## 
## Parameter | Cohen's f (partial) |      95% CI
## ---------------------------------------------
## condition |                1.40 | [0.00, Inf]
## 
## - One-sided CIs: upper bound fixed at [Inf].
  1. Estimate the variance components and association strength for rat and sample.
# using aov
summary(liver.aov.01)
## 
## Error: rat
##           Df Sum Sq Mean Sq F value Pr(>F)
## condition  2 1557.6   778.8   2.929  0.197
## Residuals  3  797.7   265.9               
## 
## Error: rat:sample
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 12    594    49.5               
## 
## Error: Within
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 18    381   21.17
xtabs(~condition+rat,data=liver) # 6 observations per cell
##            rat
## condition   r1 r2 r3 r4 r5 r6
##   control    6  6  0  0  0  0
##   c217       0  0  6  6  0  0
##   c217Sugar  0  0  0  0  6  6
var.error <- 21.17
var.rat <- (265.9-49.5)/6
xtabs(~rat+sample,data=liver) #2 observations per cell
##     sample
## rat  s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 s13 s14 s15 s16 s17 s18
##   r1  2  2  2  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0
##   r2  0  0  0  2  2  2  0  0  0   0   0   0   0   0   0   0   0   0
##   r3  0  0  0  0  0  0  2  2  2   0   0   0   0   0   0   0   0   0
##   r4  0  0  0  0  0  0  0  0  0   2   2   2   0   0   0   0   0   0
##   r5  0  0  0  0  0  0  0  0  0   0   0   0   2   2   2   0   0   0
##   r6  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   2   2   2
var.sample <- (49.5-21.17)/2
( var.components <- c(var.rat,var.sample,var.error) )
## [1] 36.06667 14.16500 21.17000
( icc <- var.components/sum(var.components))
## [1] 0.5051236 0.1983847 0.2964917

# using lmer:
print(VarCorr(liver.lme.01),comp=c("Variance","Std.Dev."))
##  Groups   Name        Variance Std.Dev.
##  sample   (Intercept) 14.167   3.7639  
##  rat      (Intercept) 36.065   6.0054  
##  Residual             21.167   4.6007
library(performance)
icc(liver.lme.01,by_group=T)
## # ICC by Group
## 
## Group  |   ICC
## --------------
## sample | 0.198
## rat    | 0.505

# with VCA
library(VCA)
liver.vca.01 <- anovaMM(glycogen~condition +(rat)+(sample),Data=liver)
print(liver.vca.01)
## 
## 
## ANOVA-Type Estimation of Mixed Model:
## --------------------------------------
## 
##  [Fixed Effects]
## 
##                int      conditionc217 conditionc217Sugar   conditioncontrol 
##         140.500000          10.500000          -5.333333           0.000000 
## 
## 
##  [Variance Components]
## 
##   Name   DF       SS         MS         VC        %Total    SD       CV[%]   
## 1 total  7.458103                       71.398148 100       8.449742 5.941225
## 2 rat    3        797.666667 265.888889 36.064815 50.512255 6.005399 4.222546
## 3 sample 12       594        49.5       14.166667 19.841784 3.763863 2.646466
## 4 error  18       381        21.166667  21.166667 29.64596  4.600725 3.234884
## 
## Mean: 142.2222 (N = 36) 
## 
## Experimental Design: unbalanced  |  Method: ANOVA
  1. Evaluate the statistical significance of the random effects rat and sample.
# with aov:
summary(liver.aov.01)
## 
## Error: rat
##           Df Sum Sq Mean Sq F value Pr(>F)
## condition  2 1557.6   778.8   2.929  0.197
## Residuals  3  797.7   265.9               
## 
## Error: rat:sample
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 12    594    49.5               
## 
## Error: Within
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 18    381   21.17
F.rat <- 265.9/49.5
p.rat <- 1-pf(F.rat,3,12)

F.sample <- 49.5/21.17
p.sample <- 1-pf(F.sample,12,18)
( F.vals <- c(F.rat,F.sample) )
## [1] 5.371717 2.338214
( p.vals <- c(p.rat,p.sample) )
## [1] 0.01410715 0.05032156

# with lmer:
ranova(liver.lme.01)
## ANOVA-like table for random-effects: Single term deletions
## 
## Model:
## glycogen ~ condition + (1 | rat) + (1 | sample)
##              npar  logLik    AIC    LRT Df Pr(>Chisq)
## <none>          6 -110.91 233.82                     
## (1 | rat)       5 -113.10 236.20 4.3802  1    0.03636
## (1 | sample)    5 -112.24 234.49 2.6698  1    0.10227
  1. What null hypotheses are being evaluated by your tests of condition, rat and sample?
## We will discuss this answer in class.

Question 4

load(url("http://pnb.mcmaster.ca/bennett/psy710/p3/trigly.rda"))
summary(trigly)
##       TGly       tech   machine
##  Min.   :107.5   t1:8   m1:8   
##  1st Qu.:135.3   t2:8   m2:8   
##  Median :143.4   t3:8   m3:8   
##  Mean   :141.2   t4:8   m4:8   
##  3rd Qu.:148.6                 
##  Max.   :156.5
xtabs(~tech+machine,trigly)
##     machine
## tech m1 m2 m3 m4
##   t1  2  2  2  2
##   t2  2  2  2  2
##   t3  2  2  2  2
##   t4  2  2  2  2
  1. A manufacturer was developing a new spectrophotometer for medical labs. A critical issue is consistency of measurements across technicians and across different machines. Four machines were randomly selected from the production process, and several technicians were selected randomly from a large population of technicians. Each machine was used by every technician to measure the triglyceride level [mg/dl] of two samples. The data are stored in the data frame trigly, which contains the random factors tech and machine, and the numeric dependent variable TGly.
  1. Evaluate the statistical significance of the main effects of tech and machine and the tech \(\times\) machine interaction.
# using aov
# machine and tech are crossed, random factors

# see Section 10.3.2 in my Chapter 10 notes
options(contrasts=c("contr.sum","contr.poly"))
tg.aov.01 <- aov(TGly ~ tech*machine,data=trigly)
summary(tg.aov.01)
##              Df Sum Sq Mean Sq F value   Pr(>F)
## tech          3 1334.5   444.8   24.86 2.91e-06
## machine       3 1647.3   549.1   30.68 7.19e-07
## tech:machine  9  786.0    87.3    4.88  0.00294
## Residuals    16  286.3    17.9
# recalculate F & p for main effects:
F.tech <- 444.8/87.3
p.tech <- 1-pf(F.tech,3,9)
c(F.tech,p.tech)
## [1] 5.09507446 0.02477665

F.machine <- 549.1/87.3
p.machine <- 1-pf(F.machine,3,9)
c(F.machine,p.machine)
## [1] 6.28980527 0.01370686
# using lmer

library(lmerTest)
tg.lmer <- lmer(TGly ~ (1|tech)+(1|machine)+(1|machine:tech),data=trigly)
ranova(tg.lmer)
## ANOVA-like table for random-effects: Single term deletions
## 
## Model:
## TGly ~ (1 | tech) + (1 | machine) + (1 | machine:tech)
##                    npar  logLik    AIC    LRT Df Pr(>Chisq)
## <none>                5 -107.52 225.04                     
## (1 | tech)            4 -109.31 226.61 3.5730  1   0.058727
## (1 | machine)         4 -109.81 227.63 4.5924  1   0.032113
## (1 | machine:tech)    4 -111.31 230.63 7.5879  1   0.005876
  1. Estimate the variance components and association strength for tech and machine and the tech \(\times\) machine interaction .
# anova variance components:
# see Section 10.3.3 of my Chapter 10 notes
a <- 4; # 4 techs
b <- 4; # 4 machines
n <- 2; # 2 measures per cell
summary(tg.aov.01)
##              Df Sum Sq Mean Sq F value   Pr(>F)
## tech          3 1334.5   444.8   24.86 2.91e-06
## machine       3 1647.3   549.1   30.68 7.19e-07
## tech:machine  9  786.0    87.3    4.88  0.00294
## Residuals    16  286.3    17.9
vc.tech <- (444.8-87.3)/(n*b)
vc.machine <- (549.1-87.3)/(n*a)
vc.TxM <- (87.3-17.9)/n
vc.Resid <- 17.9
( vc.tg <- c(vc.TxM,vc.machine,vc.tech,vc.Resid) ) # var components
## [1] 34.7000 57.7250 44.6875 17.9000
( icc.tg <- vc.tg/sum(vc.tg)) # ICCs
## [1] 0.2238529 0.3723893 0.2882832 0.1154746
# anova variance components estimated with VCA
library(VCA)
tg.vca <- anovaVCA(TGly ~ (tech)+(machine)+(machine:tech),Data=trigly)
print(tg.vca,digits=3)
## 
## 
## Result Variance Component Analysis:
## -----------------------------------
## 
##   Name         DF    SS       MS      VC      %Total SD     CV[%]
## 1 total        9.038                  155.021 100    12.451 8.819
## 2 tech         3     1334.463 444.821 44.685  28.825 6.685  4.735
## 3 machine      3     1647.278 549.093 57.719  37.233 7.597  5.381
## 4 tech:machine 9     786.035  87.337  34.721  22.398 5.892  4.174
## 5 error        16    286.325  17.895  17.895  11.544 4.23   2.996
## 
## Mean: 141.184 (N = 32) 
## 
## Experimental Design: balanced  |  Method: ANOVA
# variance components from lmer
# using lmer:
print(VarCorr(tg.lmer),comp=c("Variance","Std.Dev."))
##  Groups       Name        Variance Std.Dev.
##  machine:tech (Intercept) 34.721   5.8925  
##  machine      (Intercept) 57.718   7.5972  
##  tech         (Intercept) 44.689   6.6849  
##  Residual                 17.895   4.2303
library(performance)
icc(tg.lmer,by_group=T)
## # ICC by Group
## 
## Group        |   ICC
## --------------------
## machine:tech | 0.224
## machine      | 0.372
## tech         | 0.288

Question 5

load(url("http://pnb.mcmaster.ca/bennett/psy710/p3/leprosy.rda"))
summary(leprosy)
##  drug      baseline         after      
##  A:10   Min.   : 3.00   Min.   : 0.00  
##  B:10   1st Qu.: 7.00   1st Qu.: 2.00  
##  C:10   Median :10.50   Median : 7.00  
##         Mean   :10.73   Mean   : 7.90  
##         3rd Qu.:13.75   3rd Qu.:12.75  
##         Max.   :21.00   Max.   :23.00
  1. A study examined the effectiveness of three drugs for the treatment of leprosy. A baseline measure of the amount of leprosy bacilli was taken on all participants, who were then randomly assigned to one of three drug conditions. A second measure of leprosy bacilli was taken after the drug treatment. The data are stored in the data frame leprosy which contains the covariate baseline, the fixed factor drug, and the numeric dependent variable after.
  1. Evaluate group differences in drug with a one-way ANOVA. Calculate Cohen’s f for drug.
library(effectsize)
options(contrasts=c("contr.sum","contr.poly"))
lep.aov.01 <- aov(after ~ drug,data=leprosy)
summary(lep.aov.01)
##             Df Sum Sq Mean Sq F value Pr(>F)
## drug         2  293.6  146.80   3.983 0.0305
## Residuals   27  995.1   36.86
cohens_f(lep.aov.01)
## # Effect Size for ANOVA
## 
## Parameter | Cohen's f |      95% CI
## -----------------------------------
## drug      |      0.54 | [0.12, Inf]
## 
## - One-sided CIs: upper bound fixed at [Inf].
  1. Evaluate group differences in Movement with an ANCOVA, using Baseline as the covariate. Calculate Cohen’s f for Group.
lep.aov.02 <- aov(after ~ baseline + drug,data=leprosy)
summary(lep.aov.02)
##             Df Sum Sq Mean Sq F value   Pr(>F)
## baseline     1  802.9   802.9  50.039 1.64e-07
## drug         2   68.6    34.3   2.136    0.138
## Residuals   26  417.2    16.0
cohens_f(lep.aov.02)
## # Effect Size for ANOVA (Type I)
## 
## Parameter | Cohen's f (partial) |      95% CI
## ---------------------------------------------
## baseline  |                1.39 | [0.93, Inf]
## drug      |                0.41 | [0.00, Inf]
## 
## - One-sided CIs: upper bound fixed at [Inf].
  1. ANCOVA makes the so-called homogeneity of slopes assumption. What is that assumption? Check its validity for the leprosy data.
## We will discuss this answer in class.
# check homogeneity of slopes assumption:
lep.aov.03 <- aov(after ~ baseline + drug + baseline:drug,data=leprosy)
summary(lep.aov.03) # the interaction is not significant
##               Df Sum Sq Mean Sq F value   Pr(>F)
## baseline       1  802.9   802.9  48.473 3.37e-07
## drug           2   68.6    34.3   2.069    0.148
## baseline:drug  2   19.6     9.8   0.593    0.561
## Residuals     24  397.6    16.6
  1. Calculate the group means and the adjusted group means for drug.
with(leprosy,tapply(after,drug,mean)) # group means
##    A    B    C 
##  5.3  6.1 12.3
library(effects)
effect(term="drug",lep.aov.02) # adjusted group means
## 
##  drug effect
## drug
##         A         B         C 
##  6.714963  6.823935 10.161102
# with emmeans:
library(emmeans)
lep.emm <- emmeans(lep.aov.02,specs="drug") # adjusted group means
lep.emm
##  drug emmean   SE df lower.CL upper.CL
##  A      6.71 1.29 26     4.07     9.36
##  B      6.82 1.27 26     4.21     9.44
##  C     10.16 1.32 26     7.46    12.87
## 
## Confidence level used: 0.95
  1. Evaluate the statistical significance of all pairwise differences between adjusted group means
pairs(lep.emm)
##  contrast estimate   SE df t.ratio p.value
##  A - B      -0.109 1.80 26  -0.061  0.9980
##  A - C      -3.446 1.89 26  -1.826  0.1809
##  B - C      -3.337 1.85 26  -1.800  0.1893
## 
## P value adjustment: tukey method for comparing a family of 3 estimates

Question 6

load(url("http://pnb.mcmaster.ca/bennett/psy710/p3/jobSatisfaction.rda"))
summary(jobSatisfaction)
##        id        gender        education   satisfaction   
##  s3     : 1   male  :25   school    :17   Min.   : 4.780  
##  s4     : 1   female:28   college   :16   1st Qu.: 5.940  
##  s5     : 1               university:20   Median : 6.450  
##  s6     : 1                               Mean   : 7.058  
##  s7     : 1                               3rd Qu.: 8.700  
##  s8     : 1                               Max.   :10.000  
##  (Other):47
  1. A survey was used to assess job satisfaction of 58 randomly selected employees of a large corporation. The survey recorded the participant’s gender, level of education, and their job satisfaction score on a standard test. The data are stored in the data frame jobSatisfaction.
  1. Determine if the 2 (gender) \(\times\) 3 (education) design is balanced.
xtabs(~gender+education,jobSatisfaction)
##         education
## gender   school college university
##   male        7       8         10
##   female     10       8         10
  1. Use ANOVA to evaluate the effects of gender, education, and the gender \(\times\) education interaction on satisfaction.
jobs.aov.01 <- aov(satisfaction ~ gender*education,data=jobSatisfaction)
summary(jobs.aov.01)
##                  Df Sum Sq Mean Sq F value  Pr(>F)
## gender            1   1.58    1.58   4.954 0.03087
## education         2 106.90   53.45 167.258 < 2e-16
## gender:education  2   4.11    2.05   6.426 0.00341
## Residuals        47  15.02    0.32
# significant interaction so I'll use Type 3 SSs
library(car)
Anova(jobs.aov.01,type=3)
## Anova Table (Type III tests)
## 
## Response: satisfaction
##                   Sum Sq Df   F value    Pr(>F)
## (Intercept)      2494.64  1 7806.5853 < 2.2e-16
## gender              0.21  1    0.6620  0.419944
## education         108.45  2  169.6946 < 2.2e-16
## gender:education    4.11  2    6.4263  0.003411
## Residuals          15.02 47
  1. Analyze the simple main effect of education for each gender.
jobs.ed.male <- aov(satisfaction ~ education,data=subset(jobSatisfaction,gender=="male"))
jobs.ed.female <- aov(satisfaction ~ education,data=subset(jobSatisfaction,gender=="female"))
# using local error term:
summary(jobs.ed.male)
##             Df Sum Sq Mean Sq F value   Pr(>F)
## education    2  73.26   36.63   249.9 7.48e-16
## Residuals   22   3.22    0.15
summary(jobs.ed.female)
##             Df Sum Sq Mean Sq F value   Pr(>F)
## education    2  37.74  18.871      40 1.62e-08
## Residuals   25  11.79   0.472

# with emmeans (using global error term)
library(emmeans)
jobs.emm <- emmeans(jobs.aov.01,specs=~education, by="gender")
contrast(jobs.emm,method="pairwise",by="gender")
## gender = male:
##  contrast             estimate    SE df t.ratio p.value
##  school - college       -0.921 0.293 47  -3.148  0.0079
##  school - university    -3.909 0.279 47 -14.032  <.0001
##  college - university   -2.988 0.268 47 -11.144  <.0001
## 
## gender = female:
##  contrast             estimate    SE df t.ratio p.value
##  school - college       -0.708 0.268 47  -2.639  0.0297
##  school - university    -2.665 0.253 47 -10.542  <.0001
##  college - university   -1.957 0.268 47  -7.299  <.0001
## 
## P value adjustment: tukey method for comparing a family of 3 estimates
joint_tests(jobs.emm,by="gender")
## gender = male:
##  model term df1 df2 F.ratio p.value
##  education    2  47 114.632  <.0001
## 
## gender = female:
##  model term df1 df2 F.ratio p.value
##  education    2  47  59.053  <.0001

Question 7

load(url("http://pnb.mcmaster.ca/bennett/psy710/p3/catfood.rda"))
summary(catfood)
##        food       protein     
##  discount:10   Min.   :33.26  
##  original:10   1st Qu.:33.88  
##                Median :34.15  
##                Mean   :34.15  
##                3rd Qu.:34.53  
##                Max.   :34.73
sapply(catfood,class)
##      food   protein 
##  "factor" "numeric"
  1. A food science engineer for a pet food company tests a new, less expensive formulation of their popular cat food. The engineer wants to ensure that the protein content of the discount formulation is the same as the protein content of the original food. The engineer measures the amount of protein in 100 gram samples of both formulations of the food to test whether they are equivalent to within ±0.5 grams. The data are contained in the data frame catfood which contains the factor food and the numeric variable protein (measured in grams).
  1. Use a 2-tailed \(t\) test to determine if the original and discount cat food differed in terms of protein.
t.test(protein~food,data=catfood)
## 
##  Welch Two Sample t-test
## 
## data:  protein by food
## t = -0.46504, df = 16.331, p-value = 0.648
## alternative hypothesis: true difference in means between group discount and group original is not equal to 0
## 95 percent confidence interval:
##  -0.4890736  0.3128630
## sample estimates:
## mean in group discount mean in group original 
##               34.10874               34.19685
  1. Use an equivalence test to determine if protein levels in the two foods are equivalent (i.e., within ±0.5 grams).
# using 90% CI method:
t.test(protein~food,data=catfood,conf.level=0.90)
## 
##  Welch Two Sample t-test
## 
## data:  protein by food
## t = -0.46504, df = 16.331, p-value = 0.648
## alternative hypothesis: true difference in means between group discount and group original is not equal to 0
## 90 percent confidence interval:
##  -0.4184628  0.2422522
## sample estimates:
## mean in group discount mean in group original 
##               34.10874               34.19685
# 90% CI is within ±0.5, so two means are equivalent
# using TOST method:
library(TOSTER)
UPPER_BOUND <- 0.5
LOWER_BOUND <- -0.5
p.discount <- catfood$protein[catfood$food=="discount"]
p.original <- catfood$protein[catfood$food=="original"]
t_TOST(x=p.discount,y=p.original,low_eqbound=LOWER_BOUND,high_eqbound=UPPER_BOUND) 
## 
## Welch Two Sample t-test
## 
## The equivalence test was significant, t(16.33) = 2.17, p = 0.02
## The null hypothesis test was non-significant, t(16.33) = -0.465p = 0.65
## NHST: don't reject null significance hypothesis that the effect is equal to zero 
## TOST: reject null equivalence hypothesis
## 
## TOST Results 
##                 t    df p.value
## t-test     -0.465 16.33   0.648
## TOST Lower  2.174 16.33   0.022
## TOST Upper -3.104 16.33   0.003
## 
## Effect Sizes 
##                Estimate     SE              C.I. Conf. Level
## Raw            -0.08811 0.1895 [-0.4185, 0.2423]         0.9
## Hedges's g(av) -0.19825 0.4868 [-0.8987, 0.5082]         0.9
## Note: SMD confidence intervals are an approximation. See vignette("SMD_calcs").

# using 2 one-tailed t tests
UPPER_BOUND <- 0.5
LOWER_BOUND <- -0.5
t.test(protein~food,data=catfood,mu=LOWER_BOUND,alternative="greater") # significantly greater than LOWER_BOUND
## 
##  Welch Two Sample t-test
## 
## data:  protein by food
## t = 2.1741, df = 16.331, p-value = 0.02236
## alternative hypothesis: true difference in means between group discount and group original is greater than -0.5
## 95 percent confidence interval:
##  -0.4184628        Inf
## sample estimates:
## mean in group discount mean in group original 
##               34.10874               34.19685
t.test(protein~food,data=catfood,mu=UPPER_BOUND,alternative="less") # significantly less than UPPER_BOUND
## 
##  Welch Two Sample t-test
## 
## data:  protein by food
## t = -3.1042, df = 16.331, p-value = 0.003345
## alternative hypothesis: true difference in means between group discount and group original is less than 0.5
## 95 percent confidence interval:
##       -Inf 0.2422522
## sample estimates:
## mean in group discount mean in group original 
##               34.10874               34.19685

# reject the null hypothesis of non-equivalence
# favor of the alternative hypothesis that the means are equivalent