Pulp Brightness Study

The first set of data comes from an experiment that measured how the brightness of paper pulp varied across shift operators. The data frame pulp (in the faraway library) contains five measurements for each of four operators.

# install.packages("faraway")
library(faraway)
data(pulp)
# ?pulp # read about the data frame
summary(pulp)
##      bright      operator
##  Min.   :59.80   a:5     
##  1st Qu.:60.00   b:5     
##  Median :60.50   c:5     
##  Mean   :60.40   d:5     
##  3rd Qu.:60.73           
##  Max.   :61.00
sapply(pulp,class)
##    bright  operator 
## "numeric"  "factor"
xtabs(~operator,pulp)
## operator
## a b c d 
## 5 5 5 5

aov

The first ANOVA treats operator as a fixed factor. For this one-way design, the random effect of operator is evaluated by dividing the between-group mean square by the within-group mean square. Therefore the \(F\) and p values in the table are correct, and we would reject the null hypothesis \(\sigma^2_{\alpha} = 0\) in favor of the alternative, \(\sigma^2_{\alpha} > 0\).

# treat operator as fixed
options(contrasts=c("contr.sum","contr.poly"))
pulp.aov.00 <- aov(bright~1+operator,data=pulp)
summary(pulp.aov.00)
##             Df Sum Sq Mean Sq F value Pr(>F)
## operator     3   1.34  0.4467   4.204 0.0226
## Residuals   16   1.70  0.1062

In the next ANOVA, we use Error to explicitly encode operator is as a random factor. The ANOVA table lists the two random components but does not provide an \(F\) test for operator.

pulp.aov.01 <- aov(bright~1+ Error(operator),data=pulp)
summary(pulp.aov.01)
## 
## Error: operator
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals  3   1.34  0.4467               
## 
## Error: Within
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 16    1.7  0.1062
(F.operator <- 0.4467/0.1062)
## [1] 4.206215
(p.operator <- 1-pf(F.operator,df1=3,df2=16))
## [1] 0.02256817

The ANOVA estimate of the variance component and association strength for operator can be extracted from the values in the ANOVA table or with the anovaMM command.

( op.vc <- (0.4467-0.1062)/5 ) # variance component
## [1] 0.0681
( op.icc <- (op.vc)/(op.vc + 0.1065) ) # intraclass correlation
## [1] 0.3900344
library(VCA)
pulp.vca <- anovaMM(bright ~ 1 + (operator),Data=pulp)
print(pulp.vca,digits=3)
## 
## 
## Result Variance Component Analysis:
## -----------------------------------
## 
##   Name     DF    SS   MS    VC    %Total SD    CV[%]
## 1 total    9.767            0.174 100    0.418 0.691
## 2 operator 3     1.34 0.447 0.068 39.054 0.261 0.432
## 3 error    16    1.7  0.106 0.106 60.946 0.326 0.54 
## 
## Mean: 60.4 (N = 20) 
## 
## Experimental Design: balanced  |  Method: ANOVA

lmer in lme4

Next, we use lmer to fit a mixed model to the data. An \(F\) test for the fixed effects is provided by anova, and the random effects are evaluated with chi-square tests by ranova.

library(lmerTest)
pulp.lme.01 <- lmer(bright~1+(1|operator),data=pulp)
# summary(pulp.lme.01)
anova(pulp.lme.01)
## Type III Analysis of Variance Table with Satterthwaite's method
##      Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
ranova(pulp.lme.01)
## ANOVA-like table for random-effects: Single term deletions
## 
## Model:
## bright ~ (1 | operator)
##                npar   logLik    AIC    LRT Df Pr(>Chisq)
## <none>            3  -9.3131 24.626                     
## (1 | operator)    2 -11.0482 26.096 3.4701  1    0.06249

Notice that the chi square test for operator was not significant, and therefore we would not reject the null hypothesis \(\sigma^2_\mathit{op}=0\). However, for several reasons the chi square tests reported by ranova tend to be conservative (Bates et al. 2015). Faraway (2006) describes alternative ways of evaluating the null hypothesis that the variance of the random effect is zero.

The variance component and intraclass correlation extracted from the lmer fit are the same as the ANOVA estimates.

pulp.vc.lme <- VarCorr(pulp.lme.01)
print(pulp.vc.lme,comp=c("Variance","Std.Dev."))
##  Groups   Name        Variance Std.Dev.
##  operator (Intercept) 0.068083 0.26093 
##  Residual             0.106250 0.32596
library(performance)
icc(pulp.lme.01) # intraclass correlation
## # Intraclass Correlation Coefficient
## 
##     Adjusted ICC: 0.391
##   Unadjusted ICC: 0.391

lme in nlme

Finally, we fit the data with the lme function in the nlme package.

library(nlme)
pulp.nlme.01 <- lme(bright~1, data=pulp, random=~1|operator)
anova(pulp.nlme.01)
##             numDF denDF  F-value p-value
## (Intercept)     1    16 163349.8  <.0001
VarCorr(pulp.nlme.01)
## operator = pdLogChol(1) 
##             Variance   StdDev   
## (Intercept) 0.06808373 0.2609286
## Residual    0.10624992 0.3259600

Cognitive development

In this section I will analyze the data presented in Table 11.5 in Maxwell and Delaney (2004). The data are from a fictitious experiment that measured cognitive ability in 12 children at 30, 36, 42, and 48 months of age. The data file mw11_5.rda contains the data frames mw115 and mw115L, but only mw115L is used here.

load(url("http://pnb.mcmaster.ca/bennett/psy710/datasets/mw11_5.rda") )
# summary(mw115)
summary(mw115L)
##       subj     age         score       
##  s1     : 4   a30:12   Min.   : 84.00  
##  s2     : 4   a36:12   1st Qu.: 98.25  
##  s3     : 4   a42:12   Median :107.00  
##  s4     : 4   a48:12   Mean   :108.00  
##  s5     : 4            3rd Qu.:117.25  
##  s6     : 4            Max.   :133.00  
##  (Other):24

aov

For this design, age is a fixed variable and subjects is a random variable. The two factors are crossed, and therefore the interaction between subjects and age also is a random factor. The following command uses Error to identify subj as a random effect, and that age is a within-subjects factor.

xtabs(~age+subj,mw115L)
##      subj
## age   s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12
##   a30  1  1  1  1  1  1  1  1  1   1   1   1
##   a36  1  1  1  1  1  1  1  1  1   1   1   1
##   a42  1  1  1  1  1  1  1  1  1   1   1   1
##   a48  1  1  1  1  1  1  1  1  1   1   1   1
options(contrasts=c("contr.sum","contr.poly"))
cog.aov.01 <- aov(score~age+Error(subj/age),data=mw115L)
summary(cog.aov.01)
## 
## Error: subj
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 11   6624   602.2               
## 
## Error: subj:age
##           Df Sum Sq Mean Sq F value Pr(>F)
## age        3    552  184.00   3.027 0.0432
## Residuals 33   2006   60.79

The experiment contains only one measure for each subject at each age, and therefore there is no way to distinguish interaction effects from error. In other words, the subject x age interaction is confounded with error, and there is no way to evaluate the null hypothesis that the interaction effects are zero. Nevertheless, the interaction shows up in the ANOVA summary table as the error term that is used to evaluate the fixed, within-subject effect of age. There is no unbiased \(F\) test for the effect of subjects.

afex

In the previous ANOVA, the \(p\) value for age was calculated assuming that variance-covariance matrix of the within-subject effect is spherical. If the sphericity assumption is not valid, then the \(p\) value will be too small. The next code uses the aov_car command in the afex package. The Mauchly test is significant, which suggests that the sphericity assumption is not valid for these data. The last part of the output presents the adjusted \(p\) value using the Greenhouse-Geisser (\(\hat{\epsilon}\)) and Huynh-Feldt (\(\tilde{\epsilon}\)) estimates of sphericity.

library(afex)
cog.aov.02 <- aov_car(score~age+Error(subj/age),data=mw115L)
# identical results produced by following model:
# cog.aov.03 <- aov_car(score~age+Error(subj/age),data=mw115L)
check_sphericity(cog.aov.02) # separate Mauchly test
## Warning: Sphericity violated for: 
##  -  (p = 0.018).
summary(cog.aov.02) # includes Mauchly test
## 
## Univariate Type III Repeated-Measures ANOVA Assuming Sphericity
## 
##             Sum Sq num Df Error SS den Df  F value    Pr(>F)
## (Intercept) 559872      1     6624     11 929.7391 5.586e-12
## age            552      3     2006     33   3.0269   0.04322
## 
## 
## Mauchly Tests for Sphericity
## 
##     Test statistic  p-value
## age        0.24265 0.017718
## 
## 
## Greenhouse-Geisser and Huynh-Feldt Corrections
##  for Departure from Sphericity
## 
##      GG eps Pr(>F[GG])
## age 0.60954    0.07479
## 
##        HF eps Pr(>F[HF])
## age 0.7248502 0.06353773

lmer in lme4

Here we analyze the data using lmer. In the following model, each level of the random grouping variable subj is allowed to have its own random intercept. The fixed effect of age is significant, as is the random effect of subjects.

library(lmerTest)
cog.lmer.01 <- lmer(score~age+(1|subj),data=mw115L)
anova(cog.lmer.01) # assumes sphericity
## Type III Analysis of Variance Table with Satterthwaite's method
##     Sum Sq Mean Sq NumDF DenDF F value  Pr(>F)
## age    552     184     3    33  3.0269 0.04322
ranova(cog.lmer.01)
## ANOVA-like table for random-effects: Single term deletions
## 
## Model:
## score ~ age + (1 | subj)
##            npar  logLik    AIC    LRT Df Pr(>Chisq)
## <none>        6 -171.76 355.53                     
## (1 | subj)    5 -184.92 379.85 26.318  1  2.896e-07

The \(F\) test for age assumes that errors at each level of the within-subject variable are independent, identically distributed random variables with mean zero and constant variance. This assumption is a special form of compound symmetry, where the variances are constant across levels and the covariances (or correlations) are zero. In other words, the \(F\) test for age assumes the variance-covariance satisfies the sphericity constraint. One way of evaluating the effect of age that does not assume sphericity is to remove age from the model and then use a chi square test to evaluate the change in the goodness of fit. The \(p\) value is significant, suggesting that age should not be dropped from the model.

cog.lmer.02 <- lmer(score~1+(1|subj),data=mw115L) # remove age
anova(cog.lmer.02,cog.lmer.01) # evaluate change in deviance
## Data: mw115L
## Models:
## cog.lmer.02: score ~ 1 + (1 | subj)
## cog.lmer.01: score ~ age + (1 | subj)
##             npar    AIC    BIC  logLik deviance Chisq Df Pr(>Chisq)
## cog.lmer.02    3 371.47 377.08 -182.73   365.47                    
## cog.lmer.01    6 368.71 379.94 -178.36   356.71 8.751  3    0.03279

lme in nlme

Another way to evaluate the effect of age is to construct a model that explicitly includes a particular pattern of association between the levels the within subject factor. The following chunk of code uses nlme to construct three mixed models. The first model assumes the errors at each level of age are independent, the second model assumes compound symmetry (i.e., the levels are correlated and that the correlation is constant), and the third model imposes no constraints on the covariances/correlations (though it still assumes the variance is the same at each level).

library(nlme)
# assume independence:
cog.nlme.00 <- lme(score~age,data=mw115L,
                random=~1|subj)
# assume compound symmetry:
cog.nlme.01 <- lme(score~age,data=mw115L,
                random=~1|subj,
                correlation=corCompSymm(value=0.5,form=~1|subj))
# no constraints on between-level correlations:
cog.nlme.02 <- lme(score~age,data=mw115L,
                random=~1|subj,
                correlation=corSymm(value=c(.5,.5,.5,.5,.5,.5),form=~1|subj))

The \(p\) for the age \(F\) test in the first two models are identical. This makes sense because the assumption of independent within-subject errors made in the first model is, as far as the \(F\) test concerned, equivalent to the assumption of compound symmetry in the second model. The results of the \(F\) test differ slightly in the third model, which uses a more complex variance-covariance matrix and provides a significantly better fit to the data than the other two models.

anova(cog.nlme.00) # independent
##             numDF denDF  F-value p-value
## (Intercept)     1    33 929.7391  <.0001
## age             3    33   3.0269  0.0432
anova(cog.nlme.01) # compound symmetry
##             numDF denDF  F-value p-value
## (Intercept)     1    33 929.7364  <.0001
## age             3    33   3.0269  0.0432
anova(cog.nlme.02) # no constraints
##             numDF denDF  F-value p-value
## (Intercept)     1    33 916.3921  <.0001
## age             3    33   2.6743  0.0633
anova(cog.nlme.00,cog.nlme.01,cog.nlme.02) # 3rd model fits better
##             Model df      AIC      BIC    logLik   Test  L.Ratio p-value
## cog.nlme.00     1  6 355.5288 366.2340 -171.7644                        
## cog.nlme.01     2  7 357.5288 370.0182 -171.7644 1 vs 2  0.00000  1.0000
## cog.nlme.02     3 12 352.6203 374.0306 -164.3102 2 vs 3 14.90853  0.0108
summary(cog.nlme.00) # independent
## Linear mixed-effects model fit by REML
##   Data: mw115L 
##        AIC     BIC    logLik
##   355.5288 366.234 -171.7644
## 
## Random effects:
##  Formula: ~1 | subj
##         (Intercept) Residual
## StdDev:    11.63394 7.796658
## 
## Fixed effects:  score ~ age 
##             Value Std.Error DF   t-value p-value
## (Intercept)   108  3.541956 33 30.491624  0.0000
## age1           -5  1.949165 33 -2.565202  0.0150
## age2           -1  1.949165 33 -0.513040  0.6113
## age3            2  1.949165 33  1.026081  0.3123
##  Correlation: 
##      (Intr) age1   age2  
## age1  0.000              
## age2  0.000 -0.333       
## age3  0.000 -0.333 -0.333
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -1.88627256 -0.44709016 -0.09063144  0.53923837  1.70500967 
## 
## Number of Observations: 48
## Number of Groups: 12
summary(cog.nlme.01) # compound symmetry
## Linear mixed-effects model fit by REML
##   Data: mw115L 
##        AIC      BIC    logLik
##   357.5288 370.0182 -171.7644
## 
## Random effects:
##  Formula: ~1 | subj
##         (Intercept) Residual
## StdDev:    7.217929 12.00159
## 
## Correlation Structure: Compound symmetry
##  Formula: ~1 | subj 
##  Parameter estimate(s):
##       Rho 
## 0.5779745 
## Fixed effects:  score ~ age 
##             Value Std.Error DF   t-value p-value
## (Intercept)   108  3.541962 33 30.491579  0.0000
## age1           -5  1.949164 33 -2.565203  0.0150
## age2           -1  1.949164 33 -0.513041  0.6113
## age3            2  1.949164 33  1.026081  0.3123
##  Correlation: 
##      (Intr) age1   age2  
## age1  0.000              
## age2  0.000 -0.333       
## age3  0.000 -0.333 -0.333
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -1.59922897 -0.45106373 -0.02883482  0.61209065  1.61851718 
## 
## Number of Observations: 48
## Number of Groups: 12
summary(cog.nlme.02) # no constraints
## Linear mixed-effects model fit by REML
##   Data: mw115L 
##        AIC      BIC    logLik
##   352.6203 374.0306 -164.3102
## 
## Random effects:
##  Formula: ~1 | subj
##         (Intercept) Residual
## StdDev:    10.53035 9.956904
## 
## Correlation Structure: General
##  Formula: ~1 | subj 
##  Parameter estimate(s):
##  Correlation: 
##   1     2     3    
## 2 0.601            
## 3 0.391 0.579      
## 4 0.226 0.050 0.718
## Fixed effects:  score ~ age 
##             Value Std.Error DF   t-value p-value
## (Intercept)   108  3.735794 33 28.909521  0.0000
## age1           -5  1.952640 33 -2.560636  0.0152
## age2           -1  1.939977 33 -0.515470  0.6097
## age3            2  1.367762 33  1.462243  0.1531
##  Correlation: 
##      (Intr) age1   age2  
## age1 -0.018              
## age2 -0.015  0.131       
## age3  0.164 -0.820 -0.248
## 
## Standardized Within-Group Residuals:
##        Min         Q1        Med         Q3        Max 
## -1.5286280 -0.5366246 -0.1104848  0.6271287  1.5012144 
## 
## Number of Observations: 48
## Number of Groups: 12

variance components

Next we estimate the variance components. The anova estimates can be calculated with anovaMM.

# anova variance components
library(VCA)
cog.aov.vca <- anovaMM(score~age+(subj),Data=mw115L)
print(cog.aov.vca,digits=3)
## 
## 
## ANOVA-Type Estimation of Mixed Model:
## --------------------------------------
## 
##  [Fixed Effects]
## 
##    int agea30 agea36 agea42 agea48 
##    112     -9     -5     -2      0 
## 
## 
##  [Variance Components]
## 
##   Name  DF     SS   MS      VC      %Total SD     CV[%] 
## 1 total 18.117              196.136 100    14.005 12.967
## 2 subj  11     6624 602.182 135.348 69.007 11.634 10.772
## 3 error 33     2006 60.788  60.788  30.993 7.797  7.219 
## 
## Mean: 108 (N = 48) 
## 
## Experimental Design: balanced  |  Method: ANOVA

The estimates from lmer and lme are calculated using VarCorr. Note that the variance components extracted from the two models that assume independent errors across the levels of age are equal to the anova estimates.

# lmer 
cog.lmer.vca <- VarCorr(cog.lmer.01) # independence
print(cog.lmer.vca,comp=c("Variance","Std.Dev."))
##  Groups   Name        Variance Std.Dev.
##  subj     (Intercept) 135.348  11.6339 
##  Residual              60.788   7.7967
# lme
VarCorr(cog.nlme.00) # independence
## subj = pdLogChol(1) 
##             Variance  StdDev   
## (Intercept) 135.34848 11.633937
## Residual     60.78788  7.796658
VarCorr(cog.nlme.01) # compound symmetry
## subj = pdLogChol(1) 
##             Variance  StdDev   
## (Intercept)  52.09849  7.217929
## Residual    144.03827 12.001595
VarCorr(cog.nlme.02) # no constraints
## subj = pdLogChol(1) 
##             Variance  StdDev   
## (Intercept) 110.88829 10.530351
## Residual     99.13995  9.956904

checking residuals

The models assume that the errors/residuals and the random effects of age are distributed normally. We do not have enough levels of age to test the latter assumption, but we can check the within-subject residuals. The residuals look reasonably normal, and there are no obvious outliers (which can strongly influence the estimates of variance components).

shapiro.test(residuals(cog.aov.02))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(cog.aov.02)
## W = 0.96965, p-value = 0.2455
shapiro.test(residuals(cog.lmer.01))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(cog.lmer.01)
## W = 0.98053, p-value = 0.6008
shapiro.test(residuals(cog.nlme.02))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(cog.nlme.02)
## W = 0.97361, p-value = 0.3478
par(mfrow=c(1,3))
qqnorm(residuals(cog.aov.02),main="aov");qqline(residuals(cog.aov.02))
qqnorm(residuals(cog.lmer.01),main="lmer/lme4");qqline(residuals(cog.lmer.01))
qqnorm(residuals(cog.nlme.02),main="lme/nlme");qqline(residuals(cog.nlme.02))

linear contrasts

In this section we illustrate how to perform linear contrasts on the levels of the fixed factor, age.

wLinear <- contr.poly(n=4)[,1]
y.mat <- data.matrix(mw115[,1:4])
linTrends <-  y.mat %*% wLinear
# uses local error estimate
t.test(linTrends)
## 
##  One Sample t-test
## 
## data:  linTrends
## t = 2.2414, df = 11, p-value = 0.04659
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##   0.1208294 13.2955785
## sample estimates:
## mean of x 
##  6.708204
summary(aov(linTrends~1),intercept=T)
##             Df Sum Sq Mean Sq F value Pr(>F)
## (Intercept)  1    540   540.0   5.024 0.0466
## Residuals   11   1182   107.5
# uses global error estimate
summary(cog.aov.01)
## 
## Error: subj
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 11   6624   602.2               
## 
## Error: subj:age
##           Df Sum Sq Mean Sq F value Pr(>F)
## age        3    552  184.00   3.027 0.0432
## Residuals 33   2006   60.79
MS.err <- 60.79
df.err <- 33
( F.global <- 540/MS.err )
## [1] 8.88304
1-pf(F.global,1,df.err)
## [1] 0.005370015
library(emmeans)
cog.emm <- emmeans(cog.aov.01,specs = "age")
# uses global estimate of error:
contrast(cog.emm,method=list(linear=wLinear))
##  contrast estimate   SE df t.ratio p.value
##  linear       6.71 2.25 33   2.980  0.0054
library(emmeans)
cog.emm <- emmeans(cog.lmer.01,specs = "age")
contrast(cog.emm,method=list(linear=wLinear))
##  contrast estimate   SE df t.ratio p.value
##  linear       6.71 2.25 33   2.980  0.0054
## 
## Degrees-of-freedom method: kenward-roger

unbalanced data

This section illustrates one significant advantage that lme and nlme, which use maximum likelihood methods to estimate parameters of the model, have over anova methods. Specifically, lme and nlme work on unbalanced data whereas anova either fails entirely or removes cases with missing data.

mw2 <- mw115L[c(-2,-8,-18),] # make unbalanced data
VCA::isBalanced(score~age+subj,mw115L)
## [1] TRUE
VCA::isBalanced(score~age+subj,mw2)
## [1] FALSE
# following command does NOT work:
cog.aov.11 <- aov(score~age+Error(subj/age),data=mw2)
## Warning in aov(score ~ age + Error(subj/age), data = mw2): Error() model is
## singular
# this command removes subjects with missing data:
cog.aov.12 <- aov_car(score~age+Error(subj/age),data=mw2)
## Warning: Missing values for 3 ID(s), which were removed before analysis:
## s2, s6, s8
## Below the first few rows (in wide format) of the removed cases with missing data.
##     subj a30 a36 a42 a48
## # 2   s2  NA 117 127 133
## # 6   s6 110  NA  96  91
## # 8   s8  NA  84 101 113
anova(cog.aov.12)
## Anova Table (Type 3 tests)
## 
## Response: score
##     num Df den Df    MSE      F      ges  Pr(>F)
## age  1.844 14.752 64.746 2.9725 0.053271 0.08548
# lmer & lme provide reasonable fits
cog.lmer.11 <- lmer(score~age+(1|subj),data=mw2)
cog.nlme.12 <- lme(score~age,data=mw115L,
                random=~1|subj,
                correlation=corSymm(value=c(.3,.3,.3,.3,.3,.3),form=~1|subj))
anova(cog.lmer.11)
## Type III Analysis of Variance Table with Satterthwaite's method
##     Sum Sq Mean Sq NumDF  DenDF F value Pr(>F)
## age 358.82  119.61     3 30.156  2.1094 0.1198
anova(cog.nlme.12)
##             numDF denDF  F-value p-value
## (Intercept)     1    33 916.3946  <.0001
## age             3    33   2.6743  0.0633

Sleep Study (continuous within-subject variable)

In this section we will analyze the data from an experiment measured the effects of sleep deprivation on reaction time (Belenky et al. 2003). Reaction time was measured on ten consecutive days in each of 18 subjects. Days 0-1 were practice sessions and day 2 was considered a baseline measurement. Sleep deprivation started after day 2. The data are in the sleepstudy data frame in the lmerTest package. Because sleep deprivation started after day 2, I am going to subtract 2 from the Days variable so that zero corresponds to baseline, and then extract data from all days \(\ge 0\).

library(lmerTest)
data(sleepstudy)
sleepstudy$Days <- sleepstudy$Days - 2; # zero is baseline
sleepstudy <- subset(sleepstudy,Days >= 0)
sapply(sleepstudy,class)
##  Reaction      Days   Subject 
## "numeric" "numeric"  "factor"
xtabs(~Days+Subject,sleepstudy) # crossed & balanced
##     Subject
## Days 308 309 310 330 331 332 333 334 335 337 349 350 351 352 369 370 371 372
##    0   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1
##    1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1
##    2   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1
##    3   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1
##    4   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1
##    5   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1
##    6   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1
##    7   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1

The data are shown in Figure 1. Reaction time is (to a first approximation) a linear function of day for most subjects, with subject 332 being a notable exception.

library(lattice)
print(xyplot(Reaction ~ Days | Subject, sleepstudy, aspect = "xy",
                    layout = c(9,2), type = c("g", "p", "r"),
                    index.cond = function(x,y) coef(lm(y ~ x))[1],
                    xlab = "Days of sleep deprivation",
                    ylab = "Average reaction time (ms)"))

Examination of the parameters of the best-fitting lines suggest that there is a lot of between-subject variation in the intercept and slope, and that there there is no obvious relation between the two coefficients.

# fit lines to each subject
sleep.lm.list <- lme4::lmList(Reaction ~ Days | Subject, sleepstudy)
# plot the intercepts & slopes
print( plot(confint(sleep.lm.list,pooled=TRUE),order=1) )
Fig 2. 95% confidence intervals.

Fig 2. 95% confidence intervals.

ANOVA

First we will analyze the data with aov, treating Days as a numeric, repeated variable. In this case the within-subject variable is numeric, not a factor, and therefore has a single degree of freedom. Note that this means that the compound symmetry assumption is valid, and there is no need to adjust the \(F\) and \(p\) values. The following model treats Subject and the Subject by Days interaction as random, which corresponds to allowing the intercept and slope of the best-fitting line to vary across subjects. The ANOVA assumes that the two random effects are independent.

# The two ANOVAs are equivalent
# sleep.aov.00 <- aov(Reaction~Days +Error(Subject/Days),data=sleepstudy)
# summary(sleep.aov.00)
sleep.aov.01 <- aov(Reaction~Days +Error(Subject + Subject:Days),data=sleepstudy)
summary(sleep.aov.01)
## 
## Error: Subject
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 17 253102   14888               
## 
## Error: Subject:Days
##           Df Sum Sq Mean Sq F value   Pr(>F)
## Days       1  98861   98861    38.4 9.75e-06
## Residuals 17  43761    2574                 
## 
## Error: Within
##            Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 108  70373   651.6

Next, we compute the ANOVA estimates of the variance components. The variance component for Subject, which corresponds to the intercepts of the regression lines, is nearly twice the within-subject variance, and the interaction between subjects and days accounts for only 2.3% of the total variance. These results are consistent with Figure 2.

library(VCA)
sleep.vca <- anovaMM(Reaction~Days + (Subject)+ (Subject:Days) ,Data=sleepstudy)
print(sleep.vca,digits=3)
## 
## 
## ANOVA-Type Estimation of Mixed Model:
## --------------------------------------
## 
##  [Fixed Effects]
## 
##     int    Days 
## 267.967  11.435 
## 
## 
##  [Variance Components]
## 
##   Name         DF     SS         MS       VC       %Total SD     CV[%] 
## 1 total        15.504                     1916.21  100    43.775 14.213
## 2 Subject      17     253102.113 14888.36 1218.835 63.607 34.912 11.335
## 3 Days:Subject 17     43761.449  2574.203 45.776   2.389  6.766  2.197 
## 4 error        108    70372.608  651.598  651.598  34.005 25.526 8.288 
## 
## Mean: 307.991 (N = 144) 
## 
## Experimental Design: unbalanced  |  Method: ANOVA

LMER

Next the sleepdata are analyzed with lmer. The the following three models differ in terms of the complexity of the random effects. The simplest model, sleep.lme.00, includes a random intercept for each subject. The most complex model, sleep.lme.02, includes a random intercept and slope for each subject, and the two random effects are allowed to be correlated. Model sleep.lme.01 includes uncorrelated intercepts and slopes. The goodness-of-fit of the three models can be compared with the anova command. The difference between sleep.lme.00 and sleep.lme.01 is significant, which suggests that including a random slope coefficient improves the fit significantly. The difference between sleep.lme.01 and sleep.lme.02 is not significant, which suggests that adding a correlation between the random effects does not improve the fit significantly. We might be tempted to use the simpler model (i.e., sleep.lme.01). Although models that include uncorrelated random effects are common, Bates et al. (2015) have noted that they have a subtle limitation: such models are not invariant to additive shifts of the continuous predictor variable (e.g., Days). In other words, treating the baseline day as day 2, as was the case in the original data set, would alter the estimation of the correlation as well as the fit and predictions of the model. Therefore, I will include the correlation and use sleep.lme.02.

library(lmerTest)
sleep.lme.00 <- lmer(Reaction ~ Days +  (1|Subject),data=sleepstudy)
sleep.lme.01 <- lmer(Reaction ~ Days +  (1|Subject) + (0+Days|Subject),data=sleepstudy)
sleep.lme.02 <- lmer(Reaction ~ Days +  (1+Days|Subject),data=sleepstudy)
anova(sleep.lme.00,sleep.lme.01,sleep.lme.02)
## Data: sleepstudy
## Models:
## sleep.lme.00: Reaction ~ Days + (1 | Subject)
## sleep.lme.01: Reaction ~ Days + (1 | Subject) + (0 + Days | Subject)
## sleep.lme.02: Reaction ~ Days + (1 + Days | Subject)
##              npar    AIC    BIC  logLik deviance   Chisq Df Pr(>Chisq)
## sleep.lme.00    4 1446.5 1458.4 -719.25   1438.5                      
## sleep.lme.01    5 1423.5 1438.4 -706.76   1413.5 24.9786  1  5.797e-07
## sleep.lme.02    6 1425.2 1443.0 -706.58   1413.2  0.3535  1     0.5521
vc <- VarCorr(sleep.lme.02)
print(vc,comp=c("Variance"))
##  Groups   Name        Variance Cov    
##  Subject  (Intercept) 958.352         
##           Days         45.778   37.205
##  Residual             651.597
# coef(sleep.lme.02) # list random effects for each subject
ranova(sleep.lme.02)
## ANOVA-like table for random-effects: Single term deletions
## 
## Model:
## Reaction ~ Days + (1 + Days | Subject)
##                              npar  logLik    AIC    LRT Df Pr(>Chisq)
## <none>                          6 -702.05 1416.1                     
## Days in (1 + Days | Subject)    4 -715.01 1438.0 25.926  2  2.346e-06
library(predictmeans)
varcomp(sleep.lme.02)
##                              vcov       SE    2.5 %    97.5 %
## Subject.(Intercept)      958.3517 423.4501 364.7374 2147.0867
## Subject.Days.(Intercept)  37.2046  67.1260 -84.8515  168.8046
## Subject.Days              45.7777  21.1287  16.0635  105.0366
## residual                 651.5973  88.6712 504.7482  861.3903
anova(sleep.lme.02)
## Type III Analysis of Variance Table with Satterthwaite's method
##      Sum Sq Mean Sq NumDF DenDF F value    Pr(>F)
## Days  25024   25024     1    17  38.404 9.751e-06
# residplot(sleep.lme.02)
# varcomp(sleep.aov.01) # doesn't work on aov
# residplot(sleep.aov.01) # doesn't work on aov

References

Bates, Douglas, Martin Mächler, Ben Bolker, and Steve Walker. 2015. “Fitting Linear Mixed-Effects Models Using Lme4.” Journal of Statistical Software 67 (1): 1–48. https://doi.org/10.18637/jss.v067.i01.
Belenky, Gregory, Nancy J Wesensten, David R Thorne, Maria L Thomas, Helen C Sing, Daniel P Redmond, Michael B Russo, and Thomas J Balkin. 2003. “Patterns of Performance Degradation and Restoration During Sleep Restriction and Subsequent Recovery: A Sleep Dose-Response Study.” J Sleep Res 12 (1): 1–12. https://doi.org/10.1046/j.1365-2869.2003.00337.x.
Faraway, Julian James. 2006. Extending the Linear Model with r: Generalized Linear, Mixed Effects and Nonparametric Regression Models. Chapman & Hall/CRC Texts in Statistical Science Series. Boca Raton: Chapman & Hall/CRC. http://www.loc.gov/catdir/enhancements/fy0646/2005054822-d.html.
Maxwell, Scott E, and Harold D Delaney. 2004. Designing Experiments and Analyzing Data: A Model Comparison Perspective. 2nd ed. Mahwah, N.J.: Lawrence Erlbaum Associates. http://www.loc.gov/catdir/enhancements/fy0662/2002015810-d.html.