1 Initialize R

Set R to use the sum-to-zero definition of effects and load the police data frame with the following commands:

options(contrasts=c("contr.sum","contr.poly"))
load(url("http://pnb.mcmaster.ca/bennett/psy710/datasets/police20.rda") )

Answer all of the following questions and submit your answers as a script file on Avenue 2 Learn. Make sure to begin your script file with the following lines:

# PSYCH 710 Lab 5 Homework
# Script File
# OCT-2023
# Your Name: <<Your name here>>
# Student ID: <<Your ID here>>
# Collaborators: <<Names of your collaborators here>>

Also, make sure that text that is not an R command is preceded by a comment symbol (#). For example, you can insert questions or comments among your commands like this:

# The following command doesn't work... not sure why...
# ttest(x=g1,y=g2) # was trying to do a t test

2 Police Data

The data frame police contains data from a hypothetical 3x3 between-subjects, factorial experiment. A police department conducted an experiment to evaluate its human relations course for new officers. The independent variables were the type of beat to which officers were assigned during the course (factor beat) and the length of the course (factor course). Each subject was assigned randomly to a single combination of beat and course. The factor beat has three levels: innercity, middleclass, and upperclass. The (unordered) factor course also has three levels: short, medium, and long. The dependent variable is attitude toward minority groups following the course. The data frame also contains a variable, id, which is an id number assigned to each subject.

The following code creates a new factor, condition, that represents the cells in the police experiment, and then plots the data.

# ?interaction # read the help page
police$condition <- interaction(police$course,police$beat)
levels(police$condition) <- c("short.inner","med.inner","long.inner",
                            "short.mid","med.mid","long.mid","short.up","med.up","long.up")
op <- par(no.readonly = T)
par(mar=c(7,4,4,1)+.1)
boxplot(attitude~condition,data=police,xlab="",ylab="attitude",cex.axis=1.2,cex.lab=1.5,las=2)

3 Questions

Use factorial ANOVA to evaluate the main effects beat and course and the beat x course interaction.

police.aov.01 <- aov(attitude~beat+course+beat:course,data=police)
summary(police.aov.01)
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## beat         2    190    95.0   1.520  0.23242    
## course       2   1543   771.7  12.347 8.26e-05 ***
## beat:course  4   1237   309.2   4.947  0.00278 ** 
## Residuals   36   2250    62.5                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Use bartlett.test to evaluate the hypothesis that the variance is equal across conditions/groups.

bartlett.test(attitude~condition,data=police)
## 
##  Bartlett test of homogeneity of variances
## 
## data:  attitude by condition
## Bartlett's K-squared = 1.3725, df = 8, p-value = 0.9946

# the test is not significant so I do not reject the hypothesis of
# equal variance across conditions

Inspect the data and suggest a reason why the beat \(\times\) course interaction is significant.

Answer: Inspection of the means gives a clue:

with(police,tapply(attitude,list(beat,course),mean))
##             short medium long
## innercity      20     40   52
## middleclass    30     31   36
## upperclass     33     35   38

Answer: We can get a better idea by looking at the data graphically. The following plot suggests that the effect of course is larger for police assigned to an inner-city beat than it is for police assigned to the other two types of beat. Alternatively, we can say that the direction of the effect of beat appears to depend on the length of the course.

# using emmip in emmeans package
require(emmeans)
# next 2 lines are equivalent:
# police.emm <- emmeans(police.aov.01,specs=~beat*course)
police.emm <- emmeans(police.aov.01,specs=~course*beat)
emmip(police.emm,~course|beat) # emmip is in emmeans

Analyze the main effect of course by performing a linear contrast that evaluates the difference between mean attitude in the short and long course conditions. Also, use an interaction contrast to determine if the result of your comparison of short and long courses depends on the level of beat. Your answer should include the results of the statistical tests (e.g., values of F, t, p, etc.), and your conclusion regarding the null hypotheses for the course contrast and the interaction contrast.

levels(police$course)
## [1] "short"  "medium" "long"
w <- c(-1,0,1) # course contrast
contrasts(police$course) <- w
police.aov.02 <- aov(attitude~beat+course+beat:course,data=police)
summary(police.aov.02,split=list(course=list(SvsL=1)))
##                     Df Sum Sq Mean Sq F value   Pr(>F)    
## beat                 2    190    95.0   1.520 0.232417    
## course               2   1543   771.7  12.347 8.26e-05 ***
##   course: SvsL       1   1541  1540.8  24.653 1.67e-05 ***
## beat:course          4   1237   309.2   4.947 0.002781 ** 
##   beat:course: SvsL  2   1172   585.8   9.373 0.000528 ***
## Residuals           36   2250    62.5                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Answer: For the contrast listed under the main effect of course, the contrast evaluates the null hypothesis that the difference between the marginal mean attitudes in the short and long course conditions, averaged across the beat conditions, differs from zero. The contrast listed under the interaction evaluates the null hypothesis that the difference between the short and long courses is the same at all levels of beat. The contrast for the marginal means is significant ( \(F(1,36)=24.6\), \(p<.001\) ), so we reject the null hypothesis that the marginal means in the short and long courses are equal. The contrast under the interaction also is significant ( \(F(2,36)=9.37\), \(p<.001\) ), so we reject that the difference between the means in the short and long courses is equal at all levels of beat.

It is possible for us to use a contrast to evaluate the difference between the means in the short and long courses at each level of the beat variable. Suppose you want to evaluate the hypothesis that the value of short-vs.-long contrast in the innercity condition differed from the average of the short-vs.-long contrast values in the middleclass and upperclass conditions. Design a set of contrast weights that you could use to evaluate this hypothesis and then perform the contrast.

Answer: First, I evaluate the interaction contrast with aov.

# do analysis with aov:
levels(police$beat)
## [1] "innercity"   "middleclass" "upperclass"
wBeat <- c(-1, 1/2, 1/2) # innercity vs other two
levels(police$course)
## [1] "short"  "medium" "long"
wCourse <- c(-1,0,1) # short vs long
contrasts(police$beat) <- cbind(wBeat)
contrasts(police$course) <- cbind(wCourse)
police.aov.03 <- aov(attitude~beat+course+beat:course,data=police)
summary(police.aov.03,split=list(beat=list(SvsL=1),course=list(ICvsMCUC=1)))
##                              Df Sum Sq Mean Sq F value   Pr(>F)    
## beat                          2  190.0    95.0   1.520 0.232417    
##   beat: SvsL                  1  122.5   122.5   1.960 0.170073    
## course                        2 1543.3   771.7  12.347 8.26e-05 ***
##   course: ICvsMCUC            1 1540.8  1540.8  24.653 1.67e-05 ***
## beat:course                   4 1236.7   309.2   4.947 0.002781 ** 
##   beat:course: SvsL.ICvsMCUC  1 1170.4  1170.4  18.727 0.000115 ***
## Residuals                    36 2250.0    62.5                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# the course contrast is significant (F=24.65, p < .001)
# and the course x beat contrast is significant (F=18.7, p < .001)

Answer: Next, I evaluate the interaction contrast with emmeans.

# using emmeans:
require(emmeans)

# next 2 lines are equivalent:
# police.emm <- emmeans(police.aov.02,specs=~beat*course)
police.emm <- emmeans(police.aov.03,specs=~course*beat)

# list order of conditions in police.emm variable:
police.emm
##  course beat        emmean   SE df lower.CL upper.CL
##  short  innercity       20 3.54 36     12.8     27.2
##  medium innercity       40 3.54 36     32.8     47.2
##  long   innercity       52 3.54 36     44.8     59.2
##  short  middleclass     30 3.54 36     22.8     37.2
##  medium middleclass     31 3.54 36     23.8     38.2
##  long   middleclass     36 3.54 36     28.8     43.2
##  short  upperclass      33 3.54 36     25.8     40.2
##  medium upperclass      35 3.54 36     27.8     42.2
##  long   upperclass      38 3.54 36     30.8     45.2
## 
## Confidence level used: 0.95

# create contrast weights:
# innercity vs upper/middle class:
w_innercity_vs_other_2_beats <- c(1,1,1, -1/2, -1/2, -1/2, -1/2, -1/2, -1/2)
# short vs long:
w_short_vs_long <- c(-1,0,1,-1,0,1,-1,0,1) # N.B. long minus short
# weights for interaction contrast:
w3 <- w_short_vs_long * w_innercity_vs_other_2_beats
sum(w3) # sums to zero
## [1] 0

contrast(police.emm,method=list(myContrast=w3))
##  contrast   estimate   SE df t.ratio p.value
##  myContrast     26.5 6.12 36   4.327  0.0001

# this result is the same as the one listed in the anova table

Answer: My contrast evaluates the null hypothesis that the difference between the short and long courses in the innercity condition is the same as the average difference in the middleclass and upperclass conditions. The contrast is significant (t(36)=4.33, p<.001), and therefore I reject the null hypothesis. Note the t and F tests gave the same results, and that \(t^2=F=18.72\). Finally, the direction of the effect (as indicated by the value of t) indicates that the effect of course length is greater in the innercity condition than the other two conditions. Note that this last statement depends on the sign of the weights! Our course contrast is subtracting mean attitude in the long condition from mean attitude in the shortcondition, so a positive value means that attitude was greater in the long condition. Our interaction contrast is the value of the course contrast in the innercity condition minus the average of the course contrast in the middleclass and upperclass conditions. So a positive value (i.e., a positive \(t\)) means that the course contrast was greater in the innercity condition, which is consistent with our interaction plot (above).

As a bonus, I’ll calculate a measure of association strength.

# question doesn't ask you to do this
# but here is how I would calculate a measure of
# association strength for our interaction contrast

n <- 5 # number per condition
SS.interaction <- 1237 # from ANOVA table
psi <- 26.5 # value of contrast (psi)
( SS.contrast <- n*(psi^2)/sum(w3^2) ) # from course notes
## [1] 1170.417
( R2.alerting <- SS.contrast/SS.interaction )
## [1] 0.9461735

PSYCH 710 Homework 5

Factorial ANOVA Part 1

1 Initialize R

2 Police Data

3 Questions