Litter sizes
A scientist working with animal breading did the following experiment: He took a random sample of 6 boars (male pigs), and a random sample of 5 swine (pig) farms. At each farm 12 sows (females) were randomly picked out, and each boar was mated with 2 sows. The response was number of piglets from each sow. The data may be found in the file litter.Rdata
.
- Assume a random effects model using litter
Size
as a response (\(y\)) andFarm
andBoar
as random effects. Also include the interaction between the random effects in the model. State the model and write up all the assumptions made for the model.
- Assume a random effects model using litter
We assume the random effects model:
\[y_{ijk} = \mu + \alpha_i + \beta_j + \alpha\beta_{ij} + \epsilon_{ijk}\]
with Farm levels \(i=1,...,5\), Boar levels \(j=1,...,b\) and replicates (sows) \(k = 1,2\).
where
\[\alpha_i \sim N(0, \sigma_{\alpha}^2)\]
\[\beta_j \sim N(0, \sigma_{\beta}^2)\]
\[\alpha\beta_{ij} \sim N(0, \sigma_{\alpha\beta}^2)\]
\[\epsilon_{ij} \sim N(0, \sigma^2)\]
Further we assume that all random effects are independent of each other.
- Express the total theoretical variance of \(y_{ijk}\) (where \(i=1,...,6\), \(j=1,...,5\) and \(k=1,2\)) in terms of the variance components of all random effects in the model. It may be shown that the theoretical correlation between any two litter sizes from the same Boar and within the same Farm is \(\sigma_{\alpha}^2 + \sigma_{\beta}^2 + \sigma_{\alpha\beta}^2\) where the variance components belong to the Herd-effect, the Boar-effect and their interaction, respectively. How can we express the correlation between such two observations?
\[\text{var}(y_{ijk}) = \text{var}(\mu + \alpha_i + \beta_j + \alpha\beta_{ij} + \epsilon_{ijk}) = \sigma_{\alpha}^2 + \sigma_{\beta}^2 + \sigma_{\alpha\beta}^2 + \sigma^2\]
This means that the correlation between the litter sizes of two sows (\(k\) and \(l\)) from the same farm and mated with the same boar is \[\text{cor}(y_{ijk}, y_{ijl})=\frac{\text{cov}(y_{ijk}, y_{ijl})}{\sqrt{\text{var}(y_{ijk})}\sqrt{\text{var}(y_{ijl})}} = \frac{\sigma_{\alpha}^2 + \sigma_{\beta}^2 + \sigma_{\alpha\beta}^2}{\sigma_{\alpha}^2 + \sigma_{\beta}^2 + \sigma_{\alpha\beta}^2 + \sigma^2}\]
- Fit the model and estimate the mean litter size and find estimates for all variance components.
mod <- lmer(Size ~ 1 + (1|Boar) + (1|Farm) + (1|Boar:Farm), data = litter)
summary(mod)
Linear mixed model fit by REML ['lmerMod']
Formula: Size ~ 1 + (1 | Boar) + (1 | Farm) + (1 | Boar:Farm)
Data: litter
REML criterion at convergence: 187
Scaled residuals:
Min 1Q Median 3Q Max
-1.550 -0.721 0.122 0.453 1.479
Random effects:
Groups Name Variance Std.Dev.
Boar:Farm (Intercept) 0.411 0.641
Boar (Intercept) 1.553 1.246
Farm (Intercept) 0.822 0.907
Residual 0.600 0.775
Number of obs: 60, groups: Boar:Farm, 30; Boar, 6; Farm, 5
Fixed effects:
Estimate Std. Error t value
(Intercept) 9.167 0.669 13.7
The intercept term gives an estimate of the mean litter size in the population which is 9.167. The variance component estimates are given in the random effects section of above output. We have the following estimates:
Farm variance: \(\hat{\sigma}_{\alpha}^2 = 0.822\)
Boar variance: \(\hat{\sigma}_{\beta}^2 = 1.553\)
Farm:Boar variance: \(\hat{\sigma}_{\alpha\beta}^2 = 0.411\)
Error variance: \(\hat{\sigma}^2 = 0.6\)
- Find the estimated correlation as derived in exercise b.
Plugging these estimates into the formula found in exercise b. gives the estimated correlation:
\[\widehat{\text{cor}(y_{ijk}, y_{ijl})} = \frac{0.822 + 1.553 + 0.411}{0.822 + 1.553 + 0.411 + 0.6} = 0.823\]
- State the hypotheses for testing whether there is an interaction effect between Boar and Farm and perform the test. What is the appropriate test statistic for this test, what is the observed test statistic and what is the p-value? How do you interprete the result in terms of boars and farms?
Model without interaction effect,
mod1 <- update(mod, . ~ . - (1|Boar:Farm))
Compare models with and without interaction using anova
as,
(anv <- anova(mod1, mod))
refitting model(s) with ML (instead of REML)
Data: litter
Models:
mod1: Size ~ (1 | Boar) + (1 | Farm)
mod: Size ~ 1 + (1 | Boar) + (1 | Farm) + (1 | Boar:Farm)
Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
mod1 4 201 209 -96.3 193
mod 5 198 208 -94.0 188 4.61 1 0.032 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The Anova reports an Chisq based test for the interaction effect with a p-value of 0.032. With a test level of 5% we therefore reject a null-hypothesis \(H_0: \sigma_{\alpha\beta}^2 = 0\) in favour of \(H_1: \sigma_{\alpha\beta}^2 > 0\) which implies that there are differences between the farm:boar interaction levels. As the interaction effect is significant at 5% level, we can say that at this level there is signifant difference between Boars taken from a given random Farm.
- If you were to choose one boar among the six, which boar should be chosen (if any) in a breeding programme with the aim of increasing litter sizes?
We can obtain the coefficients corresponding to each boars as,
coef(mod)[["Boar"]]
(Intercept)
Boar1 7.46
Boar2 8.19
Boar3 9.01
Boar4 9.66
Boar5 10.02
Boar6 10.66
This suggests that Boar6
has largest litter size.
- Explain why boar should be treated as a random effect and not a fixed effect.
The boars are assumed to be selected from a large population of boars. We intend to use this selected group to say if there are differences between boars in general (for the entire population of boars). Therefore we should treat boar as a random effect.