Continuing the unofficial ongoing series of me learning things I should have learned in school but for realsies this time, I just had a really refreshing dive in to the tempestuous ocean of random effect calculations. If you happen upon this blog, I invite you to take a dip!
What is a random effect (and why would we use one)?
As has been famously pointed out by numerous statistical resources (see here for some details), the definition of which covariates are random and which are fixed varies depending upon what dusty tome you’re reading. However, I find the following interpretation intuitive, and therefore, useful:
- If the levels of the covariate cover your entire population, it’s fixed
- If the levels of the covariate are a sample from a possibly infinite population, it’s random
Okay, that’s fine for getting through the easy part of a statistics graduate program qualifying exam or having the self-satisfying knowledge that you’ve correctly specified your model formula, but what are the actual benefits to your model? I myself wondered this for a while - I knew it was some vague notion of accounting for additional variance in your model that might otherwise end up in something you don’t want, for example, the estimates of your fixed effects. What sort of got the gears turning for me was this stats.stackoverflow post. Another thread in a similar vein is this one. The key theme that of these threads discussing the advantages of mixed models was “parsimony”, that is to say, by correctly specifying random effects we make the model simpler and have to estimate less parameters? Now how can that be?
In order to answer that question, we need some insight into what is actually happening in your computer when you specify a covariate to be random.
Enter… some math
For the purposes of this journey, let’s restrict our discussion to random effect intercepts.
Let’s define our mixed effects model.
where the distribution of $b$ our random effects and $\epsilon$ are defined
Gelman and Hill is an excellent resource for mixed effects modeling, and they provide the following mathematical formula for random intercept estimates.
where the big matrix $\Sigma$ is defined as
We can already notice a few things here, without even jumping into R. First of all, the estimator for the random intercepts depends on only these things:
- $\psi$, the variance of the random intercept
- $\sigma^2$, the residual variance
- $(y - X\beta)$, the residuals after only accounting for the fixed effects
This means that after fitting the fixed effect part of the model, we can estimate all the random intecepts using just $\psi$! Random intercepts are just the residuals of the fixed effects, weighted and normalized by the ratio of the variances between the random effect and the residuals. Imagine that if we treated this covariate as fixed, we would have to estimate these intercepts directly - particularly problematicif this is a factor with a lot of levels! In addition to shrinking our intercept estimates, the decision to use a random effect has saved us degrees of freedom and simplified our model.
Let’s do a concrete example in R.
cake dataset has 3 cake recipes being baked at 6 temperatures, with 15 replicates at each combination of recipe and temperature.
Consider the following mixed effects model with one fixed covariate (temp) and one random covariate (replicate).
The summary output of this mixed effect model is displayed below:
Linear mixed model fit by REML ['lmerMod'] Formula: angle ~ temp + (1 | replicate) Data: cake REML criterion at convergence: 1671.7 Scaled residuals: Min 1Q Median 3Q Max -2.83605 -0.56741 -0.02306 0.54519 2.95841 Random effects: Groups Name Variance Std.Dev. replicate (Intercept) 39.19 6.260 Residual 23.51 4.849 Number of obs: 270, groups: replicate, 15 Fixed effects: Estimate Std. Error t value (Intercept) 0.51587 3.82650 0.135 temp 0.15803 0.01728 9.146 Correlation of Fixed Effects: (Intr) temp -0.903
If we look at the estimated random intercepts from this model, we would see the following:
Now, let’s try to calculate this step-by-step instead of having
ranef do all the heavy lifting.
The trickiest part is figuring out how to extract the design matrices corresponding to the random effect: $Z$ and $Z^t$.
Luckily, R has a built in function to extract components from a
merMod object called
This function takes a
merMod as its first argument, and a string specifying the component to be extracted as its second argument.
Two of those arguments are exactly what we need: “Z” and “Zt”!
To demonstrate that this approach works:
That’s actually the hardest part! Now we just need $\psi$, $\sigma^2$, and $y - X\beta$.
As we saw above in
summary(m), there is a table labeled “Random effects” which has the estimated variance for all random effects and the residual error.
From this table, we see that $\psi = 39.19$ and $\sigma^2 = 23.51$.
Finally, we can calculate $y - X\beta$ directly or we can use
predict(m, re.form = NA).
re.form = NA option is needed here to enforce that predictions are calculated only using the fixed effects.
Note that if you just call
predict(m), the random intercepts will be included in the prediction calculations which will lead to results that are the opposite sign of what you would expect.
This gives us the final step!
Matches up well with the results from