Minding the Brain: Three ways to get parameter-specific p-values from lmer

How to get parameter-specific p-values is one of the most commonly asked questions about multilevel regression. The key issue is that the degrees of freedom are not trivial to compute for multilevel regression. Various detailed discussions can be found on the R-wiki and R-help mailing list post by Doug Bates. I have experimented with three methods that I think are reasonable.

1. Use the normal approximation. Since the t distribution converges to the z distribution as degrees of freedom increase, this is like assuming infinite degrees of freedom. This is unambiguously anti-conservative, but for reasonable sample sizes, it appears not to be very anti-conservative (Barr et al., 2013). That is, if we take the p-value to measure the probability of a false positive, this approximation produces a somewhat (but perhaps not alarmingly) higher false positive rate than the nominal 5% at p = 0.05.

Here is an example using proportions of semantic errors in picture naming by different aphasia subtypes (the data file can be found here):

load("Examples.RData")
require(lme4)
# fit the model
m.sem <- lmer(Semantic.error ~ TestTime * Diagnosis + (TestTime | SubjectID),
    data = NamingRecovery, REML = FALSE)
# extract coefficients
coefs <- data.frame(coef(summary(m.sem)))
# use normal distribution to approximate p-value
coefs$p.z <- 2 * (1 - pnorm(abs(coefs$t.value)))
coefs

##                               Estimate Std..Error t.value       p.z
## (Intercept)                   0.045767   0.007776  5.8855 3.970e-09
## TestTime                     -0.008685   0.003524 -2.4644 1.372e-02
## DiagnosisConduction          -0.015149   0.010039 -1.5090 1.313e-01
## DiagnosisWernicke            -0.004899   0.010287 -0.4762 6.339e-01
## TestTime:DiagnosisConduction  0.007308   0.004550  1.6063 1.082e-01
## TestTime:DiagnosisWernicke    0.012854   0.004662  2.7571 5.832e-03

2. Use the Satterthwaite approximation, which is implemented in the lmerTest package. According to the documentation, this is based on SAS proc mixed theory. The lmerTest package overloads the lmer function, so you can just re-fit the model using exactly the same code, but the summary() will now include approximate degrees of freedom and p-values. This implementation is extremely easy to use, but can be a little maddening if you forget whether your model is a an object of type lmerMod or merModLmerTest.

require(lmerTest)

# re-fit model
m.semTest <- lmer(Semantic.error ~ TestTime * Diagnosis + (TestTime | SubjectID),
    data = NamingRecovery, REML = FALSE)
# get Satterthwaite-approximated degrees of freedom
coefs$df.Satt <- coef(summary(m.semTest))[, 3]
# get approximate p-values
coefs$p.Satt <- coef(summary(m.semTest))[, 5]
coefs

##                               Estimate Std..Error t.value       p.z
## (Intercept)                   0.045767   0.007776  5.8855 3.970e-09
## TestTime                     -0.008685   0.003524 -2.4644 1.372e-02
## DiagnosisConduction          -0.015149   0.010039 -1.5090 1.313e-01
## DiagnosisWernicke            -0.004899   0.010287 -0.4762 6.339e-01
## TestTime:DiagnosisConduction  0.007308   0.004550  1.6063 1.082e-01
## TestTime:DiagnosisWernicke    0.012854   0.004662  2.7571 5.832e-03
##                              df.Satt    p.Satt
## (Intercept)                    23.00 5.344e-06
## TestTime                       22.99 2.163e-02
## DiagnosisConduction            23.00 1.449e-01
## DiagnosisWernicke              23.00 6.384e-01
## TestTime:DiagnosisConduction   22.99 1.219e-01
## TestTime:DiagnosisWernicke     22.99 1.122e-02

3. Use the Kenward-Roger approximation to get approximate degrees of freedom and the t-distribution to get p-values, which is implemented in the pbkrtest package.

require(pbkrtest)

# get the KR-approximated degrees of freedom
df.KR <- get_ddf_Lb(m.sem, fixef(m.sem))

# get p-values from the t-distribution using the t-values and approximated
# degrees of freedom
coefs$p.KR <- 2 * (1 - pt(abs(coefs$t.value), df.KR))
coefs

##                               Estimate Std..Error t.value       p.z
## (Intercept)                   0.045767   0.007776  5.8855 3.970e-09
## TestTime                     -0.008685   0.003524 -2.4644 1.372e-02
## DiagnosisConduction          -0.015149   0.010039 -1.5090 1.313e-01
## DiagnosisWernicke            -0.004899   0.010287 -0.4762 6.339e-01
## TestTime:DiagnosisConduction  0.007308   0.004550  1.6063 1.082e-01
## TestTime:DiagnosisWernicke    0.012854   0.004662  2.7571 5.832e-03
##                              df.Satt    p.Satt      p.KR
## (Intercept)                    23.00 5.344e-06 9.049e-06
## TestTime                       22.99 2.163e-02 2.283e-02
## DiagnosisConduction            23.00 1.449e-01 1.468e-01
## DiagnosisWernicke              23.00 6.384e-01 6.390e-01
## TestTime:DiagnosisConduction   22.99 1.219e-01 1.238e-01
## TestTime:DiagnosisWernicke     22.99 1.122e-02 1.210e-02

Not surprisingly, the Satterthwaite and Kenward-Roger approximations produce slightly more conservative p-values than the normal approximation does. However, even in this not-very-large data set (24 participants, 6-9 in each of 3 groups, 5 observations for each participant) the normal approximation is only slightly less conservative than these two options. The approximate degrees of freedom are slightly higher when using the Satterthwaite approximation (23) than the Kenward-Roger approximation (20.1), which makes the latter a bit more conservative. Of course, these are not systematic tests, but it is fairly representative of a few models that I've compared.

15 comments:

JorgeJuly 8, 2014 at 3:52 AM
Thank you very much, Dan! This was incredibly useful.
BurnettBirdEcolNovember 25, 2014 at 9:42 AM
Yes, thanks!
UnknownMarch 4, 2015 at 1:21 PM
You are a god among men, thank you so much!
UnknownJune 23, 2015 at 12:08 PM
Comes very handy ! thanks
UnknownSeptember 1, 2015 at 10:14 AM
Thanks, it is great!
UnknownNovember 6, 2015 at 12:19 PM
Thank you. I need it very much.
UnknownNovember 11, 2015 at 6:26 PM
Hey, thank you for these! I also have used mixed from the afex package, and it sometimes gives rather different results. Do you have any explanation why that may be or which to trust?

Thanks!
JulieSeptember 13, 2016 at 9:56 PM
omg thank you so much for posting this...
Yvan Delgado de la FlorApril 30, 2017 at 6:38 PM
Thank you! I tried the first one and I worked perfectly. I double checked it with a different model where I knew the p-values and the out was identical!
If you want to know why the author didnt include the p-values, here: https://stat.ethz.ch/pipermail/r-help/2006-May/094765.html
UnknownMay 30, 2017 at 5:13 AM
Hi, Thanks for your answer. Can you please tell me, how can I get p-values for random effects in lmer function? I have my model like this:
mod1 = lmer(dep ~ ind1 + (1 + ind2 | ind3), data = dataset)
UnknownJuly 31, 2019 at 10:34 PM
Thanks heaps Dan :)
Your service is appreciated.

Tuesday, February 11, 2014

Three ways to get parameter-specific p-values from lmer

15 comments: