Mixed Models / Hierarchical Models

Benedikt Ehinger
March 2017

Overview

  • Repeated measures
  • Mixed Models

Repeated measures

plot of chunk unnamed-chunk-2

It looks like there is no difference between groups?

But what if we have data from within the same subjects?

Repeated measures

plot of chunk unnamed-chunk-3plot of chunk unnamed-chunk-3

Between-Subject variabilty can be removed by repeated measures.

Never analyze repeated measures data without taking repeated measures into account (Disaggregation, Pseudoreplication)

Why you have to acount for repeated measures

plot of chunk unnamed-chunk-5 Black = mean of the subject.

\( n_S=5 \) subjects

Sampling distributions

plot of chunk unnamed-chunk-6

Multiple trials per subject

plot of chunk unnamed-chunk-7 with \( n_T =15 \) trials

Sampling distributions

plot of chunk unnamed-chunk-8

This sampling distribution reflects these exact 5 subjects. But does not generalize!

Sampling distributions

plot of chunk unnamed-chunk-9

(pink for aggregate, purple for disaggregate).

There is a strong allure to choose the purple one, easier to get significant results!

\[ n_{subject}(aggregate) = df+1 = 5 \] \[ n_{"subject"}(pseudoreplication) = df+1= 5*5 = 25 \]

DONT DO THIS, YOU KNOW (now) BETTER

Always necessary to aggregate?

plot of chunk unnamed-chunk-10 \[ \sigma_{within} >> \sigma_{between} \]


Subjects are very similar

Not so necessary to aggregate first

A single parameter summarizes subject-means well

plot of chunk unnamed-chunk-11 \[ \sigma_{within} << \sigma_{between} \]

Subjects are very different

Totally necessary to aggregate first

\( n_{subject} \) parameters summarizes subject-means well

Should we trust all subjects equally?

plot of chunk unnamed-chunk-12

Question: Estimate the mean of the population

Should we trust all subjects equally?

plot of chunk unnamed-chunk-13

Question: Estimate the mean of the population

Should we trust all subjects equally?

plot of chunk unnamed-chunk-14

plot of chunk unnamed-chunk-15

We should take into account sample-size and variance into our estimate of the population mean

We should trust those subjects more, where we most sure about their central tendency

Summary

  • Why repeated measures
    • More power to find effects that exist!
  • Don't forget about repeated measures!
    • Else your type I error will be very high
  • Should we trust all subjects equally?
    • No, we should punish for small within-subject sample-size and punish high variance

Overview

  • An example to guide you through the steps
  • The mixed model
  • Inference
    • REML vs ML
  • Assumption checking
  • Shrinkage
  • Terminology
  • Convergence Problems
  • Simplifying your model

The example

EEG-Experiment: Contrast vs. P100 size

=> Very noisy subjects (babys?), between 4 and 29 trials

The data

plot of chunk unnamed-chunk-17 $$$$

one model for all subjects

plot of chunk unnamed-chunk-18 \[ \hat{y} = \beta_0 + \beta_1 * contrast \]

one model for each subject

plot of chunk unnamed-chunk-19 \[ \hat{y_j} = \beta_{0,j} + \beta_{1,j} * contrast \]

two-stage statistics

plot of chunk unnamed-chunk-21

plot of chunk unnamed-chunk-22 Note: Contrast-slope scaled by factor 10 for easier comparison

Overview

  • An example to guide you through the steps
  • The mixed model
  • Inference
    • REML vs ML
  • Assumption checking
  • Shrinkage
  • Terminology
  • Convergence Problems
  • Simplifying your model

One solution

Linear Mixed Model aka Hierarchical Model aka Multilevel Model

We estimate the population distribution directly. Subject-means with high uncertainty are less influential for the model fit

Formal definition

The mixed model in R

\[ \hat{y} = \beta_0 + \beta_1 * contrast + u_{0,j} + u_{1,j} * contrast \]

\[ \hat{y}_j = \beta X_j + b_jZ \]

Fitting this in R using the package lme4

response ~ beta + (beta|group)

In our example:

response ~ 1 + contrast + (1|subject) + (contrast|subject)

Mixed models in R

summary(lmer(p100~contrast+(1+contrast||subject),data=d))
Linear mixed model fit by REML ['lmerMod']
Formula: p100 ~ contrast + ((1 | subject) + (0 + contrast | subject))
   Data: d

REML criterion at convergence: 2271.4

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.71978 -0.62658 -0.01064  0.65739  2.91110 

Random effects:
 Groups    Name        Variance Std.Dev.
 subject   (Intercept) 10.16741 3.1886  
 subject.1 contrast     0.01353 0.1163  
 Residual              54.45510 7.3794  
Number of obs: 323, groups:  subject, 20

Fixed effects:
            Estimate Std. Error t value
(Intercept)  4.03422    1.10536   3.650
contrast     0.09536    0.03045   3.131

Correlation of Fixed Effects:
         (Intr)
contrast -0.324

plot of chunk unnamed-chunk-24 gray: one model, color: individual model, black: mixed model

Direct comparison of 2-stage vs mixed model

plot of chunk unnamed-chunk-25

two stage:

            Estimate Std. Error t value Pr(>|t|)  
(Intercept)  0.06858    0.02474   2.772   0.0122 *

mixed-model:

            Estimate Std. Error t value
contrast     0.09526    0.02990   3.187

true: slope=0.1

Overview

  • The mixed model
  • An example to guide you through the steps
  • Inference
    • REML vs ML
  • Assumption checking
  • Shrinkage
  • Terminology
  • Convergence Problems
  • Simplifying your model

Model-Comparisons aka p-values

Unresolved problem!

Read:how to get pvalues, problems with stepwise regression, different ways to get p-values

One way (not the best*): Model comparison for fixed effects:

Be sure to keep random structure equal

anova(mres0,mres1)

"refitting model(s) with ML (instead of REML)"

Data: d
Models:
mres0: p100 ~ 1 + (1 + contrast | subject)
mres1: p100 ~ 1 + contrast + (1 + contrast | subject)
      Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)   
mres0  5 2286.5 2305.3 -1138.2   2276.5                            
mres1  6 2280.1 2302.7 -1134.0   2268.1 8.4095      1   0.003733 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  • currently bootstraping or mcmc recommended

REML VS ML


Intuition:

When estimating the variance parameter using maximum likelihood (ML) \( \sigma \) is biased because \( \mu \) is not known at the same time

Solution:

Fit all fixed effects \( \mu \), then refit the variance parameters on the residing estimates.

This is called restricted maximum likelihood (REML). If \( n \) is large, no difference between ML and REML exist. If \( n \) is small REML gives better predictive results.


If you do model comparison and fixed effects differ (usual case) you have to use ML

Excourse: Bootstraping

  • resample from the data with replacement (or parametrically from \( H_0 \)-Model) 1000 times
  • calculate statistic of interest
  • observe spread of statistic
b1 <- bootMer(mres1, FUN = function(x) as.numeric(logLik(x)), nsim = nboot,seed = 1)
b2 <- bootMer(mres0, FUN = function(x) as.numeric(logLik(x)), nsim = nboot,seed = 1)
bootLL =  2 * b1$t - 2 * b2$t # grab the values & plot
quantile(bootLL, probs = c(0.025, 0.975))
     2.5%     97.5% 
-7.358778  1.979873 

plot of chunk unnamed-chunk-32

pbkrtest

Better & easier: Likelihood-Ratio using R-Package pbkrtest

summary(pbkrtest::PBmodcomp(mres1,mres0,nsim=nboot))
Parametric bootstrap test; time: 112.23 sec; samples: 200 extremes: 0;
large : p100 ~ 1 + contrast + (1 + contrast | subject)
small : p100 ~ 1 + (1 + contrast | subject)
           stat     df    ddf  p.value   
PBtest   8.3917               0.004975 **
Gamma    8.3917               0.005508 **
Bartlett 7.4900 1.0000        0.006204 **
F        8.3917 1.0000 18.614 0.009379 **
LRT      8.3917 1.0000        0.003769 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The main con against bootstraping is the high computational time (use at least 1000 not 200!)

Overview

  • The mixed model
  • An example to guide you through the steps
  • Inference
    • REML vs ML
  • Assumption checking
    • QQplost
    • Level 1
    • Level 2
  • Shrinkage
  • Terminology
  • Convergence Problems
  • Simplifying your model

Checking Assumptions: Normality using QQplots

plot of chunk unnamed-chunk-34

Source: Sean Kross

plot of chunk unnamed-chunk-35

plot of chunk unnamed-chunk-36

plot of chunk unnamed-chunk-37

Checking Assumptions Level I

plot of chunk unnamed-chunk-39

No strong non-linearity visible in residuals

plot of chunk unnamed-chunk-40

qqplot indicates normality of residuals

Checking assumptions Level II

In addition: test whether normality for random-variable is true: plot of chunk unnamed-chunk-41plot of chunk unnamed-chunk-41

QQ-Plots indicate normality of both random effects.

Overview

  • The mixed model
  • An example to guide you through the steps
  • Inference
    • REML vs ML
  • Assumption checking
  • Shrinkage
  • Terminology
  • Convergence Problems
  • Simplifying your model

Shrinkage

The individual estimates with high variance / far off the population are shrinked inwards

plot of chunk unnamed-chunk-42 gray: one model, color: individual model, black: mixed model

Shrinkage-plot

plot of chunk unnamed-chunk-43

Shrinkage over multiple parameters

plot of chunk unnamed-chunk-44 Simulated data: Subjects with faster reaction time (intercept) show larger effects for difficulty (slope)

In reverse: If I know somebody is generally slow, he should also have a larger slope!

A multivariate normal