Benedikt Ehinger

March 2017

- Repeated measures
- Mixed Models

It looks like there is no difference between groups?

But what if we have data from within the same subjects?

Between-Subject variabilty can be removed by repeated measures.

**Never analyze repeated measures data without taking repeated measures into account** (Disaggregation, Pseudoreplication)

Black = mean of the subject.

\( n_S=5 \) subjects

with \( n_T =15 \) trials

This sampling distribution reflects these exact 5 subjects. But does not generalize!

(pink for aggregate, purple for disaggregate).

There is a strong allure to choose the purple one, easier to get significant results!

\[ n_{subject}(aggregate) = df+1 = 5 \] \[ n_{"subject"}(pseudoreplication) = df+1= 5*5 = 25 \]

\[ \sigma_{within} >> \sigma_{between} \]

Subjects are very **similar**

**Not so necessary to aggregate first**

A single parameter summarizes subject-means well

\[ \sigma_{within} << \sigma_{between} \]

Subjects are very **different**

**Totally necessary to aggregate first**

\( n_{subject} \) parameters summarizes subject-means well

**Question:** Estimate the mean of the population

**Question:** Estimate the mean of the population

We should take into account **sample-size** and **variance** into our estimate of the population mean

We should trust those subjects more, where we most sure about their central tendency

**Why repeated measures**- More power to find effects that exist!

**Don't forget about repeated measures!**- Else your type I error will be very high

**Should we trust all subjects equally?**- No, we should punish for small within-subject sample-size and punish high variance

**An example to guide you through the steps**- The mixed model
- Inference
- REML vs ML

- Assumption checking
- Shrinkage
- Terminology
- Convergence Problems
- Simplifying your model

EEG-Experiment: Contrast vs. P100 size

=> Very noisy subjects (babys?), between 4 and 29 trials

$$$$

\[ \hat{y} = \beta_0 + \beta_1 * contrast \]

\[ \hat{y_j} = \beta_{0,j} + \beta_{1,j} * contrast \]

Note: Contrast-slope scaled by factor 10 for easier comparison

- An example to guide you through the steps
**The mixed model**- Inference
- REML vs ML

- Assumption checking
- Shrinkage
- Terminology
- Convergence Problems
- Simplifying your model

**Linear Mixed Model** aka **Hierarchical Model** aka **Multilevel Model**

We estimate the population distribution directly. Subject-means with high uncertainty are less influential for the model fit

\[ \hat{y} = \beta_0 + \beta_1 * contrast + u_{0,j} + u_{1,j} * contrast \]

\[ \hat{y}_j = \beta X_j + b_jZ \]

Fitting this in R using the package `lme4`

`response ~ beta + (beta|group)`

**In our example**:

`response ~ 1 + contrast + (1|subject) + (contrast|subject)`

```
summary(lmer(p100~contrast+(1+contrast||subject),data=d))
```

```
Linear mixed model fit by REML ['lmerMod']
Formula: p100 ~ contrast + ((1 | subject) + (0 + contrast | subject))
Data: d
REML criterion at convergence: 2271.4
Scaled residuals:
Min 1Q Median 3Q Max
-2.71978 -0.62658 -0.01064 0.65739 2.91110
Random effects:
Groups Name Variance Std.Dev.
subject (Intercept) 10.16741 3.1886
subject.1 contrast 0.01353 0.1163
Residual 54.45510 7.3794
Number of obs: 323, groups: subject, 20
Fixed effects:
Estimate Std. Error t value
(Intercept) 4.03422 1.10536 3.650
contrast 0.09536 0.03045 3.131
Correlation of Fixed Effects:
(Intr)
contrast -0.324
```

gray: one model, color: individual model, black: mixed model

two stage:

```
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.06858 0.02474 2.772 0.0122 *
```

mixed-model:

```
Estimate Std. Error t value
contrast 0.09526 0.02990 3.187
```

true: slope=0.1

- The mixed model
- An example to guide you through the steps
**Inference**- REML vs ML

- Assumption checking
- Shrinkage
- Terminology
- Convergence Problems
- Simplifying your model

Unresolved problem!

Read:how to get pvalues, problems with stepwise regression, different ways to get p-values

One way (not the best*): Model comparison for fixed effects:

**Be sure to keep random structure equal**

```
anova(mres0,mres1)
```

`"refitting model(s) with ML (instead of REML)"`

```
Data: d
Models:
mres0: p100 ~ 1 + (1 + contrast | subject)
mres1: p100 ~ 1 + contrast + (1 + contrast | subject)
Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
mres0 5 2286.5 2305.3 -1138.2 2276.5
mres1 6 2280.1 2302.7 -1134.0 2268.1 8.4095 1 0.003733 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

- currently bootstraping or mcmc recommended

When estimating the variance parameter using maximum likelihood (**ML**) \( \sigma \) is biased because \( \mu \) is not known at the same time

Fit all fixed effects \( \mu \), then refit the variance parameters on the residing estimates.

This is called restricted maximum likelihood (**REML**). If \( n \) is large, no difference between **ML** and **REML** exist. If \( n \) is small **REML** gives better predictive results.

If you do model comparison and fixed effects differ (usual case) you have to use **ML**

- resample from the data with replacement (or parametrically from \( H_0 \)-Model) 1000 times
- calculate statistic of interest
- observe spread of statistic

```
b1 <- bootMer(mres1, FUN = function(x) as.numeric(logLik(x)), nsim = nboot,seed = 1)
b2 <- bootMer(mres0, FUN = function(x) as.numeric(logLik(x)), nsim = nboot,seed = 1)
bootLL = 2 * b1$t - 2 * b2$t # grab the values & plot
quantile(bootLL, probs = c(0.025, 0.975))
```

```
2.5% 97.5%
-7.358778 1.979873
```

Better & easier: Likelihood-Ratio using R-Package `pbkrtest`

```
summary(pbkrtest::PBmodcomp(mres1,mres0,nsim=nboot))
```

```
Parametric bootstrap test; time: 112.23 sec; samples: 200 extremes: 0;
large : p100 ~ 1 + contrast + (1 + contrast | subject)
small : p100 ~ 1 + (1 + contrast | subject)
stat df ddf p.value
PBtest 8.3917 0.004975 **
Gamma 8.3917 0.005508 **
Bartlett 7.4900 1.0000 0.006204 **
F 8.3917 1.0000 18.614 0.009379 **
LRT 8.3917 1.0000 0.003769 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

The main con against bootstraping is the high computational time (use at least **1000** not 200!)

- The mixed model
- An example to guide you through the steps
- Inference
- REML vs ML

**Assumption checking**- QQplost
- Level 1
- Level 2

- Shrinkage
- Terminology
- Convergence Problems
- Simplifying your model

Source: Sean Kross

No strong non-linearity visible in residuals

qqplot indicates normality of residuals

In addition: test whether normality for random-variable is true:

QQ-Plots indicate normality of both random effects.

- The mixed model
- An example to guide you through the steps
- Inference
- REML vs ML

- Assumption checking
**Shrinkage**- Terminology
- Convergence Problems
- Simplifying your model

The individual estimates with high variance / far off the population are *shrinked* inwards

gray: one model, color: individual model, black: mixed model

**Simulated data**: Subjects with faster reaction time (intercept) show larger effects for difficulty (slope)

In reverse: If I know somebody is generally slow, he should also have a larger slope!