---
title: "Mixed Models"
output:
html_document:
toc: true
toc_depth: 2
toc_float: true
number_sections: true
theme: flatly
highlight: zenburn
css: ../rmarkdown_hw.css
---
```{r include=F}
additionalHints = F
showSolution = T
```
# Introduction
## What will I learn
- Mixed Models
- Testing for main effects / interactions in LMM
- Interpretation of Variance Random Coefficients
- Testing Assumptions of LMM
## The experiment
This dataset is real data, published by [Ehinger, Kaufhold König (2016)](https://osf.io/ba2pn/).
![](exp_design.png)
Caption (Ehinger et al.): **A)** Experimental Procedure. Subjects foveate a single bubble for the forced fixation time. [...] The next step removes the bubble and simultaneously displays up to five new bubbles. The dependent variable is the time until a saccade initiation. After the selection of one bubble and most often before the saccade ends, the others bubbles are removed. The newly foveated stimulus is shown for a newly sampled forced fixation time. **B)** Example Stimulus Urban/Pink Noise. **C)** The sampling points for bubbles following the pseudo-random Poisson disc sampling algorithm. **D)** A single bubble as seen by the subject.
The following subset of fields are available:
Variables | Description
------------------- | -----------------------------------------------
subject | participant ID (not numbered consecutively)
choicetime | Dependent variable, the time to move to a new fixation location
forcedFixtime | Independent variable, the time the last bubble was shown
NumOfBubbles | Independent variable, the number of bubbles subjects are able to choose from
Image | Name of the Image that was used as a backdrop
stimulus_type | Image Type (Noise vs. Urban Image)
## The research questions
1. Is there an decreasing or increasing relationship of *choicetime* (\approx reaction time) with the number of possible targets?
2. Ís there an decreasing relationship between the *forced fixation time* and the *choicetime*
# Analysis of Number of Bubbles
## Load the data
```{r,warning=FALSE, message=F}
library(ggplot2)
library(lme4)
library(plyr)
library(magrittr)
d = read.csv('fixdur.csv')
```
```{r eval=F,echo=F}
d = read.csv('/net/store/nbp/users/behinger/projects/fixdur/git/cache/data/all_res.csv')
d = read.csv('C:/Users/behinger/cloud/PhD/fixdur_git/cache/data/all_res.csv')
d = d[d$forcedFixtime<1000 & d$forcedFixtime>1 & d$goodFix,c('forcedFixtime','choicetime','NumOfBubbles','subject','image','stimulus_type')]
write.csv(d,file='fixdur.csv')
```
We start with only a single continuous predictor: Number of bubbles (ranging from 1 to 5)
## Plotting
1. Use `ggplot` to plot `number of bubbles` against `choicetime`. Use `stat_summary()`.
```{r eval=F,echo=additionalHints}
ggplot(d,aes(x=XXX,y=XXX))+stat_summary()
```
```{r eval=showSolution,echo=showSolution}
ggplot(d,aes(x=NumOfBubbles,y=choicetime))+stat_summary()
```
2. Why are the standard errors increasing?
*Hint: Calculate the number of trials for each condition.*
```{r eval=showSolution,echo=additionalHints}
summary(factor(d$NumOfBubbles))
```
## Mixed model
1. Build a simple linear mixed model with NumOfBubbles and a random intercept and random slope for the random grouping variable subject. We need the random coefficients to account for the correlation between subjects (aka repeated measures).
```{r eval=F,echo=additionalHints}
m1 = (lmer(data=XXX,formula=XXX~1+XXX+(XXX+XXX|XXX)))
```
2. Inspect the model. Compared to a model without the fixed coefficient, is the log-likelihood significantly increased?
```{r eval=F,echo=additionalHints}
m0 = (lmer(data=XXX,formula=XXX~1+(XXX+XXX|XXX)))
anova(XXX,XXX)
```
```{r eval=showSolution,echo=showSolution}
m0 = (lmer(data=d,formula=choicetime~1+(NumOfBubbles+1|subject)))
m1 = (lmer(data=d,formula=choicetime~1+NumOfBubbles+(NumOfBubbles+1|subject)))
anova(m0,m1)
```
## Interpretation
Fit the following model `choicetime ~ 1 + NumOfBubbles + (1 + NumOfBubbles | subject)`.
### Fixed coefficients
1. What does the intercept represent here?
1. What are the average increases from `NoB=1`to `NoB=5`?
```{r eval=F,echo=additionalHints}
fixef(m1)$XXX * 4
```
## Random Coefficients
### Intercept
What is the spread of the intercept over subjects?
*Bonus:* Calculate the values where 95% of subject's intercepts would lay (hint: `qnorm` is your friend) assuming the estimated parameters are correct.
```{r eval=showSolution,echo=additionalHints}
m1 = lmer(data=d,formula=choicetime ~ 1 + NumOfBubbles + (1 + NumOfBubbles | subject))
summary(m1)
```
### Slope
We want a plot with NumOfBubbles on X, predicted choicetime on Y for each subject. We can get $\hat{y}$ using `predict`.
```{r eval=F,echo=additionalHints}
d$predict = predict(XXX)
ggplot(d,aes(x=XXX,y=XXX,group=factor(XXX))) +
stat_summary(geom='line')
```
*Bonus:* You can add the individual data + individual fit easily with ggplot. You could also add the average slope using `aes(group=NULL)`
```{r echo=showSolution,eval=showSolution}
d$predict = predict(m1)
ggplot(d,aes(x=NumOfBubbles,y=predict,group=factor(subject))) +
stat_summary(geom='line')+
stat_summary(geom='line',size=2,aes(group=NULL))+
stat_summary(aes(y=choicetime),geom='point',color='red')+facet_wrap(~subject)
```
### Plot the 2D gaussian underlying the random coefficients
We currently have three random coefficients, `Intercept`, `NumOfBubbles` and the correlation between the two. We want to visualize them similar to the lecture.
1. You need the package `mvtnorm` for this exercise; use `install.packages('mvtnorm')` to install it
2. Complete the following code
3. Be sure you understand what gaus2d.cov represents! You could play around with the numbers and see how it affects the resulting plot.
```{r eval=F}
# Get the fixed coefficients
gaus2d.mean <- fixef(XXX)
# Get the estimated Random Coefficient Covariance Matrix
gaus2d.cov <- VarCorr(XXX)$XXX
# make a grid of points from 100:300 and 0:20 to evaluate the gaussian function on
data.grid <- expand.grid(x1 = seq(100, 300, length.out=200), x2 = seq(0, 20, length.out=200))
# Generate a further column 'prob' with the density of a multivariate norm.
q.samp <- cbind(data.grid, prob = mvtnorm::dmvnorm(data.grid, mean = gaus2d.mean, sigma = gaus2d.cov))
# Plot it!
ggplot(q.samp, aes(x=XXX, y=XXX, z=XXX)) +
geom_contour()+
geom_point(inherit.aes=F,data=data.frame(coef(XXX)$XXX),aes(x=XXX,y=XXX)) # this is a bonus, but worth it, plot the single subject estimates in here as well.
```
```{r eval=showSolution,echo=showSolution}
gaus2d.mean <- fixef(m1)#[c(2,4)]
gaus2d.cov <- VarCorr(m1)$subject#[c(2,4),c(2,4)]
data.grid <- expand.grid(s.1 = seq(100, 300, length.out=200), s.2 = seq(0, 20, length.out=200))
#data.grid <- expand.grid(s.1 = seq(-40,150, length.out=200), s.2 = seq(-40, 20, length.out=200))
q.samp <- cbind(data.grid, prob = mvtnorm::dmvnorm(data.grid, mean = gaus2d.mean, sigma = gaus2d.cov))
ggplot() +
geom_tile(data=q.samp,aes(x=s.1, y=s.2, fill=prob))+
geom_point(data=data.frame(coef(m1)$subject),aes(x=X.Intercept.,y=NumOfBubbles),color="red")
#geom_point(inherit.aes=F,data=data.frame(coef(m1)$subject),aes(x=log.NumOfBubbles.,y=log.NumOfBubbles..log.forcedFixtime.))
```
# To log or not to log
We want to empirically compare a linear fit with a logarithmic fit (which theoretically makes more sense, as it generalizes better to, say, 25 bubbles). Fit a third model with the log-transformed predictor NumOfBubbles as an additional coefficient and do a model comparison.
1. Is it empirically better to take `log(NumOfBubbles)`?
*Hint*: We might run into convergence issues. One strategy to deal with it is to find the most [parsimonious model](https://arxiv.org/abs/1506.04967) (the biggest model supported by the data). Try removing the correlation between `NumOfBubbles` and `log(NumOfBubbles)`. Does this help? You can continue removing correlation parameters from the model until you are sure the model converged.
Note that [Barr et al](https://www.ncbi.nlm.nih.gov/pubmed/24403724) show that not modeling random slopes increases Type-I errors. Maybe it is more useful in this case to choose either `NoB` or `log(NoB)` instead of incorporating both.
```{r eval=F,echo=additionalHints}
m2 = lmer(data=XXX,formula=XXX~XXX+ log(XXX)+(1+XXX+XXX|XXX))
m2a = lmer(data=XXX,formula=XXX~XXX+ log(XXX)+(1+XXX+XXX||XXX))
m2b = lmer(data=XXX,formula=XXX~XXX+ log(XXX)+(1+XXX|XXX)+ (0+XXX|XXX))
anova(XXX,XXX)
```
```{r eval=showSolution,echo=showSolution}
m2 = (lmer(data=d,formula=choicetime~1+log(NumOfBubbles) +NumOfBubbles+(log(NumOfBubbles) + NumOfBubbles+1|subject)))
m2a = (lmer(data=d,formula=choicetime~1+log(NumOfBubbles)+NumOfBubbles+(log(NumOfBubbles) + NumOfBubbles+1||subject)))
m2b = (lmer(data=d,formula=choicetime~1+log(NumOfBubbles)+NumOfBubbles+(log(NumOfBubbles) + 1|subject)+(0+NumOfBubbles||subject)))
anova(m2b,m1)
```
2. Are you allowed to do LLR - model comparisons between the model `ct ~ NumOfBubbles` and `ct ~ log(NumOfBubbles)` directly?
3. Do you know other model comparison techniques you could apply here?
```{r eval=F,echo=showSolution}
'no, use AIC'
```
# Checking assumptions
## First Level
1. Plot the residuals of the LMM. Are they normally distributed?
```{r echo=F,eval=additionalHints,results='asis'}
cat('
Hint: Use `qqnorm` and `plot` the `residuals`. It might also help to plot a histogram of the residuals, are they shaped like a normal distribution?
')
```
2. If you find problems, think of transformations that could improve the shortcomings.
3. Now plot the residuals against `NumOfBubbles`. Are there any (nonlinear) dependencies left over?
```{r eval=showSolution,echo=showSolution}
m1 = (lmer(data=d,formula=choicetime~NumOfBubbles+(1+NumOfBubbles|subject)))
qqnorm(residuals(m1))
qqline(residuals(m1))
plot(residuals(m1))
d2 = d
d2$resid = residuals(m1)
ggplot(d2,aes(x=NumOfBubbles,y=resid))+stat_summary()
```
## Second Level
In a mixed model, you estimate an effect size for each subject (whether this counts as a parameter or not is for some people up to discussion, from a Bayesian point of view they clearly do). If you want nicer plots you could try the package [`sjp.lmer`](http://www.strengejacke.de/sjPlot/sjp.lmer/)
This estimate is called the BLUP (Best Linear Unbiased Predictor). You can receive the BLUP model estimates using the following code:
```{r eval=F}
ranef(m1)$subject$`(Intercept)`
```
If you want to add the fixed coefficients, use `coeff` instead of `ranef` (if you don't understand what that means: calculate the mean of the BLUP over subjects once for `coeff` and `ranef` and compare)
1. Plot a histogram or `qqnorm` of these BLUP estimated values. Are they following the assumed distribution?
```{r eval=showSolution,echo=additionalHints}
ranef.intercept = ranef(m1,condVar=T)$subject$`(Intercept)`
qqnorm(ranef.intercept)
qqline(ranef.intercept)
```
2. Do the same for the random slope.
**Note:** Based on these values (and due to it being reaction times, i.e. cannot be smaller than 0 and right tail) it is clear that the data should be transformed, `log` transform seems a good candidate. But remember that there could be other factors heavily influencing the residuals. Maybe things get better if we include them? Also, for practical conclusions, we expect no difference between the transformed and non-transformed data: our effects are quite large, we have lots of data and discrepancies only occur in the upper & lower 5% quantiles. If we are closer to noise-level things might be very different.
# Multivariate / forcedFixationtime
## The model
1. Fit the model with continuous predictor `log(NumOfBubbles)`, `log(forcedFixtime)` and their interaction.
```{r eval=F,echo=additionalHints}
m3 = lmer(data=d,choicetime ~ log(XXX) * log(XXX) + (XXX | subject),REML=F)
```
## Is the interaction significant?
```{r eval=showSolution,echo=showSolution}
d2 = d%>%transform(standLogNoB = log(NumOfBubbles)-mean(log(NumOfBubbles)))
d2 = d2%>%transform(standLogFF = log(forcedFixtime)-mean(log(forcedFixtime)))
m3 = lmer(data=d2,choicetime ~ standLogNoB*standLogFF + (1+standLogNoB*standLogFF | subject),REML=F)
m3b = lmer(data=d2,choicetime ~ standLogNoB+standLogFF + (1+standLogNoB*standLogFF | subject),REML=F)
anova(m3,m3b)
```
You will most likely run into convergence problems here (and the model take quite some time to fit). Try to standardize your parameters (removing the mean should be sufficient, but you can also try to divide by their respective SD. In the former case only the intercept changes its interpretation, in the latter also the slope changes its interpretation).
## Interpreting the log-log interaction
A very good idea is to visualize your effects. Use `predict` in combination with `expand.grid` to produce a fully balanced data set. Plot `choicetime` against `forcedFixtime` and split by `factor(NumOfBubbles)`. To get a feeling for the size of the interaction predict the difference between 100ms and 1000ms at NumOfBubbles==1 with the difference between 100ms and 1000ms at NumOfBubbles==5.
```{r eval=F,echo=additionalHints}
# Generate a fully balanced new data set
predictdata = data.frame(expand.grid(forcedFixtime = seq(XXX,XXX,length.out = 100),
NumOfBubbles=seq(XXX,XXX),
subject=unique(d$subject)))
predictdata$predict = predict(XXX,newdata=XXX) # take the model prediction
ggplot(predictdata,aes(x=XXXX,y=XXXX,color=factor(XXXX)))+
stat_summary(geom="line")+ # we are not interested in different subjects
scale_x_log10() # it is easier to see the interaction with the log-scale.
# but feel free to remove it to understand whats really going on
```
```{r eval=showSolution,echo=showSolution}
predictdata = data.frame(expand.grid(standLogFF = log(seq(1,1000,length.out = 100))-log(500),
standLogNoB=seq(1,5)-log(2.5),
subject=unique(d$subject)))
predictdata$predict = predict(m3,newdata=predictdata)
ggplot(predictdata,aes(x=standLogFF,y=predict,color=factor(standLogNoB)))+
stat_summary(geom="line")+
scale_x_log10()
```
# Multiple Random Effects
We used 128 different images (64 urban, 64 noise).
1. Run a linear mixed model with log(FF) as random slope + fixed effect grouped in subject and additional `image` as a random variable with a random intercept.
2. Use `summary`, `qqnorm` + `qqline` to investigate the new random variable `image` for normality
```{r echo=F,eval=additionalHints,results="asis"}
cat("
*Hint:* You can get the random effect similarily to the code above using `ranef(model)$image$`(Intercept)`
")
```
3. Run a model with `log(FF)` and `stimulus_type` as a categorical factor. Include both appropriate random coefficients and random variables. Is the fixed effect `stimulus_type` still significant, even after controlling for `image`-variability?
```{r echo=F,eval=additionalHints,results="asis"}
cat("
*Hint:* Keep the same random effects structure for a model with and one without the fixed effect `stimulus_type`and do a LRT model comparison using the `anova` function`.
")
```
```{r echo=showSolution,eval=showSolution}
m4 = lmer(data=d2,sqrt(choicetime) ~ standLogFF+stimulus_type + (1+standLogFF+stimulus_type | subject)+(1|image),REML=F)
m4b = lmer(data=d2,sqrt(choicetime) ~ standLogFF + (1+standLogFF+stimulus_type | subject)+(1|image),REML=F)
anova(m4,m4b)
```
```{r eval=F,echo=F}
ggplot(d,aes(x=forcedFixtime,y=choicetime))+stat_smooth()
m0 = (lmer(data=d,formula=choicetime~1+(1+NumOfBubbles|subject)))
m1 = (lmer(data=d,formula=choicetime~NumOfBubbles+(1+NumOfBubbles|subject)))
m2 = (lmer(data=d,formula=choicetime~log(NumOfBubbles)+NumOfBubbles+(1+log(NumOfBubbles)+NumOfBubbles|subject)))
m2 = (lmer(data=d,formula=choicetime~log(NumOfBubbles)+NumOfBubbles+(1+NumOfBubbles|subject)))
anova(m0,m1,m2)
# The log-model does not fare any better :S
m3 = (lmer(data=d,formula=choicetime~log(NumOfBubbles)+(1+log(NumOfBubbles)|subject)))
ranef.intercept = ranef(m1,condVar=T)$subject$`(Intercept)`
qqnorm(ranef.intercept)
qqline(ranef.intercept)
ranef.slope = ranef(m1,condVar=T)$subject$NumOfBubbles
qqnorm(ranef.slope)
qqline(ranef.slope)
```