Intuition: False Discovery Rate (with animations)

I am currently setting up a lecture on multiple comparison correction (for related posts see here or here). In a nutshell: If you apply a statistical test, that allows for 5% of false-positives (a ‘wrong’ significant finding), many many times, you are more or less guaranteed to find a significant effect (because p(at_least_one_positive) = $1 – (1-0.05)^{100} = 0.994 = 99.4$%)

False-Discovery-Rate (FDR) is one way, to try to adapt to this. In this post I will give you a visual intuition behind it, not walk you through the math.

Note that FDR has a different goal to e.g. Bonferroni correction: An FDR corrected set of p-values tries to give you e.g. 5% of false-positives, of all significant ones. Not 5% false-positives of all p-values. Thus in some sense it is a less stringent correction.

Small ex-course: p-value distributions

Imagine you simulate 1000 data-sets without any effect in them, and ran 1000 statistical tests. We take all the p-values and visualize them using a histogram, this will look like this:

each update of the gif is one instance of an simulation. Note how all p-values are equally likely. The colors have no meaning

You can see, that all p-values are uniformly distributed (which means, all p-values are equally probable) if no effect exists.

What happens now, if we introduce an effect? We would expect many more p-values smaller 0.05, but still some that are greater 0.05, just due to chance (and depending on your noise/power).

P-Values given a strong effect. Note the strong bias towards small p-values. This make sense because given we simulated an effect, it should be unlikely under the $H_0$

This is exactly what we see here.

Let’s get back to FDR

Next, what happens if we mix the two? I.e. we get some false and some true positives. This is exactly the situation we set out with! We want to control the number of false-positives where some might underly a true effect and others are member of the H0-Team. What we do in FDR is:

  1. Estimate how many p-values could be attributed to the $H_0$ distribution. This is where the orange p-value set comes into play: Those arguably give us a $H_1$-uncontaminated calibration-set to estimate what the “height” of the uniform $H_0$ p-value distribution should be. In other words: we find out, how many p-values we expect to be smaller than our threshold ($\alpha$) if no effect exists.
  2. Estimate how many p-values could be attributed to the $H_1$ distribution. These are the ones over and beyond what we would expect under the “pure” $H_1$
  3. We allow for a certain amount (usually 5%) of $H_0$-pvalues relative to the $H_1$-pvalues. Because $H_1$ pvalues are expected on the smaller side, we can simply change the threshold until the ratio of these two fits.
The estimated False-Positives are the ratio of the estimated $H_1$-pvalues divided by the estimated $H_0$-pvalues. The orange line is here estimated by p-values>0.5. Note how we need to reduce our $\alpha$-threshold from typical 0.05 to until ~0.003 to reach 5% false-positives in our set of significant p-values.

This is not how FDR is calculated in practice. For this, have a look at this blog-post from Matthew Brett. But at least it allowed me to visualize it in a way I could better intuit.

Modelling Circular Effects using Splines

ERP effect of absolute angle between two saccades. Dimigen & Ehinger 2021

Note: This blog is just explaining the basis sets, not how to actually fit models / get parameters etc.

Recently, we used the unfoldtoolbox (Matlab or Julia; access it from python!) to analyze some fixation-related ERPs. The approach we used (multiple regression with deconvolution) allowed us to include this circular-predictor: absolute saccade-angle.

Why can’t we model saccade-angle using a linear predictor? The issue is straightforward: Look at this plot.

Predicting circular data with a straight line is difficult.

Ok, why can’t we use a standard non-linear spline regression?

Wait – what even is a standard non-linear spline regression? Great that you asked. Instead of fitting a straight line with parameters slope & intercept (think y = m*x + c), we can split up the x-axis regressor in multiple “local” regressors.

Taken from Ehinger & Dimigen 2019

But: The angle starts at x=0, and goes to x=360 – but we all know, x=0 and x=360 are actually identical! The orange line added to the plot would have 0 & 360 at different values, except for a slope of 0.

How do we fix this, how do we “wrap around” the predictor axis?

Circular Splines!

We use a basis function that wraps around 0 / 360°

Instead of having basis functions that have bounds at 0 / 360°, we are using a basis function set that wraps the circular space. If the idea didn’t become clear, here is an alternative visualization:

“default” splines, spanning 0 to 1 or 0 to 360°
Circular splines – spanning 0 – 1 in a circular fashion. Note the first spline (second column) that is “active” both at 0 and at 1 (aka 0° and 360°).

Hopefully, these visualizations help some of you to understand circular splines better!

Disclaimer: Thanks to Judith Schepers for discussing this with me; sorry for the typos etc. I am a bit in a rush

When the H0 distribution of TFCE is not uniform

I wrote about Threshold-Free-Cluster-Enhancement (TFCE) before, this time I stumbled upon a weirdly looking H0 diagram. Let me explain: If you simulate data without any effect, you expect that the P(data|H0) distribution is uniform, that is, all p-values are equally likely. Here, I define the p-value as the minimal p-value over time that I get from one whole simulation (1000 permutations per simulated dataset) – I simulated only cluster in time not space (find the notebook here, raw-jl here). When I did this for 100 repetitions, each applying permutation TFCE and calculating the min-p, I got the following histogram of 100 p-values:

TFCE H0 distribution with integration step of 0.4

That does not look uniform at all! What is going on? It turns out, that you can get this kind of “clustering” if your integration step-size is too large. Indeed, I change the integration step from 0.4 to 0.1

TFCE H0 distribution with integration step 0.1

Now it looks much more uniform; I should probably use more repetitions (indeed in full simulations I use 10x as many) – but this already took 500s and I am not prepared to wait longer 😉

Thanks @Olivier Renauld for this explanation!

Estimating travelling speed using only your eyes

Here is a fun trick (I think) invented by Martin Rolfs and Casimir Ludwig.

You are in a train and would like to know the speed of the train – but no phone, GPS or speedometer – here is how you do it.

Voyager GIF - Voyager GIFs
Person pondering how fast she is going –

Here is how to do it:

  1. Stretch out both arms, thumbs up.
  2. Make eye-movements from one thumb to the other, focus on the eye movements going in direction of the train
  3. Slowly increase/decrease the distance of the thumbs, effectively changing (in a controlled manner) how large your eye movements are.
  4. Notice the rail sleepers of the nearby track – at some point of (3) your eye-velocity will perfectly match the angular speed of the train, and you will be able to see the rails crisp as day – during an eye movement (take that saccadic suppression! – but seriously, check out that Wikipedia article, it explains step 3 in more words)
  5. Now measure (somehow) how many thumbs-width’ your two thumbs hands are apart and take this times two (1 thumb-width are approximately 2° visual angle). For our example, let’s say we measured 7°.
  6. Gauge the distance to the neighboring track, let’s say 2m.

7. The final ingredient is the log-log relationship between eye-movement size & eye-movement speed: the Main Sequence of Eye movements. A 7° saccade is ~130° / s fast.

Collewijn et al., 1988

8. Let’s solve for the train’s speed: $ v=\frac{130°/s}{2*\pi} *2m=20\frac{m}{s} = 72\frac{km}{h} $

Now remember or print out the main sequence from Collewijn & you will never not know how fast the train is going 🙂

Vision-Demo: Purkinjes Blue Arc Phenomenon

Purkinjes Blue Arc is a cool and not well known visual effect. It consists of illusionary blue arcs, emanating from a (typically) red stimulus. It has been rediscovered at least half a dozen times in the last 200 years and goes back to Purkinje. The exact physiological reason for the Blue Arcs is still not now. A detailed write-up of the demonstration with more tipps can be found in Moreland 1967.

Modern screen technology make it much easier for you, to experience this effect. Just follow these simple instructions!

Purkinjes Blue Arc Recipe
Display me in fullscreen on an OLED screen in a! Look at the red dot with the right eye only.
  1. You need an OLED screen – ~50% of modern mobile phone screens are OLED. To check, go to a dark room and open a complete black image. Is the display pitch black (OLED), or is there backlight coming through (LCD)?
  2. Display this gif in fullscreen – you probably need to download it. No other UI-elements should be visible
  3. Go to a reasonable dark room
  4. Close the left eye and stare at the red dot
  5. Purkinjes Blue Arcs should appear

It is hard to see the blue arcs if you do not know what to look for, therefore I added a small visualization

A rough approximation of the Purkinje Blue Arcs

I can’t see it?!

  • Maybe you flipped your phone (or closed the wrong eye ;)) – be sure that the straight line points to the right
  • Maybe the room is still too bright. Also be sure that the red-color of the stimulus doesnt brighten up the background of the room
  • I don’t have an OLED screen: The demonstration should also work with just a red dot – maybe you have one at your stereo? Any bright red LED should work, the effect is smaller but still there. Your mileage on an LCD screen will vary…

What’s going on?

The nerve fibers from the photoreceptors are bundled and leave the retina through the optic discs. These nervebundles are called the Raphe. As visible in the next screen, Purkinjes Blue Arc follow the raphe perfect.y

The Raphe – all pointing towards the optical disc (Blind Spot)

Why exactly such a red bright stimulus activates the blue ganglio cells is currently not know.

Visualization of deconvolution with pluto.jl

I just started dabbling with Pluto.jl and very quickly it allows to give very insightful notebooks.

For example, take this signal:

X in samples, y in “µV”, blue = 1 EEG channel, orange= event onsets

Clearly, the simulated event-responses (the event-related potentials) overlap in time (e.g. at ~sample 350). We could do a “naive” regression on all timepoints relative to the event-onset, ignoring any overlap – or we could use linear deconvolution aka. overlap-correction to correct for the overlap (as the name says ;).

What follows is the beauty of Pluto.jl – simple reactive/interactive notebooks. As shown in the following gif, it is very easy to show the dependency of deconvolution-success on window-size and noise:

Mass Univariate analysis on the left, deconvolution on the right

Looks pretty robust for this simulation! Cool!

If you want to try for yourself: here is the notebook and here the link to the Unfold.jl toolbox

Why Robust Statistics?

For my new EEG course in Stuttgart I spend some time to make this gif – I couldn’t find a version online. It shows a simple fact: If you calculate the mean, the breakdown point of 0%. That is, every datapoint counts whether it is an outlier or not.
Trimmed or winsorized means instead calculate the mean based on the inner X % (e.g. inner 80% for trimmed mean of 20%, removing the top and bottom 10% of datapoints) – or in case of winsorizing the mean with the 20% extreme values not removed, but changed to the now new remaining limits). Therefore they have breaking points of X% too – making them robust to outliers.

Fun fact: a 100% trimmed mean is just the median!

As you can see, increasing the amount of outliers has a clear influence on the mean but not the 20% trimmed mean.

One important point: While sometimes outlier removal and robust statistics are very important, and arguable a better default (compared to mean) – you should also always try to understand where the outliers you remove actually come from.

The source code to generate the animation is here:

using Plots
using Random
using StatsBase

anim = @animate  for i ∈ [range(3,20,step=0.5)...,range(20,3,step=-0.5)...]

    x = randn(50);
    append!(x,randn(5) .+ i); #add the offset


    #vline!([mean(winsor(x,prop=0.2))])# same as trimmean in this example

gif(anim, "outlier_animation.gif", fps = 4)

New lab in Stuttgart

I will be starting a new lab on Computational Cognitive Science, next month at University of Stuttgart. I will be working on the connection of EEG and Eye-Tracking, Statistics and methods development. The group is attached to the SimTech and the VIS Stuttgart

SimTech Building, three brains (not to scale) are arranged in the ponzo illusion
SimTech Building in Stuttgart

EEG recording chamber

I recently asked on twitter whether people can recommend recording chambers to seat the subject in psychological experiments. I had a tough time googling it, terms that could be helpful in case you are in search for the same thing: Testing chamber, subject booth, audiology.

I got a lot of answers and for the sake of “google-ability” will summarize them here:

  • Steve Luck recommends a separated chamber, but highlights importance of air-conditioning due to sweating artefacts
  • Aina Puce recommends no chamber, but to sit 2-3m behind the subject and use white noise generators

Regarding actual chambers several commercial vendors were thrown in the ring:

  • Studiobricks*
  • Whisper Rooms*
  • Desone
  • Eckel
  • IAC Acoustics

* no Farraday cage directly available as far as I know. But check this tweet for a custom solution

I haven’t asked all vendors for a price estimate, but as far as I can tell, with climate control & lighting a ~4m² room costs around 8.000€ – 12.000€ without a Farraday Cage. With a cage I would guesstimate +10.00-15.000€ but I actually don’t really know.

Comparing Manual and Atlas-based retinotopies; my journey through fmri-surface-land

PS: For this project I moved from EEG to fMRI, and in this post I will sometimes explain terms that might be very basic to fMRI people, but maybe not for EEG people.

I want to investigate cortical area V1. But I don’t want to spend time on retinotopy during my recording session. Thus I looked a bit into automatic methods to estimate it from segmented (segment = split up in WhiteMatter/GrayMatter+extract 3D-surfaces from voxel-MRI and also inflate them) brains. I used the freesurfer/label/lh.V1 labels and the neurophythy/Benson et al tools . The manual retinotopy was performed by Sam Lawrence using MrVista. And here the trouble begins:

The manual retinotopy was available only as a volume (voxel-file, maybe due to my completly lacking mrVista skills. I should look into whether I can extract the mrVista mesh-files somehow), while the other outputs I have as freesurfer vertex values, ready to be plotted against the different surfaces freesurfer calculated (e.g. white matter, pial (gray matter), inflated). Thus I had to map the volume to surface. Sounds easy – something that is straight forward – or so I thought.

After a lot of trial&error and bugging colleagues at the Donders, I settled for the nipype call to mri_vol2surf from freesurfer. But it took me a long time to figure out what the options actually mean. This answer by Doug Greve was helpful (the answer is 12 years old, nobody added it to the help :() (see also this answer):

It should be in the help (reprinted below). Smaller delta is better
but takes longer. With big functional voxels, I would not agonize too
much over making delta real small as you'll just hit the same voxel
multiple times. .25 is probably sufficient.


   --projfrac-avg min max delta
   --projdist-avg min max delta

     Same idea as --projfrac and --projdist, but sample at each of the
     points    between min and max at a spacing of delta. The samples are then
     averaged    together. The idea here is to average along the normal.

The problem is that you have to map each vertex to a voxel. So in this approach you take the normal vector of the surface (e.g. from white matter surface), check where it hits the gray matter, sample ‘delta’ steps between WM (min) and GM (max), and check which voxels are closest to these steps. The average value of the voxels is then assigned to this vertex.

I will first show a ‘successful subject before I dive into some troubles along the way.

red=freesurfer label, orange = benson label restricted to <10deg visual angle, purple = manual based on 10deg retinotopy data

Overall a good match I would say, generally benson & freesurfer have a good alignment (reasonable), the manual retinotopy is larger in most subjects. This might also be due to the projection method (see below)

Initially I tried projection withour smoothing, see the results below. I then changed to a smooth of 5mm kernel with subsequent thresholding (for sure there is probably a smarter way).

Without smoothing
With 5mm smoothing (red=freesurfer label, orange = benson label, purple = manual)

It is pretty clear that in this example the fit of manual with automatic tools is not very good. My trouble is now that I don’t know if this is because of actual difference or because of the projection.

Next steps would be to double check everything in voxel land, i.e. project the surface-labels back to voxels and investigate the voxel-by-voxel ROIs.

1 of 5