Simulating EEG/ERP Data with SEREEGA & multiple comparison corrections

2. April 2019

For the $i^2 c^2 s^3$ summer school I simulated quite a bit of data and analyzed them with several common multiple comparison methods. I used the SEREEGA toolbox for the simulation. All the MatLab-code can be found at the end of this post. In a follow-up blogpost I will extend the toolbox to continuous data that we can analyze with the unfold toolbox.

First, I simulated data based on three effects: Two early dipoles representing the P100, one right lateralized for the N170 and a deep one for the P300.

Two symmetric sources for the P1 in the occipital cortex, one lateralized to the right cortex (third panel right most) for the N170 and a deep one, central for the P300.

I added brown and white noise to the simulated epochs. An exemplary eegplot shows that it kind of looks like EEG data.

5 trials in the default pop_eegplot of eeglab

I generate two conditions with two different condition-differences, one on the N170 and one on the P300 . The same data are depicted in the following three plots:

Data in butterfly plot & time-aligned points, plotted using the eegvis toolbox

In the first row we see a butterfly plot of activation in all channels (the colored channels are depicted in the next plot). In the first topoplot-row we have the average activation and in the last row we have the condition differences.

The three types of components: a short but relatively strong (P100, red), a similar effect, short but strong (N170, condition difference) and a long effect (P300, with weak condition effect). Highlighted gray areas are approximate regions where the components showed condition differences. The noise was a mix between brown and white noise.

The red line is the P100, occipital effect. No difference between conditions. The green one is the N170, temporal effect. Only visible in one condition. The blue one is a P300 like, deep effect. Difference in amplitude between conditions.

A third way to look at the same data. Color indicates activation (blue => negative µV, yellow=> positive µV). Each row is one channel, each column one time point. Using such a plot, one looses spatial relations between the channels , but gains an nice overview of all the data. Note that the colorscales are different between the two panels.

Now we are ready to do some statistical testing and multiple comparison correction. I choose to compare uncorrected, bonferroni-holms, FDR (Benjamini-Hochberg), clustermass cluster permutation testing (LIMO toolbox) and threshold free cluster enhancement (EPT-TFCE toolbox).

I addapted a display from fMRI depicting t-value vs. significance based on the fmri slice display toolbox. Plot can be made using the eegvis toolbox

In this instance, uncorrected p-value did not so bad, Bonferroni-Holms is, as expected, quite conservative. FDR hat troubles with the elongated cluster and TFCE/Cluster-permutation with the short one.

Note: This is only a single simulation. In order to move from these anectotal findings to proper statements, one would need to repeat the simulation 1000 times and see how often which samples were deemed significant (see Groppe 2011).

Thanks to Anna Lisa Gert for help with writing this post.

This is the code I used, it is a bit messy because it was programmed in a tutorial-style and I kept my debug-statements in there.

rng(1)
%% Adding toolboxes
addpath('lib\eeglab\')
if ~exist('pop_loadset')
    eeglab
end
addpath(genpath('lib\sereega'))
addpath(genpath('lib\eegvis'))
% Threshold Free cluster enhancement
addpath(genpath('lib\unfold\lib\ept_TFCE'))
addpath(genpath('lib\limo_eeg'))
%% SEREEGA Data Generation
cfg = struct('debug',0)
%% load headmodel 
lf = lf_generate_fromnyhead('montage', 'S64');

if cfg.debug
    plot_headmodel(lf);
end

%% set seed

%% noise
noise_brown = struct( ...
    'type', 'noise', ...
    'color', 'brown', ...
    'amplitude', 2);
noise_brown = utl_check_class(noise_brown);

noise_white = struct( ...
    'type', 'noise', ...
    'color', 'white', ...
    'amplitude', 2);
noise_white = utl_check_class(noise_white);


%% visual Cortex N170
vis = [];

vis.signal{1} = struct();
vis.signal{1}.peakLatency = 170;      % in ms, starting at the start of the epoch
vis.signal{1}.peakWidth = 100;        % in ms
vis.signal{1}.peakAmplitude = -5;      % in microvolt
vis.signal{1} = utl_check_class(vis.signal{1}, 'type', 'erp');
vis.signal{2} = noise_brown;


% right visual cortex
vis.source= lf_get_source_nearest(lf, [50 -40 -25]);
if cfg.debug
    plot_source_location(vis.source, lf, 'mode', '3d','shrink',0);
end

% vis.orientation = utl_get_orientation_pseudotangential(vis.source,lf); 
vis.orientationDv = [0 0 0]; % orientation does not change between trials
vis.orientation = [0.5,-0.46,1]; % I looked at the maps and found that this prientation most closely looked like a N170

if cfg.debug
    %% This Code I used to generate random orientations and inspect the leadfield projections
    for k = 1:10
        vis.orientation = utl_get_orientation_random(1)
        plot_source_projection(vis.source, lf, 'orientation', vis.orientation,'orientedonly',1);
        %     title(vis.orientation)
    end
end
%% Generate P100, take from SEREEGA Toolbox
p100 = utl_get_component_fromtemplate('visual_p100_erp',lf);
if cfg.debug
    % Visualize the response 
    plot_signal_fromclass(vis.signal{1}, epochs);
end

%% Generate P300
p3 = [];

p3.signal{1} = struct();
p3.signal{1}.peakLatency = 350;      % in ms, starting at the start of the epoch
p3.signal{1}.peakWidth = 400;        % in ms
p3.signal{1}.peakAmplitude = -3;      % in microvolt
p3.signal{1} = utl_check_class(p3.signal{1}, 'type', 'erp');
p3.signal{2} = noise_brown;


% somewhere deep in the brain
p3.source= lf_get_source_nearest(lf, [0 -40 -25]);
if cfg.debug
    %% visualize it using SEREEGA
    plot_source_location(p3.source, lf, 'mode', '3d');
end

p3.orientation = utl_get_orientation_pseudotangential(p3.source,lf);
p3.orientationDv = [0 0 0];

p3.orientation = [0.03,-0.16,1];
if cfg.debug
    %% again find an orientation that somehow matches P300topography
    for k = 1:10
        p3.orientation = utl_get_orientation_random(1)
        plot_source_projection(p3.source, lf, 'orientation', p3.orientation,'orientedonly',1);
    end
end


%% generate 25 random noise sources and combine all components
sources = lf_get_source_spaced(lf, 10, 25);
noise_brown.amplitude = 5;
comps = utl_create_component(sources, noise_brown, lf);
comps = utl_add_signal_tocomponent(noise_white,comps);
[comps1, comps2] = deal(comps);

% add P100 to noise components
comps1(end+1:end+2) = p100; %bilateral
comps2(end+1:end+2) = p100;

% N170 with condition difference
vis.signal{1}.peakAmplitude = -3;
comps1(end+1) = vis;
vis.signal{1}.peakAmplitude = -1;
comps2(end+1) = vis;

% broad, weak P300, with condition difference
p3.signal{1}.peakAmplitude = -10;
comps1(end+1) = p3;
p3.signal{1}.peakAmplitude = -12;
comps2(end+1) = p3;
%% Generate The EEG Data
epochs = struct();
epochs.n = 50;             % the number of epochs to simulate
epochs.srate = 250;        % their sampling rate in Hz
epochs.length = 500;

data1 = generate_scalpdata(comps1, lf, epochs);
data2 = generate_scalpdata(comps2, lf, epochs);
EEG1 = utl_create_eeglabdataset(data1, epochs, lf, 'marker', 'event1');
EEG2 = utl_create_eeglabdataset(data2, epochs, lf, 'marker', 'event2');
EEG = utl_reorder_eeglabdataset(pop_mergeset(EEG1, EEG2));

%% First inspection, this should match the intended topographies
pop_topoplot(EEG, 1, [0 100 170,300], '', [1 8]);
%% ERP-Like Figure
figure,
findtype = @(type)cellfun(@(x)strcmp(type,x),{EEG.event.type});

ploterp = @(chan)plot(EEG.times,...
    [mean(squeeze(EEG.data(chan,:,findtype('event1'))),2),...
    mean(squeeze(EEG.data(chan,:,findtype('event2'))),2)]);
figure,ploterp(44),title(EEG.chanlocs(44).labels)
hold on,ploterp(63),title(EEG.chanlocs(63).labels)
hold on,ploterp(30),title(EEG.chanlocs(30).labels)

%% Within-Subject T-Test using LIMO

[m,dfe,ci,sd,n,t,p] = limo_ttest(2,EEG.data(:,:,findtype('event1')),EEG.data(:,:,findtype('event2')),0.05);
%% simple imagesc plots
figure,subplot(2,1,1),imagesc(EEG.times,1:64,mean(EEG.data,3));box off
subplot(2,1,2),imagesc(EEG.times,1:64,m),box off
% Show data
figure
imagesc(m)

figure
imagesc(t)
%% Prepare statistics


%%
ept_tfce_nb = ept_ChN2(EEG.chanlocs,0); % the 1 to plot
tfce_res = ept_TFCE(...
    permute(EEG.data(:,:,findtype('event1')),[3 1 2]),...
    permute(EEG.data(:,:,findtype('event2')),[3,1,2]),EEG.chanlocs,'nperm',800,'flag_save',0,'ChN',ept_tfce_nb)

%%
t_h0 = nan([size(EEG.data,1),size(EEG.data,2),800]);
for b = 1:800
    fprintf('%i/%i\n',b,800)
    r_perm      = randperm(epochs.n*2); % Consider using Shuffle mex here (50-85% faster)...
    
    nData       = EEG.data(:,:,r_perm);
    sData{1}    = nData(:,:,1:epochs.n);
    sData{2}    = nData(:,:,(epochs.n+1):(epochs.n*2));
    
    t_h0(:,:,b) = (mean(sData{1},3)-mean(sData{2},3))./sqrt(var(sData{1},[],3)/epochs.n+var(sData{2},[],3)/epochs.n);
    
    
end
p_h0 = tpdf(t_h0,199);
%% Generate Cluster Permutation
limo_nb = limo_neighbourdist(EEG,50);
[limomask,cluster_p,max_th] = limo_clustering(t.^2,p,t_h0.^2,p_h0,struct('data',struct('chanlocs',EEG.chanlocs','neighbouring_matrix',limo_nb)),2,0.05,0);
cluster_p(isnan(cluster_p(:))) = 1;
% F_tfce = limo_tfce(2,t.^2,limo_nb);

%% show significance methods
figure,
for k = 0:4
    switch k
        case 0
            plot_p = p;
            mask = plot_p<0.05;
        case 1
            plot_p = bonf_holm(double(p),0.05);
            mask = plot_p<0.05;
        case 2
            [~,~,~,plot_p] = fdr_bh(p,0.05,'pdep','no'); % see groppe 2011 why pdep is fine
            mask = plot_p<0.05;
            
        case 3
            plot_p = cluster_p;
            mask = plot_p<0.05;
            
        case 4
            plot_p = tfce_res.P_Values;
            mask = plot_p<0.05;
            
            
    end
    plot_data= plot_p;
    plot_data = log10(plot_data);
    subplot(5,1,(k)+1)
    eegvis_imagesc(m,t,'chanlocs',EEG.chanlocs,'times',EEG.times,...
        'contour',1,'clustermask',mask,'figure',0,'xlabel',0,'colorbar',k == 4)
%     caxis([-7,0])
%     if k ~=4
%         axis off
%     end
%     box off
    plot_data(~mask) = 1;
    box off
end

%%
hA = plot_topobutter(mean(EEG.data,3),EEG.times,EEG.chanlocs,'colormap',{'div'},'quality',40,'n_topos',12);

%%
figure
hA = plot_topobutter(cat(3,mean(EEG.data,3),m),EEG.times,EEG.chanlocs,'quality',32,'individualcolorscale','row','highlighted_channel',[44,63]);

%% Additional Blog Figures
plot_source_location([p100.source,vis.source,p3.source], lf, 'mode', '2d');

2 Comments

Prahlad Rao · 4. September 2019 Reply

Good!
Intuition: False Discovery Rate (with animations) - Science · 14. January 2023 Reply

[…] am currently setting up a lecture on multiple comparison correction (for related posts see here or here). In a nutshell: If you apply a statistical test, that allows for 5% of false-positives (a […]

2 Comments

Leave a Reply