Makoto's useful EEGLAB code

From SCCN
Jump to: navigation, search

I hope to help those who are eager to code to use EEGLAB more efficiently. Actually, the real purpose of building EEGLAB was to drive users to learn coding... Scott told me so. The following examples show how to deal with EEGLAB data structure using index operations wherever possible (i.e. minimal use of for loops). If you find an error or improvement, please share it on EEGLAB mailing list.


Contents

How to search a keyword in the entire EEGLAB mailing list archive (07/19/2021 added)

Assuming you use Google as a search engine, use the option site:sccn.ucsd.edu/pipermail/eeglablist/ in addition to your query words to limit the Google search to EEGLAB mailing list. For example, if you want to search for the key word rank deficient in the the archived mailing list, try

rank deficient site:sccn.ucsd.edu/pipermail/eeglablist/

I found the same solution here, which seems to work for EEGLAB mailing list.

How to export Matlab figures in publication quality (05/05/2021 updated)

This is for Mohamed Mounir Tellache. Although we always need publication-quality figures at the end of the day to write a paper, it is often hard to find good solutions. The last resort is to take a screenshot, which I sometimes need to do too, but image quality is limited (unless you are using 8k resolution monitor).

Personally, I do not want to use any other application than Matlab in doing research (except for Word, which is rather a general tool). But here I have to admit one exception, which is Adobe Illustrator. Combined with a figure exported from Matlab in the scalable vector graphics (.svg) format, I can take control over all the details of the figure. Free software like GIMP will also do. Let me share how I process the Matlab-generated figures for publications.

The following code exports the 'current figure' (i.e. which is the figure currently selected on Matlab; click the figure and usually you can make it current, unless the figure is set to 'unable to become current' explicitly in figure property setting) to the user-specified full path. The first option '-dsvg' specifies scalable vector graphic, which is the best choice if you use Adobe Illustrator or GIMP. The second option '-painters' to force Matlab to use the vector graphics mode (otherwise a figure with more than certain level of complexity will be automatically 'rasterized' i.e. not saved as a scalable vector graphics but just a collection of pixels regardless of '-dsvg' option used--this is almost a bug in Matlab.)


print('/data/project/makoto/project/saveFigureName', '-dsvg', '-painters')

Note that by default the figure background is white. If you want to reflect the actual background color, you need to run the additional line.

set(gcf, 'InvertHardcopy', 'off')

The rest of the process depends on your skill in using Illustrator/GIMP which you can learn from Youtube videos etc. Yes, processing these figures with Illustrator/GIMP is very time consuming, but writing code to finish is even more so.

Alternatively, if you want to save the figure as high-resolution jpeg (or any other lossy or lossless formats), try the following.

print('/data/project/makoto/project/saveFigureName', '-djpeg95', '-r300')

'-djpeg95' is an option to use jpeg with 'Quality 95' compression. If you want to suppress the file size, use smaller values such as 80 or 70. '-r300' is an option to specify output resolution, and '-r300' specifies 300 dpi (dots per inch).


Note also that the figure window size on your screen affects the size of the exported figure. This may or may not make sense to you, but generally speaking the best way to see how it affects is to see the exported figure. So you need to repeat try and fix several times to finalize the parameters. Again, if '-dsvg' works, it is not a problem as you can change the scale and aspect ratio of each graphical objects on the Illustrator or Gimp.


How to export PowerPoint figures in publication quality (12/23/2020 added)

This is not a Matlab trick but I certainly ran into the need that figures for publications are on PowerPoint. The default export parameter to jpeg (or .png) is only 96 dpi, but you want to have 300 dpi. See this Microsoft page for the instruction.


How to create an inset plot (05/22/21 added)

Sometimes we want to show a subplot in an inset in a plot. Once can use the example code below for specifing the position of the inset plot with Matlab's relative position syntax (the second line).

subplotPosition = get(gca,'position');
insetRelativePosition = [0.6 0.2 0.35 0.2]; % SouthEast. 
deltax = subplotPosition(3)*insetRelativePosition(1);
deltay = subplotPosition(4)*insetRelativePosition(2);
insetPosition = [subplotPosition(1:2) 0 0] + [deltax deltay subplotPosition(3:4).*insetRelativePosition(3:4)];
axes('Position',insetPosition)

How to obtain executed code with input parameters by operating graphical user interface (GUI)

Type eegh in the Matlab command line and you'll see it.

How to plot multiple channel ERPs in one plot (07/17/2020 updated)

This is written upon the request by Mohammad Reza Khaleghi.

figure
plotElecIdx = [3 7 11]; % Select the indices of the electrodes whose ERPs you want to plot.
for elecIdx = 1:length(plotElecIdx)
    hold on
    plot(EEG.times, mean(EEG.data(plotElecIdx,:,:),3)); % The last '3' means 'Take mean across the third dimension (i.e. trials)' so do not change.
    elecLabels{elecIdx} = EEG.chanlocs(plotElecIdx(elecIdx)).labels;
end
legend(elecLabels) % This gives you annotation of which color is which electrode.

How to add a channel (02/18/2021)

This is for Ilaria Berteletti. If you have already tried to do it using EEGLAB, you must have found that you can't do it with EEGLAB. But apparently I need to do it now. Maybe the right way to do it was to change the initial reference to 'Cz' then apply average reference while 'recovering the original reference channel' in the upstream of the preprocessing but it is too late. This is the hack to do it. When using it, do not forget to change the paths to valid ones. In this example, 64-channel montage but Cz was used as an initial reference, which we want to recover. Note that after applying this change, ALL ICA RESULTS BECOMES UNUSABLE (but if ICs are already rejected for artifact correction, it will be valid--valid means the scalp signal remains rank-deficient). So the right way to do it is still to add the channel first, then ICA second (See this article). This solution is more like for emergency situation for which you only need scalp electrode analysis.

allSets = dir('/data/mobi/ilaria/p0100_imported/*.set');
 
for setIdx = 1:length(allSets)
 
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    %%% Create the new montage template. %%%
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    % Use the first dataset for the new template.
    if setIdx == 1
        loadName = allSets(setIdx).name;
        EEG = pop_loadset('filename', loadName, 'filepath', '/data/mobi/ilaria/p0100_imported');
 
        % Remove ICA-related info (if present) and insert Cz at ch64.
        EEG.nbchan      = 64;
        EEG.data(64,:)  = zeros(1,EEG.pnts);
        EEG.icaact      = [];
        EEG.icawinv     = [];
        EEG.icasphere   = [];
        EEG.icaweights  = [];
        EEG.icachansind = [];
        EEG.chanlocs(64).label = 'Cz';
        EEG = pop_chanedit(EEG, 'changefield',{64 'labels' 'Cz'},'lookup','/data/projects/makoto/Tools/eeglab14_1_2b/plugins/dipfit3.3/standard_BEM/elec/standard_1005.elc', ...
                           'eval','chans = pop_chancenter( chans, [],[]);');
 
        % Store it as EEG2.            
        EEG2 = pop_select(EEG,'time',[0 1]);
    end
 
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    %%% Batch-apply the new montage template. %%%
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    % Load a .set file.
    loadName = allSets(setIdx).name;
    EEG = pop_loadset('filename', loadName, 'filepath', '/data/mobi/ilaria/p0100_imported');
 
    % Just in case, re-apply the MNI template locations with chancenter. 
    EEG = pop_chanedit(EEG, 'lookup','/data/projects/makoto/Tools/eeglab14_1_2b/plugins/dipfit3.3/standard_BEM/elec/standard_1005.elc', ...
                       'eval','chans = pop_chancenter( chans, [],[]);');
 
    % Apply the new montage template.               
    EEG = pop_interp(EEG, EEG2.chanlocs, 'spherical');
 
    % Remove ICA-related results.
    EEG.icaact      = [];
    EEG.icawinv     = [];
    EEG.icasphere   = [];
    EEG.icaweights  = [];
    EEG.icachansind = [];
    EEG.dipfit      = [];
 
    % Save the data.
    pop_saveset(EEG, 'filename', loadName, 'filepath', '/data/mobi/ilaria/p0110_recoverCz');
end

How to subsample electrodes evenly (05/12/2021 added)

There is an EEGLAB function loc_subsets() written by Nima Bigdely-Shamlo located under the 'miscfunc' folder. I use it only once in a few years, so I forget the name of the function when I need it (thus I have been bothering Nima every few years). It happened to me today, so I put the answer here once and for all. See below for the part of the help document.

%  loc_subsets() - Separate channels into maximally evenly-spaced subsets. 
%                  This is achieved by exchanging channels between subsets so as to
%                  increase the sum of average of distances within each channel subset.
%  Usage:
%        >> subset = loc_subsets(chanlocs, nchans); % select an evenly spaced nchans
%        >> [subsets subidx pos] = loc_subsets(chanlocs, nchans, plotobj, plotchans, keepchans);

How to add the initial reference electrode back by average referencing (03/10/2021)

This is for Ilaria Berteletti. This example shows 64-channel recording data with the initial reference at Cz. The data is imported as 63-channel data, then Cz is added as the initial reference channel, which is 'recovered' when re-referencing to average. Note that this approach causes effective rank deficiency, so this process needs to be done AFTER clean_rawdata(). pop_runica() has its own rank checker so no problem.

[EEG.chanlocs.ref] = deal('Cz');
EEG = pop_chanedit(EEG, 'append',63,'changefield',{64 'labels' 'Cz'},'lookup','eeglabroot/plugins/dipfit3.3/standard_BEM/elec/standard_1005.elc',...
                   'eval','chans = pop_chancenter( chans, [],[]);','changefield',{64 'type' 'REF'});
EEG = pop_reref( EEG, [],'refloc',struct('labels',{'Cz'},'type',{'REF'},'theta',{-94.2457},'radius',{0.023785},'X',{-0.54445},'Y',{7.3339},'Z',{98.2349},...
                    'sph_theta',{94.2457},'sph_phi',{85.7187},'sph_radius',{98.5098},'urchan',{64},'ref',{''},'datachan',{0}));

How to extract subjects and independent components from STUDY structure

Please read this page. Note that it says 'This apparently complex scheme allows to use...'

How to extract EEG power of frequency bands (06/06/2020 updated)

This is one of the most frequently asked questions in the EEGLAB mailing list. By the way, I wrote erratum to correct code on 02/06/2016--thank you Jerry Zhu for the correction!

 % This example Matlab code shows how to compute power spectrum of epoched data, channel 2.
 [spectra,freqs] = spectopo(EEG.data(2,:,:), 0, EEG.srate);
 
 % Set the following frequency bands: delta=1-4, theta=4-8, alpha=8-13, beta=13-30, gamma=30-80.
 deltaIdx = find(freqs>1 & freqs<4);
 thetaIdx = find(freqs>4 & freqs<8);
 alphaIdx = find(freqs>8 & freqs<13);
 betaIdx  = find(freqs>13 & freqs<30);
 gammaIdx = find(freqs>30 & freqs<80);
 
 % Compute absolute power.
 deltaPower = mean(10.^(spectra(deltaIdx)/10));
 thetaPower = mean(10.^(spectra(thetaIdx)/10));
 alphaPower = mean(10.^(spectra(alphaIdx)/10));
 betaPower  = mean(10.^(spectra(betaIdx)/10));
 gammaPower = mean(10.^(spectra(gammaIdx)/10));


Validation (06/06/2020 added)

For John Johnson, I wrote following code to confirm how power values are averaged.

% Calculate mean power (uV^2).
power1 = 11;
power2 = 22;
power3 = 33;
meanPower = mean([power1 power2 power3]) % 22
 
% Convert power (uV^2) to 10*log10 (dB).
power1_dB = 10*log10(power1);
power2_dB = 10*log10(power2);
power3_dB = 10*log10(power3);
 
% Average dB values then convert to uV^2.
meanPower_dB1 = mean([power1_dB power2_dB power3_dB]) % 13.0078
10^(meanPower_dB1/10)                                 % 19.98
 
% Convert dB values to uV^2 then average.
meanPower_dB2 = mean([10^(power1_dB/10) 10^(power2_dB/10) 10^(power3_dB/10)]) % 22

FAQ

For Tim Göcking, let me add some explanations.

  • Q. What is unit for this power?
    • A. The unit is uV^2/Hz.
  • Q. What does the part '/Hz' mean in this unit?
    • A. This means 'per unit frequency bandwidth' i.e., frequency resolution being 1Hz. I found this page helpful to understand it (Thank you Nir Ofir for letting me know that the page was moved.)
  • Q. What does 10.^ do to the values?
    • A. The output from spectopo() is converted to dB as 10*log10([value_in_uV^2/Hz]) in dB (the first '10*' part is to convert the unit from Bell to deci-Bell.) Because log10 is used for the dB conversion, '10.^' is used to convert it back to uV^2/Hz.
  • Q. Why dividing through 10?
    • A. This is to convert back the deci-Bell's 'deci' part.
  • Q. When do we switch bewtween 10*log10(X) and 20*log10(X)?
    • A. 10*log10(X) is used to describe power (i.e., amplitude^2), while 20*log10(X) is used to describe amplitude.
dB_power = 10*log10(power/reference_power);
         = 10*log10(amplitude^2/reference_amplitude^2)
         = 20*log10(amplitude/reference_amplitude)
Often approximated as 2x = 6dB, 10x = 20dB.

dB_amplitude = 10*log10(amplitude/reference_amplitude)
             = dB_power/2.
Often approximated as 2x = 3dB, 10x = 10dB.


Comparison between spectopo() and pwelch() (04/27/2021 added)

This is for Hamed Taheri Gorji. EEGLAB function spectopo() uses Matlab's pwelch() function to calculate power spectral density (PSD). In theory, in the input parameters are the same, they should produce the same results. Let's test it with the following code.

% Create a dummy data with a simulated signal.
EEG = eeg_emptyset;
EEG.data  = randn(1,10000);
EEG.data = EEG.data - mean(EEG.data);
EEG.srate = 256;
inputData = EEG.data;
 
% Use EEGLAB's spectopo() function.
[spectra1,freqs1] = spectopo(inputData, 0, EEG.srate, 'winsize', EEG.srate, 'nfft', EEG.srate, 'plot', 'off'); % For channel 2 (C4)
spectra1 = 10.^(spectra1/10);
 
% Use Matlab pwelch() function.
x  = inputData;
Fs = EEG.srate;
L  = EEG.srate;
NFFT = EEG.srate;
NOVERLAP = 0;
WINDOW = EEG.srate;
[spectra2,freqs2] = pwelch(x,WINDOW,NOVERLAP,NFFT,EEG.srate);
 
figure
subplot(1,2,1)
plot(freqs1, spectra1); hold on
plot(freqs2, spectra2')
legend({'spectopo()', 'pwelch'})
title('PSD comparison')
xlim([0 128])
 
subplot(1,2,2)
plot(freqs1, spectra1-spectra2')
xlabel('Frequency (Hz)')
ylabel('Power (uV^2/Hz)')
title('Difference')
xlim([0 128])

PsdComparison.jpg


In the left plot, the two PSD plots are completely overlapped and visually undistinguishable. In the right plot, the difference wave is shown. The difference across overall frequency bins are within 10^-17 uV^2/Hz, while the data range of the original PSD plot was 10^-2 uV^2/Hz. Compared with the original signal scale, the error between the two plots are virtually zero. Without reading the spectopo() function, I don't know why the difference is non-zero, but for the practical purpose the error is negligible.


Hamed contacted me to point out that my old code I wrote in "[Eeglablist] Power in spectopo vs. fft" (posted to EEGLAB mailing list on Thu May 8 11:48:54 PDT 2014) contains an error--when I defined NFFT, I entered the entire length of the continuous data (as if I'm dealing with the epoched data), with which one should not be able to reproduce the result. I would like to correct my old post here. Thank you Hamed!

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% EEGLAB spectopo function
[spectra,freqs] = spectopo(EEG.data(2,:), 0, EEG.srate, 'nfft', 1024); %
For channel 2 (C4)
 
thetaIdx   = find(freqs>=4 & freqs<=8);                    % theta=4-8
thetaPower = 10^(mean(spectra(thetaIdx))/10);              % mean theta
power
alphaIdx   = find(freqs>=8 & freqs<=12);                   % alpha=8-12
alphaPower = 10^(mean(spectra(alphaIdx))/10);              % mean alpha
power
eeglabThetaAlphaRatio = thetaPower/alphaPower;
 
figure; subplot(1,2,1); plot(freqs,10.^(spectra/10))
 
% Matlab pwelch function
x  = EEG.data(2, :);      % For channel 2 (C4)
Fs = EEG.srate;              % Sampling frequency
L  = EEG.pnts;               % Length of signal
NFFT = 2^nextpow2(L);        % Next power of 2 from length of x  <-This line is wrong! (04/27/2021 Makoto)
% X    = fft(x,NFFT)/L;
% freqs   = Fs/2*linspace(0,1,NFFT/2+1);
% spectra = abs(X(1:NFFT/2+1));
 
NOVERLAP = 0;
WINDOW = 512;
[spectra,freqs] = pwelch(x,WINDOW,NOVERLAP,NFFT,EEG.srate);
 
thetaIdx   = find(freqs>=4 & freqs<=8);            % theta=4-8
thetaPower = mean(spectra(thetaIdx));              % mean theta power
alphaIdx   = find(freqs>=8 & freqs<=12);           % alpha=8-12
alphaPower = mean(spectra(alphaIdx));              % mean alpha power
matlabThetaAlphaRatio = thetaPower/alphaPower;
 
subplot(1,2,2); plot(freqs,spectra)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

How to batch-process all-channel PSDs in both dB and uV^2/Hz (03/17/2020 updated)

For Sayaka Fujita, I made this section separated for her reference. Thank you John Iversen for correction!

% This example code compares PSD in dB (left) vs. uV^2/Hz (right) rendered as scalp topography (setfile must be loaded.)
lowerFreq  = 4; % Hz
higherFreq = 8; % Hz
meanPowerDb     = zeros(EEG.nbchan,1);
meanPowerMicroV = zeros(EEG.nbchan,1);
for channelIdx = 1:EEG.nbchan
    [psdOutDb(channelIdx,:), freq] = spectopo(EEG.data(channelIdx, :), 0, EEG.srate, 'plot', 'off');
    lowerFreqIdx    = find(freq==lowerFreq);
    higherFreqIdx   = find(freq==higherFreq);
    meanPowerDb(channelIdx) = mean(psdOutDb(channelIdx, lowerFreqIdx:higherFreqIdx));
    meanPowerMicroV(channelIdx) = mean(10.^((psdOutDb(channelIdx, lowerFreqIdx:higherFreqIdx))/10), 2);
end
 
figure
subplot(1,2,1)
topoplot(meanPowerDb, EEG.chanlocs)
title('Theta band (4-8Hz) power distribution')
cbarHandle = colorbar;
set(get(cbarHandle, 'title'), 'string', '(dB)')
 
subplot(1,2,2)
topoplot(meanPowerMicroV, EEG.chanlocs)
title('Theta band (4-8Hz) power distribution')
cbarHandle = colorbar;
set(get(cbarHandle, 'title'), 'string', '(uV^2/Hz)')

Comparison.png

How to build EEG structure (07/13/2018 updated)

% Create empty EEG structure.
EEG = eeg_emptyset;
 
% Define basic items of EEG structure.
EEG        = eeg_emptyset();
EEG.data   = timeSeriesDataYouHave;
EEG.times  = timeSeriesDataLatencyYouHave;
EEG.xmin   = EEG.times(1);
EEG.xmax   = EEG.times(end);
EEG.srate  = round(1/((EEG.xmax-EEG.xmin)/length(EEG.times))); % Rounded actual sampling rate. Note that the unit of the time must be in second.
EEG.nbchan = size(EEG.data,1);
EEG.pnts   = size(EEG.data,2);
 
% Define event information.
eventStructure = struct('type', [], 'latency', []);
latencyValues  = num2cell(eventLatencyInSecYouHave*EEG.srate);
[eventStructure(1:length(eventLatencyInSecYouHave)).latency] = latencyValues{:};
eventTypes     = eventTypesYouHave;
[eventStructure(1:length(eventLatencyInSecYouHave)).type] = eventTypes{:};
EEG.event = eventStructure;
EEG = eeg_checkset(EEG, 'eventconsistency');
EEG = eeg_checkset(EEG, 'makeur');


How to load EEGLAB .set and .fdt files without using EEGLAB (05/09/2020 updated)

We can load a .set file, which contains header information, by using Matlab function load() with '-mat' option. However, how can we load .fdt file that is the matrix of EEG time-series data? See below for the example of loading a dataset that is saved as a pair of XXXXX.set and XXXXX.fdt files.

% Load header info.
headerInfo = load('XXXXX.set', '-mat');
 
% Load EEG time-series matrix (taken from eeg_getdatact.m centered at line 233)
fid = fopen('XXXXX.fdt', 'r', 'ieee-le');
for trialIdx = 1:headerInfo.EEG.trials % In case the saved data are epoched, loop the process for each epoch. Thanks Ramesh Srinivasan!
    currentTrialData = fread(fid, [headerInfo.EEG.nbchan headerInfo.EEG.pnts], 'float32');
    data(:,:,trialIdx) = currentTrialData; % Data dimentions are: electrodes, time points, and trials (the last one is for epoched data)                  
end
fclose(fid);

How to extract code embedded within file names (07/13/2018 updated)

Notice how to deal with Matlab cell format using cellfun() in two different ways.

allSubj      = dir('*.set'); % Obtain all the .set files under the current folder.
allSubjNames = {allSubj.name}'; % Obtain the file names. Here, suppose the naming rule is as follows:
 
     allSubjNames =
 
     'subj01_group1'
     'subj02_group2'
     'subj03_group2'
     'subj04_group1'
     'subj05_group2'
     'subj06_group1'
 
group1List = strfind(allSubjNames, '_group1'); % Choose those subjects who has '_group1' in the file name. Note that the output is still in the cell format.
 
     groupList =
 
         [8]
         []
         []
         [8]
         []
         [8]
 
group1Idx = find(~cellfun('isempty', group1List)); % Obtain the index number of the '_group1' subjects.
 
     group1Idx = 
 
         1
         4
         6
 
group1Subjects = allSubjNames(group1Idx); % Obtain the full file names of '_group1' subjects.
 
     group1Subjects = 
 
     'subj01_group1'
     'subj04_group1'
     'subj06_group1'
 
group1SubjectId = cellfun(@(x) x(1:6), group1Subjects, 'Uniformoutput', false); % Extract 'subjxx' part from the file names (i.e., delete '_group1' from the file names).
 
     group1SubjectId = 
 
     'subj01'
     'subj04'
     'subj06'

How to build a new event structure from scratch (01/08/2021 added)

Here I show an example code to build the minimum-requirement structure of EEG.event, which includes type and latency. You may achieve the same goal using EEGLAB GUI but the manual process is labor intensive to the level it seems impossible, and the EEGLAB function for this job is difficult to use (in my opinion). I recommend the following code which is infinitely faster than the function because no forloop or check function is included and maximally clear to see what is done in each line.

% Define dummy event labels and latencies.
eventLabels  = {'a' 'b' 'c' 'd' 'e'}; % Example event labels.
eventLatency = [3 4 5 6 7]; % Example event latencies.
 
% Build a minimal but valid EEG.event from scratch (i.e. EEG.event = [])
[EEG.event(1:length(eventLabels)).type] = eventLabels{:};
latencyInCell = num2cell(eventLatency);
[EEG.event(1:length(eventLabels)).latency] = latencyInCell{:};
eeglab redraw % Optional. This runs eeg_checkset() and refreshes GUI.

How to modify event types

How to obtain unique event types

 % Obtain all unique event types
 uniqueEventTypes = unique({EEG.event.type}');

How to obtain event indices whose event type exactly matches a keyword

 % Obtain event indices whose 'type' is 'Probe'
  allEventTypes = {EEG.event.type}';
  keywordIdx = find(strcmp(allEventTypes, 'Probe'));

How to obtain event indices whose event types contain a part of the keyword (12/13/2016 updated)

During SCCN tea time today, Hiroyuki Kambara told me this solution using cellfun(). How smart!

 % Obtain event indices whose 'type' contains letter strings 'Probe' (e.g. will obtain all indices for 'Probe1', 'Probe3', and 'Probe5'
 allEventTypes = {EEG.event.type}';
 strFindResult = strfind(allEventTypes, 'Probe');
 probeIdx      = find(cellfun(@isempty, strFindResult));

How to change event type names

 % Rename event type name 'example1' into 'example2'
 allEvents = {EEG.event.type}';
 example1Idx = strcmp(allEvents, 'example1');
 [EEG.event(example1Idx).type] = deal('example2');

How to change numerical event type names to strings (05/17/2020 added)

 allEventTypes = [EEG.event.type]'; % The output is numeric with same digits.
 allEventTypeString = cellstr(num2str(allEventTypes)); % Converts numerics into strings in cell.
 [EEG.event.type] = allEventTypeString{:}; % I can't explain how this syntax works!

How to modify event latency

How to change event latency

This is weird but works (and the only way that works probably.)

 % Change event latency into 1:length(EEG.event)
 dummyLatency = num2cell(1:length(EEG.event));
 [EEG.event(1:length(EEG.event)).latency] = dummyLatency{:};
 EEG = eeg_checkset(EEG,'eventconsistency');

How to adjust event latency by adding or subtracting values

 % Add 400ms to event 'example'
 allEvent = {EEG.event.type}';
 exampleTypeIdx  = find(strcmp(allEvent, 'exampleType'));
 exampleTypeLatencyInFrame  = [EEG.event(exampleTypeIdx).latency]';
 changedLatency = num2cell(bsxfun(@plus, exampleTypeLatencyInFrame, 400), 2);
 [EEG.event(exampleTypeIdx).latency] = changedLatency{:};
 EEG = eeg_checkset(EEG,'eventconsistency');

How to recover the lost events by window rejection by clean_rawdata (05/22/2021 added)

After applying clean_rawdata, or any other manual or automated window-rejection method for data cleaning, you may find some event markers are rejected as a result. If those are the markers for the onset and/or offset of a block, for example, then you can't perform the block-design analysis. I developed a prototype of a solution to recover the lost event markers for the case that clean_rawdata() log 'clean_sample_mask' and EEGLAB log EEG.urevent are BOTH available. It replaces the 'boundary' with the lost event marker. Note that the recovered event marker is not valid to show the onset/offset of the latency of the event onset/offset, so this method cannot be straightforwardly used for the event-related study design (unless you implement additional solution to detect epochs with non-canonical epoch length, although this also has a problem if the epoch contains variable SOA... but for the fixed-length epochs it should work).

    % Recover the lost events.
    urEventType         = {EEG.urevent.type}';
    urEventLatencyFrame = round([EEG.urevent.latency]);
    cleanSampleMask     = EEG.etc.clean_sample_mask;
    isEventPresent      = cleanSampleMask(urEventLatencyFrame);
    boundaryIdx = find(contains({EEG.event.type}, 'boundary'));
    if any(isEventPresent==false)
        lostEventIdx = find(isEventPresent==0);
        for lostEventIdxIdx = 1:length(lostEventIdx)
            lostEventUrlatencyFrame = urEventLatencyFrame(lostEventIdx(lostEventIdxIdx));
            lostEventCurrentPosition = sum(cleanSampleMask(1:lostEventUrlatencyFrame));
            boundaryLatency = round([EEG.event(boundaryIdx).latency]);
            [differenceInFrame, selectedBoundaryIdx] = min(abs(boundaryLatency-lostEventCurrentPosition));
            if differenceInFrame > 3
                error('Fail to recover the lost event.')
            end
            EEG.event(selectedBoundaryIdx).type = urEventType{lostEventIdx(lostEventIdxIdx)};
        end
    end

How to delete non-time-locking events from epoched data (08/16/2020 added)

Motivation: Imagine you have mismatch negativity data sets in which a stimulus was presented every 0.5 s. In order to perform our standard time-frequency analysis, which uses 1116 ms Morlet wavelet kernel with the lowest 3 Hz with 3 cycles (but this does NOT mean EEGLAB performs a pure wavelet analysis--see this page), we need much longer epoch than one trial. If we apply our standard recommendation of -1 to 2 s epoching, it will inflate the data to 6 times larger (3.0 s contains 6 0.5 s). This not only inflate data size, but also inflate event structure. This redundant event structure causes many problems, although all we need in this situation is just those events whose within-epoch latency is 0. The following code addresses this issue so that EEG.event and EEG.epoch have only one event that has latency zero.

% Delete non-time-locking events from epoched data (08/16/2020 updated)
for epochIdx = 1:length(EEG.epoch)
    allEventIdx    = 1:length(EEG.epoch(epochIdx).event);
    if length(allEventIdx) == 1 % If there is only 1 event, it must have latency zero.
        continue
    else
        zeroLatencyIdx = find(cell2mat(EEG.epoch(epochIdx).eventlatency) == 0);
        EEG.epoch(epochIdx).event         = EEG.epoch(epochIdx).event(zeroLatencyIdx);
        EEG.epoch(epochIdx).eventtype     = EEG.epoch(epochIdx).eventtype(zeroLatencyIdx);
        EEG.epoch(epochIdx).eventlatency  = {0};
        EEG.epoch(epochIdx).eventurevent  = EEG.epoch(epochIdx).eventurevent(zeroLatencyIdx);
        EEG.epoch(epochIdx).eventduration = 0;
    end
end
validEventIdx  = [EEG.epoch.event];
deleteEventIdx = setdiff(1:length(EEG.event), validEventIdx);
EEG.event(deleteEventIdx) = [];
for epochIdx = 1:length(EEG.epoch)
    EEG.epoch(epochIdx).event = epochIdx;
end

See what I it does below: Before (left) and after (right) the process. Notice the non-time-locking event markers (i.e., whose within-epoch latency is non-zero) are cleaned up on the right.

CleanEvents.png

How to extract all-IC/channel ERPs for each condition (07/24/2018 added)

If you want to process a lot of heavy, epoched datasets (such as mismatch negativity data with 0.5 s SOA with 5,000 trials but epoched to -1 to 2 sec allowing HUGE overlap... this is the situation where EEGLAB's 'epoch' scheme becomes inefficient), a bare-bone script like the one I show below works more efficient than EEGLAB functions. Taken from my std_selectICsByCluster() plugin.

 % Extract event indices and names that have latency zero in an epoch.
 allEpochEventLatency = cell2mat({EEG.epoch.eventlatency}');
 latencyZeroEventIdx  = find(allEpochEventLatency == 0);
 latencyZeroEventType = {EEG.event(latencyZeroEventIdx).type}';
 
 % Convert latencyZeroEventType to string in case they are numaric. 
 allEventTypes = cellfun(@(x) num2str(x), latencyZeroEventType, 'uniformoutput', false);
 
 % Extract and store mean EEG.icaact across trials for each condition.
 uniqueConditions = unique(latencyZeroEventType);
 allConditionIcaactErp = zeros(size(EEG.icaact,1), size(EEG.icaact,2), length(uniqueConditions));
 for conditionIdx = 1:length(uniqueConditions)
     currentConditionLabel  = num2str(uniqueConditions{conditionIdx});
     currentConditionIdx    = find(strcmp(allEventTypes, currentConditionLabel));
     currentConditionIcaact = mean(EEG.icaact(:,:,currentConditionIdx),3);
     allConditionIcaactErp(:,:,conditionIdx) = currentConditionIcaact;
 end

If necessary, project them back to the channels.

 % Backproject Condition 1-3 to channels if necessary.
 scalpChannelErpCondition1 = EEG.icawinv*allConditionIcaactErp(:,:,1);
 scalpChannelErpCondition2 = EEG.icawinv*allConditionIcaactErp(:,:,2);
 scalpChannelErpCondition3 = EEG.icawinv*allConditionIcaactErp(:,:,3);

How to obtain theoretical dipole projection when using pop_multifit() (10/13/2020 updated)

% Comment out pop_multifit() line 189.
% This is because dipfit_reject deletes items other than pos, mom, and rv.
% This function should not be used here anyway.
%EEG.dipfit.model  = dipfit_reject(EEG.dipfit.model, g.threshold/100);


% Add the following line in dipfit_nonlinear() line 128.
EEG.dipfit.model(cfg.component).theoreticalProj = source.dip.pot;


% Then try this code.
icIdx = 23;
d1 = EEG.dipfit.model(icIdx).datapot;
d2 = EEG.dipfit.model(icIdx).theoreticalProj;
rv = sum((d1-d2).^2) ./ sum(d1.^2);
EEG.dipfit.model(icIdx).rv; % Should be identical as rv obtained above.
 
figure
subplot(1,2,1)
topoplot(d1, EEG.chanlocs)
subplot(1,2,2)
topoplot(d2, EEG.chanlocs)

EEG data cleaning

How to perform epoch rejection using single-trial PSD (02/20/2020 updated)

For my own project, I wrote the following solution for bad trial rejection for pre-selected ICs. This uses rectangular window short-term Fourier Transform (STFT) to convert 1 epoch into 1 PSD time point. Then, it takes sum of power for the user-specified frequency band. Then they are converted into z-score, and finally threshold it with SD. The example code rejects epochs that have > 3SD power between 15-50 Hz.

% Example:
%           badTrialIdx = rejectTrialSTFT(EEG.icaact, 3, EEG.pnts, EEG.srate, [15 50]); % Use 15-50 Hz freq band to evaluate PSD.
%           EEG = pop_select(EEG, 'notrial', badTrialIdx);
%
% 02/04/2020 Makoto. Modified.
% 01/31/2020 Makoto. Used with modification.
% 11/08/2019 Makoto. Created. Changed from Hamming to Rectangular window (Thanks Masaki!)
 
function badTrialIdx = rejectTrialSTFT(icaact, SD, pnts, srate, freqEdges)
 
data2D = icaact(:,:);
 
totalExclusionIdx = [];
for icIdx = 1:size(data2D,1);
    %[S,F,T,P,Fc,Tc] = spectrogram(data2D(icIdx,:), pnts, 0, [], srate);         % This is by default Hamming.
    [S,F,T,P,Fc,Tc] = spectrogram(data2D(icIdx,:), rectwin(pnts), 0, [], srate); % This is rectangular window.
 
    log10P = 10*log10(P);
 
%     figure; imagesc(T, F, log10P); axis xy; colormap('jet')
 
    medianP = median(log10P,2);
 
%     figure; plot(F, medianP)
 
    errorP = bsxfun(@minus, log10P, medianP);
 
%     figure; imagesc(T, F, errorP, [-20 20]); axis xy; colormap('jet')
 
    zscoreHighFreqPower = zscore(sum(errorP(freqEdges(1):freqEdges(2),:))); % Targetting EMG.
 
 
%     figure; hist(zscoreHighFreqPower, 100)
 
    exclusionIdx = find(zscoreHighFreqPower>=SD);
 
    %{
 
    figure
    subplot(2,1,1)
    imagesc(T, F, errorP, [-20 20]); axis xy; colormap('jet')
    title('Error from median PSD')
    xlabel('Frames (s)')
    ylabel('10*log10(Power)/Hz')
 
    subplot(2,1,2)
    barHandle = bar(zscoreHighFreqPower, 1);
    xlim([0.5 length(zscoreHighFreqPower)-0.5])
    hold on
    line([0.5 length(zscoreHighFreqPower)-0.5], [SD SD], 'color', [1 0 0])
    title(sprintf('Z-scored error between %.0f and %.0f Hz.', freqEdges(1), freqEdges(2)))
    xlabel('Frames (s)')
    ylabel('Z-score')
 
    %}
 
    totalExclusionIdx = [totalExclusionIdx exclusionIdx];
end
 
badTrialIdx = unique(totalExclusionIdx);

Example use of code.

badTrialIdx = rejectTrialSTFT(EEG.icaact, 3, EEG.pnts, EEG.srate, [15 50]);
EEG = pop_select(EEG, 'notrial', badTrialIdx);

How to perform component selection using ICLabel() and dipole information (11/04/2020 updated)

 
    % Perform IC rejection using ICLabel scores and r.v. from dipole fitting.
    EEG = iclabel(EEG, 'default'); % Original output from eegh has a bug: the second input ('default') is without ''--this is fixed here.
 
    % Obtain the most dominant class label and its label probability.
    [~, mostDominantClassLabelVector] = max(EEG.etc.ic_classification.ICLabel.classifications, [], 2);
    mostDominantClassLabelProbVector = zeros(length(mostDominantClassLabelVector),1);
    for icIdx = 1:length(mostDominantClassLabelVector)
             mostDominantClassLabelProbVector(icIdx)  = EEG.etc.ic_classification.ICLabel.classifications(icIdx, mostDominantClassLabelVector(icIdx));
    end
 
         %% Option 1: Identify artifact ICs. The order of the classes are {'Brain'  'Muscle'  'Eye'  'Heart'  'Line Noise'  'Channel Noise'  'Other'}.
         %artifactLabelProbThresh = 0; % [0-1]
         %artifactIcIdx = find((mostDominantClassLabelVector==2 & mostDominantClassLabelProbVector>=artifactLabelProbThresh)|...
         %                (mostDominantClassLabelVector==3 & mostDominantClassLabelProbVector>=artifactLabelProbThresh)|...
         %                (mostDominantClassLabelVector==4 & mostDominantClassLabelProbVector>=artifactLabelProbThresh)|...
         %                (mostDominantClassLabelVector==5 & mostDominantClassLabelProbVector>=artifactLabelProbThresh)|...
         %                (mostDominantClassLabelVector==6 & mostDominantClassLabelProbVector>=artifactLabelProbThresh));
         %nonartifactIcIdx = setdiff(1:size(EEG.icaweights,1), artifactIcIdx);
         %nonartifactGoodDipIcIdx = intersect(insideBrainAndGoodRvIdx, nonartifactIcIdx);
         %EEG.etc.icStack.nonartifactIcIdx = nonartifactGoodDipIcIdx;
 
         % Option 2: Identify brain ICs. The order of the classes are {'Brain'  'Muscle'  'Eye'  'Heart'  'Line Noise'  'Channel Noise'  'Other'}.
         brainLabelProbThresh  = 0; % [0-1]
         brainIdx = find((mostDominantClassLabelVector==1 & mostDominantClassLabelProbVector>=brainLabelProbThresh));
 
    % Perform IC rejection using residual variance of the IC scalp maps.
    rvList    = [EEG.dipfit.model.rv];
    goodRvIdx = find(rvList < 0.15)'; % < 15% residual variance == good ICs.
 
    % Perform IC rejection using inside brain criterion.
    load(EEG.dipfit.hdmfile); % This returns 'vol'.
    dipoleXyz = zeros(length(EEG.dipfit.model),3);
    for icIdx = 1:length(EEG.dipfit.model)
        dipoleXyz(icIdx,:) = EEG.dipfit.model(icIdx).posxyz(1,:);
    end
    depth = ft_sourcedepth(dipoleXyz, vol);
    depthThreshold = 1;
    insideBrainIdx = find(depth<=depthThreshold);
 
    % Take AND across the three criteria.
    goodIcIdx = intersect(brainIdx, goodRvIdx);
    goodIcIdx = intersect(goodIcIdx, insideBrainIdx);
 
    % Perform IC rejection.
    EEG = pop_subcomp(EEG, goodIcIdx, 0, 1);
 
    % Post-process to update ICLabel data structure.
    EEG.etc.ic_classification.ICLabel.classifications = EEG.etc.ic_classification.ICLabel.classifications(goodIcIdx,:);
 
    % Post-process to update EEG.icaact.
    EEG.icaact = [];
    EEG = eeg_checkset(EEG, 'ica');


How to compute a robust (i.e. median-version of) standard deviation (07/26/2021 updated)

You know median is robust mean-equivalent and nicely neglect crazy values. But median +/- n*standard deviation is not a reasonable way to determine outliers because standard deviation is affected by the outliers. Thus you need a robust standard deviation-equivalent. It is called median absolute deviation which does use median instead of mean in the calculation. See this Wiki page. https://en.wikipedia.org/wiki/Median_absolute_deviation#Relation_to_standard_deviation In Matlab, mad() is the function to calculate this value. You also need to multiply the obtained value with the constant 1.4826 if Gaussian distribution is assumed (we usually do). An example of the use of robust outlier detection can be found in clean_channels() line 104 which is included in the clean_rawdata() plugin.

How to avoid the effect of rank-deficiency in applying ICA (03/31/2021 added)

This is for Zahra Fotovatnia and Ilaria Berteletti. Intuitively, if two lines in the 3-D space are aligned in parallel, these lines are lineary dependent and not counted as two (i.e. rank deficient). In terms of 0 or 1 decision, any non-parallel lines are technically 'linear independent'. Then, what happens when two lines are not exactly but very close to parallel? This can be measured as a condition number, which is where is the maximum singular value of , while is the minimum singular value of . Notice the denominator--if becomes very small, the condition number becomes very large. Of course, large condition number indicates unstability.


Now, pop_runica() line 531 hard-cords the variable 'rankTolerance = 1e-7'. This sets the smallest acceptable for runica() process. For example, if you have 32 channels but 2 of are smaller than 1e-7, runica() automatically removes these dimensions and will produce 30 ICs for your 32-channel data input. This is because even if the input data is technically full-ranked, if some of are very small, practically it is the same as rank deficient (i.e. ICA cannot appreciate the difference between parallel and very-close to parallel) and ICA most likely start to create ghost ICs or complex weight matrix etc, all of which indicate failure. Thus, the purpose of having this 'rankTolerance = 1e-7' is to protect the results from possible pathologically small . For example, if you interpolate one electrode using spline interpolation, the interpolated electrode has non-linearly synthesized time-series data that is not exactly linearly dependent but very close to linear dependency.


Following the heuristic 'rank tolerance' in the pop_runica(), it is a good protective approach to determine the effective rank using

dataRank = sum(eig(cov(double(EEG.data'))) > 1E-7);

instead of

dataRank = rank(double(EEG.data'));

Note that the threshold value 1E-7 was most likely determined heuristically. By using larger values, such as 1E-6, it may help to make it slightly more stable. If you use too large value here, however, it will start to reduce the number of ICs eventually. If you are in doubt with the behavior of ICA and/or its decomposition quality, you may tweak this value for test.

How to turn on the automatic protection against rank-deficient ICA in pop_runica (3/31/2021 added)

Near the end of pop_runica(), you find the following line.

tmprank2 = max(tmprank, tmprank2);

If you want to turn on this protection against rank-deficient ICA (which I recommend), change the code as follows.

%tmprank2 = max(tmprank, tmprank2);
tmprank2 = min(tmprank, tmprank2);

This change means that if the effective rank of the data, which is heuristically defined as the smallest eigenvalue of 10^-7, is smaller than the regular results of rank(), then the algorithm determines to use the effective data rank instead of true data rank. This is because true data rank including eigenvalue < 10^-7 is effectively rank deficient for ICA.

How to speed up ICA (10/22/2020 updated)

By downsampling (09/13/2021 modified)

ICA has a known bias toward high amplitude. If the data length were infinite, it would not have this bias--my former colleague told me so. Due to 1/f property of scalp-recorded EEG signal, which is predicted from the cable equation (see 'Electric Field of the Brain' p.174- by Nunez and Srinivasan, 2006), signal power in the higher frequency range is very small. This simply means that ICA has less things to learn from high frequency (by the way, ASR applies inverse EEG-PSD filter so that signals in those high frequency ranges, which are unlikely to be dominated by natural EEG, is enhanced so that it is detected for correction). If that's the case, why not cutting the high frequency from the beginning, at least for ICA purpose--that's the rational for this process. If you doubt it, you can anytime verify it comparing ICA results obtained from using 1000-Hz sampled data with that from using 100-Hz downsampled data. It is even possible that due to the band-limiting effect (100-Hz downsampled data is 50-Hz low-pass filtered because of anti-aliasing), ICA result could be even better. The favorable effect of low-pass filtering before ICA has been told in the past EEGLAB workshops by Scott himself, I believe ('Applying 100-Hz low-pass filter before ICA is also important...') At least, it will speed up ICA.

% Run ICA.
EEG_forICA = pop_resample(EEG, 100);
EEG_forICA = pop_runica(EEG_forICA, 'extended',1,'interupt','off');
EEG.icaweights = EEG_forICA.icaweights;
EEG.icasphere  = EEG_forICA.icasphere;
EEG = eeg_checkset(EEG, 'ica');

How does it affect the recommended data length? (10/22/2020 added)

This is for Syanah Wynn--If you downsample the data from 250Hz to 100Hz, you have only 40% of data length. Does it negatively impact ICA's performance, since there is much less data available? This is something a simulation study can answer empirically, but the empirical answer is no. In fact, I don't recommend to use SCCN's rule of thumb literally--I would rather add an additional condition as follows: ICA requires data length of ((ch)^2)*30 if data were sampled at 250 Hz. I have been applying this downsampling-to-100Hz approach as ICA preprocessing over several thousands of datasets, in some of which the data length became shorter than the above recommendation but I did not see any impact. So, the length in time in the real world counts rather than frames. For example, 5-min data can be 30,000 frames when sampled at 100 Hz, and 300,000 frames when sampled at 1000Hz. Does the latter help ICA decomposition quality? It does not. It depends on how much information scalp-recorded EEG contains above 100Hz compared with below 100Hz.


This view is supported by the following description taken from Nunez and Srinivasan (2006) p.308: "EEG (and ECoG) power in mammals tends to fall off sharply above 50 or 100 Hz, apparently reflecting cortical cell time constants in the 10 ms range as indicated in chapter 4. That is, we expect minimal changes in the mesosource function P(r,t) over mesoscopic "relaxation times" in the range of 10 ms or perhaps several tens of milliseconds."


By modifying runica.m (09/13/2021 added)

This is for Pål. During the 30th EEGLAB workshop, one of the workshop attendees Tjerk Gutteling told us that by commenting out the lines for 'drawnow' in runamica() speeds up ICA by several times (yes, several times, not several percent). I immediately tested it and confirmed that was true. Kudos to Tjerk! What a hacker. This discovery is mentioned in EEGLAB github and a post to the EEGLAB mailing list by Curtis Bingham. This hack will be integrated into an official EEGLAB update and released in future. Meanwhile, if you are still using EEGLAB 2021 or older, it is worth trying this trick.

  1. Open runica() by typing 'edit runica' in the Matlab command line
  2. Search for 'drawnow;'. There are 5 of them in the code.
  3. Replace them with 'drawnow;' with 'drawnow limitrate;' (or alternatively comment out the line as '%drawnow;' if you don't need to interrupt the process)

runica's drawnow conflicts with PowPowCAT? (09/13/2021 added)

Relatedly, I also got a report that under certain condition PowPowCAT, one of the EEGLAB plugins I wrote to calculate comodugram/comodulogram, and runica() can cause conflict (I could not replicate the issue). The error message is as follows (thanks Pål Gunnar Larsson for reporting this error)

matlab.graphics.internal.figfile.FigFile/read>@(hObject,eventdata)PowPowCAT('PowPowCATFigure_WindowButtonMotionFcn',hObject,eventdata,guidata(hObject))
Error using runica (line 872)
Error while evaluating Figure WindowButtonMotionFcn.

Even though I don't exactly know how the conflict happens, to avoid this issue, most likely one can simply comment out 'drownow;' lines as suggested above.

How to export data to an Excel file (08/09/2020 added, prayer for Nagasaki)

This is an example to write ERP (i.e., data needs to be epoched) with channel labels and latency for Zahra Fotovatnia and Zoran Josipovic.

% Obtain electrode labels.
chLabels = {EEG.chanlocs.labels}';
 
% If the electrode labels are just numbers, you have to add non-numeric
% characters before that, otherwise writetable() fails. Very Dumb.
for chIdx = 1:length(chLabels)
    chLabels{chIdx} = ['Ch' chLabels{chIdx}];
end
 
% Obtain ERP latency.
latency  = cellstr(num2str(EEG.times'));
 
% Obtain all-electrode ERP.
ERP = mean(EEG.data,3)';
 
% Prepare output table.
outputTable = array2table(ERP, 'VariableNames', chLabels, 'RowNames', latency);
 
% Write the output table. Using '.xlsx' automatically specifies Excel
% format so no need to use 'FileType', 'spreadsheet' option.
writetable(outputTable, 'savePath/test.xlsx',...
           'WriteVariableNames', true, 'WriteRowNames', true);

How to send email from Matlab to report progress (07/26/2019 added)

% Set up Matlab preferences.
setpref('Internet','SMTP_Server','XXXX.XXXX.XXX');
setpref('Internet','E_mail','XXXXXXXXXXX@XXXX.XXX');
 
allSets = dir('/data/projects/example/p0100_imported/*.set');
ticTocList = nan(length(allSets),1);
for setIdx = 1:length(allSets)
 
    ticStart = tic;
 
 
    %%%%%%%%%%%%%%%%%%%%%%%%%%
    %%% Your process here. %%%
    %%%%%%%%%%%%%%%%%%%%%%%%%%
 
 
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    %%% Calculate time lapse. %%%
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    ticTocList(setIdx) = toc(ticStart);
 
    if mod(setIdx, 100) == 0
 
        meanProcessingDurationInSec = nanmean(ticTocList);
        stdProcessingDurationInSec  = nanstd(ticTocList);
        meanRemainingProcessingDurationInDay = (length(allSets)-setIdx)*meanProcessingDurationInSec/(60*60*24);
        stdRemainingProcessingDurationInDay  = (length(allSets)-setIdx)*stdProcessingDurationInSec/(60*60*24);
 
        %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
        %%% Send email for update. %%%
        %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
        sendmail({'XXXXXXXX@XXXX.XXX', 'XXXXXXXXX@XXXX.XXX'},...
            sprintf('From Matlab running batch: %.0f/%.0f done', setIdx, length(allSets)),...
            sprintf('Estimated remaining time: %.1f days (SD %.1f)', meanRemainingProcessingDurationInDay, stdRemainingProcessingDurationInDay));
    end
end


How to perform time-frequency analysis with variable epoch length (10/13/2020 added)

This solution is for Steve Wu, and also could have been for Kelly Michaelis, although I took a different approach in 2019--but seems apparent that this solution was more straightforward than hacking the STUDY structure. I feel I tried to be too loyal to EEGLAB and made a wrong decision at that time (whichever works anyway... it's a matter of efficiency and integrity.)


Here in this example, ERSP for 'latencyZeroEvent', which are 'DIN6' and 'DIN60', is calculated for which baseline is calculated for 'baselineOnsetEvent'. The whole point is that you can calculate ERSP with flexibly specified baseline period even if there are duration-varying events between the two conditions. Use another solution 'How to obtain the optimum number of clusters' together and you can perform your own IC clustering with ERSP.

% Define events of interest/non-interest.
latencyZeroEvent    = {'DIN6', 'DIN60'};
epochWinEdges       = [-2 1.5];
baselineOnsetEvent  = 'READY'; % Baseline window is 2-s.
baselineWinEdges    = [0 1.5];
epochExclusionEvent = 'boundary';
windowStepSize      = 0.02; % s
windowLengthInSteps = (epochWinEdges(2)-epochWinEdges(1))/windowStepSize;
windowLengthInFrame = 557; % For 1.5-Hz at 3 cycles. 
                           % This must be obtained from the command line log
                           % shown after running a newtimefreq() with above
                           % parameters.
 
allSets = dir('[loadingDataFullPath]/*.set');
for setIdx = 1:length(allSets)
 
    % Load file.
    loadName = allSets(setIdx).name;
    EEG = pop_loadset('filename', loadName, 'filepath', '[loadingDataFullPath]/');
 
    % Loop for ICs.
    icErspStack = zeros(100, windowLengthInSteps, size(EEG.icaact,1));
    for icIdx = 1:size(EEG.icaact,1)
 
        %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
        %%% Compute EEGLAB's pseudo-wavelet (i.e. STFT-hybrid). %%%
        %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
        timesoutParam = round((EEG.xmax-windowLengthInFrame*1/EEG.srate)/windowStepSize); % Temporal resolution is set to 20 ms.
 
        % Perform time-frequency decomposition. Do not calculate baseline at this point.
        [~, ~, ~, waveletTimesS, freqs, ~, ~, coeffs] = ...
            newtimef(EEG.icaact(icIdx,:), EEG.pnts, [EEG.xmin EEG.xmax], EEG.srate, [3 10], ...
            'baseline', NaN,'nfreqs', 100, 'freqs', [1.5 55], 'freqscale', 'log', ...
            'timesout', timesoutParam, 'plotersp', 'off', 'plotitc', 'off');
 
        % Find critical events.
        allEvents = {EEG.event.type}';
        latencyZeroIdx     = find(strcmp(allEvents, latencyZeroEvent{1}) | strcmp(allEvents, latencyZeroEvent{2}));
        baselineOnsetIdx   = find(strcmp(allEvents, baselineOnsetEvent));
        epochExclusionIdx1 = find(strcmp(allEvents, epochExclusionEvent));
        latencyZeroLatencyS    = [EEG.event(latencyZeroIdx).latency]*1/EEG.srate; % s.
        baselineOnsetLatencyS  = [EEG.event(baselineOnsetIdx).latency]*1/EEG.srate; % s.
        epochExclusionLatencyS = [EEG.event(epochExclusionIdx1).latency]*1/EEG.srate; % s.
 
        %%%%%%%%%%%%%%%%%%%%%%%%%%
        %%% Obtain the epochs. %%%
        %%%%%%%%%%%%%%%%%%%%%%%%%%
        % Obtain raw epoch edges.
        epochEdges = [[latencyZeroLatencyS + epochWinEdges(1)]' [latencyZeroLatencyS + epochWinEdges(2)]'];
 
        % Exclude epochs that contains epochExclusionLatencyS.
        epochExclusionIdx2 = [];
        for exclusionIdx = 1:length(epochExclusionLatencyS)
            currentExclusionEventS = epochExclusionLatencyS(exclusionIdx);
            for epochIdx = 1:size(epochEdges,1)
                if epochEdges(epochIdx,1)<=currentExclusionEventS & epochEdges(epochIdx,2)>=currentExclusionEventS
                    epochExclusionIdx2 = sort([epochExclusionIdx2; epochIdx]);
                end
            end
        end
        epochEdgesRej = epochEdges;
        epochEdgesRej(epochExclusionIdx2,:) = [];
 
        coeffEpochTensor = zeros(length(freqs), windowLengthInSteps, size(epochEdgesRej,1));
        for epochIdx = 1:size(epochEdgesRej,1)
            epochFrameIdx = find(waveletTimesS > epochEdgesRej(epochIdx,1) & waveletTimesS < epochEdgesRej(epochIdx,2));
            if     length(epochFrameIdx) < windowLengthInSteps
                epochFrameIdx(end+1) = epochFrameIdx(end)+1;
            elseif length(epochFrameIdx) > windowLengthInSteps
                epochFrameIdx(end+1) = epochFrameIdx(end)-1;
            end
            coeffEpochTensor(:,:,epochIdx) = coeffs(:,epochFrameIdx);
        end
 
        %%%%%%%%%%%%%%%%%%%%%%%%%%%%
        %%% Obtain the baseline. %%%
        %%%%%%%%%%%%%%%%%%%%%%%%%%%%
        % Obtain raw epoch edges.
        baselineEdges = [[baselineOnsetLatencyS + baselineWinEdges(1)]' [baselineOnsetLatencyS + baselineWinEdges(2)]'];
 
        % Exclude epochs that contains epochExclusionLatencyS.
        baselineExclusionIdx = [];
        for exclusionIdx = 1:length(epochExclusionLatencyS)
            currentExclusionEventS = epochExclusionLatencyS(exclusionIdx);
            for baselineIdx = 1:size(baselineEdges,1)
                if baselineEdges(baselineIdx,1)<=currentExclusionEventS & baselineEdges(baselineIdx,2)>=currentExclusionEventS
                    baselineExclusionIdx = sort([baselineExclusionIdx; baselineIdx]);
                end
            end
        end
        baselineEdgesRej = baselineEdges;
        baselineEdgesRej(baselineExclusionIdx,:) = [];
 
        % Extract epochs. Length of 75 was determined from data.
        coeffBaselineTensor = zeros(length(freqs), 75, size(baselineEdgesRej,1));
        for baselineIdx = 1:size(baselineEdgesRej,1)
            baselineFrameIdx = find(waveletTimesS > baselineEdgesRej(baselineIdx,1) & waveletTimesS < baselineEdgesRej(baselineIdx,2));
            if     length(baselineFrameIdx) < 75
                baselineFrameIdx(end+1) = baselineFrameIdx(end)+1;
            elseif length(baselineFrameIdx) > 75
                baselineFrameIdx(end+1) = baselineFrameIdx(end)-1;
            end
            coeffBaselineTensor(:,:,baselineIdx) = coeffs(:,baselineFrameIdx);
        end
 
        %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
        %%% Perform baseline correction. %%%
        %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
        epochErspRaw        = coeffEpochTensor.*conj(coeffEpochTensor);
        epochErspNoBaseline = 10*log10(mean(epochErspRaw,3));
        baselineErspRaw = coeffBaselineTensor.*conj(coeffBaselineTensor);
        baselinePower   = mean(10*log10(mean(baselineErspRaw,3)),2);
        epochErsp       = bsxfun(@minus, epochErspNoBaseline, baselinePower);
 
        % Store the result.
        icErspStack(:,:,icIdx) = epochErsp;
    end
 
    % Store the result.
    EEG.etc.customErsp.icErspStack         = icErspStack;
    EEG.etc.customErsp.windowStepSize      = windowStepSize;
    EEG.etc.customErsp.freqs               = freqs;
    EEG.etc.customErsp.latencyZeroEvent    = latencyZeroEvent;
    EEG.etc.customErsp.epochWinEdges       = epochWinEdges;
    EEG.etc.customErsp.baselineOnsetEvent  = baselineOnsetEvent;
    EEG.etc.customErsp.baselineWinEdges    = baselineWinEdges;
    EEG.etc.customErsp.epochExclusionEvent = epochExclusionEvent;    
 
    % Save the data.
    pop_saveset(EEG, 'filename', loadName, 'filepath', '[savingDataFullPath]')
end


How to perform time-frequency transformation faster (11/29/2020 added)

Calculation of of the (pseudo-)wavelet transform is calculated in the function timefreq(). It takes place inside the double for-loops, the first one for time points ('index') and the second one frequency bins ('freqind'). For a standard time-frequency analysis with 200 time points ('timesout') and 50 frequencies ('nfreqs'), it executes the calculation (tmpX = sum(wav .* tmpX);) 10,000 times. If you want to decompose 30-min continuous data for every 100 ms, there will be 1800/0.1*50 = 900,000 iteration of this calculation per channel/IC. If you have 64-channel data, there will be 5,760,000 iterations. Even if each iteration takes 1 ms, it will take 1.6 hours to finish the calculation. In theory, if we can use indices instead of for-loops, we can speed up the process. For proof of concept, I modified the existing timefreq() to replace the existing for-loops blocks with the following one.

        % In timefreq(), replace the segment after 'else' at around line 420 with the following segment.
 
        %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
        %%% Modified code. (11/29/2020 Makoto) %%%
        %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
        % Zero-pad g.win so that wavelet daughters have the same length. This allows matrix operation. (11/29/2020 Makoto).
        longestWindow = length(g.win{1});
        waveletDaughters = zeros(longestWindow, length(g.win));
        for freqIdx = 1:length(g.win)
            currentDaughter = g.win{freqIdx};
            onsetFrameToCenter = (size(waveletDaughters,1)-length(currentDaughter))/2+1;
            centeredFrameIdx = onsetFrameToCenter:onsetFrameToCenter+length(currentDaughter)-1;
            waveletDaughters(centeredFrameIdx,freqIdx) = currentDaughter;
        end
 
        % Obtain time points to which the wavelet daughters are applied.
        tfFrameList     = g.indexout;
        timeIdxTemplate = -(size(waveletDaughters,1)-1)/2:(size(waveletDaughters,1)-1)/2;
        timeIdx2D       = tfFrameList+timeIdxTemplate';
        timeIdx1D       = timeIdx2D(:);
 
        % Obtain input data that are ready to be wavelet-transformed.
        tmpX2D = double(data(timeIdx1D,:)); % Wavelet width x data frames.
        tmpX3D = reshape(tmpX2D, [size(timeIdx2D,1) size(timeIdx2D,2) size(tmpX2D,2)]);
        tmpX3D_demeaned = bsxfun(@minus, tmpX3D, mean(tmpX3D,1));
        tmpX1D = tmpX3D_demeaned(:);
 
        repetitionCounts = length(tmpX1D)/size(waveletDaughters,1);
        waveletDaughtersFullLength = repmat(waveletDaughters, [repetitionCounts 1]);
 
        % Perform wavelet transform.
        waveletCoeff = bsxfun(@times, waveletDaughtersFullLength, tmpX1D);
        waveletCoeff4D = reshape(waveletCoeff, [size(timeIdx2D,1) size(timeIdx2D,2) size(tmpX2D,2) size(waveletCoeff,2)]);
        if size(waveletCoeff4D,3)==1 % For continuous data.
            waveletCoeff3D = permute(squeeze(mean(waveletCoeff4D,1)), [2 1 3]);
        else                         % For epoched data.
            waveletCoeff3D = permute(squeeze(sum(waveletCoeff4D,1)), [3 1 2]); % freqs, times, epochs.
        end
        tmpall = waveletCoeff3D;

I confirmed 150-200% of speed increase depending on how many time points and frequency bins there are. Generally, the difference becomes larger if more iteration is necessary in the original code. Noticeably, when I processed continuous data (1700 s, 50 freq bins, 0.1-s resolution), the speed increase recorded 450%.


I would like to share this modified code for the beta test. Please download it from the link below and overwrite the existing timefreq.m under the folder [eeglabroot]/functions/timefreqfunc. Do not forget to take a back up of timefreq.m before you overwrite it.

[Download zipped timefreq.m]

My current concern is that this implementation is RAM-instensive, particularly for continuous data transformation. Make sure you have a few GB of RAM avaialble. We can still push further to save RAM by optimizing code (changing variables to single, overwrite variables instead of creating others with different names etc).

STUDY (Before EEGLAB14)

How to obtain the optimum number of clusters (01/14/2020 updated)

This solution consists 1) Obtain xyz coordinates of estimated equivalent dipoles from all .set files loaded and stored in ALLEEG streucture, 2) Run Matlab evalclusters() to perform optimization between the range of 10-30 using three algorithms (Calinski-Harabasz, Silhouette, Davies-Bouldin). Note that I am only using dipole location for the clustering criterion (if you wonder why, see this page.) Accordingly, my code below is also separated into two sections. Last time I got number 22, then 13-16. In this case, I took 13. I prefer to select the smallest number available across everything so that I could get maximum number of unique subjects per cluster.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%% Obtain all dipole xyz coordinates as a list. %%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
dipXyz = [];
for subjIdx = 1:length(ALLEEG)
 
    % Obtain xyz, dip moment, maxProj channel xyz.   
    xyz = zeros(length(ALLEEG(subjIdx).dipfit.model),3);
    for modelIdx = 1:length(ALLEEG(subjIdx).dipfit.model)
 
        % Choose the larger dipole if symmetrical.
        currentXyz = ALLEEG(subjIdx).dipfit.model(modelIdx).posxyz;
        currentMom = ALLEEG(subjIdx).dipfit.model(modelIdx).momxyz; % nAmm.
        if size(currentMom,1) == 2
            [~,largerOneIdx] = max([norm(currentMom(1,:)) norm(currentMom(2,:))]);
            currentXyz = ALLEEG(subjIdx).dipfit.model(modelIdx).posxyz(largerOneIdx,:);
        end
        xyz(modelIdx,:) = currentXyz;
    end
    dipXyz = [dipXyz; xyz];
end
 
 
 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%% Optimize the number of clusters between the range 10-30. %%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Use Matlab function evalclusters().
kmeansClusters = [];
for clustIdx = 10:30
    kmeansClusters(:,clustIdx) = kmeans(dipXyz, clustIdx, 'emptyaction', 'singleton', 'maxiter', 10000, 'replicate', 100);
end
 
kmeansClusters = kmeansClusters(:,10:30);
eva1 = evalclusters(dipXyz, kmeansClusters, 'CalinskiHarabasz');
eva2 = evalclusters(dipXyz, kmeansClusters, 'Silhouette');
%eva3 = evalclusters(dipXyz, kmeansClusters, 'gap', ); % Slow and not consistent value.
eva4 = evalclusters(dipXyz, kmeansClusters, 'DaviesBouldin');
 
figure
subplot(1,3,1)
plot(eva1); title('CalinskiHarabasz');
subplot(1,3,2)
plot(eva2); title('Silhouette');
subplot(1,3,3)
plot(eva4); title('DaviesBouldin');

How to transfer precomputed STUDY (10/11/2018 updated)

I found that just copying all the files from one folder to another does not copy the precomputed results. After investigation, I realized that this is because when STUDY is pre-computed (i.e., ERP, PSD, ERSP, etc), the generated files are stored in the HDD and linked with STUDY.design.filepath and STUDY.design.cell.filebase. If these are the all the things that needs to be updated, we can do it using the following code. After loading the copied STUDY, run the code, and save the current STUDY with a different name (just in case). Load the saved STUDY and everything should be there. At least it worked for me.

for designIdx = 1:length(STUDY.design)
 
    % Obtain the STUDY.design filepath with which the original data were precomputed.
    currentDesignFilepath = STUDY.design(designIdx).filepath;
 
    % Obtain the STUDY.design filepath which is current.
    newDesignFilepath     = '/data/mobi/Christiane/p01_study';
 
    % Replace the original path with the current path.
    STUDY.design(designIdx).filepath = newDesignFilepath;
 
 
    for cellIdx = 1:length(STUDY.design(designIdx).cell)
 
        % Obtain the STUDY.design.cell filepath with which the original data were precomputed.
        currentCellFilepath = STUDY.design(designIdx).cell(cellIdx).filebase;
 
        if strncmp(currentCellFilepath, currentDesignFilepath, length(currentDesignFilepath));
 
            % Obtain the STUDY.design filepath which is current.
            oldFilepathRemoved            = currentCellFilepath(length(currentDesignFilepath)+1:end);
            newDesignFilepathConcatenated = [newDesignFilepath oldFilepathRemoved];
 
            % Replace the original path with the current path.
            STUDY.design(designIdx).cell(cellIdx).filebase = newDesignFilepathConcatenated;
        else
            error('Oops, the original filepath did not match.')
        end
    end
end

How to load .set files to create STUDY (02/27/2020 updated)

Typically, you need to enter subject names. Sometimes, group names as well. Rarely in may case, condition names. Entering these info one by one on GUI is horrible, particularly if you have many data sets. Here is code to do it automatically. It is not only efficient but it also eliminates human errors YOU will make. After loading all the .set files, use EEGLAB GUI 'Using all loaded dataset' to create a STUDY. This code is also a good example for the basic use of 'for loop' for batch processing.

% Obtain all .set file under /data/makoto/exampleProject/.
% In this example, suppose all set files have names like subj123_group2.set
 
targetFolder = '/data/makoto/exampleProject';
allSetFiles = dir([targetFolder filesep '*.set']); % filesep inserts / or \ depending on your OS.
 
% Start the loop.
for setIdx = 1:length(allSetFiles)
 
    % Obtain the file names for loading.
    loadName = allSetFiles(setIdx).name; % subj123_group2.set
    dataName = loadName(1:end-4);        % subj123_group2
 
    % Load data. Note that 'loadmode', 'info' is to avoid loading .fdt file to save time and RAM.
    EEG = pop_loadset('filename', loadName, 'filepath', targetFolder, 'loadmode', 'info');
 
    % Enter EEG.subjuct.
    EEG.subject = dataName(1:7); % subj123
 
    % Enter EEG.group.
    EEG.group = dataName(9:14); % group2
 
    % Store the current EEG to ALLEEG.
    [ALLEEG, EEG, CURRENTSET] = eeg_store(ALLEEG, EEG, 0);
end
eeglab redraw % This is to update EEGLAB GUI so that you can build STUDY from GUI menu.


How to obtain more robust clustering result (10/24/2020 added)

This is for Juan Molina--The EEGLAB's default parameter setting for kmeans() may allow noticable variability in performing IC clustering in multiple times (except for the order of the clusters--it varies inevitably, but I'm talking about robustness of the IC-clustering pattern). We can actually do better than the default parameters. In pop_clust() line 184, there is a line of code that performs kmeans which I suggest to change as follows.

%[IDX,C,sumd,D] = kmeans(clustdata,clus_num,'replicates',10,'emptyaction','drop');
[IDX,C,sumd,D] = kmeans(clustdata,clus_num,'replicates',100,'emptyaction','drop','MaxIter',10000);

The above change only costs one-second longer processing time. If you want, you may increase the number for 'replicates' more generously.

Example of batch code to preprocess multiple subjects (03/02/2021 updated)

See this page for detail of the processes. Note that this is NOT an official SCCN opinion. Data are not epoched here.

%% Example of batch code to preprocess multiple subjects
 
% Step 1: Change the option to use double precision.
 
cd('/data/project/example')
rawDataFiles = dir('*.bdf');
for subjID = 1:length(rawData) 
    loadName = rawDataFiles(subjID).name;
    dataName = loadName(1:end-4);
 
    % Step2: Import data.
    EEG = pop_biosig(loadName);
    EEG.setname = dataName;
 
    % Step 3: Downsample the data.
    EEG = pop_resample(EEG, 250);
    %EEG = pop_resample(EEG, 250, 0.8, 0.4); % For SIFT to ease antialiasing filter slope.
 
    % Step 4: High-pass filter the data at 1-Hz. Note that EEGLAB uses pass-band edge, therefore 1/2 = 0.5 Hz.
    EEG = pop_eegfiltnew(EEG, 1, 0, 1650, 0, [], 0);
 
    % Step 5: Import channel info
    EEG = pop_chanedit(EEG, 'lookup','[EEGLABroot]/eeglab/plugins/dipfit2.3/standard_BEM/elec/standard_1005.elc','eval','chans = pop_chancenter( chans, [],[]);');
 
    % Step 6: Remove line noise using CleanLine
    %EEG = pop_cleanline(EEG, 'bandwidth', 2,'chanlist', [1:EEG.nbchan], 'computepower', 0, 'linefreqs', [50 100 150 200 250],...
    %    'normSpectrum', 0, 'p', 0.01, 'pad', 2, 'plotfigures', 0, 'scanforlines', 1, 'sigtype', 'Channels', 'tau', 100,...
    %    'verb', 1, 'winsize', 4, 'winstep', 4);
 
    % Step 6: Run cleanLineNoise (requires PREP Pipeline toolbox) The original CleanLine was replaced (04/25/2020).
    signal      = struct('data', EEG.data, 'srate', EEG.srate);
    lineNoiseIn = struct('lineNoiseMethod', 'clean', ...
                         'lineNoiseChannels', 1:EEG.nbchan,...
                         'Fs', EEG.srate, ...
                         'lineFrequencies', [60 120 180 240],...
                         'p', 0.01, ...
                         'fScanBandWidth', 2, ...
                         'taperBandWidth', 2, ...
                         'taperWindowSize', 4, ...
                         'taperWindowStep', 1, ...
                         'tau', 100, ...
                         'pad', 2, ...
                         'fPassBand', [0 EEG.srate/2], ...
                         'maximumIterations', 10);
    [clnOutput, lineNoiseOut] = cleanLineNoise(signal, lineNoiseIn);
    EEG.data = clnOutput.data;
 
    % Step 7: Apply clean_rawdata() to reject bad channels and correct continuous data using Artifact Subspace Reconstruction (ASR). 
    % Note 'availableRAM_GB' is for clean_rawdata1.10. For any newer version, it will cause error.
    originalEEG = EEG;
    EEG = clean_rawdata(EEG, 5, -1, 0.85, 4, 20, 0.25, 'availableRAM_GB', 8);
 
    % Step 8: Interpolate all the removed channels
    EEG = pop_interp(EEG, originalEEG.chanlocs, 'spherical');
 
    % Step 9: Re-reference the data to average
    EEG.nbchan = EEG.nbchan+1;
    EEG.data(end+1,:) = zeros(1, EEG.pnts);
    EEG.chanlocs(1,EEG.nbchan).labels = 'initialReference';
    EEG = pop_reref(EEG, []);
    EEG = pop_select( EEG,'nochannel',{'initialReference'});
 
    % Step 10: Run AMICA using calculated data rank with 'pcakeep' option
    dataRank = sum(eig(cov(double(EEG.data'))) > 1E-6); % 1E-6 follows pop_runica() line 531, changed from 1E-7.
    runamica15(EEG.data, 'num_chans', EEG.nbchan,...
        'outdir', ['/data/projects/example/amicaResults/' dataName],...
        'pcakeep', dataRank, 'num_models', 1,...
        'do_reject', 1, 'numrej', 15, 'rejsig', 3, 'rejint', 1);
    EEG.etc.amica  = loadmodout15(['/data/projects/example/amicaResults/' dataName]);
    EEG.etc.amica.S = EEG.etc.amica.S(1:EEG.etc.amica.num_pcs, :); % Weirdly, I saw size(S,1) be larger than rank. This process does not hurt anyway.
    EEG.icaweights = EEG.etc.amica.W;
    EEG.icasphere  = EEG.etc.amica.S;
    EEG = eeg_checkset(EEG, 'ica');
 
    % Step 11: Estimate single equivalent current dipoles
    % Note: if ft_datatype_raw() complains about channel numbers, comment out (i.e. put % letter in the line top) line 88 as follows
    % assert(size(data.trial{i},1)==length(data.label), 'inconsistent number of channels in trial %d', i);
 
    % CASE1: If you are using template channel locations for all the subjects:
    %        -> Perform 'Head model and settings' from GUI on one of the
    %           subjects, then type 'eegh' to obtain 9 parameters called 'coord_transform'
    %           For example, my 32ch data has [0.68403 -17.0438 0.14956 1.3757e-07 1.0376e-08 -1.5708 1 1 1], therefore
    %
    %           coordinateTransformParameters = [0.68403 -17.0438 0.14956 1.3757e-07 1.0376e-08 -1.5708 1 1 1];
    %
    % CASE2: If you are using digitized (i.e. measured) channel locations that have 10-20 system names (Fz, Cz, ...) 
    %        -> Calculate the invidivualized transform parameters usign all scalp channels in this way
    %          
    %           [~,coordinateTransformParameters] = coregister(EEG.chanlocs, '[EEGLABroot]/eeglab/plugins/dipfit2.3/standard_BEM/elec/standard_1005.elc', 'warp', 'auto', 'manual', 'off')
    %
    % CASE3: If you are using digitized channel locations that do NOT have 10-20 system names
    %        -> Identify several channels that are compatible with 10-20 channel locations, rename them with 10-20 system labels, perform CASE2, then rename them back.
    %           Alternatively, rename the fiducial channels under EEG.chaninfo.nodatchans into 'Nz', 'LPA', 'RPA' accordingly, then use these fiducial channels to perform CASE2 in this way
    %
    %           [~,coordinateTransformParameters] = coregister(EEG.chaninfo.nodatchans, '[EEGLABroot]/eeglab/plugins/dipfit2.3/standard_BEM/elec/standard_1005.elc', 'warp', 'auto', 'manual', 'off')
 
    templateChannelFilePath = '[EEGLABroot]/eeglab/plugins/dipfit2.3/standard_BEM/elec/standard_1005.elc';
    hdmFilePath             = '[EEGLABroot]/eeglab/plugins/dipfit2.3/standard_BEM/standard_vol.mat';
    EEG = pop_dipfit_settings( EEG, 'hdmfile', hdmFilePath, 'coordformat', 'MNI',...
        'mrifile', '[EEGLABroot]/eeglab/plugins/dipfit2.3/standard_BEM/standard_mri.mat',...
        'chanfile', templateChannelFilePath, 'coord_transform', coordinateTransformParameters,...
        'chansel', 1:EEG.nbchan);
    EEG = pop_multifit(EEG, 1:EEG.nbchan,'threshold', 100, 'dipplot','off','plotopt',{'normlen' 'on'});
 
    % Step 12: Search for and estimate symmetrically constrained bilateral dipoles
    EEG = fitTwoDipoles(EEG, 'LRR', 35);
 
    % Step 13: Run ICLabel (Pion-Tonachini et al., 2019)
    EEG = iclabel(EEG, 'default');
 
    % Save the dataset
    EEG = pop_saveset(EEG, 'filename', dataName, 'filepath', '/data/projects/example');
end

What do those AMICA parameters mean? (04/06/2018 updated)

  • 'do_reject', 1

This means 'Reject the outliers for the AMICA model being computed'.

  • 'numrej', 15

This means 'Do the rejection for 15 iterations' Note that these datapoint rejection is only valid for AMICA computation, so it does not change the data length. If you want to know which data points were rejected, you can check EEG.etc.amica.Lht. If any datapoint shows 0, it means the datapoints were rejected by AMICA. Note also that you might want to use this info for further data cleaning if you don't mind rejecting randomly distributing single datapoints.

  • 'rejsig', 3

This means 'Use rejection criterion as SD==3' Note that the outlier criterion and datapoints are updated for every iteration.

  • 'rejint', 1

This means 'Use the rejection interval == 1 (i.e., for the first 15 continuous iterations, perform this outlier rejection.'

The reason for choosing these parameters are

  1. In most cases, the above parameters rejects total of 10% of data points.
  2. This set of parameters and options were recommended by my colleague Jason, the developer of AMICA.


Recommended preprocessing steps and order (05/11/2020 updated)

I suggest one go through the processes explained in this page in the following order:

  1. Example of batch code to preprocess multiple subjects
  2. How to perform component selection using ICLabel() and dipole information
  3. How to perform epoch rejection using single-trial PSD

This is because ICLabel() has dependency on IC-scalp maps (but not dipole information, thanks Luca!), and the epoch rejection function works on single trials of ALL ICs. This means that even if one of the all ICs shows a bad behavior in an epoch, the entire epoch will be rejected. This is why you want to pre-select brain ICs only, otherwise this epoch rejection may reject too many epochs.