Makoto's useful EEGLAB code

From SCCN
Jump to: navigation, search

I hope to help those who are eager to code to use EEGLAB more efficiently. Actually, the real purpose of building EEGLAB was to drive users to learn coding... Scott told me so. The following examples show how to deal with EEGLAB data structure efficiently without using 'for loops'. If you find an error or improvement, please share it on EEGLAB mailing list.


Contents

How to export Matlab figures in publication quality (09/26/2020 added)

This is for Mohamed Mounir Tellache. Although we always need publication-quality figures in the end when we write a paper, convenient solutions are often lacking. The last resort is to take a screenshot, which I sometimes need to do too, but image quality is severely limited (unless the figure is displayed in a full-screen mode on an 8k monitor).

I typically use anything other than Matlab, but the only exception is Adobe Illustrator. Without it, I may not have a complete control over the figure details. I think free softs such as Gimp will do the same. My recommendation below is based on the use of these addiitonal applications.

If your Matlab figure is not too complicated, such as simple line plots or colorful time-frequency plots, following code exports the 'current figure' (i.e. which is the figure currently selected on Matlab; click the figure and usually you can make it current, unless the figure is set to 'unable to become current' explicitly in figure property setting) to the user-specified full path. The option '-dsvg' specifies scalable vector graphic, which is the best choice if you use Adobe Illustrator or Gimp. I always use '-dsvg' as the first option.

print('/data/project/makoto/project/saveFigureName', '-dsvg')

Note that by default the figure background is white. If you want to reflect the actual background color, you need to run the additional line.

set(gcf, 'InvertHardcopy', 'off')

On the other hand, if the figure is too complicated (i.e., dipole plot on a MRI image), '-dsvg' fails to output vector graphics and instead save something fake with no apology. I need to talk to parents of the engineer who wrote this part of code in Mathworks. If this is unfortunately the case, you may want to export it as high-resolution jpeg (or any other lossy or lossless format as you like) as follows.

print('/data/project/makoto/project/saveFigureName', '-djpeg95', '-r300')

'-djpeg95' is an option to use jpeg with 'Quality 95' compression. If you want to suppress the file size, use smaller values such as 80 or 70. '-r300' is an option to specify output resolution, and '-r300' specifies 300 dpi (dots per inch).

Note also that the figure window size on your screen affects the size of the exported figure. This may or may not make sense to you, but generally speaking the best way to see how it affects is to see the exported figure. So you need to repeat try and fix several times to finalize the parameters. Again, if '-dsvg' works, it is not a problem as you can change the scale and aspect ratio of each graphical objects on the Illustrator or Gimp.

If you know the above three tricks, most of your needs will be met in handling figures.


How to obtain executed code with input parameters by operating graphical user interface (GUI)

Type eegh in the Matlab command line and you'll see it.

How to plot multiple channel ERPs in one plot (07/17/2020 updated)

This is written upon the request by Mohammad Reza Khaleghi.

figure
plotElecIdx = [3 7 11]; % Select the indices of the electrodes whose ERPs you want to plot.
for elecIdx = 1:length(plotElecIdx)
    hold on
    plot(EEG.times, mean(EEG.data(plotElecIdx,:,:),3)); % The last '3' means 'Take mean across the third dimension (i.e. trials)' so do not change.
    elecLabels{elecIdx} = EEG.chanlocs(plotElecIdx(elecIdx)).labels;
end
legend(elecLabels) % This gives you annotation of which color is which electrode.

How to extract subjects and independent components from STUDY structure

Please read this page. Note that it says 'This apparently complex scheme allows to use...'

How to extract EEG power of frequency bands (06/06/2020 updated)

This is one of the most frequently asked questions in the EEGLAB mailing list. By the way, I wrote erratum to correct code on 02/06/2016--thank you Jerry Zhu for the correction!

 % This example Matlab code shows how to compute power spectrum of epoched data, channel 2.
 [spectra,freqs] = spectopo(EEG.data(2,:,:), 0, EEG.srate);
 
 % Set the following frequency bands: delta=1-4, theta=4-8, alpha=8-13, beta=13-30, gamma=30-80.
 deltaIdx = find(freqs>1 & freqs<4);
 thetaIdx = find(freqs>4 & freqs<8);
 alphaIdx = find(freqs>8 & freqs<13);
 betaIdx  = find(freqs>13 & freqs<30);
 gammaIdx = find(freqs>30 & freqs<80);
 
 % Compute absolute power.
 deltaPower = mean(10.^(spectra(deltaIdx)/10));
 thetaPower = mean(10.^(spectra(thetaIdx)/10));
 alphaPower = mean(10.^(spectra(alphaIdx)/10));
 betaPower  = mean(10.^(spectra(betaIdx)/10));
 gammaPower = mean(10.^(spectra(gammaIdx)/10));


Validation (06/06/2020 added)

For John Johnson, I wrote following code to confirm how power values are averaged.

% Calculate mean power (uV^2).
power1 = 11;
power2 = 22;
power3 = 33;
meanPower = mean([power1 power2 power3]) % 22
 
% Convert power (uV^2) to 10*log10 (dB).
power1_dB = 10*log10(power1);
power2_dB = 10*log10(power2);
power3_dB = 10*log10(power3);
 
% Average dB values then convert to uV^2.
meanPower_dB1 = mean([power1_dB power2_dB power3_dB]) % 13.0078
10^(meanPower_dB1/10)                                 % 19.98
 
% Convert dB values to uV^2 then average.
meanPower_dB2 = mean([10^(power1_dB/10) 10^(power2_dB/10) 10^(power3_dB/10)]) % 22

FAQ

For Tim Göcking, let me add some explanations.

  • Q. What is unit for this power?
    • A. The unit is uV^2/Hz.
  • Q. What does the part '/Hz' mean in this unit?
    • A. This means 'per unit frequency bandwidth' i.e., frequency resolution being 1Hz. I found this page helpful to understand it (Thank you Nir Ofir for letting me know that the page was moved.)
  • Q. What does 10.^ do to the values?
    • A. The output from spectopo() is converted to dB as 10*log10([value_in_uV^2/Hz]) in dB (the first '10*' part is to convert the unit from Bell to deci-Bell.) Because log10 is used for the dB conversion, '10.^' is used to convert it back to uV^2/Hz.
  • Q. Why dividing through 10?
    • A. This is to convert back the deci-Bell's 'deci' part.
  • Q. When do we switch bewtween 10*log10(X) and 20*log10(X)?
    • A. 10*log10(X) is used to describe power (i.e., amplitude^2), while 20*log10(X) is used to describe amplitude.
dB_power = 10*log10(power/reference_power);
         = 10*log10(amplitude^2/reference_amplitude^2)
         = 20*log10(amplitude/reference_amplitude)
Often approximated as 2x = 6dB, 10x = 20dB.

dB_amplitude = 10*log10(amplitude/reference_amplitude)
             = dB_power/2.
Often approximated as 2x = 3dB, 10x = 10dB.

Batch-process all-channel PSDs in both dB and uV^2/Hz (03/17/2020 updated)

For Sayaka Fujita, I made this section separated for her reference. Thank you John Iversen for correction!

% This example code compares PSD in dB (left) vs. uV^2/Hz (right) rendered as scalp topography (setfile must be loaded.)
lowerFreq  = 4; % Hz
higherFreq = 8; % Hz
meanPowerDb     = zeros(EEG.nbchan,1);
meanPowerMicroV = zeros(EEG.nbchan,1);
for channelIdx = 1:EEG.nbchan
    [psdOutDb(channelIdx,:), freq] = spectopo(EEG.data(channelIdx, :), 0, EEG.srate, 'plot', 'off');
    lowerFreqIdx    = find(freq==lowerFreq);
    higherFreqIdx   = find(freq==higherFreq);
    meanPowerDb(channelIdx) = mean(psdOutDb(channelIdx, lowerFreqIdx:higherFreqIdx));
    meanPowerMicroV(channelIdx) = mean(10.^((psdOutDb(channelIdx, lowerFreqIdx:higherFreqIdx))/10), 2);
end
 
figure
subplot(1,2,1)
topoplot(meanPowerDb, EEG.chanlocs)
title('Theta band (4-8Hz) power distribution')
cbarHandle = colorbar;
set(get(cbarHandle, 'title'), 'string', '(dB)')
 
subplot(1,2,2)
topoplot(meanPowerMicroV, EEG.chanlocs)
title('Theta band (4-8Hz) power distribution')
cbarHandle = colorbar;
set(get(cbarHandle, 'title'), 'string', '(uV^2/Hz)')

Comparison.png

How to build EEG structure (07/13/2018 updated)

% Create empty EEG structure.
EEG = eeg_emptyset;
 
% Define basic items of EEG structure.
EEG        = eeg_emptyset();
EEG.data   = timeSeriesDataYouHave;
EEG.times  = timeSeriesDataLatencyYouHave;
EEG.xmin   = EEG.times(1);
EEG.xmax   = EEG.times(end);
EEG.srate  = round(1/((EEG.xmax-EEG.xmin)/length(EEG.times))); % Rounded actual sampling rate. Note that the unit of the time must be in second.
EEG.nbchan = size(EEG.data,1);
EEG.pnts   = size(EEG.data,2);
 
% Define event information.
eventStructure = struct('type', [], 'latency', []);
latencyValues  = num2cell(eventLatencyInSecYouHave*EEG.srate);
[eventStructure(1:length(eventLatencyInSecYouHave)).latency] = latencyValues{:};
eventTypes     = eventTypesYouHave;
[eventStructure(1:length(eventLatencyInSecYouHave)).type] = eventTypes{:};
EEG.event = eventStructure;
EEG = eeg_checkset(EEG, 'eventconsistency');
EEG = eeg_checkset(EEG, 'makeur');


How to load EEGLAB .set and .fdt files without using EEGLAB (05/09/2020 updated)

We can load a .set file, which contains header information, by using Matlab function load() with '-mat' option. However, how can we load .fdt file that is the matrix of EEG time-series data? See below for the example of loading a dataset that is saved as a pair of XXXXX.set and XXXXX.fdt files.

% Load header info.
headerInfo = load('XXXXX.set', '-mat');
 
% Load EEG time-series matrix (taken from eeg_getdatact.m centered at line 233)
fid = fopen('XXXXX.fdt', 'r', 'ieee-le');
for trialIdx = 1:headerInfo.EEG.trials % In case the saved data are epoched, loop the process for each epoch. Thanks Ramesh Srinivasan!
    currentTrialData = fread(fid, [headerInfo.EEG.nbchan headerInfo.EEG.pnts], 'float32');
    data(:,:,trialIdx) = currentTrialData; % Data dimentions are: electrodes, time points, and trials (the last one is for epoched data)                  
end
fclose(fid);

How to extract code embedded within file names (07/13/2018 updated)

Notice how to deal with Matlab cell format using cellfun() in two different ways.

allSubj      = dir('*.set'); % Obtain all the .set files under the current folder.
allSubjNames = {allSubj.name}'; % Obtain the file names. Here, suppose the naming rule is as follows:
 
     allSubjNames =
 
     'subj01_group1'
     'subj02_group2'
     'subj03_group2'
     'subj04_group1'
     'subj05_group2'
     'subj06_group1'
 
group1List = strfind(allSubjNames, '_group1'); % Choose those subjects who has '_group1' in the file name. Note that the output is still in the cell format.
 
     groupList =
 
         [8]
         []
         []
         [8]
         []
         [8]
 
group1Idx = find(~cellfun('isempty', group1List)); % Obtain the index number of the '_group1' subjects.
 
     group1Idx = 
 
         1
         4
         6
 
group1Subjects = allSubjNames(group1Idx); % Obtain the full file names of '_group1' subjects.
 
     group1Subjects = 
 
     'subj01_group1'
     'subj04_group1'
     'subj06_group1'
 
group1SubjectId = cellfun(@(x) x(1:6), group1Subjects, 'Uniformoutput', false); % Extract 'subjxx' part from the file names (i.e., delete '_group1' from the file names).
 
     group1SubjectId = 
 
     'subj01'
     'subj04'
     'subj06'

Event types

How to obtain unique event types

 % Obtain all unique event types
 uniqueEventTypes = unique({EEG.event.type}');

How to obtain event indices whose event type exactly matches a keyword

 % Obtain event indices whose 'type' is 'Probe'
  allEventTypes = {EEG.event.type}';
  keywordIdx = find(strcmp(allEventTypes, 'Probe'));

How to obtain event indices whose event types contain a part of the keyword (12/13/2016 updated)

During SCCN tea time today, Hiroyuki Kambara told me this solution using cellfun(). How smart!

 % Obtain event indices whose 'type' contains letter strings 'Probe' (e.g. will obtain all indices for 'Probe1', 'Probe3', and 'Probe5'
 allEventTypes = {EEG.event.type}';
 strFindResult = strfind(allEventTypes, 'Probe');
 probeIdx      = find(cellfun(@isempty, strFindResult));

How to change event type names

 % Rename event type name 'example1' into 'example2'
 allEvents = {EEG.event.type}';
 example1Idx = strcmp(allEvents, 'example1');
 [EEG.event(example1Idx).type] = deal('example2');

How to change numerical event type names to strings (05/17/2020 added)

 allEventTypes = [EEG.event.type]'; % The output is numeric with same digits.
 allEventTypeString = cellstr(num2str(allEventTypes)); % Converts numerics into strings in cell.
 [EEG.event.type] = allEventTypeString{:}; % I can't explain how this syntax works!

Event latency

How to change event latency

This is weird but works (and the only way that works probably.)

 % Change event latency into 1:length(EEG.event)
 dummyLatency = num2cell(1:length(EEG.event));
 [EEG.event(1:length(EEG.event)).latency] = dummyLatency{:};
 EEG = eeg_checkset(EEG,'eventconsistency');

How to adjust event latency by adding or subtracting values

 % Add 400ms to event 'example'
 allEvent = {EEG.event.type}';
 exampleTypeIdx  = find(strcmp(allEvent, 'exampleType'));
 exampleTypeLatencyInFrame  = [EEG.event(exampleTypeIdx).latency]';
 changedLatency = num2cell(bsxfun(@plus, exampleTypeLatencyInFrame, 400), 2);
 [EEG.event(exampleTypeIdx).latency] = changedLatency{:};
 EEG = eeg_checkset(EEG,'eventconsistency');

How to delete non-time-locking events from epoched data (08/16/2020 added)

Motivation: Imagine you have mismatch negativity data sets in which a stimulus was presented every 0.5 s. In order to perform our standard time-frequency analysis, which uses 1116 ms Morlet wavelet kernel with the lowest 3 Hz with 3 cycles (but this does NOT mean EEGLAB performs a pure wavelet analysis--see this page), we need much longer epoch than one trial. If we apply our standard recommendation of -1 to 2 s epoching, it will inflate the data to 6 times larger (3.0 s contains 6 0.5 s). This not only inflate data size, but also inflate event structure. This redundant event structure causes many problems, although all we need in this situation is just those events whose within-epoch latency is 0. The following code addresses this issue so that EEG.event and EEG.epoch have only one event that has latency zero.

% Delete non-time-locking events from epoched data (08/16/2020 updated)
for epochIdx = 1:length(EEG.epoch)
    allEventIdx    = 1:length(EEG.epoch(epochIdx).event);
    if length(allEventIdx) == 1 % If there is only 1 event, it must have latency zero.
        continue
    else
        zeroLatencyIdx = find(cell2mat(EEG.epoch(epochIdx).eventlatency) == 0);
        EEG.epoch(epochIdx).event         = EEG.epoch(epochIdx).event(zeroLatencyIdx);
        EEG.epoch(epochIdx).eventtype     = EEG.epoch(epochIdx).eventtype(zeroLatencyIdx);
        EEG.epoch(epochIdx).eventlatency  = {0};
        EEG.epoch(epochIdx).eventurevent  = EEG.epoch(epochIdx).eventurevent(zeroLatencyIdx);
        EEG.epoch(epochIdx).eventduration = 0;
    end
end
validEventIdx  = [EEG.epoch.event];
deleteEventIdx = setdiff(1:length(EEG.event), validEventIdx);
EEG.event(deleteEventIdx) = [];
for epochIdx = 1:length(EEG.epoch)
    EEG.epoch(epochIdx).event = epochIdx;
end

See what I it does below: Before (left) and after (right) the process. Notice the non-time-locking event markers (i.e., whose within-epoch latency is non-zero) are cleaned up on the right.

CleanEvents.png

How to extract all-IC/channel ERPs for each condition (07/24/2018 added)

If you want to process a lot of heavy, epoched datasets (such as mismatch negativity data with 0.5 s SOA with 5,000 trials but epoched to -1 to 2 sec allowing HUGE overlap... this is the situation where EEGLAB's 'epoch' scheme becomes inefficient), a bare-bone script like the one I show below works more efficient than EEGLAB functions. Taken from my std_selectICsByCluster() plugin.

 % Extract event indices and names that have latency zero in an epoch.
 allEpochEventLatency = cell2mat({EEG.epoch.eventlatency}');
 latencyZeroEventIdx  = find(allEpochEventLatency == 0);
 latencyZeroEventType = {EEG.event(latencyZeroEventIdx).type}';
 
 % Convert latencyZeroEventType to string in case they are numaric. 
 allEventTypes = cellfun(@(x) num2str(x), latencyZeroEventType, 'uniformoutput', false);
 
 % Extract and store mean EEG.icaact across trials for each condition.
 uniqueConditions = unique(latencyZeroEventType);
 allConditionIcaactErp = zeros(size(EEG.icaact,1), size(EEG.icaact,2), length(uniqueConditions));
 for conditionIdx = 1:length(uniqueConditions)
     currentConditionLabel  = num2str(uniqueConditions{conditionIdx});
     currentConditionIdx    = find(strcmp(allEventTypes, currentConditionLabel));
     currentConditionIcaact = mean(EEG.icaact(:,:,currentConditionIdx),3);
     allConditionIcaactErp(:,:,conditionIdx) = currentConditionIcaact;
 end

If necessary, project them back to the channels.

 % Backproject Condition 1-3 to channels if necessary.
 scalpChannelErpCondition1 = EEG.icawinv*allConditionIcaactErp(:,:,1);
 scalpChannelErpCondition2 = EEG.icawinv*allConditionIcaactErp(:,:,2);
 scalpChannelErpCondition3 = EEG.icawinv*allConditionIcaactErp(:,:,3);

How to obtain theoretical dipole projection when using pop_multifit() (10/13/2020 updated)

% Comment out pop_multifit() line 189.
% This is because dipfit_reject deletes items other than pos, mom, and rv.
% This function should not be used here anyway.
%EEG.dipfit.model  = dipfit_reject(EEG.dipfit.model, g.threshold/100);


% Add the following line in dipfit_nonlinear() line 128.
EEG.dipfit.model(cfg.component).theoreticalProj = source.dip.pot;


% Then try this code.
icIdx = 23;
d1 = EEG.dipfit.model(icIdx).datapot;
d2 = EEG.dipfit.model(icIdx).theoreticalProj;
rv = sum((d1-d2).^2) ./ sum(d1.^2);
EEG.dipfit.model(icIdx).rv; % Should be identical as rv obtained above.
 
figure
subplot(1,2,1)
topoplot(d1, EEG.chanlocs)
subplot(1,2,2)
topoplot(d2, EEG.chanlocs)

EEG data cleaning

How to perform epoch rejection using single-trial PSD (02/20/2020 updated)

For my own project, I wrote the following solution for bad trial rejection for pre-selected ICs. This uses rectangular window short-term Fourier Transform (STFT) to convert 1 epoch into 1 PSD time point. Then, it takes sum of power for the user-specified frequency band. Then they are converted into z-score, and finally threshold it with SD. The example code rejects epochs that have > 3SD power between 15-50 Hz.

% Example:
%           badTrialIdx = rejectTrialSTFT(EEG.icaact, 3, EEG.pnts, EEG.srate, [15 50]); % Use 15-50 Hz freq band to evaluate PSD.
%           EEG = pop_select(EEG, 'notrial', badTrialIdx);
%
% 02/04/2020 Makoto. Modified.
% 01/31/2020 Makoto. Used with modification.
% 11/08/2019 Makoto. Created. Changed from Hamming to Rectangular window (Thanks Masaki!)
 
function badTrialIdx = rejectTrialSTFT(icaact, SD, pnts, srate, freqEdges)
 
data2D = icaact(:,:);
 
totalExclusionIdx = [];
for icIdx = 1:size(data2D,1);
    %[S,F,T,P,Fc,Tc] = spectrogram(data2D(icIdx,:), pnts, 0, [], srate);         % This is by default Hamming.
    [S,F,T,P,Fc,Tc] = spectrogram(data2D(icIdx,:), rectwin(pnts), 0, [], srate); % This is rectangular window.
 
    log10P = 10*log10(P);
 
%     figure; imagesc(T, F, log10P); axis xy; colormap('jet')
 
    medianP = median(log10P,2);
 
%     figure; plot(F, medianP)
 
    errorP = bsxfun(@minus, log10P, medianP);
 
%     figure; imagesc(T, F, errorP, [-20 20]); axis xy; colormap('jet')
 
    zscoreHighFreqPower = zscore(sum(errorP(freqEdges(1):freqEdges(2),:))); % Targetting EMG.
 
 
%     figure; hist(zscoreHighFreqPower, 100)
 
    exclusionIdx = find(zscoreHighFreqPower>=SD);
 
    %{
 
    figure
    subplot(2,1,1)
    imagesc(T, F, errorP, [-20 20]); axis xy; colormap('jet')
    title('Error from median PSD')
    xlabel('Frames (s)')
    ylabel('10*log10(Power)/Hz')
 
    subplot(2,1,2)
    barHandle = bar(zscoreHighFreqPower, 1);
    xlim([0.5 length(zscoreHighFreqPower)-0.5])
    hold on
    line([0.5 length(zscoreHighFreqPower)-0.5], [SD SD], 'color', [1 0 0])
    title(sprintf('Z-scored error between %.0f and %.0f Hz.', freqEdges(1), freqEdges(2)))
    xlabel('Frames (s)')
    ylabel('Z-score')
 
    %}
 
    totalExclusionIdx = [totalExclusionIdx exclusionIdx];
end
 
badTrialIdx = unique(totalExclusionIdx);

Example use of code.

badTrialIdx = rejectTrialSTFT(EEG.icaact, 3, EEG.pnts, EEG.srate, [15 50]);
EEG = pop_select(EEG, 'notrial', badTrialIdx);

How to perform component selection using ICLabel() and dipole information (02/20/2020 updated)

 
    % Perform IC rejection using ICLabel scores and r.v. from dipole fitting.
    EEG = iclabel(EEG, 'default'); % Original output from eegh has a bug: the second input ('default') is without ''--this is fixed here.
    brainIdx  = find(EEG.etc.ic_classification.ICLabel.classifications(:,1) >= 0.7); % > 70% brain == good ICs.
 
    % Perform IC rejection using residual variance of the IC scalp maps.
    rvList    = [EEG.dipfit.model.rv];
    goodRvIdx = find(rvList < 0.15)'; % < 15% residual variance == good ICs.
 
    % Perform IC rejection using inside brain criterion.
    load(EEG.dipfit.hdmfile); % This returns 'vol'.
    dipoleXyz = zeros(length(EEG.dipfit.model),3);
    for icIdx = 1:length(EEG.dipfit.model)
        dipoleXyz(icIdx,:) = EEG.dipfit.model(icIdx).posxyz(1,:);
    end
    depth = ft_sourcedepth(dipoleXyz, vol);
    depthThreshold = 1;
    insideBrainIdx = find(depth<=depthThreshold);
 
    % Take AND across the three criteria.
    goodIcIdx = intersect(brainIdx, goodRvIdx);
    goodIcIdx = intersect(goodIcIdx, insideBrainIdx);
 
    % Perform IC rejection.
    EEG = pop_subcomp(EEG, goodIcIdx, 0, 1);
 
    % Post-process to update ICLabel data structure.
    EEG.etc.ic_classification.ICLabel.classifications = EEG.etc.ic_classification.ICLabel.classifications(goodIcIdx,:);
 
    % Post-process to update EEG.icaact.
    EEG.icaact = [];
    EEG = eeg_checkset(EEG, 'ica');

How to speed up ICA (10/22/2020 updated)

ICA has a known bias toward high amplitude. If the data length were infinite, it would not have this bias--my former colleague told me so. Due to 1/f property of scalp-recorded EEG signal, which is predicted from the cable equation (see 'Electric Field of the Brain' p.174- by Nunez and Srinivasan, 2006), signal power in the higher frequency range is very small. This simply means that ICA has less things to learn from high frequency (by the way, ASR applies inverse EEG-PSD filter so that signals in those high frequency ranges, which are unlikely to be dominated by natural EEG, is enhanced so that it is detected for correction). If that's the case, why not cutting the high frequency from the beginning, at least for ICA purpose--that's the rational for this process. If you doubt it, you can anytime verify it comparing ICA results obtained from using 1000-Hz sampled data with that from using 100-Hz downsampled data. It is even possible that due to the band-limiting effect (100-Hz downsampled data is 50-Hz low-pass filtered because of anti-aliasing), ICA result could be even better. The favorable effect of low-pass filtering before ICA has been told in the past EEGLAB workshops by Scott himself, I believe ('Applying 100-Hz low-pass filter before ICA is also important...') At least, it will speed up ICA.

% Run ICA.
EEG_forICA = pop_resample(EEG, 100);
EEG_forICA = pop_runica(EEG_forICA, 'extended',1,'interupt','off');
EEG.icaweights = EEG_forICA.icaweights;
EEG.icasphere  = EEG_forICA.icasphere;
EEG = eeg_checkset(EEG, 'ica');

How does it affect the recommended data length? (10/22/2020 added)

This is for Shanah Wynn--If you downsample the data from 250Hz to 100Hz, you have only 40% of data length. Does it negatively impact ICA's performance, since there is much less data available? This is something a simulation study can answer empirically, but the empirical answer is no. In fact, I don't recommend to use SCCN's rule of thumb literally--I would rather add an additional condition as follows: ICA requires data length of ((ch)^2)*30 if data were sampled at 250 Hz. I have been applying this downsampling-to-100Hz approach as ICA preprocessing over several thousands of datasets, in some of which the data length became shorter than the above recommendation but I did not see any impact. So, the length in time in the real world counts rather than frames. For example, 5-min data can be 30,000 frames when sampled at 100 Hz, and 300,000 frames when sampled at 1000Hz. Does the latter help ICA decomposition quality? It does not. It depends on how much information scalp-recorded EEG contains above 100Hz compared with below 100Hz.


This view is supported by the following description taken from Nunez and Srinivasan (2006) p.308: "EEG (and ECoG) power in mammals tends to fall off sharply above 50 or 100 Hz, apparently reflecting cortical cell time constants in the 10 ms range as indicated in chapter 4. That is, we expect minimal changes in the mesosource function P(r,t) over mesoscopic "relaxation times" in the range of 10 ms or perhaps several tens of milliseconds."

How to export data to an Excel file (08/09/2020 added, prayer for Nagasaki)

This is an example to write ERP (i.e., data needs to be epoched) with channel labels and latency for Zahra Fotovatnia and Zoran Josipovic.

% Obtain electrode labels.
chLabels = {EEG.chanlocs.labels}';
 
% If the electrode labels are just numbers, you have to add non-numeric
% characters before that, otherwise writetable() fails. Very Dumb.
for chIdx = 1:length(chLabels)
    chLabels{chIdx} = ['Ch' chLabels{chIdx}];
end
 
% Obtain ERP latency.
latency  = cellstr(num2str(EEG.times'));
 
% Obtain all-electrode ERP.
ERP = mean(EEG.data,3)';
 
% Prepare output table.
outputTable = array2table(ERP, 'VariableNames', chLabels, 'RowNames', latency);
 
% Write the output table. Using '.xlsx' automatically specifies Excel
% format so no need to use 'FileType', 'spreadsheet' option.
writetable(outputTable, 'savePath/test.xlsx',...
           'WriteVariableNames', true, 'WriteRowNames', true);

How to send email from Matlab to report progress (07/26/2019 added)

% Set up Matlab preferences.
setpref('Internet','SMTP_Server','XXXX.XXXX.XXX');
setpref('Internet','E_mail','XXXXXXXXXXX@XXXX.XXX');
 
allSets = dir('/data/projects/example/p0100_imported/*.set');
ticTocList = nan(length(allSets),1);
for setIdx = 1:length(allSets)
 
    ticStart = tic;
 
 
    %%%%%%%%%%%%%%%%%%%%%%%%%%
    %%% Your process here. %%%
    %%%%%%%%%%%%%%%%%%%%%%%%%%
 
 
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    %%% Calculate time lapse. %%%
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    ticTocList(setIdx) = toc(ticStart);
 
    if mod(setIdx, 100) == 0
 
        meanProcessingDurationInSec = nanmean(ticTocList);
        stdProcessingDurationInSec  = nanstd(ticTocList);
        meanRemainingProcessingDurationInDay = (length(allSets)-setIdx)*meanProcessingDurationInSec/(60*60*24);
        stdRemainingProcessingDurationInDay  = (length(allSets)-setIdx)*stdProcessingDurationInSec/(60*60*24);
 
        %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
        %%% Send email for update. %%%
        %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
        sendmail({'XXXXXXXX@XXXX.XXX', 'XXXXXXXXX@XXXX.XXX'},...
            sprintf('From Matlab running batch: %.0f/%.0f done', setIdx, length(allSets)),...
            sprintf('Estimated remaining time: %.1f days (SD %.1f)', meanRemainingProcessingDurationInDay, stdRemainingProcessingDurationInDay));
    end
end


How to perform time-frequency analysis with variable epoch length (10/13/2020 added)

This solution is for Steve Wu, and also could have been for Kelly Michaelis, although I took a different approach in 2019--but seems apparent that this solution was more straightforward than hacking the STUDY structure. I feel I tried to be too loyal to EEGLAB and made a wrong decision at that time (whichever works anyway... it's a matter of efficiency and integrity.)


Here in this example, ERSP for 'latencyZeroEvent', which are 'DIN6' and 'DIN60', is calculated for which baseline is calculated for 'baselineOnsetEvent'. The whole point is that you can calculate ERSP with flexibly specified baseline period even if there are duration-varying events between the two conditions. Use another solution 'How to obtain the optimum number of clusters' together and you can perform your own IC clustering with ERSP.

% Define events of interest/non-interest.
latencyZeroEvent    = {'DIN6', 'DIN60'};
epochWinEdges       = [-2 1.5];
baselineOnsetEvent  = 'READY'; % Baseline window is 2-s.
baselineWinEdges    = [0 1.5];
epochExclusionEvent = 'boundary';
windowStepSize      = 0.02; % s
windowLengthInSteps = (epochWinEdges(2)-epochWinEdges(1))/windowStepSize;
windowLengthInFrame = 557; % For 1.5-Hz at 3 cycles. 
                           % This must be obtained from the command line log
                           % shown after running a newtimefreq() with above
                           % parameters.
 
allSets = dir('[loadingDataFullPath]/*.set');
for setIdx = 1:length(allSets)
 
    % Load file.
    loadName = allSets(setIdx).name;
    EEG = pop_loadset('filename', loadName, 'filepath', '[loadingDataFullPath]/');
 
    % Loop for ICs.
    icErspStack = zeros(100, windowLengthInSteps, size(EEG.icaact,1));
    for icIdx = 1:size(EEG.icaact,1)
 
        %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
        %%% Compute EEGLAB's pseudo-wavelet (i.e. STFT-hybrid). %%%
        %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
        timesoutParam = round((EEG.xmax-windowLengthInFrame*1/EEG.srate)/windowStepSize); % Temporal resolution is set to 20 ms.
 
        % Perform time-frequency decomposition. Do not calculate baseline at this point.
        [~, ~, ~, waveletTimesS, freqs, ~, ~, coeffs] = ...
            newtimef(EEG.icaact(icIdx,:), EEG.pnts, [EEG.xmin EEG.xmax], EEG.srate, [3 10], ...
            'baseline', NaN,'nfreqs', 100, 'freqs', [1.5 55], 'freqscale', 'log', ...
            'timesout', timesoutParam, 'plotersp', 'off', 'plotitc', 'off');
 
        % Find critical events.
        allEvents = {EEG.event.type}';
        latencyZeroIdx     = find(strcmp(allEvents, latencyZeroEvent{1}) | strcmp(allEvents, latencyZeroEvent{2}));
        baselineOnsetIdx   = find(strcmp(allEvents, baselineOnsetEvent));
        epochExclusionIdx1 = find(strcmp(allEvents, epochExclusionEvent));
        latencyZeroLatencyS    = [EEG.event(latencyZeroIdx).latency]*1/EEG.srate; % s.
        baselineOnsetLatencyS  = [EEG.event(baselineOnsetIdx).latency]*1/EEG.srate; % s.
        epochExclusionLatencyS = [EEG.event(epochExclusionIdx1).latency]*1/EEG.srate; % s.
 
        %%%%%%%%%%%%%%%%%%%%%%%%%%
        %%% Obtain the epochs. %%%
        %%%%%%%%%%%%%%%%%%%%%%%%%%
        % Obtain raw epoch edges.
        epochEdges = [[latencyZeroLatencyS + epochWinEdges(1)]' [latencyZeroLatencyS + epochWinEdges(2)]'];
 
        % Exclude epochs that contains epochExclusionLatencyS.
        epochExclusionIdx2 = [];
        for exclusionIdx = 1:length(epochExclusionLatencyS)
            currentExclusionEventS = epochExclusionLatencyS(exclusionIdx);
            for epochIdx = 1:size(epochEdges,1)
                if epochEdges(epochIdx,1)<=currentExclusionEventS & epochEdges(epochIdx,2)>=currentExclusionEventS
                    epochExclusionIdx2 = sort([epochExclusionIdx2; epochIdx]);
                end
            end
        end
        epochEdgesRej = epochEdges;
        epochEdgesRej(epochExclusionIdx2,:) = [];
 
        coeffEpochTensor = zeros(length(freqs), windowLengthInSteps, size(epochEdgesRej,1));
        for epochIdx = 1:size(epochEdgesRej,1)
            epochFrameIdx = find(waveletTimesS > epochEdgesRej(epochIdx,1) & waveletTimesS < epochEdgesRej(epochIdx,2));
            if     length(epochFrameIdx) < windowLengthInSteps
                epochFrameIdx(end+1) = epochFrameIdx(end)+1;
            elseif length(epochFrameIdx) > windowLengthInSteps
                epochFrameIdx(end+1) = epochFrameIdx(end)-1;
            end
            coeffEpochTensor(:,:,epochIdx) = coeffs(:,epochFrameIdx);
        end
 
        %%%%%%%%%%%%%%%%%%%%%%%%%%%%
        %%% Obtain the baseline. %%%
        %%%%%%%%%%%%%%%%%%%%%%%%%%%%
        % Obtain raw epoch edges.
        baselineEdges = [[baselineOnsetLatencyS + baselineWinEdges(1)]' [baselineOnsetLatencyS + baselineWinEdges(2)]'];
 
        % Exclude epochs that contains epochExclusionLatencyS.
        baselineExclusionIdx = [];
        for exclusionIdx = 1:length(epochExclusionLatencyS)
            currentExclusionEventS = epochExclusionLatencyS(exclusionIdx);
            for baselineIdx = 1:size(baselineEdges,1)
                if baselineEdges(baselineIdx,1)<=currentExclusionEventS & baselineEdges(baselineIdx,2)>=currentExclusionEventS
                    baselineExclusionIdx = sort([baselineExclusionIdx; baselineIdx]);
                end
            end
        end
        baselineEdgesRej = baselineEdges;
        baselineEdgesRej(baselineExclusionIdx,:) = [];
 
        % Extract epochs. Length of 75 was determined from data.
        coeffBaselineTensor = zeros(length(freqs), 75, size(baselineEdgesRej,1));
        for baselineIdx = 1:size(baselineEdgesRej,1)
            baselineFrameIdx = find(waveletTimesS > baselineEdgesRej(baselineIdx,1) & waveletTimesS < baselineEdgesRej(baselineIdx,2));
            if     length(baselineFrameIdx) < 75
                baselineFrameIdx(end+1) = baselineFrameIdx(end)+1;
            elseif length(baselineFrameIdx) > 75
                baselineFrameIdx(end+1) = baselineFrameIdx(end)-1;
            end
            coeffBaselineTensor(:,:,baselineIdx) = coeffs(:,baselineFrameIdx);
        end
 
        %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
        %%% Perform baseline correction. %%%
        %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
        epochErspRaw        = coeffEpochTensor.*conj(coeffEpochTensor);
        epochErspNoBaseline = 10*log10(mean(epochErspRaw,3));
        baselineErspRaw = coeffBaselineTensor.*conj(coeffBaselineTensor);
        baselinePower   = mean(10*log10(mean(baselineErspRaw,3)),2);
        epochErsp       = bsxfun(@minus, epochErspNoBaseline, baselinePower);
 
        % Store the result.
        icErspStack(:,:,icIdx) = epochErsp;
    end
 
    % Store the result.
    EEG.etc.customErsp.icErspStack         = icErspStack;
    EEG.etc.customErsp.windowStepSize      = windowStepSize;
    EEG.etc.customErsp.freqs               = freqs;
    EEG.etc.customErsp.latencyZeroEvent    = latencyZeroEvent;
    EEG.etc.customErsp.epochWinEdges       = epochWinEdges;
    EEG.etc.customErsp.baselineOnsetEvent  = baselineOnsetEvent;
    EEG.etc.customErsp.baselineWinEdges    = baselineWinEdges;
    EEG.etc.customErsp.epochExclusionEvent = epochExclusionEvent;    
 
    % Save the data.
    pop_saveset(EEG, 'filename', loadName, 'filepath', '[savingDataFullPath]')
end

STUDY (Before EEGLAB14)

How to obtain the optimum number of clusters (01/14/2020 updated)

This solution consists 1) Obtain xyz coordinates of estimated equivalent dipoles from all .set files loaded and stored in ALLEEG streucture, 2) Run Matlab evalclusters() to perform optimization between the range of 10-30 using three algorithms (Calinski-Harabasz, Silhouette, Davies-Bouldin). Note that I am only using dipole location for the clustering criterion (if you wonder why, see this page.) Accordingly, my code below is also separated into two sections. Last time I got number 22, then 13-16. In this case, I took 13. I prefer to select the smallest number available across everything so that I could get maximum number of unique subjects per cluster.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%% Obtain all dipole xyz coordinates as a list. %%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
dipXyz = [];
for subjIdx = 1:length(ALLEEG)
 
    % Obtain xyz, dip moment, maxProj channel xyz.   
    xyz = zeros(length(ALLEEG(subjIdx).dipfit.model),3);
    for modelIdx = 1:length(ALLEEG(subjIdx).dipfit.model)
 
        % Choose the larger dipole if symmetrical.
        currentXyz = ALLEEG(subjIdx).dipfit.model(modelIdx).posxyz;
        currentMom = ALLEEG(subjIdx).dipfit.model(modelIdx).momxyz; % nAmm.
        if size(currentMom,1) == 2
            [~,largerOneIdx] = max([norm(currentMom(1,:)) norm(currentMom(2,:))]);
            currentXyz = ALLEEG(subjIdx).dipfit.model(modelIdx).posxyz(largerOneIdx,:);
        end
        xyz(modelIdx,:) = currentXyz;
    end
    dipXyz = [dipXyz; xyz];
end
 
 
 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%% Optimize the number of clusters between the range 10-30. %%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Use Matlab function evalclusters().
kmeansClusters = [];
for clustIdx = 10:30
    kmeansClusters(:,clustIdx) = kmeans(dipXyz, clustIdx, 'emptyaction', 'singleton', 'maxiter', 10000, 'replicate', 100);
end
 
kmeansClusters = kmeansClusters(:,10:30);
eva1 = evalclusters(dipXyz, kmeansClusters, 'CalinskiHarabasz');
eva2 = evalclusters(dipXyz, kmeansClusters, 'Silhouette');
%eva3 = evalclusters(dipXyz, kmeansClusters, 'gap', ); % Slow and not consistent value.
eva4 = evalclusters(dipXyz, kmeansClusters, 'DaviesBouldin');
 
figure
subplot(1,3,1)
plot(eva1); title('CalinskiHarabasz');
subplot(1,3,2)
plot(eva2); title('Silhouette');
subplot(1,3,3)
plot(eva4); title('DaviesBouldin');

How to transfer precomputed STUDY (10/11/2018 updated)

I found that just copying all the files from one folder to another does not copy the precomputed results. After investigation, I realized that this is because when STUDY is pre-computed (i.e., ERP, PSD, ERSP, etc), the generated files are stored in the HDD and linked with STUDY.design.filepath and STUDY.design.cell.filebase. If these are the all the things that needs to be updated, we can do it using the following code. After loading the copied STUDY, run the code, and save the current STUDY with a different name (just in case). Load the saved STUDY and everything should be there. At least it worked for me.

for designIdx = 1:length(STUDY.design)
 
    % Obtain the STUDY.design filepath with which the original data were precomputed.
    currentDesignFilepath = STUDY.design(designIdx).filepath;
 
    % Obtain the STUDY.design filepath which is current.
    newDesignFilepath     = '/data/mobi/Christiane/p01_study';
 
    % Replace the original path with the current path.
    STUDY.design(designIdx).filepath = newDesignFilepath;
 
 
    for cellIdx = 1:length(STUDY.design(designIdx).cell)
 
        % Obtain the STUDY.design.cell filepath with which the original data were precomputed.
        currentCellFilepath = STUDY.design(designIdx).cell(cellIdx).filebase;
 
        if strncmp(currentCellFilepath, currentDesignFilepath, length(currentDesignFilepath));
 
            % Obtain the STUDY.design filepath which is current.
            oldFilepathRemoved            = currentCellFilepath(length(currentDesignFilepath)+1:end);
            newDesignFilepathConcatenated = [newDesignFilepath oldFilepathRemoved];
 
            % Replace the original path with the current path.
            STUDY.design(designIdx).cell(cellIdx).filebase = newDesignFilepathConcatenated;
        else
            error('Oops, the original filepath did not match.')
        end
    end
end

How to load .set files to create STUDY (02/27/2020 updated)

Typically, you need to enter subject names. Sometimes, group names as well. Rarely in may case, condition names. Entering these info one by one on GUI is horrible, particularly if you have many data sets. Here is code to do it automatically. It is not only efficient but it also eliminates human errors YOU will make. After loading all the .set files, use EEGLAB GUI 'Using all loaded dataset' to create a STUDY. This code is also a good example for the basic use of 'for loop' for batch processing.

% Obtain all .set file under /data/makoto/exampleProject/.
% In this example, suppose all set files have names like subj123_group2.set
 
targetFolder = '/data/makoto/exampleProject';
allSetFiles = dir([targetFolder filesep '*.set']); % filesep inserts / or \ depending on your OS.
 
% Start the loop.
for setIdx = 1:length(allSetFiles)
 
    % Obtain the file names for loading.
    loadName = allSetFiles(setIdx).name; % subj123_group2.set
    dataName = loadName(1:end-4);        % subj123_group2
 
    % Load data. Note that 'loadmode', 'info' is to avoid loading .fdt file to save time and RAM.
    EEG = pop_loadset('filename', loadName, 'filepath', targetFolder, 'loadmode', 'info');
 
    % Enter EEG.subjuct.
    EEG.subject = dataName(1:7); % subj123
 
    % Enter EEG.group.
    EEG.group = dataName(9:14); % group2
 
    % Store the current EEG to ALLEEG.
    [ALLEEG, EEG, CURRENTSET] = eeg_store(ALLEEG, EEG, 0);
end
eeglab redraw % This is to update EEGLAB GUI so that you can build STUDY from GUI menu.


How to obtain more robust clustering result (10/24/2020 added)

This is for Juan Molina--The EEGLAB's default parameter setting for kmeans() may allow noticable variability in performing IC clustering in multiple times (except for the order of the clusters--it varies inevitably, but I'm talking about robustness of the IC-clustering pattern). We can actually do better than the default parameters. In pop_clust() line 184, there is a line of code that performs kmeans which I suggest to change as follows.

%[IDX,C,sumd,D] = kmeans(clustdata,clus_num,'replicates',10,'emptyaction','drop');
[IDX,C,sumd,D] = kmeans(clustdata,clus_num,'replicates',100,'emptyaction','drop','MaxIter',10000);

The above change only costs one-second longer processing time. If you want, you may increase the number for 'replicates' more generously.

Example of batch code to preprocess multiple subjects (08/08/2020 updated)

See this page for detail of the processes. Note that this is NOT an official SCCN opinion. Data are not epoched here.

%% Example of batch code to preprocess multiple subjects
 
% Step 1: Change the option to use double precision.
 
cd('/data/project/example')
rawDataFiles = dir('*.bdf');
for subjID = 1:length(rawData) 
    loadName = rawDataFiles(subjID).name;
    dataName = loadName(1:end-4);
 
    % Step2: Import data.
    EEG = pop_biosig(loadName);
    EEG.setname = dataName;
 
    % Step 3: Downsample the data.
    EEG = pop_resample(EEG, 250);
    %EEG = pop_resample(EEG, 250, 0.8, 0.4); % For SIFT to ease antialiasing filter slope.
 
    % Step 4: High-pass filter the data at 1-Hz. Note that EEGLAB uses pass-band edge, therefore 1/2 = 0.5 Hz.
    EEG = pop_eegfiltnew(EEG, 1, 0, 1650, 0, [], 0);
 
    % Step 5: Import channel info
    EEG = pop_chanedit(EEG, 'lookup','[EEGLABroot]/eeglab/plugins/dipfit2.3/standard_BEM/elec/standard_1005.elc','eval','chans = pop_chancenter( chans, [],[]);');
 
    % Step 6: Remove line noise using CleanLine
    %EEG = pop_cleanline(EEG, 'bandwidth', 2,'chanlist', [1:EEG.nbchan], 'computepower', 0, 'linefreqs', [50 100 150 200 250],...
    %    'normSpectrum', 0, 'p', 0.01, 'pad', 2, 'plotfigures', 0, 'scanforlines', 1, 'sigtype', 'Channels', 'tau', 100,...
    %    'verb', 1, 'winsize', 4, 'winstep', 4);
 
    % Step 6: Run cleanLineNoise (requires PREP Pipeline toolbox) The original CleanLine was replaced (04/25/2020).
    signal      = struct('data', EEG.data, 'srate', EEG.srate);
    lineNoiseIn = struct('lineNoiseMethod', 'clean', ...
                         'lineNoiseChannels', 1:EEG.nbchan,...
                         'Fs', EEG.srate, ...
                         'lineFrequencies', [60 120 180 240],...
                         'p', 0.01, ...
                         'fScanBandWidth', 2, ...
                         'taperBandWidth', 2, ...
                         'taperWindowSize', 4, ...
                         'taperWindowStep', 1, ...
                         'tau', 100, ...
                         'pad', 2, ...
                         'fPassBand', [0 EEG.srate/2], ...
                         'maximumIterations', 10);
    [clnOutput, lineNoiseOut] = cleanLineNoise(signal, lineNoiseIn);
    EEG.data = clnOutput.data;
 
    % Step 7: Apply clean_rawdata() to reject bad channels and correct continuous data using Artifact Subspace Reconstruction (ASR). 
    % Note 'availableRAM_GB' is for clean_rawdata1.10. For any newer version, it will cause error.
    originalEEG = EEG;
    EEG = clean_rawdata(EEG, 5, -1, 0.85, 4, 20, 0.25, 'availableRAM_GB', 8);
 
    % Step 8: Interpolate all the removed channels
    EEG = pop_interp(EEG, originalEEG.chanlocs, 'spherical');
 
    % Step 9: Re-reference the data to average
    EEG.nbchan = EEG.nbchan+1;
    EEG.data(end+1,:) = zeros(1, EEG.pnts);
    EEG.chanlocs(1,EEG.nbchan).labels = 'initialReference';
    EEG = pop_reref(EEG, []);
    EEG = pop_select( EEG,'nochannel',{'initialReference'});
 
    % Step 10: Run AMICA using calculated data rank with 'pcakeep' option
    if isfield(EEG.etc, 'clean_channel_mask')
        dataRank = min([rank(double(EEG.data')) sum(EEG.etc.clean_channel_mask)]);
    else
        dataRank = rank(double(EEG.data'));
    end
    runamica15(EEG.data, 'num_chans', EEG.nbchan,...
        'outdir', ['/data/projects/example/amicaResults/' dataName],...
        'pcakeep', dataRank, 'num_models', 1,...
        'do_reject', 1, 'numrej', 15, 'rejsig', 3, 'rejint', 1);
    EEG.etc.amica  = loadmodout15(['/data/projects/example/amicaResults/' dataName]);
    EEG.etc.amica.S = EEG.etc.amica.S(1:EEG.etc.amica.num_pcs, :); % Weirdly, I saw size(S,1) be larger than rank. This process does not hurt anyway.
    EEG.icaweights = EEG.etc.amica.W;
    EEG.icasphere  = EEG.etc.amica.S;
    EEG = eeg_checkset(EEG, 'ica');
 
    % Step 11: Estimate single equivalent current dipoles
    % Note: if ft_datatype_raw() complains about channel numbers, comment out (i.e. put % letter in the line top) line 88 as follows
    % assert(size(data.trial{i},1)==length(data.label), 'inconsistent number of channels in trial %d', i);
 
    % CASE1: If you are using template channel locations for all the subjects:
    %        -> Perform 'Head model and settings' from GUI on one of the
    %           subjects, then type 'eegh' to obtain 9 parameters called 'coord_transform'
    %           For example, my 32ch data has [0.68403 -17.0438 0.14956 1.3757e-07 1.0376e-08 -1.5708 1 1 1], therefore
    %
    %           coordinateTransformParameters = [0.68403 -17.0438 0.14956 1.3757e-07 1.0376e-08 -1.5708 1 1 1];
    %
    % CASE2: If you are using digitized (i.e. measured) channel locations that have 10-20 system names (Fz, Cz, ...) 
    %        -> Calculate the invidivualized transform parameters usign all scalp channels in this way
    %          
    %           [~,coordinateTransformParameters] = coregister(EEG.chanlocs, '[EEGLABroot]/eeglab/plugins/dipfit2.3/standard_BEM/elec/standard_1005.elc', 'warp', 'auto', 'manual', 'off')
    %
    % CASE3: If you are using digitized channel locations that do NOT have 10-20 system names
    %        -> Identify several channels that are compatible with 10-20 channel locations, rename them with 10-20 system labels, perform CASE2, then rename them back.
    %           Alternatively, rename the fiducial channels under EEG.chaninfo.nodatchans into 'Nz', 'LPA', 'RPA' accordingly, then use these fiducial channels to perform CASE2 in this way
    %
    %           [~,coordinateTransformParameters] = coregister(EEG.chaninfo.nodatchans, '[EEGLABroot]/eeglab/plugins/dipfit2.3/standard_BEM/elec/standard_1005.elc', 'warp', 'auto', 'manual', 'off')
 
    templateChannelFilePath = '[EEGLABroot]/eeglab/plugins/dipfit2.3/standard_BEM/elec/standard_1005.elc';
    hdmFilePath             = '[EEGLABroot]/eeglab/plugins/dipfit2.3/standard_BEM/standard_vol.mat';
    EEG = pop_dipfit_settings( EEG, 'hdmfile', hdmFilePath, 'coordformat', 'MNI',...
        'mrifile', '[EEGLABroot]/eeglab/plugins/dipfit2.3/standard_BEM/standard_mri.mat',...
        'chanfile', templateChannelFilePath, 'coord_transform', coordinateTransformParameters,...
        'chansel', 1:EEG.nbchan);
    EEG = pop_multifit(EEG, 1:EEG.nbchan,'threshold', 100, 'dipplot','off','plotopt',{'normlen' 'on'});
 
    % Step 12: Search for and estimate symmetrically constrained bilateral dipoles
    EEG = fitTwoDipoles(EEG, 'LRR', 35);
 
    % Step 13: Run ICLabel (Pion-Tonachini et al., 2019)
    EEG = iclabel(EEG, 'default');
 
    % Save the dataset
    EEG = pop_saveset(EEG, 'filename', dataName, 'filepath', '/data/projects/example');
end

What do those AMICA parameters mean? (04/06/2018 updated)

  • 'do_reject', 1

This means 'Reject the outliers for the AMICA model being computed'.

  • 'numrej', 15

This means 'Do the rejection for 15 iterations' Note that these datapoint rejection is only valid for AMICA computation, so it does not change the data length. If you want to know which data points were rejected, you can check EEG.etc.amica.Lht. If any datapoint shows 0, it means the datapoints were rejected by AMICA. Note also that you might want to use this info for further data cleaning if you don't mind rejecting randomly distributing single datapoints.

  • 'rejsig', 3

This means 'Use rejection criterion as SD==3' Note that the outlier criterion and datapoints are updated for every iteration.

  • 'rejint', 1

This means 'Use the rejection interval == 1 (i.e., for the first 15 continuous iterations, perform this outlier rejection.'

The reason for choosing these parameters are

  1. In most cases, the above parameters rejects total of 10% of data points.
  2. This set of parameters and options were recommended by my colleague Jason, the developer of AMICA.


Recommended preprocessing steps and order (05/11/2020 updated)

I suggest one go through the processes explained in this page in the following order:

  1. Example of batch code to preprocess multiple subjects
  2. How to perform component selection using ICLabel() and dipole information
  3. How to perform epoch rejection using single-trial PSD

This is because ICLabel() has dependency on IC-scalp maps (but not dipole information, thanks Luca!), and the epoch rejection function works on single trials of ALL ICs. This means that even if one of the all ICs shows a bad behavior in an epoch, the entire epoch will be rejected. This is why you want to pre-select brain ICs only, otherwise this epoch rejection may reject too many epochs.