[Eeglablist] Automated EEG processing pipeline

Mon Oct 7 21:42:12 PDT 2024

Hi Ugo,

For three different ERPs, we compared high-pass filtering at 0.5 Hz with no baseline against high-pass filtering at 0.01 Hz and a standard 200 ms ERP baseline. The results were clear, with the second method resulting in significantly fewer electrodes being considered significant in all cases (see supplementary figure 11 of this paper: https://urldefense.com/v3/__https://www.nature.com/articles/s41598-023-27528-0__;!!Mih3wA!DHGJxoDZZBor52cpFp-PLXC7dwW3NLJzb9GfBbP_eHv-MbcXBGC4g3_3KS-BS4ewpiMj8tbu8Ipfdv18nPy1Cmqu$ ). However, I agree that the choice depends on your specific interests. This is why you can temporarily apply high-pass filtering at 0.5 Hz to run clean_rawdata and ICA, and then apply the rejections to your original data.

Arno

> Hi Arno,
> 
> Wouldn't a 0.5 HP filter out important ERPs? I've seen a 0.2 Hp completely remove a LPP from a dataset, which is why when doing ERP analysis we highpass it is traditional to use 0.1 hp.
> 
> Cheers,
> 
> UgoFrom: eeglablist <eeglablist-bounces at sccn.ucsd.edu> on behalf of Arnaud Delorme via eeglablist <eeglablist at sccn.ucsd.edu>
> Sent: Friday, October 4, 2024 5:12 PM
> To: eeglab list <eeglablist at sccn.ucsd.edu>
> Subject: [Eeglablist] Automated EEG processing pipeline
>  Dear all,
> 
> One of my colleagues asked me to comment on his automated EEG processing pipeline. I thought my comments might be of interest to some of you. Please feel free to react if you do not agree with some of my comments below.
> 
> > 1. Load raw data file and channel location.
> > 
> > 2. Filter data with a band pass filter (0.1 to 50 Hz, 60 Hz notch).
> 
> Makes sense. I would use linear filters, which are more stable. Also, in theory, the notch at 60 Hz is not needed if you low-pass at 50 Hz (depending on the filter roll-off). Additionally, regarding the high-pass filter at 0.1 Hz, unless you want to examine low frequencies, you might consider using a high-pass filter at 0.5 Hz (see also points 4 and 6). Note that very low frequencies below 0.5 Hz are often contaminated by skin conductance changes due to sweating.
> 
> Low-pass filtering below 50 Hz is probably acceptable, but most people would want to retain the higher frequencies. Many processes can take advantage of the additional information (like ICA, for example). If your data is very noisy, it is probably a reasonable approach, though.
> 
> > 3. Use EEGLAB function clean_line with default settings.
> 
> I would skip this step since you remove line noise in step 2. 
> 
> > 4. Use EEGLAB function clean_rawdata with default settings on 19 channels dry electrode system.
> 
> ASR (in clean_rawdata) works down to 4 channels, according to its author, Christian Kothe. So 19 channels is fine. Due to the large amount of noise and the significant distance between channels, you might want to change the threshold for rejecting bad channels (by default, 0.8 correlation with neighboring channels); otherwise, too many channels might be rejected.
> 
> Additionally, this function requires the data to be high-pass filtered at 0.5 Hz. If you do not select the option to filter in clean_rawdata and do not high-pass filter at 0.5 Hz in step 2, the results will be unpredictable. I have used clean_rawdata (and the filter) and reapplied the result (the rejected data portion) to the original unfiltered data (for example, the data high-pass filtered at 0.1 Hz), so that’s a possibility.
> 
> For practical purposes, you should high-pass filter at 0.5 Hz in step 2 and ignore the clean_rawdata filtering option.
> 
> > 5. Use EEGLAB interpolate function to interpolate missing channels that are removed by above process.
> 
> You need to interpolate channels after ICA, so I would move this step after 7. If you interpolate channels before ICA, it will fail to converge. Additionally, if you use EEGLAB STUDY after point 7, the interpolation will be performed at that time.
> 
> > 6. Use EEGLAB ICA and ICLabel to correct for blinks in data (90% threshold) and muscle (90% threshold) as in the default ICLabel menu.
> 
> I think it should be fine. For ICA, I could use the Picard method, which is Infomax ICA on steroids compared with the default runica function (it has a lower threshold for convergence, so in theory, it is better; it also uses the Newton method like in AMICA for optimizing components). If you have the patience, you could also use AMICA. 
> 
> Additionally, irrespective of the ICA algorithm, you need the data to be high-pass filtered at 0.5 Hz (sometimes higher), so it is a good argument for doing it at step 2. If you do not high-pass filter before ICA, the components are often meaningless. As for step 4, there is the option of filtering the data just for ICA, run ICA and reuse the component on the unfiltered data. ICA components can be seen as spatial filters, so this approach can make sense in some cases.
> 
> The default rejection thresholds for ICA components in ICLabel are good from my perspective (since I set them myself in the plugin and also ran some tests in this paper:https://urldefense.com/v3/__https://www.nature.com/articles/s41598-023-27528-0__;!!Mih3wA!AECl2Qnl02Ppec43Zjsm7Fj56mGCXKdgXWY7rkqBUle5nyBTz21WyWoKbDuuASq1QJZr7xZxyh0LiHob9sSqvQGX$ ). Some people would disagree, though. I know Makoto is much more aggressive in rejecting ICA components.
> 
> > 7. Reference to Average reference. 
> 
> ICLabel will also average reference the data internally to find matching components. I would probably perform average referencing before ICA (so move to step 5). Anecdotal evidence seems to indicate that ICA works better on average-referenced data. Also, average reference is not required unless you want to do source localization (DIPFIT in EEGLAB will automatically average-reference data and components before performing source localization).
> 
> 
> It is important that you evaluate some files to see if this approach works. Once the pipeline is fixed with the comments above, it looks good on paper, but real data is unpredictable, especially with a dry sensor headset. In a given number of datasets, you could run some statistics on the number of channels and ICA components rejected to make sure they are in a reasonable range (a good measure of quality is also to count the number of brain components as seen in figure 2 of this paperhttps://urldefense.com/v3/__https://ieeexplore.ieee.org/document/9441399__;!!Mih3wA!AECl2Qnl02Ppec43Zjsm7Fj56mGCXKdgXWY7rkqBUle5nyBTz21WyWoKbDuuASq1QJZr7xZxyh0LiHob9l5QzkGG$ ). Also, nothing replaces looking at the raw data.
> 
> Note that there are similar automated pipelines published on the EEGLAB website https://urldefense.com/v3/__https://eeglab.org/tutorials/11_Scripting/automated_pipeline.html__;!!Mih3wA!AECl2Qnl02Ppec43Zjsm7Fj56mGCXKdgXWY7rkqBUle5nyBTz21WyWoKbDuuASq1QJZr7xZxyh0LiHob9uQoFUwY$
> 
> Arno
> _______________________________________________
> Eeglablist page: http://sccn.ucsd.edu/eeglab/eeglabmail.html
> To unsubscribe, send an empty email to eeglablist-unsubscribe at sccn.ucsd.edu
> For digest mode, send an email with the subject "set digest mime" to eeglablist-request at sccn.ucsd.edu