[Eeglablist] Several Questions on Preprocessing Steps of a Neural Entrainment Study

Mon Feb 24 03:57:01 PST 2025

Dear EEGLAB List,

I am currently trying to analyze the EEG data that I collected for my master's thesis using EEGLAB, and I have A LOT of rookie questions regarding each step of the way. If you could guide me in any of the steps, I would be so grateful.

Before I start the questions, my sampling rate is 2048 Hz, and I used a 64 Channel BioSemi cap (Active - 10/20).
What I want to after preprocessing is to do an FFT, and look at the power of the frequencies of interest.

1. IMPORT --> I have BioSemi data (so .bdf format), and when importing the raw data, due to the EEGLAB warning that I should provide a reference channel, (otherwise there will remain 40 dB of unnecessary noise in the data), I provided Cz as the reference channel to later re-reference to average.

Question: Is this what I should be doing?

2. FILTERING --> Most of the papers that I take as reference only did a high-pass filter (0.1 Hz Butterworth, 2nd / 4th order) and don't really report any low pass filtering, or other artefact rejection steps before ICA, which I find extremely puzzling.
I want to do a 20 Hz low pass filter following the high pass filtering, (rather conservative, as I want to the data to be as clean as possible, and am not interested in frequencies above that). I then downsample to 256 for less computation time, considering how long ICA takes.

Question: Any suggestions or other ideas?

3. PREPARING DATA FOR ICA --> After the downsampling part, comes artefact rejection. So, for this, I know that there are several approaches:

a.) Run ICA on manually or automatically cleaned continuous data

  *
If this is the case, how should I do it? I experienced that, when I use the Clean Raw Data with ASR approach, it cleans almost 60% or above of the data, to which I am also confused about, as with channel scroll it doesn't look THAT terrible. And if I try to do it manually, I am afraid I won't do such a good job as this is my first time properly analyzing EEG data, plus it won't be replicable.

b.) Cut the data into dummy epochs (like 1 second epochs), provide a threshold like +- 100 mV or 3 SD's, and reject the epochs that exceed the threshold. And then, run ICA. However, here, what should be the more advised approach?

      1.) Run ICA on aggresively filtered and downsampled (1 Hz hp - 30 Hz lp) 1-sec-epoched data, get the ICA weights and                     project it onto the raw,  continuous data, reject the components, and then run the preprocessing steps on the cleaned                     data?

             2.) Run ICA on 0.1 hz high pass - 20 hz low pass filtered, 1-sec-epoched data, get the ICA weights and project it onto the                 same preprocessed but not ICA'd data, OR

             3.) Concatenate the epoched and cleaned data into the continuous form again, and then run ICA?

4. EPOCHING --> So, I had 12 trials which lasted around 2.5 mins each and trials had rhythmic patterns that always followed each other. However, the issue is here is that, the patterns are different (accented vs isochronous), so their triggers types are different (50 & 200 always together in one trial OR 150 & 250 always together in one trial). Each pattern lasts 4.2 seconds.

I have two questions regarding this:

1.) I know that in neural entrainment studies people often loop the same pattern, (for example patterns lasting 2.4 seconds looped 25 times), and then they cut the data into 60 second epochs. However, as my event types change right after the other, I have to cut my data into epochs that are 0 to 4.2 seconds exactly. Is this the right approach? I feel like it then becomes rather like an ERP study more than a neural entrainment study?

2.) I don't have a baseline! Again, because the patterns follow each other without breaks, I don't have a- 200 to 0 ms period where I can do baseline correction. Should I in this case remove the mean of the whole epoch as baseline correction? Or should I simply skip baseline correction?

I tried to do run the steps differently for one participant many times, and I always ended up with extremely noisy looking ERP plots.

I would genuinely appreciate any help!
Thanks already.