[Eeglablist] Seeking your advice on EEG preprocessing

Wed Jan 21 14:12:02 PST 2026

Hi Zigmunds,

> And everywhere it seems to be done differently.

Yes, unfortunately that is the case. It is partly because 'preprocessing'
and 'analysis' are more inseparable in EEG than in fMRI, for example.

When you use ASR, use cutoff threshold 20 to start with. Do not use 4 or 5,
or you'll lose a lot of data variance. There are many ASR papers only to
determine 'optimal' cutoff threshold. They concluded 10-30 is a good range.

I can do online consultation for you if you want. When we do so, let's
invite other people so that we can discuss together. I am in the Eastern
Standard Time and available basically anytime.

Makoto

On Wed, Jan 14, 2026 at 4:14 PM Zigmunds Freibergs via eeglablist <
eeglablist at sccn.ucsd.edu> wrote:

> Dear EEGLAB colleagues,
>
>
> I would like to consult with you regarding the EEG data analysis,
> specifically the preprocessing pipeline.
>
> I am currently a first-year PhD student at the University of Tartu in
> Estonia, in the Institute of Psychology. As I focus on EEG, I came upon
> some questions that I am struggling to get answers to.
> Considering your experience and expertise, I would like to ask for your
> advice, pose some questions, and share what I have discovered/understood so
> far.
>
>
> What I wanted to ask you is about the preprocessing pipeline. I read prof.
> M. Makoto's blog about EEG pipeline, as well as EEGLAB`s tutorials, and
> many technical articles. I also consulted with researchers from Estonian,
> Lithuanian, and Polish universities about their approaches. And everywhere
> it seems to be done differently. What I have understood is that there can
> be some minor changes in preprocessing steps; however, they should follow
> very strict argumentation.
>
> I wanted to ask for your thoughts on the pipeline I have developed (please
> see the attached schematic picture).
> Assuming that EEG files are with imported channel locations and uploaded
> stimuli, the first step is usually recommended to re-reference to the
> average. After that, resampling can be done; however, if a better temporal
> resolution is desired, it is preferable to maintain a higher sampling rate.
> Then, before filtering, it is recommended that copies of the EEG datasets
> be made for training ICA separately. The reason for this, as I understand,
> is mostly related to the fact that if a high-pass filter is set to a value
> smaller than 1Hz, it is biased. Therefore, for those copies, it should be
> set to range from 1Hz to 30Hz (in the case of ERPs). Then, in the original
> files, filtering is done, setting values from 0,1 to 30Hz. Next, with the
> Cleanline plugin (if EEGLAB is used), it is suggested to remove the line
> noise, even though the bandpass filter is only up to 30Hz. And then comes
> the removal of bad channels and bad segments (artifact rejection). Here,
> for the reason of a standardized appr
>  oach, both A. Delorme and M. Makoto suggest using the Clean_rawdata()
> plugin (in EEGLAB). I have tried it - it is quite aggressive in removing
> these segments, especially when the plugin's default settings are used.
> However, it works more carefully if ASR is used for those segment
> corrections, removing only the very bad parts. After that, channels should
> be interpolated, and rereferencing should be done. And, finally, the
> transfer of ICA weights should be done from those copies I mentioned
> earlier, and bad channels should be removed.
> One more thing, I am a bit confused - dataset copies are recommended to be
> made after the downsampling step (if it is done), or after initial
> rereferencing and before filtering. However, some sources describe the
> Cleanline and Clean_rawdata steps as being applied to ICA files, including
> interpolation. In contrast, other sources suggest that channels shouldn't
> be interpolated, as this would bias the ICA results again. However, if
> there is no logic to use Cleanrawdata steps, including bad channel
> rejection as ICA weights, then it won't be transferable to the original
> datasets.
>
> Summing it up, I am really confused, first, about the automation. On the
> one hand, the clean_rawdata function is recommended for standardized
> preprocessing. On the other hand, it is a bit aggressive. Only if I choose
> ASR to correct and set the maximum acceptable 0.5-second window standard
> deviation to 30, it leaves up to a maximum of 70% of the initial data. Is
> it the optimal way to proceed? Or would it be better to use any other
> automated bad data period cleaning option? If yes, which one would you
> suggest? Second, regarding the preprocessing steps, in some sources, ICA is
> described as being used on the same files, but this contradicts the
> filtering issues. On the other hand, if these steps are performed on data
> copies, it becomes confusing to determine exactly which steps are
> necessary. For instance, some sources suggest performing filtering first
> and then applying ICA. Other sources suggest removing bad channels as well;
> however, they then recommend interpolating them back so that
>  , when ICA weights are transferred to the original datasets, the number
> of channels remains the same. However, other sources suggest that if
> channels are interpolated back, they may bias ICA.
>
> Could you please comment on your thoughts on these questions, as well as
> your approach where it differs, and explain why you do it in a different
> way than I described?
>
> Kind regards,
> Zigmunds Freibergs
>
> PhD student/Junior Research Fellow
> Institute of Psychology
> University of Tartu
>
> _______________________________________________
> To unsubscribe, send an empty email to
> eeglablist-unsubscribe at sccn.ucsd.edu or visit
> https://sccn.ucsd.edu/mailman/listinfo/eeglablist .
>