[Eeglablist] filtering and ica

Fri Dec 17 10:39:11 PST 2004

Dear Natasa

On 16 Dec 2004, Natasa Kovacevic wrote:

> Here, at Rorman Research Institute, we acquire Neuroscan eeg data which
> always has 60Hz line noise. For the types of time-frequency analysis
> that we want to perform, we need to filter this noise out. When we apply
> ica decomposition for artifact removal it is quite clear that many ica
> components pick this noise up to various degrees. So the question is:
> 
> should we first filter this particular noise and then run ica, and if
> so, would notch filtering at 60 Hz of the raw, continuous data be the
> best (as opposed to say filtering epoched, manually pruned data)?

(Notch) filtering and then ICA is a better approach. In general, if you 
provide ICA with clean data, the decomposition into independent components 
is qualitatively better. The same holds if you can remove gross artifacts 
(due to subject movement, electrodes movement, etc.) before ICA 
decomposition. The reason is, any artifact of independent origin is 
likely to occupy one degree of freedom in ICA space to model that artifact.
If you have too many, you are likely to be left with no degrees of freedom 
for sources of interest.

> Also, why is it recommended to run ica on shorter epochs and then apply
> weights to longer epochs - is it simply to reduce computation time?
> Because, it seems that with such approach artifacts that may be present
> in longer epochs only (e.g. eye blinks) will not be removed.

You need to provide ICA with enough (stable) data to estimate the 
coefficients of the Weights matrix. Ideally, as much as possible.
A proposed rule of thumb is to use at least C * N * N data points, where N 
is the number of channels, and C an integer multiplier. Try C = 3. 
For large N, however, C may need to be increased.
Of course, artifacts not present in the training data will not be 
identified. For the longer data, ICA is likely be effective only for the 
artifacts that were identically present in the training data (identical in 
the sense of comparable signal-statistical properties).
One case were it is preferable to avoid too long epochs, besides saving 
computation time, is when data properties change radically over time, 
such that the number of stationary independent processes generating the 
data is likely to be increased beyond N. If this happens, again there 
would be too many degrees of freedom to model for in the data. Remember 
that ICA is constrained by design to output a maximum number of N 
independent sources.

Hope this helps.
Luca

________________________________________________
 Luca A. Finelli, Ph.D.

 Computational Neurobiology Laboratory
 The Salk Institute

 Swartz Center for Computational Neuroscience
 UCSD

 La Jolla CA 92037 - USA
 ________________________________________________