[Eeglablist] ICA (number of data points) for artifact rejection
Arnaud Delorme
arno at ucsd.edu
Tue Jun 2 10:27:12 PDT 2009
Dear Eric,
your decomposition might give you noise but maybe not. This question
of the number of data point is actually empirical and depends on the
ICA algorithm you use and the quality of your data. What is sure is
that if you have 256 channel and do not perform dimension reduction,
you will be optimizing an ICA weight matrix of 256 x 256 value/
parameter so you need at least that many data points. We found that
you might actually need much more than that but have not studied
exactly how much (as you mention below, our current rule of thumb is
20 times 256^2). In general, I have observed that the more data points
you have, the more stable ICA components are. Interestingly the most
dipolar components (the most biological ones) also become more stable.
My interpretation is that if you add some more data, ICA will not put
much weight on the transient artifact but rather on what is stable and
reliable throughout the whole data. If the data segment is short, then
transient artifacts might dominate the ICA decomposition.
If you are just interested in artifact removal, you might want to
reduce the dimensionality of the data. I would do 50 in your case for
the sake of speed/efficiency. You do not want to use to little PCA
components because some data will actually be missing when you perform
dimensionality reduction (if you use PCA to reduce from 256 to 50
dimension by taking only the first 50 PCA components). If you were to
look at the actual component and did not have enough data, I would use
PCA 150. If you have enough data (and patience), I would do the full
decomposition.
Of course, other people might come up with different values.
Hope this helps,
Arno
On 21 mai 09, at 18:55, Eric Landsness wrote:
> In previous eeglablist emails and the literature the number of
> datapoints needed for ICA decomposition has been discussed. See
>
> http://sccn.ucsd.edu/pipermail/eeglablist/2008/002384.html
> http://sccn.ucsd.edu/pipermail/eeglablist/2006/001568.html
>
> My dataset is 8 min of continuous data with 256 channels at 500 Hz
> per subject which is way below the suggested sample size (256^2 * 20
> = 44 minutes).
>
> I am using ICA purely for artifact removal (eye and EMG removal) and
> was wondering what is the harm in having too little data? How much
> risk do I run of removing "real" data?
>
> I understand that my data will be overlearned (Sarela and Vigario
> 2003), but I feel that the eye movements and EMG are being cleanly
> decomposed into a few components that when removed significantly
> improve my data set, is this a problem? I am not using the
> decomposition as a training set for other data sets. With each
> subject I run a new ICA decomposition.
>
> Thanks for the comments and sorry to bring up this issue again to
> the email list.
> Eric
> _______________________________________________
> Eeglablist page: http://sccn.ucsd.edu/eeglab/eeglabmail.html
> To unsubscribe, send an empty email to eeglablist-unsubscribe at sccn.ucsd.edu
> For digest mode, send an email with the subject "set digest mime" to eeglablist-request at sccn.ucsd.edu
More information about the eeglablist
mailing list