[Eeglablist] Slow ICA
Nike gnanateja
nikegnanateja at gmail.com
Thu Jun 8 12:11:23 PDT 2017
Dear Yair,
I too have closely followed the thread on ICA run times and have also
experimented myself with various settings. Following the discussion thread
helped me a great deal in understanding ICA.
here are a few points which have been suggested to be most important for of
ICA times:
1. 1 Hz highpass filter: This is one of the most important setting, as ICA
needs the data to be stationary.
2. Bad epochs and channels: High amplitude bursts seem to slow down ICA, so
it is important that these are rejected before running ICA
3. Full Rank data: The data should be full rank or 'pca' should be limited
to a number equal to or lower than the rank of the data
4.; Low pass filtering: Filtering the hiher frrequencies which are not of
interest also drastically improved ICA run times in my datasets. I think
this reduces ICA's effort to explain the extra variance in the high
frequencies.
4. Data size: The data size can be large depending on the sampling rate,
long testing duration, or number of channels, or any combinations of the
three. Downsampling the data helps a great deal. I don't think that there
is anything we can do about the duration of the signals. But if the length
of the data is very short and the number of channels are high this seems to
be a problem, and the dimensionality of the data has to be reduced using
pca - Makoto says this Rule of thumb formula: channels^2 x k, k = 20~30
for 30 channels when sampling rate is 250 Hz. The constant k should
increase exponentially as the number of channels increases. (see slide
number 13 in
https://sccn.ucsd.edu/mediawiki/images/7/74/IcaDecompositionOfEegData4.pdf.
)
We've used ICA in our lab successfully in the past but recently came upon a
problem - some of our subjects's ICA take much longer than others. while
some take between 30-40 minutes, these may last up to 50+ hours.
> I couldn't find a similar pattern in these subjects - some of them begin
> with fixing rank computation inconsistency (64 to X), some start by
> lowering learning rate and some begin training steps straight away.
>
I suppose you might be using channel interpolation in your pre-processing
steps. I think you should hold off with channel interpolation before
running ICA. And try to have the same pre-processing steps in all the
subjects.
> Measuring was done using 64 electrodes + 5 external. Pre-process included
> highpass over 0.01, and epochs were divided to [-1.2 1.8] sec windows.
> Data was saved in double precision (.set + .fdt files)
>
Were these five external channels recorded with the same reference as those
of the 64 channels? If not, then they should be excluded from ICA. Or they
have to be somehow converted such that they have the same reference as the
other 64
Your high pass filter appears too low for ICA. You should observe
significantly improved ICA run times with a high pass of 1Hz.
Your epochs seem to be fine. You just need enough number of epochs matching
Makoto's Rule of thumb fo data size
> I tried running on different computers, different versions of MATLAB, and
> different methods (runica & binica) but still no change in speed.
> One other thing, after pre-processing i noticed that when checking for th
> rank (using rank(EEG.data(:,:)) the value is between 1-31, should it be 64?
> Reducing PCA to 63 ('pca',63) dimensions also didn't help.
>
Rank of 1 seems a little to extreme, I don't know how it can become that
low. the rank must be equal to the number of channels, unless you have done
channel interpolation or any kind of re-referencing. the 'pca' cannot be
any greater than the rank of the data. That is why 'pca' of 33 is not
improving the run times.
Before, when I didn't follow these steps, my 25 minute long 64 channel data
used to take around 1 hour to run ICA.
Whereas now after following all these steps (I just ran an ICA ) it just
takes around 200-250 SECONDS to run ICA simultaneously on five such
datasets using a parallel for loop (parfor - Arno suggested to use a
parallel 'for' loop to decrease ICA run times when you have multiple
datasets, and it really helps).
I hope this helps. Do report back, about which steps worked and which
didn't.
Best Wishes,
Nike
--
G Nike Gnanateja <http://goog_636235333>
Ph.D Candidate,
Department of Audiology,
All India Institute of Speech
and Hearing Mysore-06
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://sccn.ucsd.edu/pipermail/eeglablist/attachments/20170609/89bc6915/attachment.html>
More information about the eeglablist
mailing list