[Eeglablist] How to correctly break down AR runica() in case of huge sets.

Jason Palmer japalmer29 at gmail.com
Mon Dec 13 13:36:40 PST 2010

Hi Mahesh,


Merging the results by simple averaging probably won’t work since the
components are returned in random order (even after the variance sorting,
components won’t necessarily have the same index.) Using matcorr() or a
similar component matching algorithm before averaging is one possibility.


But it seems to me that averaging will not improve anything in your
situation. As long as you have enough data in each data block that ICA runs
on, then the components you get should be well determined, allowing you to
remove the artifacts separately, and use the separate unmixing matrices to
decompose the different subsets.


I’m not sure what kind of analysis you’re doing, but for many purposes, you
want to identify brain components of interest and then analyze the
activations and possibly localize them. In this case you only need to match
up the components of interest in the separate decompositions, e.g. a frontal
midline ERN component, and collect all the trials with the activations
produced by the respective ICA unmixing matrices.


Again, as long as you use as much data as you can load (possibly overlapping
data blocks), the decompositions should be good by themselves. Comparing the
components of interest across decompositions will give you an idea of how
stable the components you’re looking at really are in your dataset. You
might also look into characterizing the variance of the component maps in a
bootstrapping sense, using a large number of resampled blocks.


It would also be possible to modify the ICA algorithm to swap out data from
the disk, but as I said, I doubt using all the data would improve the
results over using as much data as you can load into memory. To me it makes
more sense to verify the stability of the components you’re interested in,
and use the separate ICA unmixing/sphere matrices on their corresponding
data blocks, and separately back-project the components of interest, and
then collect all the trials for the final analysis.


Hope this is useful.






From: eeglablist-bounces at sccn.ucsd.edu
[mailto:eeglablist-bounces at sccn.ucsd.edu] On Behalf Of Mahesh Casiraghi
Sent: Saturday, December 11, 2010 6:34 PM
To: eeglablist at sccn.ucsd.edu
Subject: [Eeglablist] How to correctly break down AR runica() in case of
huge sets.


Dear more experienced EEGLabbers and ICA experts,



supposing one has to work with quite large datsets (several channels, very
high sample rate, long record lengths) and would therefore be unable to load
in memory several gigs of data altogether:


A) Is it methodologically problematic to run independent ICAs on subgroups
of trials and then separately perform AR (blinks and scalp detected ECG
components rejection) on each of them?


B) Assuming it would not be, as I tend indeed to think, a so recommendable
way, is there a methodologically proof way to combine all the obtained - and
presumably heterogeneous - sphere, weights and weights(-1) matrices in 3
single Sph, W, and W(-1) matrices and then use these new to backproject
after component rejection?


C) More precisely, let's suppose we have 700 trials and we run 7 independent
ICAs each time on 100 of them. 


a) I would proceed in picking-up separately (subjective criteria, adjust,
faster or whatever one may prefer) the to-be-rejected components,
independently from each subgroup of trials.  

b) I would then remove subgroup by subgroup the respective w(-1) columns and
EEG.icaact rows according to the discarded components.

c) I would merge the obtained 7 EEG.icasphere, the 7 EEG.icaweights, and the
7 EEG.icawinv, in 3 single matrices of equal dimensions, averaging through
nanmean (given the fact we are likely to pick up a different amount of
components from each of the trial subgroups and we would need consistent
matrix dimensions).  

d) I would finally independently backproject subgroup by subgroup using the
same averaged EEG.icawinv and EEG.icasphere and each time the EEG.icaact of
the current subgroup of trials.


According to my first speculations, following a->b->c->d we should come up
with something analogous to the output of a big global ICA.


Am I wrong?


D) Did someone among you already try to run something like that and is
perhaps willing to provide some feedbacks-impressions?










Mahesh M. Casiraghi

PhD candidate - Cognitive Sciences

Roberto Dell'Acqua Lab, University of Padova

Pierre Jolicoeur Lab, Univesité de Montréal

mahesh.casiraghi at umontreal.ca


I have the conviction that when Physiology will be far enough advanced, the
poet, the philosopher, and the physiologist will all understand each other.

Claude Bernard


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://sccn.ucsd.edu/pipermail/eeglablist/attachments/20101213/96321b84/attachment.html 

More information about the eeglablist mailing list