# [Eeglablist] ICA problem - using PCA to reduce number of 'channels' for ICA

Arnaud Delorme arno at ucsd.edu
Sat Jun 12 08:43:09 PDT 2010

```Dear Keith,

we have tried both (removing channels or using PCA) and both work well. PCA 64 is fine. You may just use 'pca', 64 when calling the "Run ICA" menu. Concerning the size of the weight and sphere matrix. If you use 64 PCA dimensions, your sphere matrix is going to be 133x133 and the weight matrix is going to be 64*133 so this fine (since ICA_activities = weights x sphere x data). You might wonder why we have two matrices (sphere and weights) instead of a single (weights x sphere) matrix, and to tell the truth, I wonder about that sometimes as well because this confuses users and, in the end, very rarely does anybody look at the sphere and weights matrices separately. But this is the way we originally implemented it and we have sticked to this original implementation for now.

Importantly, always remember than using PCA to decrease the number of dimension is not a good idea and should be avoided when possible. The reason is that when you apply PCA, you get 133 components and you decide to only consider the first 64 ones and ignore the  other ones. Although these first 64 components account for most of the data, the remaining components may account for the linear projection of brain source onto multiple data channels. Therefore using PCA paradoxically introduces some non-linear noise (paradoxically because PCA is a linear algorithm). In practice this does not seem to be critical but theoretically removing channels is sometimes preferable to using PCA.

Best regards,

A. Delorme

On Jun 9, 2010, at 10:10 PM, Keith Yoder wrote:

> I have a related question:
>
> We are recording using 128-channel BioSemi cap, plus four ocular leads and two mastoid leads.  All of the data are referenced to the right mastoid (leaving EEG.data with 133 rows).  We would like to use ICA, but have found that 133 components is too many (e.g. a CNV which is clear in the electrodes does not fall out as a component, but can be isolated with PCA using 5 components).
>
> Have other users had success using PCA to pre-process the data before passing them to ICA, so as to reduce the number of components identified by the ICA algorithm?
>
> A possible solution that I have been trying to implement would use PCA to identify a more reasonable number of components (e.g. 64) to then pass to ICA (all using runica):
>
> % after using runica with " ,'pca',64 " to define 64 PCAs
> % manually prepare component activations for ICA as if they were
> % data from EEG.data (as laid out in pop_runica)
> >> tmpdata = reshape(EEG.icaact(:,:,:),64,EEG.pnts*EEG.trials);
> >> tmpdata = tmpdata - repmat(mean(tmpdata,2), [1 size(tmpdata,2)]);
> >> [EEG.icaweights, EEG.icasphere] = runica(tmpdata, 'lrate', 0.001, 'interupt','on');
>
> However, this call to runica returns the component weights and sphering for the ICA decomposition of the PCA-defined components, rather than ICA-defined weights and sphering for 64 ICs for our 133 channels. Has anyone had success with a technique along similar lines?
>
> Alternatively, I could simply remove electrodes from EEG.data until I am left with 64 channels, but I am wary of throwing out data entirely.
>
> We are running EEGLAB 7.2.9.20b in 64-bit Matlab 7.10.0 (R2010a) in Mac OS X 10.5.7.
>
> Many thanks,
> Keith
> --
> Keith Yoder
> Research Aide
> Laboratory for the Neuroscience of Autism
> Cornell University
> Ithaca, NY
> 574.215.9678
>
>
> On Mon, Feb 1, 2010 at 5:26 PM, Scott Makeig <smakeig at gmail.com> wrote:
> Joe is correct that ICA will not converge if the rank of the data matrix is less than the number of channels. The runica/binica algorithms are supposed to test the rank of the input data. If two channels are identical, or if some subset of n channels are otherwise interdependent, then the rank will be less than the number of channels and PCA reduction should be applied to remove the redundancy and allow the ICA decomposition to converge.
>
> Arno -- There was a problem with the Matlab rank() function on 64-bit machines, I believe. Has this been solved and Is the auto rank detection -> PCA option currently implemented in runica/binica?  Perhaps we could add a 'toy' rank() function pre-test (e.g. finding the rank of a small full-rank matrix to detect if rank() is working...) ? If so, run the rank test; if not, then warn the user or build a work-around rank function that will work properly?
>
> Scott
>
>
> On Mon, Feb 1, 2010 at 6:56 AM, Joseph Dien <jdien07 at mac.com> wrote:
> When you say "so long", how long do you mean?  While ICA is not by its nature a fast procedure, certain datasets can take much longer than usual.  For example, I find that if two channels are perfectly correlated (1 or -1) then an ICA run will take much longer.  This can happen if the data is mean mastoid referenced and both channels are explicitly included in the data because they will have a perfect -1 correlation (see Dien, 1998 for reference issues).  It can also happen if a channel is shorted out during acquisition and the reference channel is explicitly included because then they will have a perfect correlation.  Also if two channels are shorted together during the data acquisition they will be perfectly correlated with each other.  My EP Toolkit (https://sourceforge.net/projects/erppcatoolkit/) has code for dealing with these situations so you might want to look into it.  It implements an automated artifact correction routine that relies on EEGlab's runICA code, among!
>  other things.
>
> Cheers!
>
> Joe
>
>
>
> On Jan 29, 2010, at 4:49 AM, peng wang wrote:
>
> > Hi there,
> >
> >    I am using ICA to remove blinks via EEGLab. My dataset has 122 channels, and it takes so long to compute 122 components.
> >    (1) So I tried to use the option "ncomps" (say, 24) to reduce the number of components. However, an error message appears after computing: "Matrix dimensions must agree".
> >
> >    (2) Then I tried fastICA instead as following,
> >
> > ==================
> >  sz = size(EEG.data);
> >  nchans = sz(1);
> >  npts = sz(2);
> >  ntrials = sz(3);
> >  clear sz;
> >  nICs = 24;
> >  data = reshape(EEG.data,nchans,npts*ntrials);
> >  [ica,V,W] = fastica(data,'numOfIC',nICs,'approach','symm');
> >  EEG.icasphere = eye(nchans);
> >  EEG.icaact = single(reshape(ica,nICs,npts,ntrials));
> >  EEG.icawinv = V;
> >  EEG.icaweights = W;
> >  EEG = eeg_checkset( EEG );
> >  clear V W ica data;
> >
> >  EEG = pop_saveset( EEG, 'filename','test_raw_ica');
> > ==================
> >
> >    Everything seems fine. But when I reject the blink component via GUI of eeglab and load the data again, Something strange happens. It seems the amplitude of EEG.data become much smaller, about in  -1~1 range.  Thus I wonder whether there was some normalization behind, and how can I correct it? The problem would not repeat if I choose the number of components same as channels event in fast ICA (e.g. change to "nICs = nchans" in the above code).
> >
> >    Thank you for your help.
> >
> > best
> > Peng
> > _______________________________________________
> > Eeglablist page: http://sccn.ucsd.edu/eeglab/eeglabmail.html
> > To unsubscribe, send an empty email to eeglablist-unsubscribe at sccn.ucsd.edu
> > For digest mode, send an email with the subject "set digest mime" to eeglablist-request at sccn.ucsd.edu
>
>
> --------------------------------------------------------------------------------
>
> Joseph Dien,
> Senior Research Scientist
> University of Maryland
> 7005 52nd Avenue
> College Park, MD 20742-0025
>
> E-mail: jdien07 at mac.com
> Phone: 301-226-8848
> Fax: 301-226-8811
> http://homepage.mac.com/jdien07/
>
>
>
>
>
>
>
>
>
> _______________________________________________
> Eeglablist page: http://sccn.ucsd.edu/eeglab/eeglabmail.html
> To unsubscribe, send an empty email to eeglablist-unsubscribe at sccn.ucsd.edu
> For digest mode, send an email with the subject "set digest mime" to eeglablist-request at sccn.ucsd.edu
>
>
>
> --
> Scott Makeig, Research Scientist and Director, Swartz Center for Computational Neuroscience, Institute for Neural Computation, University of California San Diego, La Jolla CA 92093-0961, http://sccn.ucsd.edu/~scott
>
> _______________________________________________
> Eeglablist page: http://sccn.ucsd.edu/eeglab/eeglabmail.html
> To unsubscribe, send an empty email to eeglablist-unsubscribe at sccn.ucsd.edu
> For digest mode, send an email with the subject "set digest mime" to eeglablist-request at sccn.ucsd.edu
>
> _______________________________________________
> Eeglablist page: http://sccn.ucsd.edu/eeglab/eeglabmail.html
> To unsubscribe, send an empty email to eeglablist-unsubscribe at sccn.ucsd.edu
> For digest mode, send an email with the subject "set digest mime" to eeglablist-request at sccn.ucsd.edu

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://sccn.ucsd.edu/pipermail/eeglablist/attachments/20100612/fd620bb8/attachment.html
```