[Eeglablist] value of PCA pre-processing before running ICA on EEG data?

arno arno at salk.edu
Thu Aug 3 12:41:45 PDT 2006

Dear Anish,
> [1] How do I know how many dimensions to reduce the data to?  So far, I have 
> been choosing to keep just enough principal components such that ~ 99% of 
> the variance is retained (but I only picked that value arbitrarily), which 
> usually halves the dimension of the data.
There is no accepted rule. If you 'n' sample points, you should use no 
more than sqrt(n) "channels" (so if you have more than sqrt(n) channel 
in your data, you use PCA to reduce the dimensionality). This is because 
there is  number_channel^2 values in the weight matrix so you need at 
least one value in the data (on time frame) per value in the matrix. In 
our experience, it is good to have number_channel^2<sqrt(n/20).
> [2] Once I reduce the dimensionality of the data with PCA to 'p' 
> uncorrelated components, how many independent components 'c' do I choose to 
> extract?  Should c=p?
Yes, necessarily using the algorithm runica(). There is an option to 
runica that modify the ICA algorithm to obtain less components than 
channels ('ncomps') but it should not be used (it has returned strange 
results). If someone is interested in investigating this behavior, they 
are very welcome to. For now, just use the 'pca', option.
> [3] Is there any difference between: [a] running ICA and extracting (say) 60 
> components from the original raw data, and [b] first running PCA to reduce 
> the raw data to the largest 60 principal components, and then running ICA 
> and extracting 60 independent components from the pre-processed data?  If 
> there is a difference, which is the more appropriate method?
There is no difference if you use the 'pca' option of the runica() function.
> If anyone can offer any insight into these questions, it would be greatly 
> appreciated.  So far, I have just been picking arbitrary values for 'c' and 
> 'p' (ie. trial and error) and hoping for things to work out.  I am really 
> stumped about question [3] though... I don't know which method is better, or 
> even if it makes a difference which I choose.
See my comments above,

Best regards,


More information about the eeglablist mailing list