[Eeglablist] quick EEGLAB ICA question

Wed Jun 8 16:28:49 PDT 2011

EEGLAB's runica does prewhitening by default I think. The whitening matrix
is returned in the EEG.icasphere variable.

Your code looks correct, but there are actually infinitely many whitening
matrices. In your notation, R*Q would also a whitenting matrix for any
orthogonal matrix R. The whitening matrix eeglab uses is actually your Ux*Q.

Jason

-----Original Message-----
From: jia gu [mailto:jia.gu12345 at gmail.com] 
Sent: Wednesday, June 08, 2011 2:56 PM
To: japalmer at ucsd.edu; eeglablist at sccn.ucsd.edu
Subject: Re: quick EEGLAB ICA question

Thank you very much!!! :)  sorry, but 1 more question ^.^ Do I need to
pre-whitening the data before input it into EEGLAB's ICA? the eeglab
tutorial didn't mention about prewhitening, so i'm not sure if this will
help or hurt the performance...
the code I'm thinking of using for pre-whitening is:
[m,N]=size(x)  % x is data
x=x-kron(mean(x')', ones(1,N));
Rxx=(x*x')/N;
[Ux,Dx,Vx]=svd(Rxx);
Dx=diag(Dx);
Q=diag(real(sqrt(1./Dx)))*Ux';
xw = Q*x;  % xw is the whitened data matrix

thank you kindly for all of your time and help :) cheers, jia

On Wed, Jun 8, 2011 at 2:01 PM, Jason Palmer <japalmer29 at gmail.com> wrote:
> Hi Jia,
>
> You should be able to use all the data ... it doesn't have to be 
> strictly stationary for ICA to be useful. You mainly need the same 
> components to be present (not necessarily continuously) and not too 
> many components. If the experimental setup is similar in all the 
> recorded data, and you high pass filter around 0.5-1Hz (assuming slow 
> waves are not of interest) and make sure then you shouldn't have to worry
about having "too much data".
>
> 13 channels should be sufficient to give some useful results. You can 
> probably use standard channel locations ... you might check the eeglab 
> tutorial and info on the MNI brain.
>
> Best,
> Jason
>
> -----Original Message-----
> From: jia gu [mailto:jia.gu12345 at gmail.com]
> Sent: Wednesday, June 08, 2011 10:57 AM
> To: japalmer at ucsd.edu; eeglablist at sccn.ucsd.edu
> Subject: Re: quick EEGLAB ICA question
>
> Dear Jason and others who'd like to help ^.^
>
> 1 more question ^.^ you mentioned that too many EEG sample points will 
> hurt the performance of ICA due to EEG's non-stationarity. I have 13 
> channels of continuous raw signals, at the sampling frequency of 256 
> Hz. Do you think
> 300 sec of EEG data (76800 sample points) are too much for ICA? or 
> should I reduce the window length to 5 sec? or maybe
> 10 sec?
> Also, If I reduce the window length, say to 10 secs, will that create 
> discontinuities for the reconstructed signals at every 10 sec? The 
> goal is to perform algorithms on the continuous signals, so I would 
> like to avoid any reconstructed discontinuities generated from shorter 
> window length, at the same time want to avoid affecting the 
> performance of ICA due to too many sample points. Any suggestions on what
I should do?
>
> thank you all very much for your kind help and time :) cheers jia
>
> On Tue, Jun 7, 2011 at 10:27 PM, jia gu <jia.gu12345 at gmail.com> wrote:
>> Dear Jason :)
>>
>> Thank you very much for your prompt and detailed reply :) Could I bug 
>> you with a few more questions ^.^ ?
>> I want to use infomax to remove mainly the following artifacts: EOG, 
>> EMG, heartbeat, and respiration. I have 13 channels, sampling freq 
>> 256 Hz, but I don't know which channel correspond to which 10-20 
>> location
>>>.<
>> (1) do you think 13 channels will be enough to produce desired 
>> results? Or do I need more channels or maybe less? what are the range 
>> of sample points do you think will produce good results (eg:
>> 13x13x20)? I read that EEG is non-stationary, and can be assumed 
>> "quasi-stationary" when the window length is around 1 sec or 2 sec, 
>> and according to your explanation, it's better to limit the window 
>> length to 2 sec to maintain this "quasi-stationarity" , but that 
>> won't allow enough sampling points -> unless I reduce the number of 
>> channels; or increase the sampling freq to 1KHz -> But, the freq of 
>> EEG is low, and 250Hz should be enough to recover the original signal 
>> according to Shannon's sampling theory, so does that imply 1KHz 
>> sampling freq is just too much/redundant? Or is that a good option 
>> for ICA in this case?
>>
>> (2) I realized that EEGLAB does independent component rejection only 
>> if I input channel location info, which I don't have :( .... Besides 
>> rejecting the IC component by eye using "plot -> component
>> activation(scroll) -> reject component by eye", is there any 
>> automatic algorithms you know that works well with the w and w' 
>> matrices EEGLAB produces? I saw one website that developed some 
>> algorithms to do that, but not sure if it's good:
>> http://www.cs.tut.fi/~gomezher/projects/eeg/aar.htm
>> thank you very much for your time and help :) cheers jia
>>
>> On Tue, Jun 7, 2011 at 10:16 AM, Jason Palmer <japalmer29 at gmail.com>
> wrote:
>>> Hi Jia,
>>>
>>> There is some ambiguity in the case of non-stationary environments.
>>> Generally for statistical estimation, more data means better 
>>> estimates (of components, activations, etc.), so you want as much 
>>> data as
> possible.
>>> However, if the data is non-stationary, then you have different data 
>>> points generated by different statistical systems, and combining the 
>>> data will generally degrade the estimation of either system. So you 
>>> really want as much data as possible generated from the statistical
> system of interest.
>>>
>>> EEG data is usually preprocessed with a high-pass filter (remove 
>>> mean and low frequency drift) to increase the stationarity. We are 
>>> essentially trying to remove unimportant sources of non-stationarity 
>>> to sample on from a supposedly long-term stationary system of brain 
>>> and other biological sources.
>>>
>>> Another issue is the number of sources that are present. Basic ICA 
>>> assumes that the number of sources is less than the number of 
>>> sensors/channels. So if you record from longer periods of time, you 
>>> are likely to have more “artifactual”, transient type sources show 
>>> up, which will force ICA to compromise the independence of the 
>>> estimated
> components.
>>>
>>> If we assume that the same components are present at different 
>>> times, and that we have filtered and removed artifacts sufficiently 
>>> to make the data consist of fewer sources than sensors, then 
>>> generally the more data the better.
>>>
>>> We might also assume that the same components are present in 
>>> different subjects. Again it will be important to try to preprocess 
>>> out unimportant differences (sources of nonstationarity) and try to 
>>> be sure that the number of sources is less than the number of data
> dimensions used.
>>>
>>> Hope that’s helpful.
>>>
>>> Best,
>>> Jason
>>>
>>>
>>>
>>> ---------- Forwarded message ----------
>>> From: jia gu <jia.gu12345 at gmail.com>
>>> Date: Mon, Jun 6, 2011 at 9:50 PM
>>> Subject: quick EEGLAB ICA question
>>> To: eeglab at sccn.ucsd.edu
>>>
>>>
>>> To whom it might concern:
>>>
>>> Thank you very much for providing us the EEGLAB!  :) I am trying to 
>>> use ICA to clean some EEG signals, I read that the min # of sample 
>>> points should be at least 25x channel number squared. But there is 
>>> no upper limit. I wonder does the performance of ICA
>>> (infomax) get better with more training points? or does it start to 
>>> degrade after a certain optimal number of sample points, and if so 
>>> what is the best # of sample points?
>>> thank you very much for your time and help cheers jia
>>>
>>>
>>> --
>>> Scott Makeig, Research Scientist and Director, Swartz Center for 
>>> Computational Neuroscience, Institute for Neural Computation & Adj.
>>> Prof. of Neurosciences, University of California San Diego, La Jolla 
>>> CA 92093-0559, http://sccn.ucsd.edu/~scott
>>
>
>