[Eeglablist] Component clustering and PCA reduction

Makoto Miyakoshi mmiyakoshi at ucsd.edu
Fri Jul 1 15:21:18 PDT 2016


Dear Clemens,

This clarification may be of interest for other subscribers too, so let me
get back to the list to continue.

Sorry for slow response. I've been swamped by a project I've been working
on since January, which is finished (I believe) yesterday.

> Did I understand correctly, that, to take ERSPs as an example, the data,
wich is hundreds of time/frequency points is reduced to principal
components that explain the most variance? In components that for example
capture a lot of fm-theta, would this (first) principal component take its
weight from the time/frequency points in 4-8 Hz range, high in power, and
some latency? In other words, there a hundreds of dimensions, if I reduce
to five dimensions, I am left with, mostly theta, some alpha, some noise,
etc.?

This is my first time to check this process in code so I could be wrong.
For the case of ERSP/ITC, std_preclust() line 362 says X, which should be
frequency x time, is reshaped to 1 x numel(X), which means time-frequency
is vectorized. For example, suppose you have ERSP/ITC matrix of 30 (freqs)
x 200 (times) x 50 (ICs), you'll run runpca() at line 404 of the same
function on 6000 x 50 matrix. If you specify ERSP/ITC dimension of 10, this
should be 10 x 50 matrix after dimension reduction. So the vectorized
time-frequency data are dimension-reduced.

> I am sorry to bother you, perhaps you can also suggest some literature
where people have done similar things. (I have read Julie Ontons paper from
2005 on FMT cluster, Still I am not so confident if I get the PCA thing)

I have never seen a publication detailing this part. However, this way of
using PCA to reduce data dimension is extremely popular in the engineering
field, and no need to validate... that's how I feel.

Makoto



On Thu, Jun 9, 2016 at 2:13 AM, Clemens DICKHUT <clemens.dickhut at uni.lu>
wrote:

> Dear Makoto,
>
> Thanks a lot for your reply. I have read the online manual thoroughly,
> however, the part where it is described how PCA reduces the dimensions of
> the respective measures is a little vague… at least I am not able to grasp
> it.
>
> Did I understand correctly, that, to take ERSPs as an example, the data,
> wich is hundreds of time/frequency points is reduced to principal
> components that explain the most variance? In components that for example
> capture a lot of fm-theta, would this (first) principal component take its
> weight from the time/frequency points in 4-8 Hz range, high in power, and
> some latency? In other words, there a hundreds of dimensions, if I reduce
> to five dimensions, I am left with, mostly theta, some alpha, some noise,
> etc.?
>
> I am sorry to bother you, but this step really is a black box to me…
>
> I am sorry to bother you, perhaps you can also suggest some literature
> where people have done similar things. (I have read Julie Ontons paper from
> 2005 on FMT cluster, Still I am not so confident if I get the PCA thing)
>
> All the best,
> Clemens
>
> __________________________________
> Clemens Dickhut, M.Sc., (PhD Student)
> Institute for Health and Behaviour
> Research Unit INSIDE
> University of Luxembourg - Campus Belval
> Maison des Sciences Humaines
> 11, Porte des Sciences, R. 04 415
> L-4366 Esch-sur-Alzette
> Tel.: (+352) 46 66 44 9536
>
>
>
> On 07 Jun 2016, at 01:26, Makoto Miyakoshi <mmiyakoshi at ucsd.edu> wrote:
>
> Dear Clemens,
>
> > As for example, I want to reduce the dimensions of ERSP/dipoles,
> spectra, given all ICs, which dimensions, what are the dimensions that
> would be reduced for the respective measures?
>
> If I remember correctly, EEGLAB manual says
>
> 1. Set the parameters so that total number of dimensions across all the
> measures are <20-30 because k-means clustering performance becomes worse if
> >30.
>
> 2. Do not use 'final dimension' thing.
>
> I think it's a good idea to keep it below 20 to see how it works.
> I personally recommend you use just dipole and just a little bit (weight 1
> - 3) of spectrum with dimension 10-15. It's a good idea to avoid the
> measure which you'll test using statistics in the end to avoid 'double
> dipping' issue (if you don't know what it is, google 'voodoo correlation'.)
>
> Makoto
>
>
> On Tue, May 24, 2016 at 1:42 AM, Clemens DICKHUT <clemens.dickhut at uni.lu>
> wrote:
>
>> Dear all,
>>
>> I have some trouble understanding the reduction of dimensionality on
>> measures selected for component clustering. (STUDY —> build pre-clustering
>> array)
>>
>> As for example, I want to reduce the dimensions of ERSP/dipoles, spectra,
>> given all ICs, which dimensions, what are the dimensions that would be
>> reduced for the respective measures?
>>
>> I am grateful for any input.
>>
>> Best,
>> Clemens
>>
>> __________________________________
>> Clemens Dickhut, M.Sc., (PhD Student)
>> Institute for Health and Behaviour
>> Research Unit INSIDE
>> University of Luxembourg - Campus Belval
>> Maison des Sciences Humaines
>> 11, Porte des Sciences, R. 04 415
>> L-4366 Esch-sur-Alzette
>> Tel.: (+352) 46 66 44 9536
>>
>>
>>
>>
>> _______________________________________________
>> Eeglablist page: http://sccn.ucsd.edu/eeglab/eeglabmail.html
>> To unsubscribe, send an empty email to
>> eeglablist-unsubscribe at sccn.ucsd.edu
>> For digest mode, send an email with the subject "set digest mime" to
>> eeglablist-request at sccn.ucsd.edu
>>
>
>
>
> --
> Makoto Miyakoshi
> Swartz Center for Computational Neuroscience
> Institute for Neural Computation, University of California San Diego
>
>
>


-- 
Makoto Miyakoshi
Swartz Center for Computational Neuroscience
Institute for Neural Computation, University of California San Diego
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://sccn.ucsd.edu/pipermail/eeglablist/attachments/20160701/9bbe7e86/attachment.html>


More information about the eeglablist mailing list