[Eeglablist] An open experiment on the data rank issue with ICA

Arnaud Delorme adelorme at ucsd.edu
Wed Apr 5 13:57:01 PDT 2023


Hi Makoto,

In your paper, we read your main concerns:

"1. EEGLAB does not account for the initial reference electrode. Therefore, EEGLAB reduces data rank by re-referencing. This violates the first and second properties described previously (Hu et al., 2019)."

This is not correct. You can add back the common reference to the data when you reference the data, with no loss of data rank. There is an entire page in the tutorial on this process (and we pointed you to this page several times). But you are right that maybe this should be the default method when the reference is a scalp channel. The issue is that people need to provide the location of the reference, which might not be available to them.

https://urldefense.com/v3/__https://eeglab.org/tutorials/05_Preprocess/rereferencing.html*retaining-the-reference-channel__;Iw!!Mih3wA!FjmU8KUeMMhf-Jh7ww7xfFT3jYcM12PSs9M0B27FWu9BRXjS0tddjZDWqxaNvDMXRtClBxhZ7a_xfT_SGXGZJM5G$ 

"2. Although EEGLAB's implementation of the ICA (pop_runica) includes an effective rank deficiency checker,"

This is also not totally correct. EEGLAB implements 2 methods to check rank, not one, and we were discussing including a 3rd one because of issues with numerical inaccuracies and the fact that computing the rank sometimes return incorrect results. But this is link to point 3 below, so I get your point.

"3. Even if the data rank is ensured to be cleanly deficient by one [which can be detected by using the rank () function] through EEGLAB's re-referencing process, EEGLAB calculates λmin, which reintroduces a non-zero small number (typically <10−10) via numerical error. This non-zero noise forces effectively rank-deficient decomposition.”

Interesting. I loaded the tutorial dataset and computed the average reference. Then I used PCA.

[pc,eigvec,sv] = runpca(double(EEG.data));

The last eigenvalue corresponding to the dimension that would be discarded is 

sv(end,end) 

>> 8.4791e-04

Compared the smallest second eigenValue

sv(end-1,end-1) 

>> 386.1381

Now it may well be that 10^6 difference in scale between eigenvalues is enough to disrupt ICA decomposition, as claimed in the paper, but this will depend on the number of channels etc.

Also, credit should have been given to this tutorial page, which outlined the problem more than ten years ago.

https://urldefense.com/v3/__https://eeglab.org/tutorials/06_RejectArtifacts/RunICA.html*how-to-deal-with-corrupted-ica-decompositions__;Iw!!Mih3wA!FjmU8KUeMMhf-Jh7ww7xfFT3jYcM12PSs9M0B27FWu9BRXjS0tddjZDWqxaNvDMXRtClBxhZ7a_xfT_SGa_pb_A0$ 

The most important is, I think to show that these problems do not occur if the matrix is full rank and one does not use an average reference (not including the original reference).
Also puzzling is that using agressive PCA dimension reduction, as outlined in the page above, seems to partially solve the problem, which seems opposite to the conclusion of the paper.

I have compared the code you contributed (reref.m function but it is identical to the one in EEGLAB, so I am confused).

Note that this is also linked to a well known problem that ICA requires high numerical precision. If you run ICA in single precision (32 bit float number) the results will be different from double precision (64-bit float number). The PCA problem outlined in the paper is also an issue with numerical precision.

Proposed action in EEGLAB:
1. Update the documentation (Done) https://urldefense.com/v3/__https://eeglab.org/tutorials/06_RejectArtifacts/RunICA.html*how-to-deal-with-corrupted-ica-decompositions__;Iw!!Mih3wA!FjmU8KUeMMhf-Jh7ww7xfFT3jYcM12PSs9M0B27FWu9BRXjS0tddjZDWqxaNvDMXRtClBxhZ7a_xfT_SGa_pb_A0$ 
2. I think the best strategy is to systematically run PCA before ICA when the rank is reduced and then check the eigenvalue for the dimension to be removed are below 1e-7 (or should we check the ratio of eigenvalues). If this ratio is larger than 1e-7 issue a warning in red, indicating that dimension reduction is not appropriate, and that people can expect Ghost ICs.
3. Add additional warning when people reference the data, advising them to re-reference after ICA.

Cheers,

Arno

Ps: maybe next time, submit a pull request instead of publishing a paper :-) — although it is nice to have everything documented.

> On Apr 4, 2023, at 10:09 AM, Makoto Miyakoshi via eeglablist <eeglablist at sccn.ucsd.edu> wrote:
> 
> Dear eeglab mailinglist subscribers,
> 
> On May 7 2021, I announced on this mailing list that I started the open
> experiment on the data rank issue (forwarded below). Just last week, the
> project was published. Please see the paper from the URL below.
> 
> https://urldefense.com/v3/__https://www.frontiersin.org/articles/10.3389/frsip.2023.1064138/full?&utm_source=Email_to_authors_&utm_medium=Email&utm_content=T1_11.5e1_author&utm_campaign=Email_publication&field=&journalName=Frontiers_in_Signal_Processing&id=1064138__;!!Mih3wA!HoFCU2dXFWJ4lr9yY3siKM-dK7zb3Bvkp4JnLYaneFgdQpFoOtjXtvLa1JvpkSlLeejnqhPQfZmw5zHn_Hdi2rFXqXs$ 
> 
> We did several experimental things in this publication:
> 
>   - We invited Dr. Sven Hoffmann, who kindly reported the issue with the
>   solution (which was proven to be correct in our simulation!) Everybody can
>   see his name in pop_runica() line 668, but the current implementation
>   disables his original idea.
>   - We did an homage project on ICA voice unmixing i.e. the original
>   definition of the cocktail-party effect and effectiveness of ICA on this
>   issue. One of the 'voice actors', TzyyPing Jung, actually performed
>   the same part in the original demo in late 90's. So he played his own role
>   after about 25 years.
>   - We quoted several private communications with permissions regarding
>   the issue of the correct way of applying average reference. We
>   communications with Paul Nunes, Ramesh Srinivasan, Joseph Dien, and Dezhong
>   Yao. In other words, everyone (as far as I know of) who published a
>   paper/book on this issue. And Andreass Widmann who pointed me to this
>   problem on the mailing list.
>   - The second author is a high-school student--He joined us as an intern
>   student. He turned out to be a fluent programmer and did all the analysis
>   beyond my instructions by figuring out the purpose of the simulation by
>   himself.
> 
> It was such a fun project. The rank issue is one of the recurring questions
> in the past EEGLAB workshops. I hope our publication clarifies the
> background of this long-standing problem and to provide a solution once and
> for all.
> 
> Makoto
> 
> 
> 
> On Fri, May 7, 2021 at 2:25 PM Makoto Miyakoshi <mmiyakoshi at ucsd.edu> wrote:
> 
>> Dear subscribers,
>> 
>> Recently, there are multiple independent posts about the data rank issue
>> with ICA. In response, I am thinking about running a simulation experiment
>> with a visiting scholar to SCCN as a small project on this issue for
>> publication. I would appreciate it if you can give me any of the following
>> as an input.
>> 
>>   - Questions (what is puzzling for you? No need to be shy for asking
>>   'dumb questions')
>>   - Requests (if you want to know particularly X and/or Y on this issue,
>>   I may be able to give you the answer based on the simulation test)
>>   - Suggestions (about methods, data type, applications, etc)
>>   - Reports (when ICA failed, what did you see?)
>> 
>> If you are interested in working with me to make a contribution to this
>> small project, please reply or contact me mmiyakoshi at ucsd.edu. If your
>> contribution is substantial, I'll offer you to be a coauthor. Probably we
>> will need as many strange results as possible...?
>> 
>> Probably this is the first attempt to run an open experiment on the EEGLAB
>> mailing list--please join us and let's find out what happens!
>> 
>> Makoto
>> 
>> 
> _______________________________________________
> Eeglablist page: http://sccn.ucsd.edu/eeglab/eeglabmail.html
> To unsubscribe, send an empty email to eeglablist-unsubscribe at sccn.ucsd.edu
> For digest mode, send an email with the subject "set digest mime" to eeglablist-request at sccn.ucsd.edu




More information about the eeglablist mailing list