[Eeglablist] EEG data rank deficiency at ASR & ICA by rank deficiency

Tue Jan 11 17:08:29 PST 2022

Dear Tigoum and list subscribers,

This is a question from Tigoum to the list but it did not appear in the
list somehow. Tigoum later asked it to me, so I forward it here to answer
it. It may be difficult to follow because of that, but I wrote my comments
after 'Makoto's answer' to distinguish from Tigoum's words.

By the way Tigoum, I don't think your question showed up in the EEGLAB
mailing list. Please read the following page again to double check if your
email address is correctly registered.
https://urldefense.proofpoint.com/v2/url?u=https-3A__eeglab.org_others_EEGLAB-5Fmailing-5Flists.html&d=DwIBaQ&c=-35OiAkTchMrZOngvJPOeA&r=kB5f6DjXkuOQpM1bq5OFA9kKiQyNm1p6x6e36h3EglE&m=F8JJm3sgQIBTXuy3FYwmc9jf0bU4P3kvNfI8ziHyZUH_CEktkoqfFDANd8m4qBIf&s=9RvCtuPU61f_IbwxoGdO6FA0v-x9a8Znq5_Yzs2zKWc&e=  If you respond, reply
to the EEGLAB mailing list and not to my mailing address so that we shift
our discussion online. Sorry for taking a long time.

Makoto

%%%%%%%%%%%%%%%%%%%%

> Based on these contents, I'd like to introduce the newly collected EEG
data,
I'm going to apply this topic to preprocessing:
- load raw data (multi subjects)
- HPF @ 0.5Hz
- set the channel location
- ...
-- reject & interpolate bad channels by clean_rawdata(==ASR)
-- CAR
-- de-noising by ASR
-- ICA
- ...
In the process above, I came to have several questions.

> I confirmed in my data that EEG data rank-deficiency is caused by reject
& interpolate bad channels. (4 to 6 channels per data)
> About this,
> Question A. In the article you refer to,
https://sccn.ucsd.edu/wiki/Makoto's_useful_EEGLAB_code
Here,
# A-1. How to avoid the effect of rank-deficiency in applying ICA
(03/31/2021 added)
And
# A-2. How to turn on the automatic protection against rank-deficient ICA
in pop_runica (3/31/2021 added)

> Contents of question:
I'm not sure about the difference between the two sections above.
Does the former solve rank-deficiency manually and the latter automatically?
Do I have to apply all the code modifications that I show in each of the
two sections at once?

Makoto's answer:
Yes, the former (A-1) is to address the rank issue manually by showing how
it should be handled. The latter (A-2) is to let EEGLAB take care of it
properly using the same idea. You would need only one of the solutions. If
you use AMICA, you can't use the latter (A-2) so you need the first
solution.

> Question B. In the article you refer to, (I found your previous posting
document, which is similar to Section A-2 of Question A).

https://sccn.ucsd.edu/wiki/Makoto's_preprocessing_pipeline#Adjust_data_rank_for_ICA_.2805.2F17.2F2019_updated.29
Here,
#Adjust data rank for ICA (05/17/2019 updated):
Thus, it is strongly recommended if your preprocessing makes your data rank
deficient by x in total (i.e., they add up!), you explicitly specify
(number_of_channels - x) as your 'true' data rank in one of the following
ways. Again, this becomes particularly important after channel
interpolation, since pop_runica()'s rank checker does not detect the rank
deficiency and allows to produce 'ghost ICs' (by the way, same happens to
ASR--any PCA/ICA solution would be affected similarly!) To address the
issue, you either

A1. reduce data dimension by using 'pca' or 'pcakeep' option for runica()
and runamica() respectively, to the number of (number_of_channels - x), OR
A2. reject *ANY* channels by x. Interestingly, they do not have to be the
exact same channels that are interpolated. You may want to determine the
new channel location montage by using Nima's loc_subsets() function which
tells you which channels to be rejected to achieve maximally uniform
channel distribution.
...
> In the above description, "to producer 'ghost ICs' (by the way, same
happens)
to ASR--any PCA/ICA solution would be affected similarly!)" It says so.
Meanwhile, pop_runica() automatically solves the ghost ICs problem through
other sentences in the text.
I learned that they would give it to me.

> Question:
But I don't know if this is guaranteed in ASR. Is automatic resolution
guaranteed?
If not, should I come up with a solution (manual solution) myself?
If you had to do it manually, what would be the way?
Is the only solution to remove x channels as shown in A2 above?
What if Nima's loc_subsets() function tells me to remove dominant channels
from my data?

Makoto's answer:
No, there is no automatic solution for ASR. ASR absolutely refuses (or
fails with) rand-deficiency in the input data. To avoid the issue, simply
do not perform electrode interpolation before running ASR. Technically,
there is a way to let ASR handle with rank-deficient data, but it is not
implemented as of now. To make the data full-ranked, the easiest way is to
reject the number of channels so that the smallest eigenvalue of the
post-rejection data is > 1E-06, for example. Nima's loc_subsets() does not
determine how many electrodes to reject; that is the parameter the
users enter to loc_subsets().

Makoto