[Eeglablist] Rank deficiency and ASR+ICA

Fri Sep 27 13:35:03 PDT 2024

Hi Efthymios,

Depends on the analysis (evoked vs continuous data). But I generally use ASR before ICA only to remove very large artifacts (electrode disconnections, high amplitude high-frequency bursts, etc.), it is well known to improve ICA decomposition. But I ensure to preserve eye blinks, using ASR thresholds around 60-110, depending on the data. If you use default thresholds (20 if I recall), it can remove all eye blinks, which leads to removing lots of data (that can be extracted with ICA instead) and lots of discontinuities. This could explain lower performance than running ICA first. Have you looked at what ASR is removing (e.g. with vis_artifacts)? In brief, my experience is that if you use it only to remove large artifacts, it helps ICA a lot. If ASR removes eye artifacts, ICA may not be able to separate their source because there is no longer enough eye activity, and the many discontinuities (boundaries) probably don't help too. 

Then, for remaining more subtle artifacts, if it's ERP I use another method to detect bad epochs after segmentation. 

Not sure your metrics would tell you which method is best. You could have large entropy or SNR in clean data, depending on the brain activity of interest. I would only rely on inspecting the data (time series, topographies, signal and IC domains). 

Note: both ICA and ASR perform best after re-referencing in my experience (and I use ref to infinity generally as it increases spatial resolution relative to average, in my view). 

I don't think you need CleanLine if applying lowpass 30 Hz, it adds a lot of unnecessary computing time. 

Question: what is your analysis on such low frequencies? Is it a prestimulus predictive/anticipatory activity by any chance (e.g. CNV, readiness potential, etc.)? 
If so, make sure to apply a causal (e.g. minimum-phase) filter. 

Cedric

On Friday, September 27th, 2024 at 11:49 AM, Dr. Efthymios Papatzikis <efp331 at mail.harvard.edu> wrote:

> Dear
> 
> Cedric
> 
> 
> ,
> 
> In regards to the ASR and then ICA application in the pipeline you mentioned below, I was wondering if you have any available literature on this, or is it just personal experience you are based on?
> 
> I was fighting the last couple of months with this kind of inquiry on my data (premature infant data) and I decided to run SNR and spectral entropy analysis on both options (ASR-ICA and ICA-ASR) to see which one brings the best signal at the end. Although close enough, first ICA and then ASR seemed to produce cleaner results. (Spectral Entropy was almost similar for both options after full preprocessing.)
> 
> Any thoughts?
> 
> PS.1
> I followed the below steps:
> 1.FIR (0.1Hz / 30Hz)
> 2. Cleanline (50Hz)
> 3. Diagnostics script I’ve written on Signal Quality Index/Zero crossing rate/Power in frequency bands producing an average of these three to find the electrodes to extract and interpolate later on
> 4. ICA/ASR or ASR/ICA
> 5. Interpolate extracted electrodes
> 6. Re reference
> 
> PS 2
> All my data is on premature infants. 8 electrodes. Therefore, I also introduced in the diagnostics steps above, an extra script I wrote to calculate the continuity/discontinuity ratio in all channels and then produced a classifier just before extracting ICA components to see if and how much discontinuity informs any of the ICA components so that I take a more informed decision during ICLabeling and extracting process.
> 
> Looking forward to your reply,
> Best,
> Efthymios
> 
> > On 27 Sep 2024, at 19:48, Cedric Cannard via eeglablist eeglablist at sccn.ucsd.edu wrote:
> > 
> > Hi Francisco,
> > 
> > I agree with Makoto it is not ideal solution.But I haven't found any perfect solution personally.
> > 
> > If you do decide to go with the initial approach, here are the general steps.
> > 
> > Yes, always get the best signal quality possible before ICA (although preserve eye movements as much as possible so they are extracted by ICA and not ASR, which removes relevant signals with the eye artifacts).
> > 
> > A good way for this approach is to make a copy of your dataset before filtering. Filter copy data >1Hz, remove bad channels, run ASR, interpolate bad channels (after ASR to avoid data rank issues as there is no input like for ICA, to my knowledge), run ICA (input effective data rank). Store bad channels, bad segment latencies, ICA latencies during this process.
> > 
> > Go back to the original dataset (<1 hz filter), remove bad channels, interpolate, remove bad segments, and transfer ICA weights (based on stored info). Run EEG = eeg_checkset(EEG); to make sure everything is good. Run ICLabel to flag and remove bad components.
> > 
> > Hope this helps and that I didn't miss something.
> > 
> > Cedric
> > 
> > > On Thursday, September 26th, 2024 at 2:59 PM, Cedric Cannard ccannard at protonmail.com wrote:
> > > 
> > > Hi Francisco,
> > > 
> > > Yes, always get the best signal quality possible before ICA (although preserve eye movements as much as possible so they are extracted by ICA and not ASR, which removes relevant signals with the eye artifacts).
> > > 
> > > A good way for this approach is to make a copy of your dataset before filtering. Filter copy data >1Hz, removebad channels, run ASR, interpolate bad channels (after ASR to avoid data rank issues as there is no input like for ICA, to my knowledge), run ICA (input effective data rank). Store bad channels, bad segment latencies, ICA latencies during this process.
> > > 
> > > Go back to original dataset (<1 hz filter), remove bad channels, interpolate, remove bad segments, and transfer ICA weights (based on stored info). Run EEG = eeg_checkset(EEG); to make sure everything is good. Run ICLabel to flag and remove bad components.
> > > 
> > > Hope this helps and that I didn't miss something.
> > > 
> > > Cedric
> > > 
> > > > On Monday, September 23rd, 2024 at 10:13 AM, Francisco Zambrano via eeglablist eeglablist at sccn.ucsd.edu wrote:
> > > > 
> > > > Hello again,
> > > > 
> > > > Thanks a lot for your insights, the results from
> > > > 
> > > > Cedric
> > > > 
> > > > 's reply gave me an
> > > > answer that sounds more in line with what I've read.
> > > > 
> > > > I've come with another question regarding ICA. If I'm trying to use ICA's
> > > > trick of applying the >1 Hz data's unmixing matrix to the <1 Hz data, but
> > > > 
> > > > I don't know if It's better to do the whole processing pipeline(removing
> > > > bad channels, interpolating, removing bad trials,...) first and then
> > > > applying it back to the <1Hz data that is also been processed with the same
> > > > steps; or if I'm supposed to do it in another way. Could you please help me
> > > > with this?
> > > > 
> > > > Thanks in advance.
> > > > Francisco.
> > > > 
> > > > El mié, 7 ago 2024 a la(s) 3:37 p.m., Makoto Miyakoshi via eeglablist (
> > > > eeglablist at sccn.ucsd.edu) escribió:
> > > > 
> > > > > Hi Pancho,
> > > > > 
> > > > > > rank(EEG.data(:,:)) over the original data It says the rank is 11
> > > > > 
> > > > > Try rank(double(EEG.data(:,:)) and you'll see more realistic values at
> > > > > least. It is counterintuitive if you see it naively that single or double
> > > > > makes such a difference here.
> > > > > 
> > > > > For the complete answer, see Cedric's reply.
> > > > > 
> > > > > Makoto
> > > > > 
> > > > > On Mon, Aug 5, 2024 at 9:40 PM Pancho Zambrano via eeglablist <
> > > > > eeglablist at sccn.ucsd.edu> wrote:
> > > > > 
> > > > > > Hi!,
> > > > > > 
> > > > > > I'm currently working on a script for my thesis on an EEG channel
> > > > > > reconstruction protocol for Universidad de Concepción, and I've found
> > > > > > myself in some trouble regarding rank deficiency. The script it's about
> > > > > > quantification of the positive yzing impact of ASR algorithm from
> > > > > > cleanrawdata() on the data by analyzing the increase of the amount of
> > > > > > dipolar sources present before and after using this default toolbox
> > > > > > function. This is accomplished by comparing the residual variance from
> > > > > > the
> > > > > > dipfit model of the clean and original data. The problem is I use the
> > > > > > matcorr function to find the best match between the original and clean
> > > > > > ICA
> > > > > > mixing matrices, but they have different number of columns: the first one
> > > > > > is 64x64 and the clean one is 64x55. Since I've re-referenced the data
> > > > > > after ASR, I thought this was the problem but I also noticed when I use
> > > > > > rank(EEG.data(:,:)) over the original data It says the rank is 11, so I'm
> > > > > > lost here, please help. I got this method from the article:
> > > > > > 
> > > > > > Evaluation of Artifact Subspace Reconstruction for Automatic Artifact
> > > > > > Components Removal in Multi-Channel EEG Recordings
> > > > > > 
> > > > > > from:
> > > > > 
> > > > > https://urldefense.com/v3/__https://ieeexplore.ieee.org/abstract/document/8768041?casa_token=IsR0oujDOEAAAAAA:T1NPgCRqT15sUYjG8KLtR8a9yW6PqQP7iaKzg5057R93Oef1yCmxDKt6ZtLM4IZk2t_xF4qWJ0I__;!!Mih3wA!ClbUhbvVFcY2DhNRP-wwMym8wBcZTXz2jeCl8yebysZiEmG01Z-cZm3OgV4T4R1GXwiDsWNEwyIFZcOglDZZkbBiLjHP0sFV$
> > > > > 
> > > > > > This is my code:
> > > > > > % Initialize EEGLAB
> > > > > > eeglab;
> > > > > > 
> > > > > > % Load the dataset
> > > > > > EEG = pop_loadset('filename', 'C110_filt_butt_zap2.set');
> > > > > > 
> > > > > > % Perform ICA on the original data
> > > > > > EEG = pop_runica(EEG, 'extended', 1);
> > > > > > 
> > > > > > % Save the original ICA results
> > > > > > ica_weights_orig = EEG.icaweights;
> > > > > > ica_sphere_orig = EEG.icasphere;
> > > > > > ica_act_orig = EEG.icaact;
> > > > > > ica_mix_matrix_orig = pinv(ica_weights_orig * ica_sphere_orig);
> > > > > > 
> > > > > > % ASR dataset
> > > > > > EEG_clean = pop_loadset('filename', 'prueba_1hz_ica.set');
> > > > > > 
> > > > > > % % Perform ICA on the ASR cleaned data
> > > > > > % EEG_clean = pop_runica(EEG_clean, 'extended', 1);
> > > > > > 
> > > > > > % Save the ASR ICA results
> > > > > > ica_weights_clean = EEG_clean.icaweights;
> > > > > > ica_sphere_clean = EEG_clean.icasphere;
> > > > > > ica_act_clean = EEG_clean.icaact;
> > > > > > ica_mix_matrix_clean = pinv(ica_weights_clean * ica_sphere_clean);
> > > > > > 
> > > > > > % Calculate the correlations between the components
> > > > > > % n_components_orig = size(ica_mix_matrix_orig, 2);
> > > > > > % n_components_clean = size(ica_mix_matrix_clean, 2);
> > > > > > %
> > > > > > % corr_matrix = zeros(n_components_orig, n_components_clean);
> > > > > > %
> > > > > > % for i = 1:n_components_orig
> > > > > > % for j = 1:n_components_clean
> > > > > > % corr_matrix(i, j) = corr(ica_mix_matrix_orig(:, i),
> > > > > > ica_mix_matrix_clean(:, j));
> > > > > > % end
> > > > > > % end
> > > > > > 
> > > > > > % Use the matcorr function to find the best matching
> > > > > > rmmean = true; % Adjust as needed (remove mean or not)
> > > > > > method = 0; % Correlation method
> > > > > > weighting = []; % No specific weighting
> > > > > > 
> > > > > > [corr, indx, indy, corrs] = matcorr(ica_mix_matrix_orig,
> > > > > > ica_mix_matrix_clean, rmmean, method, weighting);
> > > > > > 
> > > > > > % Display the matched correlations
> > > > > > best_matches = corrs(sub2ind(size(corrs), indx, indy));
> > > > > > 
> > > > > > disp('Correlations between the best matched components:');
> > > > > > disp(best_matches);
> > > > > > 
> > > > > > % Apply the spatial filter obtained from the ICA of the original data to
> > > > > > the cleaned data
> > > > > > W_orig = ica_weights_orig * ica_sphere_orig;
> > > > > > Y_clean = W_orig * reshape(EEG_clean.data, size(EEG_clean.data, 1), []);
> > > > > > 
> > > > > > % Calculate the activities of the ICs
> > > > > > Y_clean = reshape(Y_clean, size(EEG_clean.icaact));
> > > > > > 
> > > > > > % Calculate the mean power reduction for the IC activities
> > > > > > Y_orig = W_orig * reshape(EEG.data, size(EEG.data, 1), []);
> > > > > > Y_orig = reshape(Y_orig, size(EEG.icaact));
> > > > > > 
> > > > > > power_reduction = mean(var(Y_orig, 0, 2)) - mean(var(Y_clean, 0, 2));
> > > > > > 
> > > > > > disp('Mean power reduction:');
> > > > > > disp(power_reduction);
> > > > > > 
> > > > > > % Classify the ICs using iclabel
> > > > > > EEG = iclabel(EEG, 'default');
> > > > > > EEG_clean = iclabel(EEG_clean, 'default');
> > > > > > 
> > > > > > % Get the IC classification labels
> > > > > > ic_labels_orig = EEG.etc.ic_classification.ICLabel.classifications;
> > > > > > ic_labels_clean =
> > > > > > EEG_clean.etc.ic_classification.ICLabel.classifications;
> > > > > > 
> > > > > > % Set up DIPFIT for the original data
> > > > > > EEG = pop_dipfit_settings(EEG, 'hdmfile',
> > > > > > 'standard_BEM/standard_vol.mat',
> > > > > > ...
> > > > > > 'coordformat', 'MNI', 'mrifile', 'standard_mri.mat', ...
> > > > > > 'chanfile', 'standard_1005.elc', 'coord_transform', [0 0 0 0 0 0 1 1
> > > > > > 1], ...
> > > > > > 'chansel', 1:size(EEG.data, 1));
> > > > > > 
> > > > > > % Fit dipoles to the original ICA components
> > > > > > EEG = pop_multifit(EEG, 1:size(EEG.icaweights, 1), 'threshold', 100);
> > > > > > 
> > > > > > % Get the residual variance of the fitted dipoles in the original data
> > > > > > rv_orig = [EEG.dipfit.model.rv];
> > > > > > 
> > > > > > % Set up DIPFIT for the ASR cleaned data
> > > > > > EEG_clean = pop_dipfit_settings(EEG_clean, 'hdmfile',
> > > > > > 'standard_BEM/standard_vol.mat', ...
> > > > > > 'coordformat', 'MNI', 'mrifile', 'standard_mri.mat', ...
> > > > > > 'chanfile', 'standard_1005.elc', 'coord_transform', [0 0 0 0 0 0 1 1
> > > > > > 1], ...
> > > > > > 'chansel', 1:size(EEG_clean.data, 1));
> > > > > > 
> > > > > > % Fit dipoles to the ASR cleaned ICA components
> > > > > > EEG_clean = pop_multifit(EEG_clean, 1:size(EEG_clean.icaweights, 1),
> > > > > > 'threshold', 100);
> > > > > > 
> > > > > > % Get the residual variance of the fitted dipoles in the ASR cleaned data
> > > > > > rv_clean = [EEG_clean.dipfit.model.rv];
> > > > > > 
> > > > > > % Identify dipolar ICs (residual variance < 5%)
> > > > > > dipolar_ICs_orig = find(rv_orig < 0.05);
> > > > > > dipolar_ICs_clean = find(rv_clean < 0.05);
> > > > > > 
> > > > > > % Compare the number of dipolar ICs before and after ASR
> > > > > > num_dipolar_ICs_orig = length(dipolar_ICs_orig);
> > > > > > num_dipolar_ICs_clean = length(dipolar_ICs_clean);
> > > > > > 
> > > > > > % Display dipolar IC results
> > > > > > fprintf('Number of dipolar ICs in original data: %d\n',
> > > > > > num_dipolar_ICs_orig);
> > > > > > fprintf('Number of dipolar ICs in ASR cleaned data: %d\n',
> > > > > > num_dipolar_ICs_clean);
> > > > > > 
> > > > > > % Save all results
> > > > > > save('ica_evaluation_results.mat', 'ica_weights_orig',
> > > > > > 'ica_weights_clean',
> > > > > > 'ica_sphere_orig', 'ica_sphere_clean', 'ica_act_orig', 'ica_act_clean',
> > > > > > 'indx', 'indy', 'best_matches', 'Y_clean', 'power_reduction',
> > > > > > 'ic_labels_orig', 'ic_labels_clean', 'rv_orig', 'rv_clean',
> > > > > > 'dipolar_ICs_orig', 'dipolar_ICs_clean');
> > > > > > 
> > > > > > Please, help.
> > > > > > 
> > > > > > Francisco Alonso Zambrano Salamanca
> > > > > > Biomedical Engineering
> > > > > > Universidad de Concepción
> > > > > > Chile
> > > > > > _______________________________________________
> > > > > > Eeglablist page: http://sccn.ucsd.edu/eeglab/eeglabmail.html
> > > > > > To unsubscribe, send an empty email to
> > > > > > eeglablist-unsubscribe at sccn.ucsd.edu
> > > > > > For digest mode, send an email with the subject "set digest mime" to
> > > > > > eeglablist-request at sccn.ucsd.edu
> > > > > > _______________________________________________
> > > > > > Eeglablist page: http://sccn.ucsd.edu/eeglab/eeglabmail.html
> > > > > > To unsubscribe, send an empty email to
> > > > > > eeglablist-unsubscribe at sccn.ucsd.edu
> > > > > > For digest mode, send an email with the subject "set digest mime" to
> > > > > > eeglablist-request at sccn.ucsd.edu
> > > > 
> > > > _______________________________________________
> > > > Eeglablist page: http://sccn.ucsd.edu/eeglab/eeglabmail.html
> > > > To unsubscribe, send an empty email to eeglablist-unsubscribe at sccn.ucsd.edu
> > > > For digest mode, send an email with the subject "set digest mime" to eeglablist-request at sccn.ucsd.edu
> > > > _______________________________________________
> > > > Eeglablist page: http://sccn.ucsd.edu/eeglab/eeglabmail.html
> > > > To unsubscribe, send an empty email to eeglablist-unsubscribe at sccn.ucsd.edu
> > > > For digest mode, send an email with the subject "set digest mime" to eeglablist-request at sccn.ucsd.edu