[Eeglablist] Inconsistent results using clean_artifacts

Makoto Miyakoshi mmiyakoshi at ucsd.edu
Tue Mar 22 09:26:56 PDT 2022


Dear Cyril,

Thank you for the detailed suggestion.

>From the viewpoint of project management, now Hyeonseok and I are writing a
technical paper about ASR's calibration stage. Our initial hope was we
quickly try Cristina's solution and if it works we report it as a
recommended good practice altogether. However, yesterday we made a decision
that we would investigate it as an independent issue separated from the
calibration process.

Hyeonseok and I may be able to write a brief dedicated paper to formally
investigate and address this issue if we can convince John. If you know a
good journal, please let us know.

Makoto

On Tue, Mar 22, 2022 at 1:44 AM Dr Cyril Pernet <wamcyril at gmail.com> wrote:

> Hi Makoto,
>
> my understanding of the issue is different - I believe this comes down
> to the uncontrolled random sampling of channel.
>
> in
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_sccn_clean-5Frawdata_blob_master_clean-5Fchannels.m&d=DwIDaQ&c=-35OiAkTchMrZOngvJPOeA&r=pyiMpJA6aQ3IKcfd-jIW1kWlr8b1b2ssGmoavJHHJ7Q&m=8qVAxQSzc6wl42S-UYSGiuyI2N-csKOPA2Ek6O3avqp-Jx4bugE4-2pQXR1cMl-L&s=FbeWXkSer1W53X8LSWvQw44L_KOMiPj1H0RwSHVnssM&e=
> ,
> the size of the channel subsets to use for robust reconstruction, as a
> fraction of the total number of channel is set at  0.25 -- this means
> the function computes N samples with 25% channels at the time being
> Interpolated and then computes correlations between them (to get
> consensus between N draws).
>
> Christina had 29 channels - because the random sampling is
> unconstrained, it could pick 7 channels next to each other leading to
> wild interpolations which i think can explain why she needs such high
> number of resamples --
>
> I haven't found the time to test it but I'm thinking that changes in
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_sccn_clean-5Frawdata_blob_master_clean-5Fchannels.m-23L177&d=DwIDaQ&c=-35OiAkTchMrZOngvJPOeA&r=pyiMpJA6aQ3IKcfd-jIW1kWlr8b1b2ssGmoavJHHJ7Q&m=8qVAxQSzc6wl42S-UYSGiuyI2N-csKOPA2Ek6O3avqp-Jx4bugE4-2pQXR1cMl-L&s=5igDR_jOHAtlbYpvuwGbRoEaBChz3fEe5b2i8i0TVJs&e=
>
> could help. For instance for each resample, meshgrid the channels
> location and remove the 25% selected, if only one whole (ie all channels
> are neighbors) then take another draw. With your 205 channels, this
> issue 'never' occurs and thus 50 or 1000 make little difference. This
> can be tested 1st by downsampling the number of channels - if you see a
> change such as increasing N is more stable then it reproduces
> Christina's error and the proposed constrain in the channel sample might
> be solution  (and maybe there is a trade-off channel number vs number of
> resamples to find, even when constraining the channel sampling)
>
> cyril
>
>
> > Dear Cristina, Daniele, and Arno (cc Hyeonseok),
> >
> > This is a follow up study. Hyeonseok and I ran a test using empirical
> > datasets. See the summary below.
> >
> https://sccn.ucsd.edu/wiki/Makoto%27s_preprocessing_pipeline#Channel_rejection_using_RANSAC_in_clean_rawdata.28.29_.2803.2F21.2F2022_added.29
> > Our results did NOT show increasing 'NumSamples' produces more stable
> > results, given rng() is NOT fixed. We wished it does!
> > This warrants further investigation.
> >
> > Makoto
> >
> > On Thu, Mar 17, 2022 at 10:57 AM Makoto Miyakoshi <mmiyakoshi at ucsd.edu>
> > wrote:
> >
> >> Dear Cristina,
> >>
> >> Wow, this is such a perfect summary report. I deeply appreciate you took
> >> so much time and care to make this happen.
> >> You are the best part of the EEGLAB mailing list. Thank you, thank you,
> >> thank you!
> >>
> >>> Second, I would prefer not to discard the RANSAC method to detect bad
> >> channels if I find a stable solution. I believe that the RANSAC method
> is
> >> the core for detecting bad channels in the clean_rawdata function.
> >>
> >> I appreciate you mentioning that. I'll tell you why.
> >> In the early 2010's when Christian, the developer of ASR, was working on
> >> the offline version of clean_rawdata() upon my request, he gave me a
> >> solution once, then told me that he wanted to add one more thing for
> >> update. Within a few days, this RANSAC part was implemented. So this
> RANSAC
> >> part was one of the final touch ups he specifically wanted to implement.
> >>
> >> So I agree, I'd love to use his bad-channel rejection. Your confirmation
> >> is so valuable for me--increasing the 'NumSamples' to 1000, for example,
> >> can make the algorithm's behavior more stable. I'll make it my default
> and
> >> use the channel rejection function again. I still would not use 0.8 for
> the
> >> correlation criterion though, I'd use 0.6-0.7. Christian did recommend
> >> higher values. But the problem of channel rejection is that
> short-segment
> >> of high-amplitude data always biases the selection. Now I quickly
> checked
> >> code of clean_channels(), but the current process is not robustified
> >> against the short, high-amplitude burst.
> >>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_sccn_clean-5Frawdata_blob_master_clean-5Fchannels.m&d=DwIFaQ&c=-35OiAkTchMrZOngvJPOeA&r=kB5f6DjXkuOQpM1bq5OFA9kKiQyNm1p6x6e36h3EglE&m=iihm7vXbmXPM3roTZyq3HHfjCLd_EvrE7iP_zLcVlArZO35j4N9teP2ZcZOlFBVC&s=YK0XRmMVfmrFp10TRIoVsOlqx0JmAS-uKu53HSldgyQ&e=
> >> It seems possible to address this issue. I'll discuss it with
> colleagues.
> >>
> >> By the way, I have an update for you ASR enthusiasts which you may be
> >> interested in. Let me forward my recent post to the list below.
> >>
> >> %%%%%%%%%%
> >> Relatedly, Hyeonseok and I have been working on a mod for the
> calibration
> >> stage of ASR to process our Juggling data collected by Hiroyuki.
> >> We will present the idea at the Mobi meeting 2022.
> >>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__sites.google.com_ucsd.edu_mobi2022_&d=DwIFaQ&c=-35OiAkTchMrZOngvJPOeA&r=kB5f6DjXkuOQpM1bq5OFA9kKiQyNm1p6x6e36h3EglE&m=iihm7vXbmXPM3roTZyq3HHfjCLd_EvrE7iP_zLcVlArZO35j4N9teP2ZcZOlFBVC&s=I9BcssF-6sbBtzzgkAMBAn12ZjN2cpCFbjGMymOzohg&e=
> >>
> >> The idea is to use single-frame order statistics across electrodes
> rather
> >> than the default sliding window for selecting the calibration data. This
> >> way, we can obtain more 'clean' data points without letting
> high-amplitude
> >> artifacts into the calibration data (there is a default tolerance
> >> value--that is, the default setting allows a small amount of outliers
> sneak
> >> into the calibration data, up to 7.5% of electrodes; The proposed
> >> method uses 0%.) The proposed method makes subsequent PC distributions
> more
> >> Gaussian, which fits the assumption of ASR. Also, the proposed method
> seems
> >> to be able to explain, at least partially, the reason why the
> >> conventional empirically recommended values for the cutoff SD are
> unusually
> >> high, such as SD == 20. We will show both simulation and empirical
> results.
> >> Check out the MoBI 2022 conference!
> >> %%%%%%%%%%
> >>
> >> Makoto
> >>
> >>
> >>
> >> On Mon, Mar 14, 2022 at 8:55 AM Gil Avila, Cristina <
> cristina.gil at tum.de>
> >> wrote:
> >>
> >>> Thank you all for your input.
> >>>
> >>>
> >>>
> >>> First, I have noticed that the set of bad channels is only different
> >>> every time I restart EEGLab (please see the code below, I run EEGLab
> >>> command inside the loop over repetitions). Otherwise results are stable
> >>> (@Arno Could this explain why it passed all the tests?).
> >>>
> >>>
> >>>
> >>> Second, I would prefer not to discard the RANSAC method to detect bad
> >>> channels if I find a stable solution. I believe that the RANSAC method
> is
> >>> the core for detecting bad channels in the clean_rawdata function. The
> two
> >>> other options (clean channels based on flat line and on the high
> frequency
> >>> activity) seem to me more a preliminary step to the RANSAC. Therefore I
> >>> have tested:
> >>>
> >>>     1. How the ‘ChannelCriterion’ parameter influences the selected bad
> >>>     channels. I have tried the values 0.7, 0.8 (default) and 0.9. The
> higher
> >>>     the value, the less reproducible is the result. This was not a
> surprise if
> >>>     I look at the definition of the ChannelCriterion parameter: ‘if a
> channel
> >>>     is correlated at less than this value to an estimate based on other
> >>>     channels it is considered abnormal in the given time window’.
> Still, even
> >>>     being lax with the correlation threshold (0.7) I don’t get
> reproducible
> >>>     results.
> >>>     2. How the high-pass bandwidth influences the selected bad
> channels.
> >>>     I have tried a highpass with bandwidth [1 1.5] instead of the
> default [0.25
> >>>     0.75] with the ‘ChannelCriterion’ parameter fixed at 0.8. This
> does not
> >>>     seem to increase the reproducibility.
> >>>     3. How the ‘NumSamples’ RANSAC parameter of clean_artifacts()
> >>>     influences the selected bad channels. I have tried with 50
> (default), 100,
> >>>     500 and 1000 samples with ‘ChannelCriterion’ fixed at 0.8.
> Increasing this
> >>>     parameter to 1000 makes the output more reliable at the cost of
> more
> >>>     computation time (~1.5 min per recording).
> >>>
> >>>
> >>>
> >>> Brief comment regarding my data: I am working with eyes-closed
> >>> resting-state, 29 channels, recordings of 5 mins of duration sampled
> at 500
> >>> Hz (~150000 samples).
> >>>
> >>> For each case I have run 10 repetitions. You can also find along with
> the
> >>> code figures of all test cases. Figures represent how often was each
> >>> channel marked bad in each recording.
> >>>
> >>>
> >>>
> >>> For reproducibility I attach my code and the small dataset I am using.
> I
> >>> am using most recent versions of EEGLab and clean_rawdata from github.
> >>>
> >>> Code:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_crisglav_replication-5Fclean-5Frawdata_&d=DwIFaQ&c=-35OiAkTchMrZOngvJPOeA&r=kB5f6DjXkuOQpM1bq5OFA9kKiQyNm1p6x6e36h3EglE&m=iihm7vXbmXPM3roTZyq3HHfjCLd_EvrE7iP_zLcVlArZO35j4N9teP2ZcZOlFBVC&s=oHz9McmYOk26K0uA8eNdyYXpegIPR-kaw_5f0hvlV-o&e=
> >>> <
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_crisglav_replication-5Fclean-5Frawdata_&d=DwQGaQ&c=-35OiAkTchMrZOngvJPOeA&r=pyiMpJA6aQ3IKcfd-jIW1kWlr8b1b2ssGmoavJHHJ7Q&m=9m75cEFE25pnZqvTCnezRor87-PYdjeB2KlL4FhRwDsyrde-Zy2fdp5Ds1Jye6IK&s=bRRPLy36GAqMvYFcRiHOH3FY3hXoxi1qCMMcxJ7EVPA&e=
> >
> >>>
> >>> Dataset:
> >>>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__syncandshare.lrz.de_getlink_fiX7VwVdbGEsMTf46kqrcvx3_rawBIDS&d=DwIFaQ&c=-35OiAkTchMrZOngvJPOeA&r=kB5f6DjXkuOQpM1bq5OFA9kKiQyNm1p6x6e36h3EglE&m=iihm7vXbmXPM3roTZyq3HHfjCLd_EvrE7iP_zLcVlArZO35j4N9teP2ZcZOlFBVC&s=yviziOoR9Lt7TcuVhGqEbc5JTm_lN1dHti4wjW0P-Sg&e=
> >>> <
> https://urldefense.proofpoint.com/v2/url?u=https-3A__syncandshare.lrz.de_getlink_fiX7VwVdbGEsMTf46kqrcvx3_rawBIDS&d=DwQGaQ&c=-35OiAkTchMrZOngvJPOeA&r=pyiMpJA6aQ3IKcfd-jIW1kWlr8b1b2ssGmoavJHHJ7Q&m=9m75cEFE25pnZqvTCnezRor87-PYdjeB2KlL4FhRwDsyrde-Zy2fdp5Ds1Jye6IK&s=QCLa_vSG7bTiSAxeN1p8HNpZxCvMcpRp02JstwyDFMA&e=
> >
> >>>
> >>> Note: to test 3) I had to change clean_artifacts code and add in line
> 186
> >>>
> >>> {'num_samples','NumSamples'}, 50, ... % line 186
> >>>
> >>> And substitute line 232 by
> >>>
> >>> [EEG,removed_channels] =
> >>>
> clean_channels(EEG,chancorr_crit,line_crit,[],channel_crit_maxbad_time,num_samples);
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>>
> >>> Cristina Gil Ávila – PhD candidate
> >>>
> >>> Department of Neurology
> >>>
> >>> Technische Universität München
> >>>
> >>> Munich, Germany
> >>>
> >>> cristina.gil at tum.de
> >>>
> >>> painlabmunich.de
> >>> <
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.painlabmunich.de_&d=DwMGaQ&c=-35OiAkTchMrZOngvJPOeA&r=pyiMpJA6aQ3IKcfd-jIW1kWlr8b1b2ssGmoavJHHJ7Q&m=9m75cEFE25pnZqvTCnezRor87-PYdjeB2KlL4FhRwDsyrde-Zy2fdp5Ds1Jye6IK&s=hSQglHuzdgnx2GiKB_bxC1oRqVi-TqsmKbANR39Pcdk&e=
> >
> >>>
> >>>
> >>>
> > _______________________________________________
> > Eeglablist page: http://sccn.ucsd.edu/eeglab/eeglabmail.html
> > To unsubscribe, send an empty email to
> eeglablist-unsubscribe at sccn.ucsd.edu
> > For digest mode, send an email with the subject "set digest mime" to
> eeglablist-request at sccn.ucsd.edu
>
> --
> Dr Cyril Pernet, PhD, OHBM fellow, SSI fellow
> Neurobiology Research Unit
> Copenhagen University Hospital, Rigshospitalet
> Building 8057, Blegdamsvej 9
> DK-2100 Copenhagen, Denmark
>
> wamcyril at gmail.com
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cpernet.github.io_&d=DwIDaQ&c=-35OiAkTchMrZOngvJPOeA&r=pyiMpJA6aQ3IKcfd-jIW1kWlr8b1b2ssGmoavJHHJ7Q&m=8qVAxQSzc6wl42S-UYSGiuyI2N-csKOPA2Ek6O3avqp-Jx4bugE4-2pQXR1cMl-L&s=prr-QUz8t0KcVWGCaPoaxRCLp_VXiRaFfcvPD_5wZ4g&e=
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__orcid.org_0000-2D0003-2D4010-2D4632&d=DwIDaQ&c=-35OiAkTchMrZOngvJPOeA&r=pyiMpJA6aQ3IKcfd-jIW1kWlr8b1b2ssGmoavJHHJ7Q&m=8qVAxQSzc6wl42S-UYSGiuyI2N-csKOPA2Ek6O3avqp-Jx4bugE4-2pQXR1cMl-L&s=qtWNlOlqd8DcERHBMQ9hn0YN2V5cBrwIqyKd5qaqiWI&e=
>
>



More information about the eeglablist mailing list