[Eeglablist] Inconsistent results using clean_artifacts

Makoto Miyakoshi mmiyakoshi at ucsd.edu
Thu Mar 31 18:39:17 PDT 2022


Dear Velu,

> Did you use the correlation threshold of 0.8 or something lower?

Yes, we used 0.8.

Our team has a plan to investigate this issue systematically but we can't
do it now. I'll update you when we get back to this issue.

Makoto

On Tue, Mar 22, 2022 at 11:34 AM Velu Prabhakar Kumaravel <
velu.kumaravel at unitn.it> wrote:

> Thanks, Makoto. Did you use the correlation threshold of 0.8 or something
> lower? From Cristina's report, this parameter affects the stability of the
> results, as well.
> Also, I second Cyril's point. Each resampling with Cristina's data
> consists of 7 channels while your data would pick around 50 channels and as
> such the *numSamples *have nil effect.
>
> It would be cool to have a technical report for ASR calibration!
>
> Best,
>
> Velu Prabhakar Kumaravel, PhD Student
> Center for Mind/Brain Sciences,
> University of Trento, Italy
>
>
> On Tue, 22 Mar 2022 at 17:33, Makoto Miyakoshi via eeglablist <
> eeglablist at sccn.ucsd.edu> wrote:
>
>> Dear Cyril,
>>
>> Thank you for the detailed suggestion.
>>
>> From the viewpoint of project management, now Hyeonseok and I are writing
>> a
>> technical paper about ASR's calibration stage. Our initial hope was we
>> quickly try Cristina's solution and if it works we report it as a
>> recommended good practice altogether. However, yesterday we made a
>> decision
>> that we would investigate it as an independent issue separated from the
>> calibration process.
>>
>> Hyeonseok and I may be able to write a brief dedicated paper to formally
>> investigate and address this issue if we can convince John. If you know a
>> good journal, please let us know.
>>
>> Makoto
>>
>> On Tue, Mar 22, 2022 at 1:44 AM Dr Cyril Pernet <wamcyril at gmail.com>
>> wrote:
>>
>> > Hi Makoto,
>> >
>> > my understanding of the issue is different - I believe this comes down
>> > to the uncontrolled random sampling of channel.
>> >
>> > in
>> >
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_sccn_clean-5Frawdata_blob_master_clean-5Fchannels.m&d=DwIDaQ&c=-35OiAkTchMrZOngvJPOeA&r=pyiMpJA6aQ3IKcfd-jIW1kWlr8b1b2ssGmoavJHHJ7Q&m=8qVAxQSzc6wl42S-UYSGiuyI2N-csKOPA2Ek6O3avqp-Jx4bugE4-2pQXR1cMl-L&s=FbeWXkSer1W53X8LSWvQw44L_KOMiPj1H0RwSHVnssM&e=
>> > ,
>> > the size of the channel subsets to use for robust reconstruction, as a
>> > fraction of the total number of channel is set at  0.25 -- this means
>> > the function computes N samples with 25% channels at the time being
>> > Interpolated and then computes correlations between them (to get
>> > consensus between N draws).
>> >
>> > Christina had 29 channels - because the random sampling is
>> > unconstrained, it could pick 7 channels next to each other leading to
>> > wild interpolations which i think can explain why she needs such high
>> > number of resamples --
>> >
>> > I haven't found the time to test it but I'm thinking that changes in
>> >
>> >
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_sccn_clean-5Frawdata_blob_master_clean-5Fchannels.m-23L177&d=DwIDaQ&c=-35OiAkTchMrZOngvJPOeA&r=pyiMpJA6aQ3IKcfd-jIW1kWlr8b1b2ssGmoavJHHJ7Q&m=8qVAxQSzc6wl42S-UYSGiuyI2N-csKOPA2Ek6O3avqp-Jx4bugE4-2pQXR1cMl-L&s=5igDR_jOHAtlbYpvuwGbRoEaBChz3fEe5b2i8i0TVJs&e=
>> >
>> > could help. For instance for each resample, meshgrid the channels
>> > location and remove the 25% selected, if only one whole (ie all channels
>> > are neighbors) then take another draw. With your 205 channels, this
>> > issue 'never' occurs and thus 50 or 1000 make little difference. This
>> > can be tested 1st by downsampling the number of channels - if you see a
>> > change such as increasing N is more stable then it reproduces
>> > Christina's error and the proposed constrain in the channel sample might
>> > be solution  (and maybe there is a trade-off channel number vs number of
>> > resamples to find, even when constraining the channel sampling)
>> >
>> > cyril
>> >
>> >
>> > > Dear Cristina, Daniele, and Arno (cc Hyeonseok),
>> > >
>> > > This is a follow up study. Hyeonseok and I ran a test using empirical
>> > > datasets. See the summary below.
>> > >
>> >
>> https://sccn.ucsd.edu/wiki/Makoto%27s_preprocessing_pipeline#Channel_rejection_using_RANSAC_in_clean_rawdata.28.29_.2803.2F21.2F2022_added.29
>> > > Our results did NOT show increasing 'NumSamples' produces more stable
>> > > results, given rng() is NOT fixed. We wished it does!
>> > > This warrants further investigation.
>> > >
>> > > Makoto
>> > >
>> > > On Thu, Mar 17, 2022 at 10:57 AM Makoto Miyakoshi <
>> mmiyakoshi at ucsd.edu>
>> > > wrote:
>> > >
>> > >> Dear Cristina,
>> > >>
>> > >> Wow, this is such a perfect summary report. I deeply appreciate you
>> took
>> > >> so much time and care to make this happen.
>> > >> You are the best part of the EEGLAB mailing list. Thank you, thank
>> you,
>> > >> thank you!
>> > >>
>> > >>> Second, I would prefer not to discard the RANSAC method to detect
>> bad
>> > >> channels if I find a stable solution. I believe that the RANSAC
>> method
>> > is
>> > >> the core for detecting bad channels in the clean_rawdata function.
>> > >>
>> > >> I appreciate you mentioning that. I'll tell you why.
>> > >> In the early 2010's when Christian, the developer of ASR, was
>> working on
>> > >> the offline version of clean_rawdata() upon my request, he gave me a
>> > >> solution once, then told me that he wanted to add one more thing for
>> > >> update. Within a few days, this RANSAC part was implemented. So this
>> > RANSAC
>> > >> part was one of the final touch ups he specifically wanted to
>> implement.
>> > >>
>> > >> So I agree, I'd love to use his bad-channel rejection. Your
>> confirmation
>> > >> is so valuable for me--increasing the 'NumSamples' to 1000, for
>> example,
>> > >> can make the algorithm's behavior more stable. I'll make it my
>> default
>> > and
>> > >> use the channel rejection function again. I still would not use 0.8
>> for
>> > the
>> > >> correlation criterion though, I'd use 0.6-0.7. Christian did
>> recommend
>> > >> higher values. But the problem of channel rejection is that
>> > short-segment
>> > >> of high-amplitude data always biases the selection. Now I quickly
>> > checked
>> > >> code of clean_channels(), but the current process is not robustified
>> > >> against the short, high-amplitude burst.
>> > >>
>> >
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_sccn_clean-5Frawdata_blob_master_clean-5Fchannels.m&d=DwIFaQ&c=-35OiAkTchMrZOngvJPOeA&r=kB5f6DjXkuOQpM1bq5OFA9kKiQyNm1p6x6e36h3EglE&m=iihm7vXbmXPM3roTZyq3HHfjCLd_EvrE7iP_zLcVlArZO35j4N9teP2ZcZOlFBVC&s=YK0XRmMVfmrFp10TRIoVsOlqx0JmAS-uKu53HSldgyQ&e=
>> > >> It seems possible to address this issue. I'll discuss it with
>> > colleagues.
>> > >>
>> > >> By the way, I have an update for you ASR enthusiasts which you may be
>> > >> interested in. Let me forward my recent post to the list below.
>> > >>
>> > >> %%%%%%%%%%
>> > >> Relatedly, Hyeonseok and I have been working on a mod for the
>> > calibration
>> > >> stage of ASR to process our Juggling data collected by Hiroyuki.
>> > >> We will present the idea at the Mobi meeting 2022.
>> > >>
>> >
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__sites.google.com_ucsd.edu_mobi2022_&d=DwIFaQ&c=-35OiAkTchMrZOngvJPOeA&r=kB5f6DjXkuOQpM1bq5OFA9kKiQyNm1p6x6e36h3EglE&m=iihm7vXbmXPM3roTZyq3HHfjCLd_EvrE7iP_zLcVlArZO35j4N9teP2ZcZOlFBVC&s=I9BcssF-6sbBtzzgkAMBAn12ZjN2cpCFbjGMymOzohg&e=
>> > >>
>> > >> The idea is to use single-frame order statistics across electrodes
>> > rather
>> > >> than the default sliding window for selecting the calibration data.
>> This
>> > >> way, we can obtain more 'clean' data points without letting
>> > high-amplitude
>> > >> artifacts into the calibration data (there is a default tolerance
>> > >> value--that is, the default setting allows a small amount of outliers
>> > sneak
>> > >> into the calibration data, up to 7.5% of electrodes; The proposed
>> > >> method uses 0%.) The proposed method makes subsequent PC
>> distributions
>> > more
>> > >> Gaussian, which fits the assumption of ASR. Also, the proposed method
>> > seems
>> > >> to be able to explain, at least partially, the reason why the
>> > >> conventional empirically recommended values for the cutoff SD are
>> > unusually
>> > >> high, such as SD == 20. We will show both simulation and empirical
>> > results.
>> > >> Check out the MoBI 2022 conference!
>> > >> %%%%%%%%%%
>> > >>
>> > >> Makoto
>> > >>
>> > >>
>> > >>
>> > >> On Mon, Mar 14, 2022 at 8:55 AM Gil Avila, Cristina <
>> > cristina.gil at tum.de>
>> > >> wrote:
>> > >>
>> > >>> Thank you all for your input.
>> > >>>
>> > >>>
>> > >>>
>> > >>> First, I have noticed that the set of bad channels is only different
>> > >>> every time I restart EEGLab (please see the code below, I run EEGLab
>> > >>> command inside the loop over repetitions). Otherwise results are
>> stable
>> > >>> (@Arno Could this explain why it passed all the tests?).
>> > >>>
>> > >>>
>> > >>>
>> > >>> Second, I would prefer not to discard the RANSAC method to detect
>> bad
>> > >>> channels if I find a stable solution. I believe that the RANSAC
>> method
>> > is
>> > >>> the core for detecting bad channels in the clean_rawdata function.
>> The
>> > two
>> > >>> other options (clean channels based on flat line and on the high
>> > frequency
>> > >>> activity) seem to me more a preliminary step to the RANSAC.
>> Therefore I
>> > >>> have tested:
>> > >>>
>> > >>>     1. How the ‘ChannelCriterion’ parameter influences the selected
>> bad
>> > >>>     channels. I have tried the values 0.7, 0.8 (default) and 0.9.
>> The
>> > higher
>> > >>>     the value, the less reproducible is the result. This was not a
>> > surprise if
>> > >>>     I look at the definition of the ChannelCriterion parameter: ‘if
>> a
>> > channel
>> > >>>     is correlated at less than this value to an estimate based on
>> other
>> > >>>     channels it is considered abnormal in the given time window’.
>> > Still, even
>> > >>>     being lax with the correlation threshold (0.7) I don’t get
>> > reproducible
>> > >>>     results.
>> > >>>     2. How the high-pass bandwidth influences the selected bad
>> > channels.
>> > >>>     I have tried a highpass with bandwidth [1 1.5] instead of the
>> > default [0.25
>> > >>>     0.75] with the ‘ChannelCriterion’ parameter fixed at 0.8. This
>> > does not
>> > >>>     seem to increase the reproducibility.
>> > >>>     3. How the ‘NumSamples’ RANSAC parameter of clean_artifacts()
>> > >>>     influences the selected bad channels. I have tried with 50
>> > (default), 100,
>> > >>>     500 and 1000 samples with ‘ChannelCriterion’ fixed at 0.8.
>> > Increasing this
>> > >>>     parameter to 1000 makes the output more reliable at the cost of
>> > more
>> > >>>     computation time (~1.5 min per recording).
>> > >>>
>> > >>>
>> > >>>
>> > >>> Brief comment regarding my data: I am working with eyes-closed
>> > >>> resting-state, 29 channels, recordings of 5 mins of duration sampled
>> > at 500
>> > >>> Hz (~150000 samples).
>> > >>>
>> > >>> For each case I have run 10 repetitions. You can also find along
>> with
>> > the
>> > >>> code figures of all test cases. Figures represent how often was each
>> > >>> channel marked bad in each recording.
>> > >>>
>> > >>>
>> > >>>
>> > >>> For reproducibility I attach my code and the small dataset I am
>> using.
>> > I
>> > >>> am using most recent versions of EEGLab and clean_rawdata from
>> github.
>> > >>>
>> > >>> Code:
>> >
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_crisglav_replication-5Fclean-5Frawdata_&d=DwIFaQ&c=-35OiAkTchMrZOngvJPOeA&r=kB5f6DjXkuOQpM1bq5OFA9kKiQyNm1p6x6e36h3EglE&m=iihm7vXbmXPM3roTZyq3HHfjCLd_EvrE7iP_zLcVlArZO35j4N9teP2ZcZOlFBVC&s=oHz9McmYOk26K0uA8eNdyYXpegIPR-kaw_5f0hvlV-o&e=
>> > >>> <
>> >
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_crisglav_replication-5Fclean-5Frawdata_&d=DwQGaQ&c=-35OiAkTchMrZOngvJPOeA&r=pyiMpJA6aQ3IKcfd-jIW1kWlr8b1b2ssGmoavJHHJ7Q&m=9m75cEFE25pnZqvTCnezRor87-PYdjeB2KlL4FhRwDsyrde-Zy2fdp5Ds1Jye6IK&s=bRRPLy36GAqMvYFcRiHOH3FY3hXoxi1qCMMcxJ7EVPA&e=
>> > >
>> > >>>
>> > >>> Dataset:
>> > >>>
>> >
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__syncandshare.lrz.de_getlink_fiX7VwVdbGEsMTf46kqrcvx3_rawBIDS&d=DwIFaQ&c=-35OiAkTchMrZOngvJPOeA&r=kB5f6DjXkuOQpM1bq5OFA9kKiQyNm1p6x6e36h3EglE&m=iihm7vXbmXPM3roTZyq3HHfjCLd_EvrE7iP_zLcVlArZO35j4N9teP2ZcZOlFBVC&s=yviziOoR9Lt7TcuVhGqEbc5JTm_lN1dHti4wjW0P-Sg&e=
>> > >>> <
>> >
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__syncandshare.lrz.de_getlink_fiX7VwVdbGEsMTf46kqrcvx3_rawBIDS&d=DwQGaQ&c=-35OiAkTchMrZOngvJPOeA&r=pyiMpJA6aQ3IKcfd-jIW1kWlr8b1b2ssGmoavJHHJ7Q&m=9m75cEFE25pnZqvTCnezRor87-PYdjeB2KlL4FhRwDsyrde-Zy2fdp5Ds1Jye6IK&s=QCLa_vSG7bTiSAxeN1p8HNpZxCvMcpRp02JstwyDFMA&e=
>> > >
>> > >>>
>> > >>> Note: to test 3) I had to change clean_artifacts code and add in
>> line
>> > 186
>> > >>>
>> > >>> {'num_samples','NumSamples'}, 50, ... % line 186
>> > >>>
>> > >>> And substitute line 232 by
>> > >>>
>> > >>> [EEG,removed_channels] =
>> > >>>
>> >
>> clean_channels(EEG,chancorr_crit,line_crit,[],channel_crit_maxbad_time,num_samples);
>> > >>>
>> > >>>
>> > >>>
>> > >>>
>> > >>>
>> > >>>
>> > >>>
>> > >>>
>> > >>>
>> > >>>
>> > >>>
>> > >>>
>> > >>> --
>> > >>>
>> > >>> Cristina Gil Ávila – PhD candidate
>> > >>>
>> > >>> Department of Neurology
>> > >>>
>> > >>> Technische Universität München
>> > >>>
>> > >>> Munich, Germany
>> > >>>
>> > >>> cristina.gil at tum.de
>> > >>>
>> > >>> painlabmunich.de
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__painlabmunich.de&d=DwMFaQ&c=-35OiAkTchMrZOngvJPOeA&r=pyiMpJA6aQ3IKcfd-jIW1kWlr8b1b2ssGmoavJHHJ7Q&m=fg15uIEPlPqPHe6vWbifYGaRkvQLbbPuGhbHpDEP5MYN2-8PFutpLbDyHLo_V20R&s=Ms-BBh_dlNeFU0njlZ5aNwdrfZrWibxSe5scDSc0drM&e=>
>> > >>> <
>> >
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.painlabmunich.de_&d=DwMGaQ&c=-35OiAkTchMrZOngvJPOeA&r=pyiMpJA6aQ3IKcfd-jIW1kWlr8b1b2ssGmoavJHHJ7Q&m=9m75cEFE25pnZqvTCnezRor87-PYdjeB2KlL4FhRwDsyrde-Zy2fdp5Ds1Jye6IK&s=hSQglHuzdgnx2GiKB_bxC1oRqVi-TqsmKbANR39Pcdk&e=
>> > >
>> > >>>
>> > >>>
>> > >>>
>> > > _______________________________________________
>> > > Eeglablist page: http://sccn.ucsd.edu/eeglab/eeglabmail.html
>> > > To unsubscribe, send an empty email to
>> > eeglablist-unsubscribe at sccn.ucsd.edu
>> > > For digest mode, send an email with the subject "set digest mime" to
>> > eeglablist-request at sccn.ucsd.edu
>> >
>> > --
>> > Dr Cyril Pernet, PhD, OHBM fellow, SSI fellow
>> > Neurobiology Research Unit
>> > Copenhagen University Hospital, Rigshospitalet
>> > Building 8057, Blegdamsvej 9
>> > DK-2100 Copenhagen, Denmark
>> >
>> > wamcyril at gmail.com
>> >
>> >
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__cpernet.github.io_&d=DwIDaQ&c=-35OiAkTchMrZOngvJPOeA&r=pyiMpJA6aQ3IKcfd-jIW1kWlr8b1b2ssGmoavJHHJ7Q&m=8qVAxQSzc6wl42S-UYSGiuyI2N-csKOPA2Ek6O3avqp-Jx4bugE4-2pQXR1cMl-L&s=prr-QUz8t0KcVWGCaPoaxRCLp_VXiRaFfcvPD_5wZ4g&e=
>> >
>> >
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__orcid.org_0000-2D0003-2D4010-2D4632&d=DwIDaQ&c=-35OiAkTchMrZOngvJPOeA&r=pyiMpJA6aQ3IKcfd-jIW1kWlr8b1b2ssGmoavJHHJ7Q&m=8qVAxQSzc6wl42S-UYSGiuyI2N-csKOPA2Ek6O3avqp-Jx4bugE4-2pQXR1cMl-L&s=qtWNlOlqd8DcERHBMQ9hn0YN2V5cBrwIqyKd5qaqiWI&e=
>> >
>> >
>> _______________________________________________
>> Eeglablist page: http://sccn.ucsd.edu/eeglab/eeglabmail.html
>> To unsubscribe, send an empty email to
>> eeglablist-unsubscribe at sccn.ucsd.edu
>> For digest mode, send an email with the subject "set digest mime" to
>> eeglablist-request at sccn.ucsd.edu
>
>



More information about the eeglablist mailing list