[Eeglablist] SIFT resampling surrogate distributions with 1 trial

Thu Aug 25 17:59:33 PDT 2016

Thanks for that reference, Tim, and the suggestions.

Since I'm only interested in within-condition means of SIFT metrics
compared across conditions, I made SIFT's time-windows non-overlapping.  So
at least that very blatant source of autocorrelation is avoidable.  Of
course there remains the source from the EEG itself.

Skipping some data by leaving gaps in the windows with sizes dictated by
the autocorrelation function to get to quasi-independence could improve
upon the adjacent non-overlapping window scheme.

Even though SIFT's connectivity metrics won't be Gaussian, if we have
enough (modulo quasi-ind) and are just interested in their sample mean, the
CLT should still apply.  Good advice on the log transform though.  Anything
making it more Gaussian should help.

Thanks Tim!

On Thu, Aug 25, 2016 at 5:19 PM, Tim Mullen <mullen.tim at gmail.com> wrote:

> PS: in Matlab 2016 you can just use the autocorr() function for an
> autocorrelation plot with confidence intervals.
>
>
>
>
> On Thu, Aug 25, 2016 5:13 PM, Tim Mullen mullen.tim at gmail.com wrote:
>
>> Ok. Yes, as you've surmised, the key trickiness in this sort of
>> statistical problem is that you need to consider the autocorrelation in
>> each time-series (and they are definitely autocorrelated here, since the
>> causal estimates are obtained from a sliding window). Probably the worst
>> thing you can do is perform a standard unpaired t-test, which has hugely
>> inflated Type I error rate if samples are significantly autocorrelated.
>>
>> Perhaps you could try a method like this:
>>
>> [image: Preview image] <http://www.ncbi.nlm.nih.gov/pubmed/26011524>
>> Performing T-tests to Compare Autocorrelated Time Series Data Collected
>> from Direct-Reading Instruments.
>> <http://www.ncbi.nlm.nih.gov/pubmed/26011524>
>> J Occup Environ Hyg. 2015;12(11):743-52. doi:
>> 10.1080/15459624.2015.1044603.
>> <http://www.ncbi.nlm.nih.gov/pubmed/26011524>
>> ncbi.nlm.nih.gov <http://www.ncbi.nlm.nih.gov/pubmed/26011524> [image:
>> Mixmax] <https://mixmax.com/r/T5KoXCt3bjAk6cadS>
>>
>>
>> Alternately, one possibility might be to compute the autocorrelation
>> function for each time-series and if it decays to non-significant amplitude
>> after K lags, then you could just select every K+1th sample for subsequent
>> analysis. The serial correlation should be minimal at that point (you could
>> run a Durbin-Watson test to confirm). Here is an example
>> <http://www.itl.nist.gov/div898/handbook/eda/section3/autocopl.htm> of
>> how to use an autocorrelation plot for this. One potential issue here
>> (there's always one) is that the analytic confidence bounds for the null
>> hypothesis of zero autocorrelation generally rely on a Gaussianity
>> assumption on the data. The granger-causal estimates are definitely not
>> Gaussian (probably closer to gamma-distributed). You could try
>> log-transforming them to render more Gaussian -- it's a monotonic transform
>> so it won't affect the statistics for differences in means.
>>
>> There are other more complex approaches involving fitting ARMA models
>> (and probably some more simple ones I'm not considering at the moment).
>>
>> Tim
>>
>>
>>
>> On Tue, Aug 23, 2016 11:38 PM, Winslow Strong winslow.strong at gmail.com
>> wrote:
>>
>> Only in a difference in means over the entire condition.
>>
>> On Tue, Aug 23, 2016 at 11:00 PM, Tim Mullen <mullen.tim at gmail.com>
>> wrote:
>>
>> Yes, skipping one or more trials may at least mitigate some of the
>> autocorrelation effects. Are you only interested in whether there is a
>> difference in means over the whole condition, or whether there are
>> differences at specific points in time?
>>
>> On Tue, Aug 23, 2016 at 4:17 PM, Winslow Strong <winslow.strong at gmail.com
>> > wrote:
>>
>> Hi Tim,
>>
>> Yes I was searching for some approximate test stats and p-vals generated
>> by creating pseudotrials within each trial.  I'll try this out.  I'm
>> thinking it might be wise to leave a gap between the pseudo trials (i.e.
>> not make them contiguous EEG segments) to make them closer to independent.
>> Leaving out every-other pseudotrial might be a reasonable tradeoff.  One
>> could get 2 test stats or just 2 sample variances this way: one from the
>> even pseudotrials and one from the odds.
>>
>> This is a bit hacky though, and I wonder if there are canonical methods
>> to deal with the lack of independence.
>>
>> On Mon, Aug 22, 2016 at 12:59 PM, Tim Mullen <mullen.tim at gmail.com>
>> wrote:
>>
>> Winslow, Makoto,
>>
>> As a statistical principle, bootstrapping can only be used when you have
>> multiple independent and identically distributed (i.i.d) observations
>> available. The observations are resampled with replacement from the
>> original set to construct an empirical probability distribution.
>>
>> It is not possible to use bootstrapping to test for statistical
>> differences between only two observations (i.e. two trials). In general,
>> with any test, your statistical power will be extremely low if you have
>> only one observation per condition.
>>
>> You can try to mitigate this by segmenting your long continuous trials
>> into short 'pseudo-trials' and then testing for differences in the
>> pseudo-trial between conditions. Make sure that you average your causal
>> measure over time within each trial before computing your stats. One
>> concern is that the pseudotrials may be far from i.i.d within a condition,
>> so if using bootstrap, your bootstrap distribution may not converge to the
>> true distribution of the estimator and your stats will be biased.
>>
>> Depending on your specific null hypothesis and protocol, however, there
>> may be alternative parametric and nonparametric tests you can apply.
>>
>> Otherwise try to collect data for more subjects (then you simply
>> bootstrap across subjects e.g. using statcond with SIFT matrices) or more
>> trials (run your experiment more than once per condition).
>>
>> Tim
>>
>> On Aug 18, 2016 11:09 AM, "Makoto Miyakoshi" <mmiyakoshi at ucsd.edu> wrote:
>>
>> Dear Winslow,
>>
>> Yes, unfortunately the bootstrap seems to be designed for across trials.
>>
>> Makoto
>>
>> On Sat, Aug 13, 2016 at 4:57 PM, Winslow Strong <winslow.strong at gmail.com
>> > wrote:
>>
>> I'd like to use a resampling technique (e.g. bootstrap) to get p-values
>> and test stats for SIFT connectivity metrics for 1 subject across n
>> conditions.
>>
>> This is a steady-state condition study, hence there's only 1 trial per
>> condition.  I'm trying to analyze whether certain connectivity metrics
>> (i.e. their averages over a condition) are statistically significantly
>> different across the conditions.  I was under the impression I could use
>> SIFT's surrogate distribution generator to obtain the surrogate
>> distribution for these calculations, but when I run that from the GUI for
>> bootstrap, I get the error:
>>
>> "Unable to compute bootstrap distributions for a single trial"
>>
>> Is this surrogate function only designed to do boostrapping over trials?
>> Or is there a way to do it over windows within a condition?
>>
>> _______________________________________________
>> Eeglablist page: http://sccn.ucsd.edu/eeglab/eeglabmail.html
>> To unsubscribe, send an empty email to eeglablist-unsubscribe at sccn.uc
>> sd.edu
>> For digest mode, send an email with the subject "set digest mime" to
>> eeglablist-request at sccn.ucsd.edu
>>
>>
>>
>>
>> --
>> Makoto Miyakoshi
>> Swartz Center for Computational Neuroscience
>> Institute for Neural Computation, University of California San Diego
>>
>> _______________________________________________
>> Eeglablist page: http://sccn.ucsd.edu/eeglab/eeglabmail.html
>> To unsubscribe, send an empty email to eeglablist-unsubscribe at sccn.uc
>> sd.edu
>> For digest mode, send an email with the subject "set digest mime" to
>> eeglablist-request at sccn.ucsd.edu
>>
>>
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://sccn.ucsd.edu/pipermail/eeglablist/attachments/20160825/f82b72c2/attachment.html>