[Eeglablist] Nonparametric test for statistical significant difference between two ERPs?
Tim Mullen
mullen.tim at gmail.com
Wed Aug 31 17:21:50 PDT 2011
Hi Aleksandra,
Yes, a parametric t-test should not be used if the distribution of the data
is not gaussian.
You can also use pop_signalstat() to plot statistics for a channel or
component. This will perform, among other things, a kstest for
gaussianity. More generally, you can test for univariate normality of any
random vector using kstest(), jbtest(), or lillietest() from the Matlab
statistics toolbox. These each have slightly different null hypotheses, so
check the doc file for info and to determine the test most appropriate for
your data. Testing for multivariate normality is trickier, but there are
user-contributed functions available for this in the Matlab FEX (e.g.
Henze-Zirkler's Multivariate Normality Test (HZmvntest.m)).
Scalp EEG data is not necessarily gaussian distributed (and ICA-derived
component activations are almost certainly not given the assumption of
non-gaussianity of all but one IC). Likewise ERP or ERSP samples may not be
normally distributed. However, there is a caveat: According to the Central
Limit Theorem, as the number of independent trials (i.e. the "ensemble
size") you average over to produce the ERP or ERSP estimator approaches
infinity the distribution of the estimator itself will approach that of a
gaussian. This means that if you have produced your ERP or ERSP by averaging
over a very large number of trials, the distribution of that mean-estimator
may well be gaussian. Unfortunately, determining *a priori *the number of
samples required to ensure gaussian convergence is not trivial (although, as
I understand it, the gaussian convergence for a finite number of samples is
often only valid near the peak of the normal distribution, not out near the
tails). In general, it is safe to say that if the number of trials being
averaged over are relatively few (as in most EEG experiments), then the
distribution will not have converged to gaussian. And, unfortunately, for a
single subject you have only a single observation of the ensemble-averaged
ERP or ERSP (at a given time/frequency, and channel) and therefore cannot
directly assess the normality of the distribution of the ensemble average --
that is, unless you use a resampling method to obtain an empirical estimate
of the distribution, as discussed below.
For this reason we (EEGLAB/Fieldtrip developers) generally advocate the use
of statistics based on resampling, which are implemented in Fieldtrip and in
Arnaud Delorme's statcond() function, a core part of EEGLAB STUDY stats.
There are a number of possibilities here, and if interested you can check
out David Groppe's video-lecture and slides in our online EEGLAB workshop (
http://tinyurl.com/eeglab-stats). You can also search our past EEGLAB
workshop webpages (http://tinyurl.com/eeglab-workshops) for additional
slides on the topic by Robert Oostenveld, Arnaud Delorme,
and Guillaume Rousselet.
Bootstrap (resampling trials with replacement) or Jacknife (leave-one-out
resampling) methods can be used to obtain an empirical estimate of the
distribution of the ERP or ERSP. If you are averaging over many trials (and
you can verify that the bootstrap or jacknife distribution is normal) then
this provides an empirical estimate of the variance of a distribution. If
you have two conditions with normal bootstrap distributions, then a
two-sample t-test can be used to obtain p-values w.r.t the null hypothesis
that the difference in distribution means is zero. Similarly, if you have
just one distribution a t-test can be used to obtain p-values w.r.t the null
hypothesis that an estimated ERP or ERSP sample is zero. If the distribution
is not normal, you can also obtain p-values by determining the empirical
probability that a sample from a bootstrap distribution (or the difference
of two distributions) is greater than zero. Alternately, non-parametric
permutation tests (e.g. "label swapping") can be performed to test for
significant differences between conditions. The latter has the advantage
that you don't need to make any assumptions regarding the shape of the
distribution. The majority of these tests (and other more complicated
ANOVA's etc) can be performed using statcond(). The downside to these
non-parametric tests is that they require more CPU time and memory than a
parametric test (e.g. you have to compute the mean ERP or ERSP hundreds or
thousands of times and store the results). However, statcond() is carefully
optimized to be maximally time-efficient and in many cases it does not take
much time at all to compute perform the non-parametric test (although it may
require a good bit of memory, esp for ERSPs, so make sure you have a few Gb
available). Take a look at the help text and examples to get started and
give it a whirl!
One point to make is that if you are testing for the difference in means
between two populations, in many cases we can assume the population
distributions are normal and thus a parametric t-test is suitable. Again,
you can test for normality using the functions I listed above.
Another interesting consequence of the CLT applies to the distribution of
the scalp EEG itself. Since the signal recorded at the scalp is the sum of
multiple sources inside (and outside) the brain, if, for a given electrode,
we assume the number of contributing (non-zero) sources is very large and
that all these sources are multivariate i.i.d, then by the CLT, the
distribution of the EEG recorded at this electrode should tend toward
gaussianity. Of course the assumption that all contributing sources are
i.i.d is almost certainly false (or only partially true), but certainly the
sensor data is more gaussian than that of any of the sources (this is, in
fact, an implicit assumption of ICA).
Also, on the topic of robust multivariate statistics, try Cyril Pernet and
Guillaume Rousselet's EEGLAB-compatible LIMO EEG (LInear MOdelling of EEG)
toolbox: (http://tinyurl.com/limo-stats).
Hope this is helpful,
Tim
On Wed, Aug 31, 2011 at 1:39 AM, Aleksandra Vuckovic <
Aleksandra.Vuckovic at glasgow.ac.uk> wrote:
> Dear all,
> I am comparing two ERP conditions using the pop_comerpfunction, through EEGLabGUI. It has an option for setting a statisticalsignificance for a two-tailed t test. Can somebody tell me how can we
> justify that it is Ok to apply a parametric test, i.e. that the EEG data
> have a normal distribution? Is it possible to apply a non-parametric test
> using the same GUI?
> Many thanks,
> Aleksandra
>
> _______________________________________________
> Eeglablist page: http://sccn.ucsd.edu/eeglab/eeglabmail.html
> To unsubscribe, send an empty email to
> eeglablist-unsubscribe at sccn.ucsd.edu
> For digest mode, send an email with the subject "set digest mime" to
> eeglablist-request at sccn.ucsd.edu
>
--
--------- αντίληψη -----------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://sccn.ucsd.edu/pipermail/eeglablist/attachments/20110831/3055aa1c/attachment.html>
More information about the eeglablist
mailing list