[Eeglablist] ICA "adds" noise?
Jason Palmer
japalmer29 at gmail.com
Wed Jan 23 17:45:27 PST 2013
Hi Joseph et al.,
I believe that any reference, average or channel, does reduce the rank by
one. This is straightforward to show using linear algebra, e.g., here:
http://sccn.ucsd.edu/wiki/Linear_Representations_and_Basis_Vectors#EEG_Data_
Reference_and_Re-referencing
The rank that matlab gives you depends on the tolerance used to declare
small dimensions zero. E.g. a rank deficient matrix usually has smallest
eigenvalue of around 1e-15 to 1e-8, due to numerical imprecision,
particularly with large ill-conditioned matrices. You should see a sudden
drop off though in the eigenvalue magnitudes after the theoretical rank.
Best,
Jason
From: eeglablist-bounces at sccn.ucsd.edu
[mailto:eeglablist-bounces at sccn.ucsd.edu] On Behalf Of Joseph Dien
Sent: Wednesday, January 23, 2013 11:54 AM
To: Matt Craddock
Cc: eeglablist at sccn.ucsd.edu; Kristina Borgström
Subject: Re: [Eeglablist] ICA "adds" noise?
Hmmm
. I should know better than to talk off the top of my head like that
Well, part right, part wrong.
It's easy enough to just try it out and see what Matlab says the rank is of
the data. I took some Cz-referenced 129-channel data and then rereferenced
it to mean mastoid and to average reference.
>> rank(undata)
ans =
128
>> rank(mmdata)
ans =
128
>> rank(ardata)
ans =
128
so the bottom line is that average reference doesn't reduce the rank but
neither does mean mastoid (my error).
As for the wiki page you linked to, it's worded in a confusing way. It's
not that "the average reference reduces the rank of the data" necessarily.
What it should say is that it doesn't increase the rank of the data. So if
you start off with 128 recording channels and a reference channel (n=129)
and rank is 128 (because voltage data is relative by definition so two
channels only give you one waveform), then after average reference, the rank
is still 128 even though it now looks as though you've got 129 channels with
independent waveforms. If you dropped the 129th reference channel and then
computed an average reference channel, you would indeed lose another rank
(as seen below) but that would be because you were doing the procedure
incorrectly and had deleted a channel of meaningful information (even though
it is flat). The flat reference channel should always be included in the
average reference computation. The same goes for computing the mean mastoid
reference (or any other rereference), although unfortunately a lot of
systems throw that information away.
>> rank(ar128data)
ans =
127
I definitely need to look into the effects of bridging more closely,
especially for frequency-based applications. This has been a very helpful
discussion!
Joe
On Jan 23, 2013, at 6:44 AM, Matt Craddock <matt.craddock at uni-leipzig.de>
wrote:
On 18/01/2013 20:39, Joseph Dien wrote:
Another thought occurs to me. I have indeed noticed a tendency for
increased noise to show up in my own ICA-based artifact correction
routine in the EP Toolkit (Tim Curran first reported it to me). I've
never worked out why. I ended up implementing a trial-by-trial
workaround wherein the eyeblink factors are removed from a given
trial only when it reduces the overall variance of the trial. In
other words, when the benefit outweighs the cost. The increased
noise that I see is small enough that it gets averaged out for ERPs
so has not been an issue. Could be an issue for frequency-based
measures though. I need to look into this further. Anyway, what
you're reporting seems more severe than anything I've observed so
perhaps something different.
Hi Joe, Kristina, and all,
I'm mostly dealing with frequency analysis; the noise does indeed pose
some problems for frequency-based measures, since it translates into
noise in the gamma band range (>40Hz, mostly). This issue has been
reported previously on this list:
http://sccn.ucsd.edu/pipermail/eeglablist/2011/004316.html
and the conclusion then was that it was down to reduced rank:
http://sccn.ucsd.edu/pipermail/eeglablist/2011/004319.html
Hence why I jumped on that as an explanation when I saw Kristina's
original post. My situation turns out to be a little different from
hers, in that I use average reference rather than linked mastoids, and
don't keep a reference channel in the data, so it didn't seem to be caused
by the duplicate data issue Makoto identified (although sometimes it may
have been - see later; but wouldn't that also be a rank reduction?). In my
case I've found doing PCA first (reducing number of components to the rank,
so usually only to numChannels-1) makes this problem go away, but given that
everybody said avoid doing that first, I also had a closer look at the
datasets where I'd had this problem and found in some cases that there were
*very* high correlations between some channels (.99 in one case!). Removing
one of those channels before running ICA (and *not* doing PCA) also fixed
the problem. I didn't see any major differences in the components between
PCAing first and removing the channels, though of course that's not to say
there aren't any that would emerge if looking at them more systematically!
Average reference doesn't reduce the rank. Basically all it does is
to virtually move the reference site. In the original
vertex-referenced data, there is informational ambiguity as to
whether recorded voltage fluctuations are due to activity at the
reference site or at the recording site (unavoidable since voltages
are by their nature relative and so require a reference site). When
one algebraically rereferences the data to a different single
reference site (including the virtual reference site of average
reference) there is no increase in informational ambiguity. Mean
mastoid reference does increase increase informational ambiguity
because it introduces a new ambiguity of whether reference site
activity is occurring at the left or right mastoid. In essence, this
is because a subset of the total set of electrodes has been singled
out and mixed together. This increased ambiguity reduces the rank by
one.
Hmm, but this page says that average reference does reduce rank, and that's
been what people have said on this list for quite a while.
http://sccn.ucsd.edu/wiki/Linear_Representations_and_Basis_Vectors
Happy to be corrected, but on the whole I'm left a little puzzled - the
default behaviour of EEGlab's GUI is to suggest PCA reduction if your data
is reduced rank. Given the consensus seems to be to avoid PCA, does that
need to change, or at least to suggest people be very cautious about using
it and try to find alternative ways of conditioning their data, be that
removing channels or whatever?
Cheers,
Matt
--
Dr. Matt Craddock
Post-doctoral researcher,
Institute of Psychology,
University of Leipzig,
Seeburgstr. 14-20,
04103 Leipzig, Germany
Phone: +49 341 973 95 44
----------------------------------------------------------------------------
----
Joseph Dien,
Senior Research Scientist
University of Maryland
E-mail: jdien07 at mac.com
Phone: 301-226-8848
Fax: 301-226-8811
http://joedien.com// <http://joedien.com/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://sccn.ucsd.edu/pipermail/eeglablist/attachments/20130123/cb79a843/attachment.html>
More information about the eeglablist
mailing list