# [Eeglablist] ICA - Mimimum # of Data Points

Makoto Miyakoshi mmiyakoshi at ucsd.edu
Fri Jan 18 18:32:04 PST 2019

```Dear Janelle,

You mean (ch^2)*30, right?
This 'rule of thumb' was only empirically proposed and not even tested. If
you see Julie Onton and Scott Makeig's another paper from around 2009, they
say (ch^2)*20, not 30. My former colleague Jason Palmer, who wrote AMICA,
used to told us that 100ch, 1 million data points...

So why don't you follow whichever rule that shows the shorter length
required?

By the way, I DO believe that it's not the number of datapoints, but actual
data length in time that counts most important. For example, I can
downsample the 30-min data to 100Hz, which makes the number of datapoints
less than (ch^2)*20, but ICA performs well on it. On the other hand, I
predict that increasing the sampling rate to 5k for your 7-min recording
would not help. That's why I wrote in my wiki page that (ch^2)*20 is a rule
for sampling rate of 250Hz or lower.

> With 256 sensors, it would be difficult to collect such a large number of
data points as our acquisition time is about 7 minutes.

Ok, now what do we want to do?
There are two ways to go.

1. Discarding channels so that (ch^2)*20 rule holds--It is always
do-able but I don't recommend this.
2. Performing dimension reduction by using 'pca' option in infomax (or
'pcakeep' in AMICA)--I recommend this. Sometimes I hear people
misunderstand the message of Fiorenzo's paper about the use of PCA as a
preprocessing for ICA (https://www.ncbi.nlm.nih.gov/pubmed/29526744).
His point was that PCA is NOT a solution to same computational time, but
PCA is a lossy compression (like .wav file format is converted to .mp3...
the file size becomes smaller, but information is irreversibly lost) so
don't use it as long as you don't need to use it.

Use 'pca' option with something between 60-80 to meet (ch^2)*20 rule of
thumb.
By the way, this PCA is only for ICA purpose. Your sensor data will be
intact. You'll have 'only' 60-80 ICs, which should not be a problem
(because in most of data, only 10-20 good ICs do the job, and ALL of the
rest is junk. If you want to know the reason, post another question).

Makoto

On Fri, Jan 18, 2019 at 8:27 AM Janelle Crouch <crouchjh at sunypoly.edu>
wrote:

> Is it reliable to use ICA with fewer data points than called for in the
> *ch*2x30 *formula?  With 256 sensors, it would be difficult to collect
> such a large number of data points as our acquisition time is about 7
> minutes.
> If ICA results from a lower number of data points would not be reliable,
> can anyone suggest their preferred method of artifact detection and removal
> that does not use ICA?
>
> Thank you,
>
> Janelle Crouch
> SUNY Polytechnic Institute
>
> _______________________________________________
> Eeglablist page: http://sccn.ucsd.edu/eeglab/eeglabmail.html
> To unsubscribe, send an empty email to
> eeglablist-unsubscribe at sccn.ucsd.edu
> For digest mode, send an email with the subject "set digest mime" to
> eeglablist-request at sccn.ucsd.edu

--
Makoto Miyakoshi
Swartz Center for Computational Neuroscience
Institute for Neural Computation, University of California San Diego
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://sccn.ucsd.edu/pipermail/eeglablist/attachments/20190118/9b2738d0/attachment.html>
```