<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-15">
<META content="MSHTML 6.00.6000.16809" name=GENERATOR></HEAD>
<BODY style="MARGIN: 4px 4px 1px; FONT: 14pt Arial">
<DIV>One thought about number of components might be to apply information theoretic tools to your data that essentially ask 'How much unique information is contained within my data?'. Our colleagues have done this for ICA of fMRI data with some success... Although I've not yet systematically thought about its application to EEG data, on first blush it seems like a comparable approach. For example, there are tools like Minimum Description Length (MDL) or AIC that might be useful. Scott or others might have opinions about any pros/cons with using various algorithms with EEG data.</DIV>
<DIV> </DIV>
<DIV>It's also worth pointing out that any choice you make about how many components to estimate is, by definition, arbitrary. When viewed that way, the issue becomes more about trusting what process or methods you used to make the most educated guess... and being able to defend the thought process and assumptions you made to get you to that number.</DIV>
<DIV> </DIV>
<DIV>Hope that's a little useful.</DIV>
<DIV>Mike</DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV>
<DIV>Michael C. Stevens, Ph.D.<BR><BR>Director, Child and Adolescent Research<BR>The Institute of Living / Hartford Hospital<BR><BR>Director, Clinical Neuroscience & Development Laboratory<BR>Olin Neuropsychiatry Research Center<BR><BR>Assistant Clinical Professor of Psychiatry<BR>Yale University School of Medicine<BR><BR>Contact Information:<BR>200 Retreat Avenue<BR>ONRC, Whitehall Building<BR>Hartford, CT 06106<BR><BR>Tel: (860) 545-7552<BR>Fax: (860) 545-7797</DIV>
<DIV><A href="http://www.nrc-iol.org/onrc_labs_cnd.asp">http://www.nrc-iol.org/onrc_labs_cnd.asp</A></DIV><BR><BR>>>> </DIV>
<DIV style="PADDING-LEFT: 7px; MARGIN: 0px 0px 0px 15px; BORDER-LEFT: #050505 1px solid; BACKGROUND-COLOR: #f3f3f3">
<TABLE style="FONT: 14pt Arial" bgColor=#f3f3f3>
<TBODY>
<TR vAlign=top>
<TD><STRONG>From: </STRONG></TD>
<TD>Scott Makeig <smakeig@gmail.com></TD></TR>
<TR vAlign=top>
<TD><STRONG>To:</STRONG></TD>
<TD>Dorothy Bishop <Dorothy.Bishop@psy.ox.ac.uk></TD></TR>
<TR vAlign=top>
<TD><STRONG>Date: </STRONG></TD>
<TD>4/6/2009 5:04 PM</TD></TR>
<TR vAlign=top>
<TD><STRONG>Subject: </STRONG></TD>
<TD>Re: [Eeglablist] questions about N components and high pass filtering</TD></TR>
<TR vAlign=top>
<TD><STRONG>CC:</STRONG></TD>
<TD><eeglablist@sccn.ucsd.edu></TD></TR></TBODY></TABLE>Dorothy -<BR><BR>
<DIV class=gmail_quote>On Mon, Apr 6, 2009 at 8:28 AM, Dorothy Bishop <SPAN dir=ltr><<A href="mailto:Dorothy.Bishop@psy.ox.ac.uk">Dorothy.Bishop@psy.ox.ac.uk</A>></SPAN> wrote:<BR>
<BLOCKQUOTE class=gmail_quote style="PADDING-LEFT: 1ex; MARGIN: 0pt 0pt 0pt 0.8ex; BORDER-LEFT: rgb(204,204,204) 1px solid">1. If you are doing ICA with the view to removing noise components from a signal, is there an optimal number of components to extract? The manual gives guidance on how to compute the maximum number, but is it more efficient to reduce the data to fewer dimensions? My impression is yes, but I'd be grateful for the views of others, especially if there is some rational means of deciding, rather than relying on trial and error.</BLOCKQUOTE>
<DIV><BR>> For me, the key factor is how much data you have (timepoints / channels^2). If this is > 30 (or near to it), then we find it preferable to return all possible components (since pca does a rather poor job of separating sources). How many components to identify as 'noise' depends on your definition and interests. Simple PCAcompatible concepts such as EEG = signalspace + noisespace are not sufficient here, as ICA separates all sorts of "non-cortical brain EEG source processes" (aka noise) from each other.<BR></DIV>
<BLOCKQUOTE class=gmail_quote style="PADDING-LEFT: 1ex; MARGIN: 0pt 0pt 0pt 0.8ex; BORDER-LEFT: rgb(204,204,204) 1px solid"><BR>2. It's not uncommon in my area for people to filter the data prior to processing, and 1 Hz is a common value to select for high pass cutoff. However, I'm concerned that if the SOA is around 1 second, then this filter may remove genuine upward or downward trends in the data that are stimulus-related. Have others got views and/or recommendations on this?</BLOCKQUOTE>
<DIV><BR>> This is a difficult question. IF the sources of < 1 Hz data are spatially different from those at higher frequencies (e.g., from sweating, etc), then removing them (or decreasing them, actually) by frequency filtering may make sense (we routinely do it). However, if the low frequency activity is from discrete, spatially stationary sources (the same as the sources of higher-frequency EEG, or not), then leaving them in the data for ICA decomposition may well be preferable.<BR><BR>Scott Makeig<BR></DIV>
<BLOCKQUOTE class=gmail_quote style="PADDING-LEFT: 1ex; MARGIN: 0pt 0pt 0pt 0.8ex; BORDER-LEFT: rgb(204,204,204) 1px solid"><BR>Many thanks.<BR><BR>Dorothy Bishop<BR>Professor of Developmental Neuropsychology<BR>Department of Experimental Psychology<BR>University of Oxford<BR>OX1 3UD<BR><A href="http://psyweb.psy.ox.ac.uk/oscci/" target=_blank>http://psyweb.psy.ox.ac.uk/oscci/</A><BR><BR>tel: +44 (0)1865 271369<BR>fax: +44 (0)1865 281255<BR><BR>_______________________________________________<BR>Eeglablist page: <A href="http://sccn.ucsd.edu/eeglab/eeglabmail.html" target=_blank>http://sccn.ucsd.edu/eeglab/eeglabmail.html</A><BR>To unsubscribe, send an empty email to <A href="mailto:eeglablist-unsubscribe@sccn.ucsd.edu">eeglablist-unsubscribe@sccn.ucsd.edu</A><BR>For digest mode, send an email with the subject "set digest mime" to <A href="mailto:eeglablist-request@sccn.ucsd.edu">eeglablist-request@sccn.ucsd.edu</A><BR></BLOCKQUOTE></DIV><BR>-- <BR>Scott Makeig, Research Scientist and Director, Swartz Center for Computational Neuroscience, Institute for Neural Computation, University of California San Diego, La Jolla CA 92093-0961, <A href="http://sccn.ucsd.edu/~scott">http://sccn.ucsd.edu/~scott</A><BR></DIV></BODY></HTML>