<div dir="ltr">Dear Jason,<div><br></div><div><span style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14.6666669845581px">> [Makoto, I think he is referring to the num_mix_comps variable which controls the source density mixture model, not the number of ICA models (num_models).]</span><br><div class="gmail_extra"><br></div><div class="gmail_extra">Oops, you are right. I was wrong. Thank you for correcting Jason.</div><div class="gmail_extra"><br></div><div class="gmail_extra"><span style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14.6666669845581px">> basically having one density to fit each tail and one to fit the peak.</span><br></div><div class="gmail_extra"><br></div><div class="gmail_extra">That makes sense.</div><div class="gmail_extra"><br></div><div class="gmail_extra"><span style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14.6666669845581px">> The parameter may be automatically determined based on likelihood in the next version, which will have density models other than Generalized Gaussian, including non-symmetric (skew) models.</span><br></div><div class="gmail_extra"><br></div><div class="gmail_extra">That sounds interesting. I'm looking forward to seeing how the performance changes by supporting skewed distributions, which I guess often happen.</div><div class="gmail_extra"><br></div><div class="gmail_extra">Makoto</div><div class="gmail_extra"><br></div><div class="gmail_extra"> </div><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Mar 19, 2016 at 8:46 AM, Jason Palmer <span dir="ltr"><<a href="mailto:japalmer29@gmail.com" target="_blank">japalmer29@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div lang="EN-US" link="blue" vlink="purple"><div><p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:black">Hi Tatu,<u></u><u></u></span></p><p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:black"><u></u> <u></u></span></p><p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:black">Sorry for the delay in responding. [Makoto, I think he is referring to the num_mix_comps variable which controls the source density mixture model, not the number of ICA models (num_models).]<u></u><u></u></span></p><p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:black"><u></u> <u></u></span></p><p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:black">The num_mix_comps setting controls the number of densities used in the Generalized Gaussian mixture model for each source density. The default is 3, which seems to work well, basically having one density to fit each tail and one to fit the peak. Using up to 5 or more should give very similar results but take longer to run. Some very unusual source pdfs might benefit from having a larger number of mixture components, but actual sources seem to be well represented by the 3 density mixture. In principle, having too many densities in the mixture could lead to overfitting, but with the usual number of samples in EEG data (100,000 or more), it is not likely to be able to overfit the source distributions as even with 6 mixture components there are a relatively small number of degrees of freedom compared with the number of samples. In general ICA can work with simple sub- or super-gaussian density models for sources, but the error for a finite number of samples depends on the fidelity of the source density model, with minimum variance using the actual source density.<u></u><u></u></span></p><p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:black"><u></u> <u></u></span></p><p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:black">It is also possible to use Gaussian mixture models instead of Generalized Gaussian, for comparison purposes. With Gaussians, having a higher num_mix_comps should improve the estimation since log linear tails can be fit better with more Gaussians, which aren’t necessary using the default Generalized Gaussian mixture model.<u></u><u></u></span></p><p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:black"><u></u> <u></u></span></p><p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:black">So, basically the num_mix_comps parameter is not meant to be changed in standard usage. The parameter may be automatically determined based on likelihood in the next version, which will have density models other than Generalized Gaussian, including non-symmetric (skew) models.<u></u><u></u></span></p><p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:black"><u></u> <u></u></span></p><p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:black">Best,<u></u><u></u></span></p><p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:black">Jason<u></u><u></u></span></p><p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:black"><u></u> <u></u></span></p><p class="MsoNormal"><b><span style="font-size:10pt;font-family:Tahoma,sans-serif">From:</span></b><span style="font-size:10pt;font-family:Tahoma,sans-serif"> <a href="mailto:eeglablist-bounces@sccn.ucsd.edu" target="_blank">eeglablist-bounces@sccn.ucsd.edu</a> [mailto:<a href="mailto:eeglablist-bounces@sccn.ucsd.edu" target="_blank">eeglablist-bounces@sccn.ucsd.edu</a>] <b>On Behalf Of </b>Makoto Miyakoshi<br><b>Sent:</b> Friday, March 18, 2016 12:36 AM<br><b>To:</b> Tatu Huovilainen<br><b>Cc:</b> EEGLAB List<br><b>Subject:</b> Re: [Eeglablist] AMICA number of mixture components<u></u><u></u></span></p><div><div class="h5"><p class="MsoNormal"><u></u> <u></u></p><div><p class="MsoNormal">Dear Tatu,<u></u><u></u></p><div><p class="MsoNormal"><u></u> <u></u></p></div><div><p class="MsoNormal">That's a good question. I've only heard of heuristic way to determine it.<u></u><u></u></p></div><div><p class="MsoNormal">Jason once told me that start with 5 or 6 models, and if you find one or two models that does not account so much data (you can check it with amica utility tools to see which model explains which part of data) remove them... does that make sense?<u></u><u></u></p></div><div><p class="MsoNormal"><u></u> <u></u></p></div><div><p class="MsoNormal">Makoto<u></u><u></u></p></div><div><p class="MsoNormal"><u></u> <u></u></p></div></div><div><p class="MsoNormal"><u></u> <u></u></p><div><p class="MsoNormal">On Wed, Jan 20, 2016 at 7:35 PM, Tatu Huovilainen <<a href="mailto:Tatu.Huovilainen@helsinki.fi" target="_blank">Tatu.Huovilainen@helsinki.fi</a>> wrote:<u></u><u></u></p><p class="MsoNormal">Hi Makoto, Dr. Palmer & eeglab list,<br><br>I have a few specific questions about AMICA, that I failed to find answers to from previous discussions. What should I use as a criterion for choosing the 'num_mix_comps' parameter? I've understood that increasing the number will result in better model fit, but with a chance of overfitting. Is there a way to make an approximation of how many mixture components it's ok to estimate, like the k(n_channels)^2 rule for infomax? Will it cause trouble (besides taking much longer), given that I have enough samples to avoid overfitting, if a source is well approximated with 3 densities but I'm using, say, 6? Are there other aspects of the data that affect choosing this number, like sensor types or snr?<br><br>Regards,<br>Tatu Huovilainen<br>_______________________________________________<br>Eeglablist page: <a href="http://sccn.ucsd.edu/eeglab/eeglabmail.html" target="_blank">http://sccn.ucsd.edu/eeglab/eeglabmail.html</a><br>To unsubscribe, send an empty email to <a href="mailto:eeglablist-unsubscribe@sccn.ucsd.edu" target="_blank">eeglablist-unsubscribe@sccn.ucsd.edu</a><br>For digest mode, send an email with the subject "set digest mime" to <a href="mailto:eeglablist-request@sccn.ucsd.edu" target="_blank">eeglablist-request@sccn.ucsd.edu</a><u></u><u></u></p></div><p class="MsoNormal"><br><br clear="all"><u></u><u></u></p><div><p class="MsoNormal"><u></u> <u></u></p></div><p class="MsoNormal">-- <u></u><u></u></p><div><div><p class="MsoNormal">Makoto Miyakoshi<br>Swartz Center for Computational Neuroscience<br>Institute for Neural Computation, University of California San Diego<u></u><u></u></p></div></div></div></div></div></div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature"><div dir="ltr">Makoto Miyakoshi<br>Swartz Center for Computational Neuroscience<br>Institute for Neural Computation, University of California San Diego<br></div></div>
</div></div></div>