[Eeglablist] AMICA number of mixture components

Jason Palmer japalmer29 at gmail.com
Sat Mar 19 08:46:45 PDT 2016


Hi Tatu,

 

Sorry for the delay in responding. [Makoto, I think he is referring to the num_mix_comps variable which controls the source density mixture model, not the number of ICA models (num_models).]

 

The num_mix_comps setting controls the number of densities used in the Generalized Gaussian mixture model for each source density. The default is 3, which seems to work well, basically having one density to fit each tail and one to fit the peak. Using up to 5 or more should give very similar results but take longer to run. Some very unusual source pdfs might benefit from having a larger number of mixture components, but actual sources seem to be well represented by the 3 density mixture. In principle, having too many densities in the mixture could lead to overfitting, but with the usual number of samples in EEG data (100,000 or more), it is not likely to be able to overfit the source distributions as even with 6 mixture components there are a relatively small number of degrees of freedom compared with the number of samples. In general ICA can work with simple sub- or super-gaussian density models for sources, but the error for a finite number of samples depends on the fidelity of the source density model, with minimum variance using the actual source density.

 

It is also possible to use Gaussian mixture models instead of Generalized Gaussian, for comparison purposes. With Gaussians, having a higher num_mix_comps should improve the estimation since log linear tails can be fit better with more Gaussians, which aren’t necessary using the default Generalized Gaussian mixture model.

 

So, basically the num_mix_comps parameter is not meant to be changed in standard usage. The parameter may be automatically determined based on likelihood in the next version, which will have density models other than Generalized Gaussian, including non-symmetric (skew) models.

 

Best,

Jason

 

From: eeglablist-bounces at sccn.ucsd.edu [mailto:eeglablist-bounces at sccn.ucsd.edu] On Behalf Of Makoto Miyakoshi
Sent: Friday, March 18, 2016 12:36 AM
To: Tatu Huovilainen
Cc: EEGLAB List
Subject: Re: [Eeglablist] AMICA number of mixture components

 

Dear Tatu,

 

That's a good question. I've only heard of heuristic way to determine it.

Jason once told me that start with 5 or 6 models, and if you find one or two models that does not account so much data (you can check it with amica utility tools to see which model explains which part of data) remove them... does that make sense?

 

Makoto

 

 

On Wed, Jan 20, 2016 at 7:35 PM, Tatu Huovilainen <Tatu.Huovilainen at helsinki.fi> wrote:

Hi Makoto, Dr. Palmer & eeglab list,

I have a few specific questions about AMICA, that I failed to find answers to from previous discussions. What should I use as a criterion for choosing the 'num_mix_comps' parameter? I've understood that increasing the number will result in better model fit, but with a chance of overfitting. Is there a way to make an approximation of how many mixture components it's ok to estimate, like the k(n_channels)^2 rule for infomax? Will it cause trouble (besides taking much longer), given that I have enough samples to avoid overfitting, if a source is well approximated with 3 densities but I'm using, say, 6? Are there other aspects of the data that affect choosing this number, like sensor types or snr?

Regards,
Tatu Huovilainen
_______________________________________________
Eeglablist page: http://sccn.ucsd.edu/eeglab/eeglabmail.html
To unsubscribe, send an empty email to eeglablist-unsubscribe at sccn.ucsd.edu
For digest mode, send an email with the subject "set digest mime" to eeglablist-request at sccn.ucsd.edu





 

-- 

Makoto Miyakoshi
Swartz Center for Computational Neuroscience
Institute for Neural Computation, University of California San Diego

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://sccn.ucsd.edu/pipermail/eeglablist/attachments/20160319/d5099ecd/attachment.html>


More information about the eeglablist mailing list