The EEGLAB News #13

Question: I expected some variation in ICA decomposition between repeated runs, but not results that differ to this extent. Am I missing something obvious?

Explained further:

By chance I noticed that the results of the ICA decomposition and the subsequent classification with ICLabel varies considerably when run repeatedly on the same dataset (e.g., from 5 to 11 components classified as eye and/or muscle artifacts with 90% probability for the exact same dataset). This is only the case when the data are FCz-referenced (i.e., the online reference is kept unchanged). If the data are instead re-referenced to average reference and then repeatedly run through ICA, the resulting decompositions and classifications appear more stable.

Before sending the data to ICA decomposition it has been:

  1. high-passed filtered (1Hz) with pop_eegfiltnew
  2. processed with pop_cleanline (default settings)
  3. processed with pop_clean_rawdata (default settings)
  4. removed channels have been interpolated with pop_interp (‘spherical’)
  5. after these four steps the data is saved with the online reference FCz as one dataset and re-referenced to average reference (with FCz added back) and saved as a second dataset.

These two datasets are then repeatedly run through ICA decomposition using the following code:
>> pop_runica(EEG, 'extended', 1, 'interrupt', 'on', 'pca', channel rank)
>> pop_iclabel(EEG, 'default') pop_icflag(EEG, ...
[NaN NaN;0.9 1;0.9 1;NaN NaN;NaN NaN;NaN NaN;NaN NaN])

I expected there to be some variation in decomposition (and possibly minor changes in which components were classified as artifactual) between repeated runs due to ICA starting with a random weight matrix, but not that the results of the classification would differ to this extent. Am I missing something obvious? Does the PCA before ICA add to the instability of the results? I have repeated this procedure for several different datasets and some appear more stable than others. Could this simply be a question of data quality leading to less stable ICA decompositions? Any form of assistance is appreciated.



Response by Scott Makeig:


Making or missing a 90% hard threshold in ICLabel can make the variability seem stronger than it is. Did you look at the actual distribution of e.g. Eye component likelihood values for 'candidate' Eye components?


Response by Anna:

Dear Dr Makeig,

Thank you very much for your assistance. I have tested to lower the threshold with a somewhat increased similarity between the components classified as artifactual between repeated runs. The distribution of eye and muscle classification probabilities were relatively similar between the runs, but they still differed, and the ICA decompositions seem to vary more in datasets referenced to FCz than those with the average reference. These variations seem further to have downstream effects because when I calculate the mean alpha power (8-13 Hz) for each dataset (i.e., the same dataset being repeatedly run through ICA and ICLabel), the results vary. Thus, for average-referenced data, the processing pipeline is reproducible: repeated runs with the same dataset return similar results. For FCz-referenced data, however, the results vary between runs with the same settings. Note that I have only repeated the runs a small number of times.

I am wondering what might be causing this problem because I would ideally like to use a hard threshold with ICLabel instead of removing components manually, both for reproducibility reasons and because I have a large dataset.


Response by Scott Makeig:


I welcome this information that the distribution of brain and eye source components returned by ICA decomposition of your data appear to be more stable when the data are first converted to average reference. It might be of use to the field for you to test this observation more exhaustively and publish the results as a research note.

How might conversion to average reference make ICA decomposition in this sense less stable? Average reference is widely used, including in source analysis (e.g. in the fieldtrip-sourced equivalent dipole location procedure dipplot in EEGLAB). Might most or all of the ICLabel training data have been converted to average reference before ICA decomposition? This is quite likely as this has long been our standard practice at SCCN. If so, this would suggest that the network trained on those data could be expected to be more accurate and stable when applied to novel data that had been converted to average reference.

Note also that Luca and I noted, in our paper introducing ICLabel, that while the percentage values ICLabel reports for the various component source categories can informally be treated as likelihoods, they are not formal probabilities but could be expected to map monotonically into probabilities, if these were able to be correctly computed. Thus, the use of a hard threshold (e.g., lilkelihood>0.9) should not be interpreted as use of a formal probability threshold.

ICA decomposition, by infomax or in AMICA, is purposefully stochastic. There are two mechanisms implementing this. First, by default the initial source unmixing matrix is randomly chosen. Second, the order in which the data points are considered in each training step is freshly randomized at each step. These steps are taken to insure that no part of the data is more dispositive than other parts simply by being considered first, and that the solution is not unduly biased by insertion of a biased model at the beginning of the decomposition process. (Note that in relica/binica, an option is given to optionally use a known matrix when beginning the decomposition - this can be used to examine the progression of the decomposition progress by stopping and again starting at selected steps in the training process).

While classical statistics can be formally derived from the mathematical assumption that the underlying data distribution is gaussian, for ICA no closed form derivation is possible - even the conceptual bases of the decomposition method (entropy, mutual information) cannot be precisely computed from data of any appreciable (but still finite) size. The high degree of stability of infomax and AMICA decomposition when applied to EEG data could not be achieved if the source activity distributions were gaussian - they are, in nearly all cases, demonstrably nongaussian, meaning formal closed-form gaussian statistics computed from them cannot be completely accurate.

Brain activity is complex. Even the assumption that brain sources are temporally independent can be used only as a heuristic, since brain activity is affected by complex interconnections within the brain and to external events. ICA decomposition of scalp-recorded brain electrical activity by infomax or AMICA finds, in essence, maximally independent sources in the data it is applied to. Its effect is to eliminate or greatly reduce the effects of the broad mixing, in each scalp-recorded channel, of spatially - and typically functionally - distinct source signals by volume conduction through brain, CSF, skull, and scalp. The number of appreciable brain (principally cortical) and non-brain sources mixed into recorded scalp data appears to be in fact small enough that ICa decomposition can isolate them into separate components (ICs).

I refer to sources labeled by ICLabel as likely 'brain sources' as ' effective brain sources.' Why? because they represent, I believe, emergent small areas of relative synchrony in cortical local field potential, in particular its aspect normal to the cortical surface that should dominate 'far field' (e.g., scalp) recordings.

Appearance of cortical 'islands' or 'patches' or 'ponds' that exhibit relative local field synchrony across (e.g.) the scale of minicolumns (~0.1 ms in diameter) should dominate scalp recordings relative to desynchronized activity in the much larger number of desynchronized cortical minicolumns, as these activities will largely be cancelled in scalp recordings by phase cancellation. Both ECoG recordings (including new, very high resolution recordings) and dynamical cortical models include appearances of such 'effective source' activity patterns. The success of ICA decomposition in isolating (and locating) these 'effective brain source' activities depends on their strong appearance (typically repeated in the data) at the same place in cortex. ICA decompositions show that locations of these effective sources clearly vary with subject state, intent, and experience.

Typically, one might expect that these 'effective source' processes within a given data set should range in total scalp projection amplitude from relatively large to very small - and to number more than the number of components found in the decomposition (i.e., the number of recorded channels). However, if this amplitude distribution includes relatively few largest 'effective source' phenomena, ICA decomposition might be expected to isolate these into separate ICs.

Meanwhile, the number of non-cortical source processes that contribute appreciably to scalp recordings are, under favorable conditions' relatively few (major and minor eye movements, line noise, scalp muscle activities, etc. - but as they also include some degree of single electrode noise, must themselves be larger in number than the number of available component degrees of freedom (i.e., the number of scalp channels). The effect of electrode and room electrical noise was made clear to us when we examined EEG data collected in an MRI scanner housed in an electrically shielded chamber - even in simple scrolling presentation the data appeared strikingly more 'clean' than data recorded in the laboratory under similar subject conditions ...

These facts and models mean that the IC decomposition should in effect be attempting to unmix a relatively large number of source processes (mostly with relatively small contributions) into a fixed and relatively small number of ICs. In practice, when applied to well-recorded and not excessively 'noisy' appearing EEG data, IC distributions include at least 3 recognizable subsets, i.e. 'effective brain sources', 'effective non-brain sources', plus uninterpretable sources (typically small in effect) that can be said to constitute the 'noise subspace' of the decomposition. Of course, for most dataset decompositions, some components may not be confidently assigned to one of these three compartments (as ICLabel results also suggest).

I hope these thoughts might suggest to you and others a more nuanced view of the difficulty of the problem addressed by ICA decomposition of EEG (or related) data, and how far ICA may be (and is, in practice) able to succeed in separating the data into temporally independent components.  Its success, like that of any algorithm applied to complex data, rests on the degree to which the recorded (here, scalp) data are in fact principally the mixed output of a relatively small number of nearly independent source processes that make large contributions to the recorded signals relative to the vast number of quite small contributions made by e.g. local cortical and other electrical source processes, whose effects on the scalp data are further reduced by phase cancellation in the scalp channel-recorded source mixtures.

I have gone on a bit here ... Likely I should work this text into a publishable proposal for understanding the success of ICA applied to EEG data -- a contribution that could be made still more useful if it included evidence based on near-ground truth results derived from EEG and ECoG data...


Scott Makeig

Response by Cedric Cannard


It has been highlighted in the field of frontal alpha asymmetry research that using a Cz-reference (as commonly used) under- or over-estimates alpha activity at the target site, leading to alpha activity at all other sites to poorly correlate with alpha activity estimated with other reference montages like mastoid- or average- ref. See this great paper by Smith et al, 2017 - section The "Impact of Reference and Reference-free Transformations": "averaged mastoids and the average reference montages create a localization difficulty so that power from distant intracranial sources is apparent at an unrepresentative scalp electrode (see Figure 1). Most vexing is alpha “mirroring” whereby frontal alpha power is contaminated by recording the opposite polarity of an oscillating occipital alpha dipole".

To me, the same would apply to your FCz reference. If ICLabel was trained on averaged-reference data, I am suspecting that what you are observing might be due to over-estimated frontal alpha asymmetry dynamics by your FCz-reference located in the middle of it.

As you can see Figure 1 of the paper above, some posterior alpha activity typically smears into frontal regions with average- or mastoid- reference. So your topo might over-reflect frontal alpha asymmetry, which has a topography that can look like one from horizontal eye movements (left/right frontal dipole). I don't know how much ICLAbel relies on the power spectrum and if that would be enough to avoid that potential bias. But ICLabel might incorrectly tag these alpha asymmetries as eye movements, since they are normally not dominant compared to posterior alpha power.

Furthermore, posterior alpha asymmetries have been observed (e.g., Davidson 1992) and can go in the opposite direction than frontal ones. So your FCz-reference might affect posterior alpha in some unknown way (e.g., phase cancelation by opposite dipoles in these frequencies?).

Referencing on a single electrode should generally be avoided, since all resulting activity is vulnerable/influenced by large changes at that one site. A FCz reference is especially vulnerable to eye artifacts that may affect all other electrodes, so it could also simply be that you are smearing subtle eye movements into all your electrodes, confusing ICLabel?

Cedric Cannard