Independent Component Clustering Example
Comparing Independent Components Across Subjects
for a volume in the Elsevier Progress in Brain Research book series, Eds. Neuper & Klimesch, 2006, in press.
Do different subjects performing the same task have equivalent independent EEG components (ICs)? In exchange for the benefits that ICA offers to EEG analysis in both spatial and temporal resolution of separable source-level activities, it also introduces a new level complexity into EEG analysis. In traditional scalp channel signal analysis, clustering of event-related EEG phenomena across subjects is straightforward, as each scalp electrode is assumed to be comparable with results from equivalently placed electrodes for the all subjects. Comparing ICA results across subjects, on the other hand, requires that, if possible, subject ICs from different subjects should likewise be grouped into clusters of ICs that are functionally equivalent despite differences in their scalp maps.
As we have seen, however, data recorded at a single scalp channel within each subject is heterogeneous, so the idea of grouping channel activity across subjects may actually be a risky proposition. In particular, clear physical differences between subjects in the locations and, particularly, the orientations of cortical gyri and sulci mean that even exactly equivalent cortical sources may project, across subjects, with varying relative strengths to any single scalp channel location, no matter how exactly reproduced across subjects. Thus, the basic assumption in nearly all EEG research, that activity at a given scalp location should be equivalent in every subject, is itself questionable. On the other hand, changing the basis of EEG evaluation from scalp channel recordings to IC activities necessitates an extra step compared to channel analysis -- that of combining and/or comparing results across subjects through identifying equivalent IC processes, if any, in their data.
Approaches to IC clustering. The process of identifying sets of equivalent ICs across subjects, or even across sessions from the same subject, can proceed in many ways depending on the measures and experimental questions of interest. An appealing approach to clustering ICs is by their scalp map characteristics. Such clustering can be attempted by eye, by correlation, or by an algorithm that searches for common features of IC scalp maps. The disadvantage of this method is that, as shown in Fig. 2, slight differences across subjects in the orientation of equivalent dipoles for a set of equivalent ICs can produce quite different IC scalp maps.
Clustering ICs based on the 3-D locations of their equivalent dipoles may avoid this problem. Using this method, it is possible to describe typical event-related or other activities in cortical areas of interest, or at least in cortical areas with sufficient density of IC equivalent dipoles across subjects or sessions. Common clustering algorithms such as K-means and other distance-based algorithms can be used to cluster ICs based on the 3-D locations of their equivalent dipoles quickly and easily. However, clustering on estimated cortical location alone may introduce similar confounds as clustering by scalp channel location, since subjects may have multiple types of IC processes in the same general cortical regions.
For one, comparing cortical locations across subjects raises the same spatial normalization questions as arise in functional magnetic resonance imaging (fMRI) analysis. Since brain shapes differ across subjects, true comparison of 3-D equivalent dipole locations should be performed only after spatially normalizing each set of subject IC locations to his or her normalized individual structural magnetic resonance (MR) brain image. This requires MR images be obtained for each EEG subject, a requirement that may greatly increase the resources required for EEG data acquisition.
A simpler method normalizes the 3-D equivalent dipole locations via normalizing the subject head shape, as learned from the recorded 3-D locations of the scalp electrodes, to a standard head model. When 3-D electrode location information is not available, the expected functional specificity of equivalent dipole clusters based on estimated equivalent dipole locations in a standard head model must be reduced. In this case, some IC processes estimated to be located in the same cortical area may not express the same functional activities. Despite this drawback, our results show that clustering component dipole locations in a standard spherical head model still allows meaningful conclusions about differences in regional EEG activities across one or more subject groups, if sufficient statistical testing is applied to the data, and if the limitations of the analysis are acknowledged.
Because homogeneity of an IC cluster is most accurately assessed and characterized by the activities of its constituent ICs, a more direct route to obtaining functionally consistent clusters may be to group ICs from experimental event-related studies according to their event-related activity patterns. For example, a recent EEG/fMRI study of Debener et al. (2005) clustered components contributing most strongly to the event-related negativity (ERN) feature of the average ERP time locked to incorrect button presses in a speeded choice manual response task. Remarkably, the authors showed that trial-to-trial variations in the strength of the activity underlying the ERN correlated with changes in the fMRI BOLD signal only in the immediate vicinity of the equivalent dipole source for the component cluster. In some cases, therefore, clustering ICs on similarities in their ERP contributions can be a simple but powerful approach to discovering sources of well-documented ERP peaks.
If the measure of primary interest is not the average ERP but, instead, event-related fluctuations in spectral power of the ongoing EEG across frequencies and latencies, as measured by average ERSPs, then component ERSP characteristics may similarly be used as a basis for IC clustering. Given a small number of subjects and a simple experimental design, it might be possible to group component ERSPs across subjects by eye, though this quickly becomes discouraging as the number of subjects and/or task conditions rise. In any case, an objective approach is more desirable.
Figure: Clustering ICs from 29 subjects by common properties of their mean event-related activity time courses can be an efficient method for finding homogeneous groups of independent processes across sessions or subjects. Here, ICs were clustered by similarities in 3-D dipole location as well as features of their mean ERSPs time-locked to auditory performance feedback signals within two task conditions (following Correct and Wrong button presses) and by the significant ERSP difference (when any) between them (this significance estimated by non-parametric binomial statistics, p<1e-5). Colored spheres show the locations of the equivalent dipoles for the clustered components. Colored lines connect these clusters to the respective cluster-mean ERSP and ERSP-difference images. Although IC equivalent dipole locations were only a portion of the data used in the clustering algorithm, the equivalent dipole models for four of the obtained clusters (A-D) are spatially distinct. Two other, spatially intermingled clusters (E-F) illustrate how activity-based clustering can differentiate spatially similar components that would not be separated in clustering based on location alone.
As an example, let us consider data from the same two-back task described earlier and illustrated for one subject. Assume there are 20 subjects, each with a mean of 15 dipolar cortical ICs. To prepare the data, each 2-D (latencies, frequencies) component ERSP image for one or more task conditions (correct, incorrect, etc.) must be concatenated and then reshaped into a 1-D (1, latencies × frequencies × conditions) vector. Thereafter, the vectorized component ERSPs from the 20 subjects can be concatenated to form a large 2-D matrix of size (#ICs, latencies × frequencies × conditions) or, in this case, (300, latencies ×frequencies × conditions). A number of options are now available.
A simple approach is to use standard clustering algorithms such as K-means to cluster on Euclidean distances between the rows of the component matrix, whose dimensionality can be made manageable by preliminary PCA reduction. It is also possible to combine dissimilar IC activity and/or location measures in computing component "location" measures from which component-pair "distances" can be derived. The open source EEG analysis toolbox EEGLAB includes a clustering interface that implements this method. The ICACLUST facility enables component clustering across subjects or sessions using a variable set of IC features: ERPs, ERSPs, scalp maps, mean spectra, and/or equivalent dipole locations.
The figure above illustrates preliminary clustering results on 368 near-dipolar ICs from 29 subjects performing the two-back task described earlier. In the figure, equivalent dipoles of the same colors were clustered by computing a Euclidean ‘distance’ measure between the concatenated average component ERSPs time-locked to auditory feedback tones signaling ‘correct’ and ‘wrong’ responses, as well as the significant ERSP difference between the two, and also 3-D dipole location of each IC. The ERSPs plotted for each cluster represent the means over all the cluster components, after masking spectral perturbations not significant (p < 0.00001) by binomial probability across the set of clustered components.
Note that although IC equivalent dipole locations were here only a portion of the data used in the clustering, ICs with similar event-related activity patterns proved to be naturally associated with distinct cortical regions. From the ‘Difference’ activity the (A, light blue) central midline cluster, it is clear that this cortical area produced a different activity pattern following wrong responses, namely a 400-ms theta band burst that began before the auditory feedback during the period of the motor response. This result is in line with our previous findings (Luu et al., 2004) and neatly reproduces the recent result of Debener et al. (2005) who used time-domain analysis of simultaneous EEG and functional magnetic resonance imaging (fMRI) data to show that trial-to-trial variations in post-error activity of a very similarly located IC cluster were correlated with trial-to-trial variations in fMRI blood oxygen level-dependent (BOLD) signal only directly below the cortical projection of the component cluster, and highly coincident with the location of the equivalent dipole cluster in the figure above.
ERSP and equivalent dipole locations for three other activity-derived but spatially ‘tight’ component clusters are shown in the figure above. IC clusters located in/near left and right hand somatomotor cortex (B, yellow and C, magenta, respectively) exhibited significantly stronger alpha activity for a half-second after receiving Correct feedback than after receiving Wrong feedback, as confirmed by permutation-based statistical testing. Analysis of responses to matching versus non-matching letters (not shown) revealed the expected dominance of spectral perturbations contralateral to the actual response hand. Finally, event-related perturbations in spectral power in the bilateral occipital (D, green) IC cluster differed little following Correct versus Wrong feedback.
In contrast, the same clustering of ICs on dipole locations, condition ERSPs, and ERSP differences produced two more spatially diffuse and highly overlapping occipital dipole clusters (red and blue). Although the spatial distributions of these two clusters cannot be distinguished, the difference ERSPs between the two clusters do differ. Much like the left somatomotor (yellow) cluster, components in the blue cluster exhibit increased alpha activity following Correct auditory feedback. Of course, interpretation of the clustered component activities is a problem separate from clustering. However, once successful clustering of independent component activities has been accomplished, meaningful conclusions about brain function may be approached with more confidence.
Julie Onton & Scott Makeig,
SCCN/INC/UCSD, March 2006