RERP

From SCCN
Jump to: navigation, search

rERP is an open source Matlab toolbox for calculating overlapping Event Related Potentials (ERP) by multiple regression (an alternative to averaging). It only works on continuous, un-epoched data.

Contents

Download

You can download the toolbox directly here and unzip it in the /plugins/ folder of EEGLAB (10.2.5.5b or later, otherwise will likely encounter errors). If EEGLAB is already running you may type:

>> eeglab rebuild

to load the toolbox. You should see ‘rERP’ menu under Tools once you have loaded a dataset.

Report an issue or make a suggestion

References

[1] M. D. Burns, N. Bigdely-Shamlo, N. J. Smith, K. Kreutz-Delgado, and S. Makeig, "Comparison of Averaging and Regression Techniques for Estimating Event Related Potentials," presented at the IEEE Engineering in Medicine and Biology Conference, 2013. [ cite ] [ pdf ]

[2] N. J. Smith and M. Kutas, "Regression-based estimation of ERP waveforms: II. Non-linear effects, overlap correction, and practical considerations." [ cite ] [ pdf ]

[3] N. J. Smith. (2011). Scaling up psycholinguistics. [ cite ] [ pdf ]

[4] N. Bidely-Shamlo, K. Kreutz-Delgado, M. Miyakoshi, M. Westerfield, T. Bel-Bahar, C. Kothe, et al., "Hierarchical Event Descriptor (HED) Tags for Analysis of Event-Related EEG Studies," IEEE GlobalSIP, Austin, TX, 2013. [ cite ] [ pdf ]

Quick Start (read first)

  1. Load a continuous (un-epoched) dataset or study in EEGLAB and select the Tools-rERP-Run analysis. This will bring up the rERP setup GUI. If you haven’t used this toolbox before, the Select profiles list should be empty. Select one of the datasets from the list and click “Make profile for dataset” at the bottom.

    Rerp plugin gui.png Rerp setup gui.png

  2. Once that completes, click Edit profile at the bottom right. This will bring you to the profile GUI: where you specify the regression settings.

    Rerp profile gui.png

  3. If you have not computed ICA on that dataset, hit the Components toggle button to switch to Channels. By default, it operates on ICs. If you have less than 8GB of memory available, change the artifact function name from “rerp_reject_samples_robcov” to “rerp_reject_samples_probability”. rerp_reject_samples_robcov is more powerful, but more expensive.

    Profile11.png

  4. Under Included event types you will see a comprehensive list of all event codes extracted from the EEG struct. By default, all event types are included as variables in the regression model. Be sure to remove any redundant or superfluous event types by selecting them and pressing the “Remove >>” button.

    Profile3.png

  5. Press Run at the bottom right corner. If you have Parallel Computing Toolbox, be sure to start a pool first: the program will parallelize over the individual time series.
  6. When it finishes, go back to the EEGLAB GUI and select Tools-rERP-Plot results. This will bring up the plotting GUI.

    Rerp plugin gui.png Rerp result gui.png

  7. Under Results, you see the result of the analysis you just ran.
  8. Under Plot types heading, you see a list of possible plots to perform for the currently selected result; select one of them.
  9. Under Components (R2) or Channels (R2), you see a list of channels or ICs with their average R-Squared value in parentheses. You can choose multiple time series at once and it will plot them all.
  10. Under the Event types heading, you see a list of event types for that result. You can choose multiple event types at once and it will plot them all.
  11. Select Exclude insignificant, to plot only the ERP estimates which resulted in a statistically significant R2. This is off by default.
  12. Press Plot and a new plot window will appear. You can select another result and hit Plot again to plot multiple datasets on the same axes. You will have to hit Clear figure if you want to plot different time series or event types as each plot is simply overlaid on the previous one.
  13. Right click any axis to save that graph as an image.


Description

  • Traditionally, the way of calculating an ERP estimate is to average epochs time-locked to a stimulus class of interest. This technique places severe restrictions on the experimental protocol: only a small number of stimulus categories can be used, stimulus events must be well separated in time and all other cognitive processes must be held constant. Violating the latter conditions will cause the ERP to be estimated sub-optimally.
  • [1 ] demonstrated that a multiple regression model is capable of explaining more experimental data variance than averaging for overlapping ERPs. By adopting a simple regression model over averaging, we make an additional assumption: that the observed EEG signal is a linear combination of ERPs added to white Gaussian noise.
  • Under this new assumption, ERPs combine additively when they overlap in time. This is also known as a Linear Time Invariant (LTI) model. We can illustrate the concept graphically as a convolution of impulsive inputs with the characteristic brain response (ERP):
    LTI system.png

  • We can rewrite the convolution as a sum of matrix-vector products. Here, Xi is the matrix which predicts the occurrence of event type i and hi is the ERP vector for event type i.
    Matrix vector1.png

  • We stack the ERP vectors into a single vector h and consolidate the predictors for the different event types into a single predictor matrix, X. Now, y = Xh and we can solve for h using our favorite technique, such as Penalized Least Squares.
    Matrix vector2.png

The rERP toolbox includes:

  • Routines for constructing sparse predictor matrix X and solving for h
  • GUI interface to EEGLAB
  • Time-frequency analysis (rERSP) analogous to ERSP
  • HED Tag (nested factors) support
  • Automatic artifact rejection using [cleanrawdata] or [probability]
  • Convex regularization (LASSO, Ridge, ElasticNet) with grid search
  • Tools for plotting and publishing results
  • Statistical significance derived from R-squared


GUI


Setup GUI

Rerp setup gui.png

  1. Select datasets to analyze: Select which datasets to analyze
  2. Add/Clear datasets: Add or remove datasets from the list
  3. Make profile for dataset: Create a profile based on the currently selected dataset
  4. Select profiles: Select which profiles to run against the selected datasets
  5. Add/Clear profiles: Add or remove profiles from the list
  6. Edit profile: Launch GUI to make changes to the selected profile
  7. Cancel/Run: Cancel GUI, or run analysis with selected datasets and profiles


  • Start the setup GUI from EEGLAB as described in the Quick Start section above, or enter at the command line
>> rerp_result_study = pop_rerp_study(‘eeg’, EEG); 

or

>> rerp_result_study = pop_rerp_study(‘study’, STUDY);
  • All datasets you select will be analyzed with all profiles you select. Selecting 3 datasets and 2 profiles will yield 6 results total.
  • See Toolbox Structure and Scripting section for more information on how profiles are used in the toolbox.


Profile GUI

Rerp profile gui.png

  • The profile GUI modifies RerpProfile handle objects. You can launch the GUI from the command line by entering
>> rerp_profile_gui(rerp_profile);  


Profile11.png

  1. Auto-save results: toolbox will save results to disk automatically at the path to the right. You can change the path manually or by pressing “Browse path”.
  2. rERSP: Do a time-frequency decomposition on the data, then perform regression. “Number of bins” defines the frequency resolution of the transform. Comparable to ERSP, but using regression instead of averaging.
  3. Include/Exclude components/channels: Select which time series are to be analyzed. You can change the selection method by pressing the toggle buttons to the right.
  4. Category/Continuous epoch boundaries: Choose the epoch window boundary, relative to event onset. See the Description section above for information about Continuous Variables.
  5. Artifact reject function: If you have already removed artifact data samples by some other means, deselect. Otherwise leave it checked. The toolbox will automatically ensure that artifact frames have been identified and excluded from regression.
    1. Once a profile has been updated with artifact indexes, it will display the number of frames identified. Function used to identify artifact indexes. It should follow the prototype artifact_indexes = artifact_function(EEG).
    2. Two functions are included with the toolbox: rerp_reject_samples_robcov and rerp_reject_samples_probability. rerp_reject_samples_robcov is better, but uses more memory.


Profile3.png

  1. Included/Excluded event types: The event types in the “Include” list will be variables in regression. This applies even when doing hierarchical regression with HED tags: only events of the included type are considered. While selecting event types, you can hit enter, which will bring up a window for entering descriptions of the event types. This will be useful later for plotting.
    :Profile4.png
  2. << Add/Remove >>: move selected event types to the Include event types or Exclude event types list respectively.


Profile5.png

  1. Hierarchical regression: If the EEG.event(:).hedTag field has been populated with HED strings, you have the option of using HED tags as regression variables in lieu of event types. See the section About HED Tags for more information about what HED tags are and when you should consider using them.
  2. Enforce HED specification: If selected, the toolbox will throw an error if the HED tags do not conform to the HED specification. Useful for comparing cross-study results.
  3. Display HED hierarchy: show the structure of the HED hierarchy for this profile.
    Hed tree.jpg
  4. Include tags: The hierarchical version of “Include event types”. All tags in this list are treated as variables in regression. Some of these tags will have markings, such as stimulus/instruction in the above screenshot. * stimulus/instruction * indicates that tag has been excluded from regression (because it is redundant with stimulus/instruction/fixate). Similarly, tags enclosed with { } are affected by separator tags and tags enclosed in [ ] are combined into a single Continuous variable.
  5. Exclude tags: are excluded from regression
  6. Separator tags: If we want to separate certain events into different categories, we can add separator tags to their HED strings. For example, if we want to divide the data into two blocks and see how the rERP estimates are different between the blocks. Perhaps the experimental conditions changed between the two blocks. We can add the tag Custom/Block/1 to the HED strings of all events in the first block and Custom/Block/2 to all events in the second block. We add Custom/Block/1 and Custom/Block/2 to the list “Included event types” and add Custom/Block to the Separator tag list. This creates two separate variable spaces for all of the included tags; one for each block. The number of parameters to estimate effectively doubles. If there were three blocks, it would triple, and so on.
    Separator tags.png

  7. Continuous tags: We may believe that the amplitude of the evoked response is related to another continuous variable that we have access to (e.g. intensity, acceleration). If so, we can tag each event with the value of that continuous variable when the event occurs (e.g. Custom/Mag/2.5, Custom/Mag/.22, … ). This models the scaled responses within the regression framework. The relationship between the continuous variable and the amplitude of the response need not be linear. In the example, we add all the continuous tags (there may be many) to the “Include tags” list and add Custom/Mag to the “Continuous tags” list. This creates a single regression variable Custom/Mag with scaled event responses
    Continuous tags.png

  8. Include/Exclude/Separator/Continuous: Add selected tags to the respective list
  • We display a parameter count as a quick check on how many variables the model is generating. It is quite possible to generate highly collinear model, which overtaxes the data with a relatively low parameter count [ 2 ]. Future versions may include Variance Inflation Factor analysis to give a better idea of how hard we are pushing the data.


Profile6.png

  1. Regularization (recommended): Enable penalized regression. Prevents overfitting of data. See regularization wiki for more details.
  2. Grid search (recommended): Find the regularization constants (lambda) by testing a grid of values. Lambda which yields highest R2 is selected (i.e. best linear prediction of the data using cross-validation).
  3. Lambda: If “Grid search” is disabled, we can enter a specific lambda to use. Always enter two values, even if only using one of them.
  4. Penalty function: Recommend L2 norm (Ridge regression). This is the fastest method and it has not been shown that the L1 Norm (Lasso) and Elastic net variants are any better for ERP regression purposes.


Profile7.png

  1. Load profile/Save profile: To try many analyses on a single dataset to see what works, press “Force recompute artifact frames”, then save variants of the profile that you want to run.
  2. Set default profile: When generating new profiles, these settings will be transferred.
  3. Cancel/OK: Cancel will cancel all changes to the profile since you opened the profile GUI. OK keeps the changes.
  4. Set advanced options: Set advanced options for grid search.
    :Profile8.png
    1. Number of grid zoom levels: We can do multiple levels of grid search, zooming in on the local optima that we find. Two is a conservative number. One is just a normal “flat” grid search.
    2. Number of grid points: After the first level search, how many points to generate around the local optimal value in deeper levels.
    3. Number of cross-validation folds: Number of cross-validation folds to use when computing R-squared. Used in grid search and final results regardless of whether regularization is enabled.
    4. First phase lambda: The exact lambda values to use for the first grid search. This determines where the grid search will look for lambda in subsequent levels as well. Default is set to a wide range of positive, exponentially increasing lambda values.
    5. Elastic net quick zoom (experimental): There is preliminary evidence that optimal Elastic net lambda (combined L1 and L2 norm) R2 values are achieved around the optimal separate L1 and L2 norm lambda. This finds the optimal L1 and L2 lambda values first, then does the full grid search around those values.
    6. Save all grid search results: If you want to explore the predictive surface of the grid search, this will save every result that the grid search generates.


Result GUI

Rerp result gui.png

  1. Results: List of results to b plotted. Select one result at a time to be plotted. Selecting multiple results at the same time allows for group level plotting of those results (for Total R2 and Event Type R2).
  2. Plot type: List of available plots will change depending on which result is selected. Possible plots:
    1. Rerp by event type: Make a single plot for each event type/HED tag selected. If multiple time series are selected, they are plotted on the same axis.
      Rerp by event type.png
    2. Rerp by channel/component: Make a single plot for each time series selected. If multiple event types/HED tags are selected, they are plotted on the same axis
      Rerp by component.png
    3. R2 total: Plot R-Squared (coefficient of determination) for each time series taking into account all event types/HED tags. This is essentially the percentage of total data variance predicted by a linear model.
      R2 total.png
    4. R2 by event type: Plot R-Squared (coefficient of determination) for each time series taking into account each event type/HED tag separately. This is essentially the percentage of total data variance accounted for by a linear model with only that event type/HED tag.
      R2 event type.png
    5. Rerp image: Plot original data epochs, modeled data epochs and difference (noise) epochs next to each other.
      Rerpimage.png
    6. Rersp: Analogous to ERSP, but with regression (on db Power), instead of averaging.
      Rersp.png
    7. Grid search: View the predictive surface for the grid search. Shows how performance changes by varying lambda. For multilevel grid search, this produces as many graphs as there are levels.
      Gridsearch1.png Gridsearch2.png Gridsearch3.png

  3. Rerp image length: When Rerp image plot type is selected, this determines the epoch window length.
  4. Channels/Components (R2): List of time-series that were analyzed. It is possible to plot multiple time-series at once.
  5. Sort by R2: Sort the Channels/Components list by total R2. Shows the components which had more variance accounted for at the top.
  6. HED tags/Event types: List of variables that were included in the regression model. It is possible to plot multiple tags/types at once.
  7. Locking/Sorting variable: For Rerp image. Two variables must be selected, locking and sorting. Use this toggle button to cycle between selecting each variable.
  8. Exclude insignificant: Only plot waveforms which had statistically significant R2.
  9. Overplot: Each call to plot will be superimposed on the same axis. Used for comparative plotting, perhaps with different datasets. Only overplots if the plot type is the same.
  10. Constant scale: All subplots will be forced to have the same scale. Scale is chosen so that all graphs are visible.
  11. p < : For plots which identify statistical significance (i.e. Rerp by event type, Rerp by channel/component), determines p-value threshold for significance.
  12. Load results/Save result as: Load multiple results from other locations. By default, the GUI loads all results in the rERP/results folder. Resave a particular result to another location.
  13. Display profile: Display the profile used to compute the currently selected result.
  14. Clear figure: Clear the plotting window of all plots.
  15. Plot: Plot results based on currently selected options in the GUI. You can plot the results of multiple datasets on top of each other by hitting plot in between changing the results (results must be compatible for this to work properly, i.e. be from the same profile). Combined statistics from multiple results will be available in future releases by selecting multiple results; currently on one result may be plotted at a time.


Toolbox Structure and Scripting

  • The best, most up to date reference for specific functions and classes can be found using the Matlab documentation system
>> doc RerpProfile; 
  • Simple examples of scripting the toolbox can be found in rerp_example.m.
  • The basic premise is to consolidate all information needed to specify a regression into one object: a profile (of class RerpProfile). The only information not specified in the profile is the data.
  • A profile is specific to the dataset it was formed from: it contains event timings and artifact indexes, as well as generic settings such as epoch boundaries and penalty functions.
  • The profile and EEG struct (containing the data) are sent to the pop_rerp function which performs checks on the system and ultimately calls the low level rerp function.
  • The pop_rerp function, after analyzing the data according to the profile, returns a result object (of class RerpResult).

Toolbox overview.png

  • The result object (RerpResult) contains the results of the regression, as well as the profile that was used to derive it.


Class structure.png

  • In general, it is easiest to use the GUI for modifying profiles and plotting results.


About HED Tags

HED website

Hierarchical Event Description (HED) tags were proposed in [ 4 ] as a way to standardize EEG experimental event descriptions. One benefit of this approach is that it permits higher level analyses involving multiple studies and experiments. For regression purposes, we use hierarchical event descriptions to define hierarchical regression models. This is best demonstrated with a simple example.

  • Say we have an experiment with two event types: event type 1 is the presentation of an image, event type 2 is the presentation of an image with a target that the subject is searching for. We tag event type 1 as Stimulus/Expected and event type 2 as Stimulus/Expected/Target.
  • The tag Stimulus/Expected/Target has three sub-tags: Stimulus, Stimulus/Expected and Stimulus/Expected/Target. Tagging an experimental event with Stimulus/Expected/Target is equivalent to tagging it with all three sub-tags. Both event types 1 & 2 inherit the tag Stimulus/Expected, but only event type 2 gets Stimulus/Expected/Target.
  • Of course, we can also use tags to express non-hierarchical event descriptions. Tagging event type 1 as Stimulus/1 and event type 2 as Stimulus/2, we are back to using the standard event codes as our regression variables.


Non hierarchical.png Heriarchical overlap.png

  • In the regression framework, we interpret the sub-tags as distinct variables, analogous to using event types as variables in non-hierarchical regression.
  • In our example, we define a variable Stimulus/Expected (any time event types 1 or 2 occur) and Stimulus/Expected/Target (any time event type 2 occurs). When we include both variables in a regression model, we are effectively modeling the ERP of event type 2 as the sum of the responses to the sub-tags.
  • Another way to put it: event type 2 is the response to being shown any image (Stimulus/Expected) plus an additional response due to the fact that the image is a target (Stimulus/Expected/Target).
  • We can interpret the “additional response” to the target image as a recognition event, or P300 response. We have used the overlap modeling capability of regression to decompose an ERP into different layers which combine additively to produce the overall effect.


Hierarchical summation.jpg



  • Another benefit of hierarchical event descriptions is that each experimental event is allowed to be defined in many different ways simultaneously. Each event is tagged with a HED string (e.g. EEG.event( i ).hedTag = ‘Stimulus/Expected; Stimulus/Visual; Custom/Block/1; Custom/Reaction time/2.33’ ) which is composed of many HED tags separated by comma or semicolon.
  • This allows us to capture very specific information about each event, while still retaining the ability to reference variables which are common across many events. The toolbox will compile the HED strings from all experimental events into a master hierarchy. From there, the user can pick out which tags to include in the regression model.

About hed tags.png

Feature Backlog

  • Variance Inflation Factor (VIF) analysis
  • Ability to perform regression in different basis (e.g. spline, Gaussian)
  • Statistics: pointwise statistics for ERP and ERSP estimates (glm-ie integration)


Credits

Developed and Maintained by: Matthew Burns (SCCN, INC, UCSD)
Email: rerptoolbox (at) gmail (dot) com


Significant contributions from:

Supported by a gift from The Swartz Foundation (Old Field, NY).