[Eeglablist] Letter of Support request - Hierarchical Event Descriptors (HED)

Sun Sep 24 14:55:24 PDT 2023

Dear EEGLABlist members -

Some of you may have heard of and even already made use of the system of
Hierarchical Event Descriptors (HED) for describing *What happened? *to the
participant during neuroimaging experiments. The only system for
systematically recording answers to this question in a format both human
readable and machine actionable is HED, first proposed and developed at
SCCN (UCSD) by then student Nima Bigdely-Shamlo. HED is now endorsed within
the BIDS data formatting standards and is used in online data archives
*NEMAR.org*, *OpenNeuro.org*, and *EEGNet.org*. EEGLAB now includes plug-in
functions to read and write BIDS formatted data, and to search for events
in BIDS-formatted datasets that include HED tag event information.

Following a year of initial support from the U.S. National Institutes of
Health, they have invited us to apply for 5 years of continued development
support and outreach. To interest the 'panel of peers' reviewers our
proposal must go through, letters of support from people working in EEG or
other time series neuroimaging modalities will be most useful.

Therefore, we are asking for letters of support of our impending proposal
to BRAIN/NIMH to fund continued development of HED: “Hierarchical Event
Descriptors (HED): Standardizing event reporting to enable analysis-ready
data sharing,” in response to PA-20-185 NIH Research Project Grant.

Co-PIs on this proposal will again be myself and Kay Robbins (UTSA), with
Co-Investigator Arnaud Delorme and active support from Yahya Shirazi and
from Dung Truong who are making important contributions to our ongoing
progress. Our three proposed Specific Aims are attached. Dora Hermes of
Mayo Clinic (Rochester MN) will again be a Co-Investigator working on
description of events in clinical electrophysiology and on stimulus image
description. These and a few others meet regularly as the open HED Working
Group. Write privately for more information if you are interested in taking
part.

    Annotating more precisely the nature of events in neurobehavioral and
other time series data is an essential step in advancing understanding of
brain dynamics using new methods of artificial intelligence and machine
learning. HED is *still* the only system we are aware of that provides a
practical path for machine-actionable annotation of participant experience,
behavior, and neurodynamics during time series neuroimaging experiments.
Yet the work of developing, maintaining, and promoting the use of HED
annotation for storing, sharing, and analyzing neuroimaging time series
data is just beginning. Important technical programming and support tasks
need funding – hence the importance of our submission to continue current
development momentum.

    The HED vocabulary and infrastructure have substantially advanced over
the past three years, as summarized below. I also append for your possible
interest our current HED project draft Specific Aims. Finally I attach a
template letter of support
<https://urldefense.com/v3/__https://docs.google.com/document/d/1pf-QnnycWknGhMQDycTr-qBRYQJK0byQ3Dr1xLalSnQ__;!!Mih3wA!EfouSXSMP5tc_ah4dhU1-pgvv-8_FBJObySHYJkq6x_-fo_4x2JTFlpD0gnjaVK12s_bSlC8d494vR2loCYx$>
to
add to your letterhead and modify in any way you’d like.

We need to receive emailed letters of support at latest by Tuesday, October
3, to include them in our October 5 submission.

     We also welcome, separately, any suggestions or questions about our
proposed Aims. We have a great opportunity to bring HED into widespread and
wide ranging use over the next 5-6 years under BRAIN/NIMH funding, and
still have some days to (re)consider proposed activities, either technical
or outreach.

     Thank you again for your support, on behalf of the HED Working Group -

            Scott Makeig

o

Progress Summary: the Hierarchical Event Descriptor system, 2021-23

The third-generation HED standard schema vocabulary and syntax were
officially released and a formal specification document and two journal
articles were published in 2021. HED capabilities including the ability to
represent events that extend over time and have intermediate points of
interest in a manner that can provide analysis-ready metadata and event
context information at any point in the experimental timeline. HED tools
can now express condition variables, the experimental design, and discover
factors in datasets with appropriate HED annotations. In addition, during
the last year we have continued to focus on making human use of HED as
simple and natural as possible, as we know this is key to wider adoption.
In this direction, we have begun promising tests exploiting use of
available large language models for [text description→HED tags] translation.

Our work already has practical application by our adding HED tags and
thereby analysis value to increasing numbers of the 230+ EEG, MEG, and iEEG
datasets so far contributed to OpenNeuro and thereby made available for
inspection, download, or direct high-performance computing in our open
data, tools, and compute resource NEMAR.org. We remain eager to work with
data holders on this, as it both increases the amount and range of readily
analyzable shared data, and increases our understanding of challenges data
holders face in adding HED annotations to their datasets. We are also in
close contact with the Canadian EEGNet.org data archiving project at McGill
University, who are using HED to annotate events in electrophysiological
datasets they are sharing, and with the International Neuroinformatics
Coordinating Facility (INCF), with a view to soon submitting HED to be
recognized by them as a neuroinformatics standard. For still more detail on
the current status of HED, see the HED homepage (https://urldefense.com/v3/__http://www.hedtags.org__;!!Mih3wA!GCya2vffzoLcTbGCCYgp1glSGoMuoAzRqTDKJrtfXrzMVDNbW5uBSteQJ78ViVWDLL36oX3e_dY2oPwQ18jW$ 
<https://urldefense.com/v3/__http://www.hedtags.org__;!!Mih3wA!Fd8H4ym4n0VEsEHqnNVD6PhBpPXUt1umZQt4XYEMxQEpnwobJZj78bsVD4EFLm0QIiNpJc-9syxrtJ5dKPSB$>)
and the HED resources page (https://urldefense.com/v3/__http://www.hed-resources.org__;!!Mih3wA!GCya2vffzoLcTbGCCYgp1glSGoMuoAzRqTDKJrtfXrzMVDNbW5uBSteQJ78ViVWDLL36oX3e_dY2oFJsBFmR$ 
<https://urldefense.com/v3/__http://www.hed-resources.org__;!!Mih3wA!Fd8H4ym4n0VEsEHqnNVD6PhBpPXUt1umZQt4XYEMxQEpnwobJZj78bsVD4EFLm0QIiNpJc-9syxrtIZolznW$>
).

One of HED’s key advances is the recognition that each research subfield
may need to employ its own specific HED terminology in addition to the
basic HED schema vocabulary. HED now fully supports the development, use
and validation of such library schema extensions. At the beginning of this
year, the first such specialized vocabulary was officially released: an
open-source system for ‘reading’ EEG data collected during clinical
examinations using the internationally accepted SCORE vocabulary for
describing, e.g., alpha bursts, ictal events, etc. This HED-SCORE library
schema effort is led by Dora Hermes Miller at The Mayo Clinics, Rochester
MN. We are happy to say that groups are now building HED library schemas
(specialized vocabularies) for experiments using language terms and for
annotating standard sets of natural images and video stimuli, and that
other potential library schemas are under discussion.

Kay Robbins and her small programming team are doing a particularly
valuable job of formalizing the computer science specification,
maintenance, and upgrade path for HED. The last two years have seen a
remarkable convergence of opinion among neuroinformatics leaders (and
funders) that event annotation is a yet unfilled gap in annotation of time
series neuroimaging data. However, we see that much further work is
required 1) to extend the HED system to encompass annotation of more
complex stimulus and task paradigms, 2) in making the neuroimaging
community aware of the need for and tools for performing HED annotation of
their data, and 3) to demonstrate the power of HED annotation for advanced
analysis and mega-analysis across as well as within datasets.

*Specific Aims: *Hierarchical Event Descriptors (HED): Standardizing event
reporting to enable sharing of analysis-ready time series data

Relating brain dynamics in recorded behavioral and neuroimaging time series
data to synchronous changes in subject experience, cognition, and
goal-directed actions and interactions is the major goal and challenge for
electrophysiological, hemodynamic, and brain/body imaging studies.
Standardizing the descriptions of in vivo-recorded as well as post
hoc-identified
event processes that unfold in time during data recording – affecting
participant experience, cognition, and behavior – is essential to effective
machine-aided search and systematic, reproducible analysis of time series
data, both within and across data sets, to model systems-level brain
function or perform functional biomarker discovery and validation. In
response to call PA-20-185 (R01) we here propose a five-year project to
advance the Hierarchical Event Descriptors (HED) ecosystem for describing
events of all types in occurring human neuroimaging experiments in
sufficient detail to enable and support comparative analysis of human brain
dynamics both within and across studies. Storing and sharing data absent
this information severely limits the feasibility of its participation in
any innovative and/or large-scale data analyses or meta-/mega-analyses.
Availability of well-annotated, analysis-ready data, plus standardized tools
to readily access and make use of it, are particularly essential to
leveraging the potential of applying powerful new artificial intelligence
(AI) modeling approaches to large, diverse collections of neuroimaging
data.

HED syntax, vocabulary, and accompanying software infrastructure have been
under continuous development for more than a decade (Bigdely-Shamlo 2013;
Makeig & Robbins 2023). To our knowledge HED is the only project filling a
longstanding neuroinformatics information gap – to adequately record
answers to the question, ‘Precisely what happened?’ during neuroimaging
data acquisition. That is, HED was the first and to our knowledge remains the
only system addressing the problem of capturing details of sensory,
behavioral, data-feature, and other events occurring in time series
neuroimaging experiments. The efforts of our open HED Working Group in
recent years, as described in the Research Plan, have significantly
advanced the capabilities, infrastructure, documentation, and supporting
tools for HED. We here propose a project to address many critical next
steps, grouped below into three Specific Aims:

Aim 1. Make HED more broadly applicable. We will continue to expand the
formally-defined infrastructure, syntax, vocabulary, tools, and
documentation of the HED annotation system - with a focus on enabling the
storing, sharing and information mining of analysis-ready data. a)
Vocabulary: We will develop HED dictionary extensions (library schemas)
comprising terms used in particular neuroimaging research communities, in
particular (1) in language experiments, (2) to describe human body movements,
and (3) to describe complex stimulus material (e.g., natural view images,
movies and animations, human speech and other sounds, music, V-R and A-R
recordings). b) Data formats: We will expand HED annotation capabilities to
use information contained in derivative datasets (however preprocessed) in
BIDS format and will also work with developers of the NWB data formats to
incorporate use of HED in NWB-formatted data. c) Generality: We will
address identified challenges in capturing the relationships of events to
local and global event context including participant task and
data-authors’ study
goals. We will add terms and syntax to specify task-related linkage between
events and will develop a HED-compatible language for task representations.

Aim 2: Make HED more easily used. a) Immediate coding: We will develop
methods for incorporating HED-coded event information in datasets from the very
beginning of the experiment design, data collection, and data analysis
process by being integrated into experiment control scripts – first within
the widely used open-source applications PsychoPy and BCI200, and as
possible, into other leading

experiment control applications – while maintaining compatibility with BIDS
data formatting standards. b) Data analysis tools: We will develop a suite
of automated HED-aware data analysis tools use to perform new and efficient
analysis of HED-annotated data, including comparisons of event-related
dynamics across datasets and paradigms, and will incorporate these tools
into analysis pipelines for leading open-source electrophysiology tool
environments EEGLAB, fieldtrip, MNE-python (EEG/MEG/iEEG), and Fitlins (fMRI).
c) Data repository tools: We will improve HED-based event search mechanisms
and will integrate representations of experimental conditions and
participant tasks into event search and summary and dataset comparison
tools for use in data repositories including NEMAR / OpenNeuro, the
Canadian archive EEGNet (eegnet.org), and any other archives adopting BIDS
formatting. d) Complex stimulus library: We will create and maintain a
public ‘StimZoo’ of well-known HED-annotated complex stimulus sets
including the COCO natural images and often-used movies.

Aim 3. Make HED more widely used. a) User outreach: We will continue to
offer HED tutorial workshops as well as HED tutorial videos and other
teaching materials on how to use HED tools for data search, summary,
quality assessment, and analysis, and will hold online office hours to
assist users in annotating their data. b) Annotation assistants: We will
continue to improve our HED schema review and online annotation tools by
building HED online annotation assistants supported by newly available
large language model (LLM) technology. c) Neuroinformatics community
integrations: We will work with open neuroimaging data repositories, with
identified holders of large data collections, and with other neuroimaging
community stakeholders to increase the amount of available analysis-ready,
HED-annotated data. We will implement a HED governance plan to ensure its
future continuity, and will contact neuroscience journal publishers to
explore how HED annotation may be made more visible in publications.

-- 
Scott Makeig, Research Scientist and Director, Swartz Center for
Computational Neuroscience, Institute for Neural Computation, University of
California San Diego, La Jolla CA 92093-0559, http://sccn.ucsd.edu/~scott