Download OHBM Morning Workshop: Neurocognitive ontologies

Document related concepts

Cluster analysis wikipedia , lookup

Nonlinear dimensionality reduction wikipedia , lookup

Principal component analysis wikipedia , lookup

Transcript
http://nemo.nic.uoregon.edu
NEMO Year 1:
From Theory to Application —
Ontology-based analysis of ERP data
August 20, 2009
Overview Agenda
•
•
•
•
•
•
ICBO highlights (5 mins)
Logistics (5 mins)
ERP pattern analysis methods (20 mins)
ERP measure generation (10 mins)
Linking measures to ontology (10 mins)
Data annotation (deep, ontology-based) (10
mins)
Action items highlighted in lime green!
Overview Agenda
•
•
•
•
•
•
ICBO highlights (5 mins)
Logistics (5 mins)
ERP pattern analysis methods (20 mins)
ERP measure generation (10 mins)
Linking measures to ontology (10 mins)
Data annotation (deep, ontology-based) (10
mins)
Action items highlighted in lime green!
First International Conference on
Biomedical Ontologies (ICBO’09)
http://precedings.nature.com/collections/icbo-2009
First International Conference on
Biomedical Ontologies (ICBO’09)
• High-level issues and "best practices" for onto dev't
• Tools that may be of use for NEMO
• Potential collaborations
• Practical Questions/Issues to resolve
Overview Agenda
•
•
•
•
•
•
ICBO highlights (5 mins)
Logistics (5 mins)
ERP pattern analysis methods (20 mins)
ERP measure generation (10 mins)
Linking measures to ontology (10 mins)
Data annotation (deep, ontology-based) (10
mins)
Action items highlighted in lime green!
NEMO “to do” items
• Identify "point person" at each site who will
be responsible for contributing feedback on
NEMO wiki and ontologies and for uploading
data and testing matlab-based tools for data
markup
– Please provide name & contact info for this person
in an email
• Bookmark NEMO website & explore links
under “Collaboration” (more to come next
time on how specifically you can contribute)
Overview Agenda
•
•
•
•
•
•
ICBO highlights (5 mins)
Logistics (5 mins)
ERP pattern analysis methods (20 mins)
ERP measure generation (10 mins)
Linking measures to ontology (10 mins)
Data annotation (deep, ontology-based) (10
mins)
Action items highlighted in lime green!
ERP Pattern Analysis
• An embarrassment of riches
– A wealth of data
– A plethora of methods
• A lack of integration
– How to compare patterns across studies, labs?
– How to do valid meta-analyses in ERP research?
• A need for robust pattern classification
– Bottom-up (data-driven) methods
– Top-down (science-driven) methods
Ontologies for highlevel, explicit
representation of
domain knowledge
 theoretical
integration
Knowledge
 Semantically structured
(Taxonomy, CMap, Ontology,…)
Information
 Syntactically structured
(Tables, XML, RDF,…)
Data
 Minimally structured or
unstructured
Ontologies to
support principled
mark-up of data
(inc. ERP patterns)
practical
integration
NEMO principles that inform our
pattern analysis strategies
• Current Challenges (motivations)
– Tracking what we know
• Ontologies
– Integrating knowledge to achieve high-level
understanding of brain–functional mappings
• Meta-analyses
• Important Considerations (disiderata)
– Stay true to data
• bottom-up (data-driven methods)
– Achieve high-level understanding
• top-down (hypothesis-driven methods)
Top-down vs. Bottom-up
Top-Down
Bottom-Up
PROS
•Familiar
•Formalized
•Science-driven •Data-driven
(integrative)
(robust)
CONS
•Informal
•Paradigmaffirming?
•Unfamiliar
•Study-specific
results?
Combining Top-Down & Bottom-Up
Traditional approach to bio-ontology dev’t
TOP-DOWN
Encode knowledge of concepts (=> classes,
relations, & axioms that involve classes &
relations) in a formal ontology (e.g., owl/rdf)
NEMO owl ontologies being developed & version-tracked on Sourceforge
(the main topic of our last meeting)
NEMO top-down approach
TOP-DOWN
NEMO emphasis on pattern rules/descriptions — way to enforce rigorous definitions
Of complex concepts (patterns or “components”) that are central to ERP research
Superposition of ERP Patterns
What do we know about ERP patterns?
Observed Pattern = “P100” iff
 Event type is visual stimulus AND
 Peak latency is between 70 and 160 ms AND
 Scalp region of interest (ROI) is occipital AND
 Polarity over ROI is positive (>0)
FUNCTION
?
TIME
SPACE
Why does it matter?
Robust pattern rules a good foundation for–
 Development of ERP ontologies
 Labeling of ERP data based on pattern rules
 Cross-experiment, cross-lab meta-analyses
BOTTOM-UP
Two classes of methods for
ERP pattern analysis
• Pattern decomposition
Focus today (already
implemented & almost
ready for YOU to test  )
– Temporal factor analysis (tPCA, tICA)
– Spatial factor analysis (sPCA, sICA
• Windowing/segmentation
– Microstate analysis (use global field “maps”;
compute “global field dissimilarity” between
adjacent maps to determine where there are
significant shifts in topography
Decomposition approach
100ms
170ms
200ms
280ms
400ms
600ms
P100
N100
fP2
P1r/
N3
P1r/
MFN
P300
PCA, ICA, dipoles etc.
multiple methods for
principled separation
of patterns using
factor-analytic
approach
Windowing/segmentation approach
Advantages
over factor-analytic/ decomposition methods:
100ms
P100
• Familiarity — Closer
to what most ERP researchers do (manually)
• Less (or at least different!) concerns regarding misallocation of variance
• Robustness to latency diffs across subjects, conditions
170ms
200ms
280ms
400ms
600ms
N100
fP2
P1r/
N3
P1r/
MFN
P300
Michel, et al., 2004; Koenig, 1995; Lehmann & Skrandies, 1985
What we’ve done (to date…)
•
Implemented sPCA, tPCA, sICA, & microstate
analysis
•
Tested & evaluated sPCA, tPCA & sICA (following
Dien, Khoe, & Mangun, 2008) using simulated
ERP data
• Explored two different approaches to pattern
classification & labeling (the step AFTER
decomposition)
1. Data preprocessing
1.
filter & segment data
2.
detect & reject artifacts
3.
interpolate bad channels
4.
average across trials w/in
subjects
5.
manual detection of bad
channels
6.
interpolate bad channels
7.
re-reference montage (PARE)
8.
baseline-correct (200ms)
2. Component Analysis
Our current practice (NOT set in stone!)
- Step 1. Apply eigenvalue decomposition
method (eg., tPCA)
- Step 2: Rotate ALL latent factors (unrestricted
PCA)
- Step 3: Retain fairly large number of factors
based on log of scree
- Step 4: Let ontology-based labeling (next
slide) help determine which factors to keep
and analyze!
3. Component Labeling
NEXT MAJOR CHALLENGE: How
to tune pattern rules (particularly
TI-max begin and end) to fit each individual dataset. Data
mining on results from different component analyses? (Note
mining of tPCA data won’t help to refine temporal criteria.)
4. Meta-analysis (next milestone!!)
• Apply pattern decomposition & labeling to NEMO
consortium datasets
• Identify one experimental contrast for each analysis
• Compute Effect Size (ES) estimates for each study
• Run mixed effects analysis:
• test homogeneity of variance across studies
• if rejected, then test effects of variables that differ
across studies, laboratories (e.g., nature of stimuli,
task, subjects)
ERP Meta-analysis goals
1. Demonstrate working NEMO consortium
2. Demonstrate application of BrainMap-like taxonomy for
classification of functional (experimental) contrasts.
3. Show that ERP component analysis, measure generation, and
component labeling tools can be used on a large scale
4. ** Show that combination of bottom-up and top-down methods
for refining pattern rules can be used to tune rules for detecting
target ERP patterns across different datasets
5. ** Show that we can (semi-)automatically indentify analogous
patterns across datasets (follows from 4), enabling us to carry out
statistical meta-analyses
** harder problems to discuss…
A Case Study with real data
(CIN’07 paper)
1. Real 128-channel ERP data
2. Temporal PCA used for pattern analysis
3. Spatial & temporal metrics for labeling of
discrete patterns
4. Revision of pattern rules based on mining of
labeled data
Example: Rule for “P100”
•For any n, FAn = PT1 iff
– temp criterion #1: 70ms > TI-max (FAn) < 170ms AND
– spat criterion #1 : SP-r (FAn, SP(PT1)) > .7 AND
– func criterion #1: EVENT (FAn) = stimon AND
– func criterion #2: MODAL (EV) = visual AND
Example of output [1]
values for summary measures (for one subject, one/six expt conditions)
Example of output [2]
Matches to spatial,
temporal &
functional criteria for
one subject &
one/six experimental
conditions
Summary results for Rule #1
A Case Study with simulated ERPs
(HBM’08 tak)
1. Simulated ERP datasets
2. PCA & ICA methods for spatial & temporal
pattern analysis
3. Spatial & temporal metrics for labeling of
discrete patterns
4. Revision of pattern rules based on mining of
labeled data
Simulated ERPs (n=80)
P100
N100
N3
MFN
P300
+
NOISE
Simulated ERP Datasets (in DipSim)
1
2
3
4
5
Dipole Simulator (P. Berg)
Simulated ERP data: Creating individual ERPs
Source #
ROI
1 (P1)
L-Occipital
2 (P1)
R-Occipital
3 (N1)
L-Parietal
4 (N1)
R-Parietal
5 (N1N2) L-Temporal
6 (N1N2) R-Temporal
7 (P2)
Medial -Frontal
Intensity Latency Location Location Orientation Orientation
Ec centricity
(uv / ma)
(ms)
Th eta
Phi
Th eta
Phi
3.5 | 45
4.0 | 45
-5.0 | -70
-4.0 | -70
-4.0 | -60
-2.0 | -60
2.5 | -30
050 : 150
055 : 155
120 : 240
130 : 250
160 : 300
170 : 310
210 : 390
-090.00 o 068.20 o
090.00 o -068.20 o
-100.02 o 045.00 o
100.02 o -045.00 o
-110.59 o 035.72 o
114.00 o -033.23 o
056.59 o 087.82 o
-090.00 o
090.00 o
-090.00 o
090.00 o
-129.57 o
125.30 o
-122.09 o
053.62 o
-060.39 o
036.44 o
-036.44 o
019.93 o
-026.57 o
083.111 o
0.81
0.81
0.57
0.57
0.42
0.40
0.63
•Random jitter in intensity
•NO temporal jitter
•NO spatial jitter
Patrick Berg’s Dipole Simulator
BOTTOM-UP
Pattern Analysis with PCA & ICA
(Decomposition approach)
ERP pattern analysis
• Temporal PCA (tPCA)
✔
– Gives invariant temporal patterns (new bases)
– Spatial variability as input to data mining
• Spatial ICA (sICA)
✔
– Gives invariant spatial patterns (new bases)
– Temporal variability as input to data mining
X
• Spatial PCA (sPCA)
Multiple measures used for evaluation (correlation + L1/L2 norms)
New inputs to NEMO
SPATIAL
TEMPORAL
TI-max
ROI
IN-mean (ROI)
TI-max
ROI
IN-mean (ROI
TI-max
ROI
IN-mean (ROI
TI-max
ROI
IN-mean (ROI
What we’ve learned (so far…)
•
Bottom-up methods result in validation &
refinement of top-down pattern rules
 Validation of expert selection of temporal
concepts (peak latency)
 Refinement of expert specification of
spatial concepts (± centroids)
• Alternative pattern analysis methods (e.g., tPCA
& sICA) provide complementary input to bottomup (data mining) procedures
BOTTOM-UP
Measure Generation
Vector attributes = Input to Data mining
(clustering & classification)
T1
T2
S1
Input to data mining:
32 attribute vectors,
defined over 80
“individual” ERPs
(observations)
S2
CoN
CoP
ROI
± Centroids
BOTTOM-UP
Data mining
• Vectors of spatial & temporal attributes as input
• Clustering observations  patterns (E-M accuracy >97%)
• Attribute selection (“Information gain”)
± Centroids
CoN
CoP
Peak Latency
✔
Revised Rule for the “P100”
Pattern = P100v iff
 Event type is visual stimulus AND
 Peak latency is between 76 and 155 ms AND
 Positive centroid is right occipital AND
 Negative centroid is left frontal
SPACE
TIME
FUNCTION
Simulated ERP Patterns
“P100”
“N100”
“N3”
“MFN”
“P300”
Alternative Spatial Metrics
• Scalp (ROI)
“regions-of-intrest”
CNEG
• Positive and negative
“centroids” (topographic
source & sink)
CPOS
Overview Agenda
•
•
•
•
•
•
ICBO highlights (5 mins)
Logistics (5 mins)
ERP pattern analysis methods (20 mins)
ERP measure generation (10 mins)
Linking measures to ontology (10 mins)
Data annotation (deep, ontology-based) (10
mins)
Action items highlighted in lime green!
BOTTOM-UP
Statistical Measure Generation
• Temporal
– Peak latency
– Duration (cf. spectral measures)
• Spatial (topographic)
– Scalp regions-of-Interest (ROI)
– Positive & negative centroids
• Functional (experimental)
– Concepts borrowed from BrainMap (Laird et al.)
where possible
Measure Generation
Vector attributes = Input to Data mining
(clustering & classification)
T1
T2
S1
Input to data mining:
32 attribute vectors,
defined over 80
“individual” ERPs
(observations)
S2
CoN
CoP
ROI
± Centroids
Overview Agenda
•
•
•
•
•
•
ICBO highlights (5 mins)
Logistics (5 mins)
ERP pattern analysis methods (20 mins)
ERP measure generation (10 mins)
Linking measures to ontology (10 mins)
Data annotation (deep, ontology-based) (10
mins)
Action items highlighted in lime green!
Automated ontology-based
labeling of ERP data
Pattern
Labels
=
Functional
attributes
+
Temporal
attributes
+
Spatial
attributes
Concepts encoded in
NEMO_Data.owl
Robert M. Frank
NEMO Data Ontology:
Where ontology meets epistemology
Ontology for Biological
Investigations (OBI)
&
Information Artifact
Ontology (IAO)
Overview Agenda
•
•
•
•
•
•
ICBO highlights (5 mins)
Logistics (5 mins)
ERP pattern analysis methods (20 mins)
ERP measure generation (10 mins)
Linking measures to ontology (10 mins)
Data annotation (deep, ontology-based) (10
mins)
Action items highlighted in lime green!