Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
http://nemo.nic.uoregon.edu NEMO Year 1: From Theory to Application — Ontology-based analysis of ERP data August 20, 2009 Overview Agenda • • • • • • ICBO highlights (5 mins) Logistics (5 mins) ERP pattern analysis methods (20 mins) ERP measure generation (10 mins) Linking measures to ontology (10 mins) Data annotation (deep, ontology-based) (10 mins) Action items highlighted in lime green! Overview Agenda • • • • • • ICBO highlights (5 mins) Logistics (5 mins) ERP pattern analysis methods (20 mins) ERP measure generation (10 mins) Linking measures to ontology (10 mins) Data annotation (deep, ontology-based) (10 mins) Action items highlighted in lime green! First International Conference on Biomedical Ontologies (ICBO’09) http://precedings.nature.com/collections/icbo-2009 First International Conference on Biomedical Ontologies (ICBO’09) • High-level issues and "best practices" for onto dev't • Tools that may be of use for NEMO • Potential collaborations • Practical Questions/Issues to resolve Overview Agenda • • • • • • ICBO highlights (5 mins) Logistics (5 mins) ERP pattern analysis methods (20 mins) ERP measure generation (10 mins) Linking measures to ontology (10 mins) Data annotation (deep, ontology-based) (10 mins) Action items highlighted in lime green! NEMO “to do” items • Identify "point person" at each site who will be responsible for contributing feedback on NEMO wiki and ontologies and for uploading data and testing matlab-based tools for data markup – Please provide name & contact info for this person in an email • Bookmark NEMO website & explore links under “Collaboration” (more to come next time on how specifically you can contribute) Overview Agenda • • • • • • ICBO highlights (5 mins) Logistics (5 mins) ERP pattern analysis methods (20 mins) ERP measure generation (10 mins) Linking measures to ontology (10 mins) Data annotation (deep, ontology-based) (10 mins) Action items highlighted in lime green! ERP Pattern Analysis • An embarrassment of riches – A wealth of data – A plethora of methods • A lack of integration – How to compare patterns across studies, labs? – How to do valid meta-analyses in ERP research? • A need for robust pattern classification – Bottom-up (data-driven) methods – Top-down (science-driven) methods Ontologies for highlevel, explicit representation of domain knowledge theoretical integration Knowledge Semantically structured (Taxonomy, CMap, Ontology,…) Information Syntactically structured (Tables, XML, RDF,…) Data Minimally structured or unstructured Ontologies to support principled mark-up of data (inc. ERP patterns) practical integration NEMO principles that inform our pattern analysis strategies • Current Challenges (motivations) – Tracking what we know • Ontologies – Integrating knowledge to achieve high-level understanding of brain–functional mappings • Meta-analyses • Important Considerations (disiderata) – Stay true to data • bottom-up (data-driven methods) – Achieve high-level understanding • top-down (hypothesis-driven methods) Top-down vs. Bottom-up Top-Down Bottom-Up PROS •Familiar •Formalized •Science-driven •Data-driven (integrative) (robust) CONS •Informal •Paradigmaffirming? •Unfamiliar •Study-specific results? Combining Top-Down & Bottom-Up Traditional approach to bio-ontology dev’t TOP-DOWN Encode knowledge of concepts (=> classes, relations, & axioms that involve classes & relations) in a formal ontology (e.g., owl/rdf) NEMO owl ontologies being developed & version-tracked on Sourceforge (the main topic of our last meeting) NEMO top-down approach TOP-DOWN NEMO emphasis on pattern rules/descriptions — way to enforce rigorous definitions Of complex concepts (patterns or “components”) that are central to ERP research Superposition of ERP Patterns What do we know about ERP patterns? Observed Pattern = “P100” iff Event type is visual stimulus AND Peak latency is between 70 and 160 ms AND Scalp region of interest (ROI) is occipital AND Polarity over ROI is positive (>0) FUNCTION ? TIME SPACE Why does it matter? Robust pattern rules a good foundation for– Development of ERP ontologies Labeling of ERP data based on pattern rules Cross-experiment, cross-lab meta-analyses BOTTOM-UP Two classes of methods for ERP pattern analysis • Pattern decomposition Focus today (already implemented & almost ready for YOU to test ) – Temporal factor analysis (tPCA, tICA) – Spatial factor analysis (sPCA, sICA • Windowing/segmentation – Microstate analysis (use global field “maps”; compute “global field dissimilarity” between adjacent maps to determine where there are significant shifts in topography Decomposition approach 100ms 170ms 200ms 280ms 400ms 600ms P100 N100 fP2 P1r/ N3 P1r/ MFN P300 PCA, ICA, dipoles etc. multiple methods for principled separation of patterns using factor-analytic approach Windowing/segmentation approach Advantages over factor-analytic/ decomposition methods: 100ms P100 • Familiarity — Closer to what most ERP researchers do (manually) • Less (or at least different!) concerns regarding misallocation of variance • Robustness to latency diffs across subjects, conditions 170ms 200ms 280ms 400ms 600ms N100 fP2 P1r/ N3 P1r/ MFN P300 Michel, et al., 2004; Koenig, 1995; Lehmann & Skrandies, 1985 What we’ve done (to date…) • Implemented sPCA, tPCA, sICA, & microstate analysis • Tested & evaluated sPCA, tPCA & sICA (following Dien, Khoe, & Mangun, 2008) using simulated ERP data • Explored two different approaches to pattern classification & labeling (the step AFTER decomposition) 1. Data preprocessing 1. filter & segment data 2. detect & reject artifacts 3. interpolate bad channels 4. average across trials w/in subjects 5. manual detection of bad channels 6. interpolate bad channels 7. re-reference montage (PARE) 8. baseline-correct (200ms) 2. Component Analysis Our current practice (NOT set in stone!) - Step 1. Apply eigenvalue decomposition method (eg., tPCA) - Step 2: Rotate ALL latent factors (unrestricted PCA) - Step 3: Retain fairly large number of factors based on log of scree - Step 4: Let ontology-based labeling (next slide) help determine which factors to keep and analyze! 3. Component Labeling NEXT MAJOR CHALLENGE: How to tune pattern rules (particularly TI-max begin and end) to fit each individual dataset. Data mining on results from different component analyses? (Note mining of tPCA data won’t help to refine temporal criteria.) 4. Meta-analysis (next milestone!!) • Apply pattern decomposition & labeling to NEMO consortium datasets • Identify one experimental contrast for each analysis • Compute Effect Size (ES) estimates for each study • Run mixed effects analysis: • test homogeneity of variance across studies • if rejected, then test effects of variables that differ across studies, laboratories (e.g., nature of stimuli, task, subjects) ERP Meta-analysis goals 1. Demonstrate working NEMO consortium 2. Demonstrate application of BrainMap-like taxonomy for classification of functional (experimental) contrasts. 3. Show that ERP component analysis, measure generation, and component labeling tools can be used on a large scale 4. ** Show that combination of bottom-up and top-down methods for refining pattern rules can be used to tune rules for detecting target ERP patterns across different datasets 5. ** Show that we can (semi-)automatically indentify analogous patterns across datasets (follows from 4), enabling us to carry out statistical meta-analyses ** harder problems to discuss… A Case Study with real data (CIN’07 paper) 1. Real 128-channel ERP data 2. Temporal PCA used for pattern analysis 3. Spatial & temporal metrics for labeling of discrete patterns 4. Revision of pattern rules based on mining of labeled data Example: Rule for “P100” •For any n, FAn = PT1 iff – temp criterion #1: 70ms > TI-max (FAn) < 170ms AND – spat criterion #1 : SP-r (FAn, SP(PT1)) > .7 AND – func criterion #1: EVENT (FAn) = stimon AND – func criterion #2: MODAL (EV) = visual AND Example of output [1] values for summary measures (for one subject, one/six expt conditions) Example of output [2] Matches to spatial, temporal & functional criteria for one subject & one/six experimental conditions Summary results for Rule #1 A Case Study with simulated ERPs (HBM’08 tak) 1. Simulated ERP datasets 2. PCA & ICA methods for spatial & temporal pattern analysis 3. Spatial & temporal metrics for labeling of discrete patterns 4. Revision of pattern rules based on mining of labeled data Simulated ERPs (n=80) P100 N100 N3 MFN P300 + NOISE Simulated ERP Datasets (in DipSim) 1 2 3 4 5 Dipole Simulator (P. Berg) Simulated ERP data: Creating individual ERPs Source # ROI 1 (P1) L-Occipital 2 (P1) R-Occipital 3 (N1) L-Parietal 4 (N1) R-Parietal 5 (N1N2) L-Temporal 6 (N1N2) R-Temporal 7 (P2) Medial -Frontal Intensity Latency Location Location Orientation Orientation Ec centricity (uv / ma) (ms) Th eta Phi Th eta Phi 3.5 | 45 4.0 | 45 -5.0 | -70 -4.0 | -70 -4.0 | -60 -2.0 | -60 2.5 | -30 050 : 150 055 : 155 120 : 240 130 : 250 160 : 300 170 : 310 210 : 390 -090.00 o 068.20 o 090.00 o -068.20 o -100.02 o 045.00 o 100.02 o -045.00 o -110.59 o 035.72 o 114.00 o -033.23 o 056.59 o 087.82 o -090.00 o 090.00 o -090.00 o 090.00 o -129.57 o 125.30 o -122.09 o 053.62 o -060.39 o 036.44 o -036.44 o 019.93 o -026.57 o 083.111 o 0.81 0.81 0.57 0.57 0.42 0.40 0.63 •Random jitter in intensity •NO temporal jitter •NO spatial jitter Patrick Berg’s Dipole Simulator BOTTOM-UP Pattern Analysis with PCA & ICA (Decomposition approach) ERP pattern analysis • Temporal PCA (tPCA) ✔ – Gives invariant temporal patterns (new bases) – Spatial variability as input to data mining • Spatial ICA (sICA) ✔ – Gives invariant spatial patterns (new bases) – Temporal variability as input to data mining X • Spatial PCA (sPCA) Multiple measures used for evaluation (correlation + L1/L2 norms) New inputs to NEMO SPATIAL TEMPORAL TI-max ROI IN-mean (ROI) TI-max ROI IN-mean (ROI TI-max ROI IN-mean (ROI TI-max ROI IN-mean (ROI What we’ve learned (so far…) • Bottom-up methods result in validation & refinement of top-down pattern rules Validation of expert selection of temporal concepts (peak latency) Refinement of expert specification of spatial concepts (± centroids) • Alternative pattern analysis methods (e.g., tPCA & sICA) provide complementary input to bottomup (data mining) procedures BOTTOM-UP Measure Generation Vector attributes = Input to Data mining (clustering & classification) T1 T2 S1 Input to data mining: 32 attribute vectors, defined over 80 “individual” ERPs (observations) S2 CoN CoP ROI ± Centroids BOTTOM-UP Data mining • Vectors of spatial & temporal attributes as input • Clustering observations patterns (E-M accuracy >97%) • Attribute selection (“Information gain”) ± Centroids CoN CoP Peak Latency ✔ Revised Rule for the “P100” Pattern = P100v iff Event type is visual stimulus AND Peak latency is between 76 and 155 ms AND Positive centroid is right occipital AND Negative centroid is left frontal SPACE TIME FUNCTION Simulated ERP Patterns “P100” “N100” “N3” “MFN” “P300” Alternative Spatial Metrics • Scalp (ROI) “regions-of-intrest” CNEG • Positive and negative “centroids” (topographic source & sink) CPOS Overview Agenda • • • • • • ICBO highlights (5 mins) Logistics (5 mins) ERP pattern analysis methods (20 mins) ERP measure generation (10 mins) Linking measures to ontology (10 mins) Data annotation (deep, ontology-based) (10 mins) Action items highlighted in lime green! BOTTOM-UP Statistical Measure Generation • Temporal – Peak latency – Duration (cf. spectral measures) • Spatial (topographic) – Scalp regions-of-Interest (ROI) – Positive & negative centroids • Functional (experimental) – Concepts borrowed from BrainMap (Laird et al.) where possible Measure Generation Vector attributes = Input to Data mining (clustering & classification) T1 T2 S1 Input to data mining: 32 attribute vectors, defined over 80 “individual” ERPs (observations) S2 CoN CoP ROI ± Centroids Overview Agenda • • • • • • ICBO highlights (5 mins) Logistics (5 mins) ERP pattern analysis methods (20 mins) ERP measure generation (10 mins) Linking measures to ontology (10 mins) Data annotation (deep, ontology-based) (10 mins) Action items highlighted in lime green! Automated ontology-based labeling of ERP data Pattern Labels = Functional attributes + Temporal attributes + Spatial attributes Concepts encoded in NEMO_Data.owl Robert M. Frank NEMO Data Ontology: Where ontology meets epistemology Ontology for Biological Investigations (OBI) & Information Artifact Ontology (IAO) Overview Agenda • • • • • • ICBO highlights (5 mins) Logistics (5 mins) ERP pattern analysis methods (20 mins) ERP measure generation (10 mins) Linking measures to ontology (10 mins) Data annotation (deep, ontology-based) (10 mins) Action items highlighted in lime green!