* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Parkinson`s Disease Ontology
Survey
Document related concepts
Transcript
Parkinson’s Disease Ontology Outline • Use Case: – Parkinson’s Disease • Seed Ontology • Design Issues • Extending the seed ontology • Next Steps Use Case: Parkinson’s Disease • Description of Parkinson’s Disease from different perspectives: – – – – – – – Systems Physiology View Cellular and Molecular Biologist View Clinical Researcher View Clinical Guideline Formulator View Clinical Decision Support Implementer View Primary Care Clinical View Neurologist View • Identify Information Needs of the stakeholders identified above • Available at: – http://esw.w3.org/topic/HCLS/ParkinsonUseCase • Developed by: – Don Doherty – Ken Kawamato Use Case: Systems Physiology View What chemicals (neurotransmitters) are used by each circuit element (neuron) to communicate with the next element (neuron)? What responses do they elicit in the neurons? Use Case: Cellular and Molecular Biologist View What proteins are implicated in Parkinson's disease? How are protein expression patterns, protein processing, folding, regulation, transport, protein-protein interactions, protein degradation, etc. affected? Use Case: Clinical Researcher View Can a certain diagnostic test (e.g., a blood test for a biomarker or an imaging study) provide an approach to diagnosing Parkinson’s disease that is superior to or can complement existing diagnostic approaches? Use Case: Clinical Guideline Formulator View What have been the results of clinical trials that have evaluated the benefits and costs associated with diagnostic or therapeutic interventions for Parkinson’s disease? Use Case: Clinical Decision Support Implementer View Which clinical guideline(s) should be used as the basis for implementing the CDS functionality? Use Case: Primary Care Clinician View If a patient is not currently diagnosed with Parkinson’s disease, do the patient’s current symptoms indicate the need for a referral to a neurologist for further evaluation? If so, what are the referral criteria? Use Case: Neurologist View What is the differential diagnosis for this patient given his/her symptoms, signs, and diagnostic test results? First Phase • Focus on the Cellular and Molecular Biologist View • Develop Parkinson’s Disease Ontology based on that View • Refine it iteratively • Augment it with other views later Parkinson’s Disease Revisited Studies identifying genes involved with Parkinson's disease are rapidly outpacing the cell biological studies which would reveal how these gene products are part of the disease process in Parkinson's disease. The alpha synuclein and Parkin genes are two examples. The discovery that genetic mutations in the alpha synuclein gene could cause Parkinson's disease in families has opened new avenues of research in the Parkinson's disease field. When it was also discovered that synuclein was a major component of Lewy bodies, the pathological hallmark of Parkinson's disease in the brain, it became clear that synuclein may be important in the pathogenesis of sporadic Parkinson's disease as well as rare cases of familiar Parkinson's disease. More recently, further evidence for the intrinsic involvement of synuclein in Parkinson's disease pathogenesis was shown by the finding that the synuclein gene may be triplicated or duplicated in familiar Parkinson's disease, suggesting that simple overexpression of the wild type protein is sufficient to cause disease. Since the discovery of synuclein, studies of genetic linkages, specific genes, and their associated coded proteins are ongoing in the Parkinson's disease research field - transforming what had once been thought of as a purely environmental disease into one of the most complex multigenetic diseases of the brain. Studies of genetic linkages, specific genes, and their associated coded proteins are ongoing in the Parkinson's disease research field. Mutations in the Parkin gene cause early onset Parkinson's disease, and the parkin protein has been identified as an E3 ligase, suggesting a role for the proteasomal pathway of protein degradation in Parkinson's disease. DJ-1 and PINK-1 are proteins related to mitochondrial function in neurons, providing an interesting genetic parallel to mitochondrial toxin studies that suggest disruptions in cellular energetics and oxidative metabolism are primarily responsible for Parkinson's disease. Other genes, such as UCHL-1, tau, and the glucocerebrosidase gene, may be genetic risk factors, and their potential role in the sporadic Parkinson's disease population remains unknown. Mutations in LRRK2, which encodes for a protein called dardarin, is the most recently discovered genetic cause of Parkinson's disease, and LRRK2 mutations are likely to be the largest cause of familial Parkinson's disease identified thus far. Dardarin is a large complex protein, which has a variety of structural moieties that could be participating in more than a dozen different cellular pathways in neurons. Because the cellular pathways that lead to Parkinson's disease are not fully understood, it is currently unknown, how, or if, any of these pathways intersect in Parkinson's disease pathogenesis. Step 1: Identify concepts and subsumption hierarchies Step 2: Identify relationships Step 3: Look at Information Queries What cell signaling pathways are implicated in the pathogenesis of Parkinson’s disease? In which cells? What proteins are involved in which pathways? Design Issues: Modeling • Modeling as relationships vs classes – E.g., UHCL-1 transcribed_into Dardarin, vs – Define a class called transcription as follows: – Transcription • has_gene: UHCL-1 • has_protein: Dardarin • Modeling a Disease as a dynamic process as opposed to a static class Design Issues: Instance vs SubClass • A generic/specific relationship can be modeled either using instance-of vs subclass-of, for e.g. – Parkinson’s Disease subclassof Disease vs Parkinson’s Disease instance-of Disease – UHCL-1 subclass-of Gene vs UHCL-1 instance-of Gene – Synuclein subclass-of Protein vs Synuclein instance-of Gene • What are the performance impact of these relationships? – Instance-of involves ABox reasoning – Subclass-of involved TBox reasoning – Is one more scalable than the other? • What is the impact on expressivity? – Can “more” knowlledge be represented using one over the other? Design Issue: Granularity • At what level of specificity should relationships be represented in the ontology? – AllelicVariant causes Disease, vs – LRR2KVariant causes Parkinson’s Disease • At what level of genericity should relationships be represented in the ontology? – LewyBody hallmark_of Parkinson’s Disease, vs – AnatomicalEntity hallmark_of Disease Design Issue: Uncertainty • “The discovery that genetic mutations in the alpha synuclein gene could cause Parkinson's disease in families” • The OWL/RDF metamodels do not support expressing this information. • What could be ways of expressing these? – Using reification in RDF? – Introducing new relationships in OWL? • What impact would this have on: – Data Integration? – Reasoning? Design Issue: Domain/Range Polymorphism • What are the semantics of multiple domains and ranges? – – – – – Property: associated_with domain: Pathway domain: Protein range: Cell range: Biomarker • Are RDF/OWL Semantics good enough for us? • Do we need remodel relationships to avoid this? • Different types of polymorphic relationships: – Sub-type polymorphism – Ad-hoc polymorphism Design Issue: Default Values • How do we handle default values of OWL properties • Example: – Default function of proteosomal pathway is protein degradation • What is the impact of default values on biomedical data integration? Reasoning? Design Issue: Ontology Inclusion • Cross-linking to other ontologies such as GO, Neuronames, etc. • If we “link” to a class or property in another ontology: – Should we include associated sub classes? – Should we include associated properties? – Should we include associated axioms? • What if this leads to inconsistencies – Cycles – Contradictions • How does this impact data integration or reasoning? – Can we get by with “shallow” inclusion? Ontology Modularization • Mutually disjoint tree with “cross cutting” properties, axioms, etc. • Proposed by Alan Rector • Example: Different hierarchies/lattices for – Studies (e.g., publication in Pubmed) – Biomedical knowledge referenced in those studies (e.g., association between a gene and a disease) Design Issue: Higher Order Relationships • Example – Association between a Gene and a Disease mentioned in a study Creation of Best Practices • Design issues have been the subject of investigation in the Knowledge Engineering and Medical Informatics communities • Different approaches to resolve these issues will be appropriate in the context of different use cases. • Goal: – Propose various alternatives in the context of use cases proposed in HCLSIG Extending the Seed Ontology • Identify concepts and properties inclusions from: – Gene Ontology – Neuro Names • Decide the “level of inclusion” Extending the Seed ontology • Look at statements from research articles to extend the ontology Example: • – Aggresomes formed by alpha-synuclein and synphilin-1 are cytoprotective. – Create a new property called formedBy • • domain(formedBy) = Aggresome range(formedBy) = Protein subClassOf( intersectionOf(Aggresome, Restriction(formedBy, someValuesFrom(intersectionOf(alpha-synuclein, synphilin-1)))), Restriction(function, hasValue(cytoprotective) ) Next Steps • Apply this ontology to demonstrate Parkinson’s Disease Use Case • Focus of the BIONT – BIORDF Collaborative F2F