Download PowerPoint

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Pharmacogenomics wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Genome (book) wikipedia , lookup

Gene expression profiling wikipedia , lookup

Public health genomics wikipedia , lookup

Epistasis wikipedia , lookup

RNA-Seq wikipedia , lookup

Microevolution wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Transcript
Primary Immunodeficiency Disease (PID) PhenomeR
(An integrated web-based ontology resource towards establishment of PID E-clinical decision support system)
Subazini Thankaswamy Kosalai and Sujatha Mohan1
1Research
Unit for Immunoinformatics, RIKEN Research Center for Allergy and Immunology, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan.
ABSTRACT
The main challenge for in silico genotype-phenotype correlation for any genetic diseases is to standardize phenotype ontology
terms and the genotype data. Earlier, we have developed and established a molecular disease database named RAPID—
Resource of Asian Primary Immunodeficiency Diseases (PID) (http://rapid.rcai.riken.jp), a web-based informatics platform
which enables PID experts to easily mine collected genomic, transcriptomic, and proteomic data of PID causing genes. At
present, RAPID comprises a total of 265 PIDs and 243 genes, out of which 233 genes are reported with over 5000 unique
disease-causing mutations annotated from about 1800 PubMed citations as of February 2013. We, hereby, introduce a newly
developed PID ontology browser, “PhenomeR” (http://rapid.rcai.riken.jp/ontology/v1.0/phenomer.php), for systematic
integration and analysis of PID phenotype with the genotype data that are taken from RAPID. It currently holds 1438 PIDphenotype terms that are mapped and standardized using logic based assessment approach and represented in the form of Web
Ontology Language (OWL) and Resource Description Framework (RDF) formats using semantic web technology for easy
data exchange and validation, and interpretation of PID phenotype-genotype correlation using various computational approaches.
The motivation for the development of PhenomeR is mainly to assist researchers and clinicians to identify reported and novel
PID-causing genes as well as to determine genes involved in PID through the identification of reported disease-causing mutations
and their respective observed symptoms. In essence, PID PhenomeR serves as an active integrated platform for PID phenotype
data, wherein the generated semantic framework is implemented in the integrated knowledge-base query interface i.e. SPARQL
Protocol and RDF Query Language (SPARQL) endpoint for establishing a well-informed PID e-clinical decision support system.
Overview of PID-phenomeR
(A) DATA COLLECTION
RAPID,
IDR and
Literature
Phenotype
annotation tool
Collected PID
Phenotypes
terms
Mapped terms using Standard sources
Human Disease (DOID)
Human Phenotype Ontology (HPO)
Online Mendelian Inheritance in Man Metathesaurus source processing (OMIM-MTHU)
Symptom Ontology (SYMP)
Systematized Nomenclature of Medicine Clinical
Terms (SNOMEDCT)
The Unified Medical Language System - Concept
Unique Identifiers (UMLS_CUI)
No
PID-phenomeR features
Is Mapped ?

Presents a web-based user friendly interface for
accessing, querying browsing and analyzing PID
phenotype terms

Integrates
semantically
standardized
phenotype
vocabularies from RAPID along with PIDs, genes and
disease-causing mutations into a relational ontology for
inference of genotype-phenotype correlation

Provides PID-phenotype data in various standardized
downloadable options - OWL, RDF and Excel formats for
easy sharing and data exchange among other interested
research groups

Displays the phenotype terms in tree structure using
NCBO widget

Facilitates integrated knowledgeBase query interface SPARQL Protocol and RDF Query Language (SPARQL)

Promotes a network of active open community-driven
semantic web technology
RAPID - Home page
Yes
(B) DATA STANDARDIZATION
Masuya, H., Y. Makita, et al. (2011). "The RIKEN integrated
database of mammals." Nucleic Acids Res. 39:D861-70.
PID quality check
by Logic based
assessment
method
No
Conservativity
principle
Yes
PID PhenomeR Database Schema
RDF and OWL formats viewed in Link Data and Protégé
Home page
No
Consistency
principle
Yes
R
E
S
P
O
N
S
E
Locality
principle
PID quality check by semiautomated method
Term C3 deficiency
viewed using
Protégé 4.1 OntoGraf
No
(C) DATA STORAGE & RETRIEVAL
OWL, RDF files
generation
PID PhenomeR – Download Option
Statistics
Database Statistics
RDF file generated using OWL
Syntax Converter
Q
U
E
R
Y
Phenotype terms
Phenotype ontology
database
PID Phenotype KnowledgeBase
Search and Query interface "PhenomeR"
Search result of
phenotype term
Primary information page
of STK4 gene in RAPID
OWL Statistics
1466
Classes
Semantic types
24
Individuals
1549
-
Category
29
Classes with single subclass
144
Subcategory
45
Classes with more than 25
subclasses
1346
Terms in Multiple
Category
17
Average number of Siblings
276
Terms in Multiple
subcategory
10
Object Property
161
Newly mapped terms
51
Data Property
9
PID PhenomeR Advanced search options
Successful outcome and challenges
PhenomeR aims to build hierarchical ontology class structures and entities
of all observed PID phenotypic terms that can be further used as integrated
knowledgebase query interface - SPARQL Protocol and RDF Query
Language (SPARQL) for screening and implementing algorithms to
compile data from multiple sources to measure statistically significant
dataset with greater sensitivity, specificity and degree of confidence
towards well-informed clinical decision support system.
The mapping of unmapped terms from the PhenomeR is a challenging
task, since some of them are not available in any of the databases. This
ongoing pursuit will soon implement a systematic integrated approach for
mapping all these unmapped new terms towards an open communitydriven semantic web (SW) technology.
PhenomeR enables easy access, search, query and analyze PID
phenotype terms associated with genes, diseases and mutations
Reported list of genes
Reported list of mutation data
Reported list of mutation data
Mutation analysis of STK4 gene
Search result of phenotype term
beginning with ‘Recurrent’
CONCLUSION
Multiple terms search output
Hyperlinked PubMed
reference citation
Search result of PID phenotype
term with category
‘Cardiovascular’
Search result of PID phenotype term
with semantic type - ‘Acquired
Abnormality’
Master list of PID phenotype terms, associated
features and relationships in Excel format
PID PhenomeR – Download Option – OWL format
Overall, this kind of analysis should bridge a gap between genotype and phenotype
correlation thereby improving phenotype-based genetic analysis of PID genes.
Moreover, it should facilitate clinicians in confirming early PID diagnosis and also helpful
in implementing proper therapeutic interventions.
We sincerely believe that the presented structured data format in RPO should help in
augmenting biomedical researchers to do further analysis computationally and also
assisting clinicians in identification of diagnosed PID
Publications – PID project
PID PhenomeR project in NCBO BioPortal
http://bioportal.bioontology.org/projects/171
Term hierarchy
visualization using NCBO
widget from NCI thesaurus
Subazini Thankaswamy Kosalai and Sujatha
Mohan.
PID PhenomeR- An integrated
platform for developing phenotype ontology
structures for primary immunodeficiency
diseases (Database, Oxford University Press In communication)
All distinct subjects from
RPO ontology queried
using SPARQL
RPO summary page in NCBO BioPortal
(http://bioportal.bioontology.org/ontologi
es/3114)
Contact: [email protected]
Registration form for submitting new PID
terms
Acknowledgements
The authors acknowledge RIKEN for providing necessary computing resources, the
research team at the Institute of Bioinformatics (IOB), Bangalore India for their
collaboration in developing RAPID, and alumni of our lab as well as all PID physicians
involved in the PID Japan project for their valuable input and suggestions.
Collaboration and funding
The PID project has been initiated by the IOB and the Immunogenomics research
group at Research Centre for Allergy and Immunology (RCAI), RIKEN Yokohama
Institute, Japan and it was funded by The Asia S&T Strategic Cooperation Promotion
Program, Special Coordination Funds for Promoting Science and Technology, MEXT,
Japan.