Download PDF file

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

IMDb wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Concurrency control wikipedia , lookup

Relational model wikipedia , lookup

Database wikipedia , lookup

Functional Database Model wikipedia , lookup

Clusterpoint wikipedia , lookup

ContactPoint wikipedia , lookup

Database model wikipedia , lookup

Transcript
A trial to develop a integrative database of mouse phenotype
related information
Hiroshi Masuya1
[email protected]
Koji Kozaki3
1
2
3
4
Norio Kobayashi2
[email protected]
Shigeharu Wakana4
Nobuhiko Tanaka1
Kazunori Waki1
[email protected]
Riichiro Mizoguchi3
[email protected]
Tetsuro Toyoda2
Technology and development unit for knowledge base of mouse phenotype,
RIKEN Bioresource center, 3-1-1 Kouyadai, Tsukuba, Ibaraki 305-0074, Japan
Bioinformatics and systems engineering division, RIKEN Yokohama Institute, 1-7-22
Suehiro-cho, Tsurumi-ku, Yokohama City, Kanagawa, 230-0045, Japan
Department of Knowledge Systems, The Institute of Scientific and Industrial Research (ISIR),
Osaka University, 8-1 Mihogaoka, Ibaraki, Osaka, 567-0047 Japan
Technology and Development Team for Mouse Phenotype Analysis:Japan Mouse Clinic,
RIKEN Bioresource center, 3-1-1 Kouyadai, Tsukuba, Ibaraki 305-0074, Japan
Keywords: mouse, database, phenotype, ontology, semantic web
1
Introduction
Following sequencing of the whole mouse genome, large-scale international projects are underway to
generate collections of knockout mouse mutants and subsequently to perform high throughput phenotype
assessments, raising new challenges for bioinformatics due to the complexity and scale of the phenotype data.
This requires the development of a sophisticated informatics infrastructure for the integration of phenotype
related information. Semantics-based description with interoperable set of ontologies is a one of the identical
ways to integrate such information [1]. However, there represents a few efforts to develop practical database
with ontological description. We have developed the basement system of database developments,
collaborations and publications, named "RIKEN SciNes" [2] which developed based on semantic web
technology, and can host various data such as, ontologies and scalable dataset in machine-reasonable format.
We worked out the development of database hosted by SciNeS to integrate phenotypic data in two major
databases, PhenoPub database [3] in Japan Mouse Clinic (JMC) Project and EuroPhenome [4] database in
European Mouse Disease Clinic (EUMODIC) project.
2
2.1
Method and Results
The basic schema for the integration of phenotype related information
For the integration of a broad range of concepts using a general data model, we need logically well-formed
framework to incorporate accurate representations of biological reality. We have developed a comprehensive
data framework to represent entities in experimental genetics workflows, “YATO-Genetics Ontology (GXO)
framework” [5], based on a top-level ontology, Yet Another Top-level Ontology (YATO) [6]. This
framework provides consistent guidelines how to build data schema with the combination of public datasets
such as data records in Mouse Genome Informatics (MGI), Mouse Adult Gross Anatomy Ontology (MA),
Phenotypic Quality Ontology (PATO). It also involves formalisms of the description of phenotype, Entity +
Quality (EQ) and Entity + Attribute + Value (EAV) [7], candidates for the internationally standardized
syntax of biological phenotypes with the combination of ontology terms.
We have imported YATO-GXO
framework into SciNeS [8]. On the development of the integrated database, we generated each database
tables from concepts (classes) of YATO-GXO framework as a specified “subclasses”.
P128-1
2.2
The composition of the integrated phenotype data base
We analyzed data schema of
PhenoPub and EuroPhenome
databases. As a result, we
regarded
that
mutually
related six database tables,
Annotation of Phenotype
Data,
Statistics,
Measurement
and
Observation,
Cohort,
Individual and Experimental
Parameter, are sufficient to
represent the contents of
both databases. We prepared
a “template” project in which
these six database tables
(“classes” in SciNes) were
Figure 1: Overview of integrated mouse phenotype database in SciNeS.
involved.
Two
“child”
projects were generated from the template to implement datasets imported from PhenoPub and EuroPhenome
respectively [9]. As a result, same styled two databases implemented phenotyping data from different
projects. These databases represent common formalism integrated with public data sources.
3
Discussions
The value of any kind of data is greatly enhanced when it exists in a form that allows it to be integrated with
other data. One approach to integration is through the annotation of multiple bodies of data using ontologies.
We tried to develop an integrated database based on two infrastructures. One is a logical framework of data
structure and meaning, YATO-GXO. The other is a semantic-web based data hosting system, SciNeS. The
integrated mouse phenotype database project implemented in SciNeS revealed common, logically
well-formed and practical formalism of phenotype related data based on the top-level ontology, YATO.
Standardization of the phenotype description provides better comparison and integration of mutant
information. Further more, any equivalences and differences between JMC and EUMODIC, such as
phenotyping procedure or way of statistical analyses can be represented by the annotation with commonly
formed data. This trial showed semantic-web based integration is workable for complex and scalable
phenotypic information. In the future, we will establish interoperability between YATO-GXO framework
and external similar frameworks such as Ontology of Biomedical Investigation (OBI) [10] based on Basic
Formal Ontology (BFO) to realize data integration ranging more broad communities.
References
[1] Barry Smith et al, The OBO Foundry: coordinated evolution of ontologies to support biomedical data
integration, Nature Biotechnology, 25, 1251-1255, 2007.
[2] http://database.riken.jp/
[3] http://www.brc.riken.go.jp/lab/jmc/mouse_clinic/m-strain_jp.html
[4] http://www.europhenome.org/
[5] http://www.brc.riken.go.jp/lab/bpmp/Ontologies/GXO/GXO.html
[6] http://www.ei.sanken.osaka-u.ac.jp/hozo/onto_library/upperOnto.htm
[7] Hancock JM et al, Mouse, man, and meaning: bridging the semantics of mouse phenotype and human
disease. Mammalian Genome, in press
[8] https://database.riken.jp/sw/view#RIB00023
[9] http://renkei.gsc.riken.jp/sw/view#RIA00110
[10] http://obi-ontology.org/
[11] http://www.ifomis.org/bfo
P128-2