Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Extensible Storage Engine wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Concurrency control wikipedia , lookup
Relational model wikipedia , lookup
Functional Database Model wikipedia , lookup
Clusterpoint wikipedia , lookup
A trial to develop a integrative database of mouse phenotype related information Hiroshi Masuya1 [email protected] Koji Kozaki3 1 2 3 4 Norio Kobayashi2 [email protected] Shigeharu Wakana4 Nobuhiko Tanaka1 Kazunori Waki1 [email protected] Riichiro Mizoguchi3 [email protected] Tetsuro Toyoda2 Technology and development unit for knowledge base of mouse phenotype, RIKEN Bioresource center, 3-1-1 Kouyadai, Tsukuba, Ibaraki 305-0074, Japan Bioinformatics and systems engineering division, RIKEN Yokohama Institute, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama City, Kanagawa, 230-0045, Japan Department of Knowledge Systems, The Institute of Scientific and Industrial Research (ISIR), Osaka University, 8-1 Mihogaoka, Ibaraki, Osaka, 567-0047 Japan Technology and Development Team for Mouse Phenotype Analysis:Japan Mouse Clinic, RIKEN Bioresource center, 3-1-1 Kouyadai, Tsukuba, Ibaraki 305-0074, Japan Keywords: mouse, database, phenotype, ontology, semantic web 1 Introduction Following sequencing of the whole mouse genome, large-scale international projects are underway to generate collections of knockout mouse mutants and subsequently to perform high throughput phenotype assessments, raising new challenges for bioinformatics due to the complexity and scale of the phenotype data. This requires the development of a sophisticated informatics infrastructure for the integration of phenotype related information. Semantics-based description with interoperable set of ontologies is a one of the identical ways to integrate such information [1]. However, there represents a few efforts to develop practical database with ontological description. We have developed the basement system of database developments, collaborations and publications, named "RIKEN SciNes" [2] which developed based on semantic web technology, and can host various data such as, ontologies and scalable dataset in machine-reasonable format. We worked out the development of database hosted by SciNeS to integrate phenotypic data in two major databases, PhenoPub database [3] in Japan Mouse Clinic (JMC) Project and EuroPhenome [4] database in European Mouse Disease Clinic (EUMODIC) project. 2 2.1 Method and Results The basic schema for the integration of phenotype related information For the integration of a broad range of concepts using a general data model, we need logically well-formed framework to incorporate accurate representations of biological reality. We have developed a comprehensive data framework to represent entities in experimental genetics workflows, “YATO-Genetics Ontology (GXO) framework” [5], based on a top-level ontology, Yet Another Top-level Ontology (YATO) [6]. This framework provides consistent guidelines how to build data schema with the combination of public datasets such as data records in Mouse Genome Informatics (MGI), Mouse Adult Gross Anatomy Ontology (MA), Phenotypic Quality Ontology (PATO). It also involves formalisms of the description of phenotype, Entity + Quality (EQ) and Entity + Attribute + Value (EAV) [7], candidates for the internationally standardized syntax of biological phenotypes with the combination of ontology terms. We have imported YATO-GXO framework into SciNeS [8]. On the development of the integrated database, we generated each database tables from concepts (classes) of YATO-GXO framework as a specified “subclasses”. P128-1 2.2 The composition of the integrated phenotype data base We analyzed data schema of PhenoPub and EuroPhenome databases. As a result, we regarded that mutually related six database tables, Annotation of Phenotype Data, Statistics, Measurement and Observation, Cohort, Individual and Experimental Parameter, are sufficient to represent the contents of both databases. We prepared a “template” project in which these six database tables (“classes” in SciNes) were Figure 1: Overview of integrated mouse phenotype database in SciNeS. involved. Two “child” projects were generated from the template to implement datasets imported from PhenoPub and EuroPhenome respectively [9]. As a result, same styled two databases implemented phenotyping data from different projects. These databases represent common formalism integrated with public data sources. 3 Discussions The value of any kind of data is greatly enhanced when it exists in a form that allows it to be integrated with other data. One approach to integration is through the annotation of multiple bodies of data using ontologies. We tried to develop an integrated database based on two infrastructures. One is a logical framework of data structure and meaning, YATO-GXO. The other is a semantic-web based data hosting system, SciNeS. The integrated mouse phenotype database project implemented in SciNeS revealed common, logically well-formed and practical formalism of phenotype related data based on the top-level ontology, YATO. Standardization of the phenotype description provides better comparison and integration of mutant information. Further more, any equivalences and differences between JMC and EUMODIC, such as phenotyping procedure or way of statistical analyses can be represented by the annotation with commonly formed data. This trial showed semantic-web based integration is workable for complex and scalable phenotypic information. In the future, we will establish interoperability between YATO-GXO framework and external similar frameworks such as Ontology of Biomedical Investigation (OBI) [10] based on Basic Formal Ontology (BFO) to realize data integration ranging more broad communities. References [1] Barry Smith et al, The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration, Nature Biotechnology, 25, 1251-1255, 2007. [2] http://database.riken.jp/ [3] http://www.brc.riken.go.jp/lab/jmc/mouse_clinic/m-strain_jp.html [4] http://www.europhenome.org/ [5] http://www.brc.riken.go.jp/lab/bpmp/Ontologies/GXO/GXO.html [6] http://www.ei.sanken.osaka-u.ac.jp/hozo/onto_library/upperOnto.htm [7] Hancock JM et al, Mouse, man, and meaning: bridging the semantics of mouse phenotype and human disease. Mammalian Genome, in press [8] https://database.riken.jp/sw/view#RIB00023 [9] http://renkei.gsc.riken.jp/sw/view#RIA00110 [10] http://obi-ontology.org/ [11] http://www.ifomis.org/bfo P128-2