Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Tae-Hyung Kim1 [email protected] InSong Koh2 [email protected] Gil-Mi Ryu1,2 [email protected] Jong Park3 [email protected] 1 Department of Bioinformatics, Bioinformatics Cooperative Course, Pusan National University, Pusan, Korea 2 Section of Bioinformatics, Central Genome Center, National Institute of Health, Nokbun-Dong 5, Seoul, Korea 3 MRC-DUNN, Hills Road Cambridge CB2, 2XY, England, UK 1 Introduction One of the major obstacle of bioinformatics is the difficulty in computation with literature information. Unlike sequence and structure, it is impossible to establish homology, similarity, interaction and function criteria for literature information. To ease this problem, attempts to clarify the ontological problems have become bioinformatic projects. The idea of ontology is to define terms and concepts in a mechanical and computable units. The result will be clear classification and mapping of text elements for computers. We have applied this ontological advantage of classifying elements to the very bioinformatics field. This project has an important merit of efficient understanding and dissemination of bioinformatics knowledge to this fast growing field. Any intuitive classification system of bioinformatics itself can provide us with valuable project ideas and future directions. There are three main components of ontology of bioinformatics field: 1) classification based on methodology, 2) knowledge based classification (database systems) and 3) classification based on biological data types. These components overlap and they are different aspects of the same or similar information. However, depending on the users interest, the certain view can be more relevant to design and organize a bioinformatics project 2 Method and Results 2.1Classification based on methodology. We tried to classify bioinformatics field according to analysis method of biological data(DNA, RNA, Protein). In this way, bioinformatics can be understood intuitively through a schematic map. 2.3 Classification based on biological data types We categorized the component fields of bioinformatics according to the implementation types used by the biologists after data acquisition. We differentiate them by the common procedures used and tools applied to the biological knowledge, which is a usual procedure carried out by biologists Figure 5. Classification according to bi ological data types. According to this cl assification map, biological data can be identified through prediction of sequence structure and function. As information acquired from data flows from right to left, it becomes more and more clear. 3 Discussion In this classification of the components of bioinformatics, we introduced our o ntology schema in classifying and mapping the bioinformatics field itself. This ontological procedure was designed to represent the methodology, features of databases and data content. So it allows us to find projects and relate the problem domain in bioinformatics in the much more systematic way. Also it ca n be used to cluster biological sequence data based on their bioinformatics ontology characteristics and it can provide us computation on the specific elements such as sequence and database. In addition, schematic maps are drawn to show a visual tree so that one can get the global picture on bioinformatics field, and obtain more precise information intuitively and efficiently. The lower levels of each classification criterion is lin ked to the web pages.(http://nihcgc.re.kr/BioinfoMap and http://interaction.mrc-dunn.cam.ac.uk/BioinfoMap/). The classification system is still being developed and will be stored in an SQL based database for more dynamic navigation between different component concepts of bioinformatics field. Acknowledgement Figure 1. main window Figure 2. sub windows Figure 3. Ontological classification based on methodology. The methodology for DNA sequence determination can be classified according to work procedure such as mapping, sequencing, assembly, and searching. RNA analysis is classified according to cDNA chip procedure resulting in expression analysis. Protein analysis methodology can be classified as comparative and predictive methods. 2.2 Knowledge based classification (database systems). These databases can be classified according to data features, thus classified as 1) sequence, 2) protein, 3) metabolic pathway, 4) organism and 5) RNA groups. Figure 4. Classification of databases. Biological databases can be classified according to the data features. The popular databases used in the biological community were included in this schematic map.. We thank Mi-Ae Yoo and Heui-Soo Kim(Pusan National University) for support. This work was funded in part by the Bioinformatics Training Grant of Ministry of Health & Welfare, Korea and supported by Pusan National University, Korea and MRC, UK References [1] Patricia G. Baker, Carole A. Goble, Sean Bechhofer, Norman W. Paton, Robert Stevens, Andy Brass, An ontology for bioinformatics applications, Bioinformatics vol 15, no 6, 510-520, 1999 [2] Robert Stevens, Patricia Baker, Sean Bechhofer, Gary Ng, Alex Jacoby, Norman W. Paton, Carole A. Goble, Andy Brass, TAMBIS: Transparent Access to Multiple Bioinformatics Information Sources, Bioinformatics vol. 16 no. 2, 184-185, 2000 [3] Andreas D. Baxevanis, The Molecular Biology Database Collection: an online compilation of relevant database resources, Nucleic Acid Research, vol. 28. No. 1, 2000 [4] The Gene Ontology Consortium, Gene ontology : Tool for the unification of biology, Nature America Inc http://genetics.nature.com., nature genetics volume 25, 2000