Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Protein purification wikipedia , lookup
Bimolecular fluorescence complementation wikipedia , lookup
Homology modeling wikipedia , lookup
Protein domain wikipedia , lookup
List of types of proteins wikipedia , lookup
Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup
Western blot wikipedia , lookup
Intrinsically disordered proteins wikipedia , lookup
An Integrated Database and Knowledge-Base of Interaction between Human Proteins and Commonly Used Drugs Hiroyasu Shimada1 Masashi Nemoto1 [email protected] [email protected] 1 2 3 Ken Horiuchi1 Atsuko Yamaguchi1 [email protected] [email protected] Motoi Tobita2 Kenji Araki3 Tetsuo Nishikawa1 [email protected] [email protected] [email protected] Reverse Proteomics Research Institute, 2-6-7 Kazusa-Kamatari, Kisarazu, Chiba 292-0818, Japan Hitachi, Advanced Research Laboratory, 1-280 Higashi-Koigakubo, Kokubunji, Tokyo 185-8601, Japan Mochida Pharmaceutical, 722 Uenohara, Jimba, Gotemba, Shizuoka 412-8524, Japan Keywords: protein-drug interaction, database, human full length cDNA, knowledge-base 1 Introduction Finding new drug target proteins is indispensable to design novel drugs. However, not many target proteins of commonly used drugs have been found until now. Therefore, it is important to clarify target proteins of commonly used drugs in order to optimize commonly used drugs and to design novel drugs. Reverse Proteomics Research Institute (REPRORI), that aims at finding new target proteins from 6,000 proteins obtained from the human full-length cDNA clones generated in NEDO FL project by using 800 known commonly used drugs as probes, started a research project [1] in 2002 in order to build a platform for drug design. In this paper, we report a database system [2] information about Known interaction data FL cDNA commonly used Information that integrates annotation data of proteins, drugs and sequences drugs collection Amino acid interactions and a knowledge-base system that collects Measured interaction data sequences knowledge that is extracted from the database. One of Normalization the unique characteristics of the database is that it calculation of physical Functional annotation Pre-processing properties of drugs and collects comprehensive interaction data as matrices. and clustering based on clustering based on the genome mapping properties From the analysis of the database, various findings such as new drug target proteins, different target Integrated database proteins corresponding to different pharmacological Proteins information Drugs information Functional annotations effects or adverse effects, and new pharmacophores CAS No., synonyms, structures, Interaction similarity search, Database Physical properties, data motif search and pharmacological effects, side effects genome mapping are expected to be derived. Therefore, we clustered the construction Integrated retrieval system interaction matrices and analyzed the clusters to search for properties of the drugs or proteins common to the Data-mining members of the cluster. We also did modeling and Interaction cluster analysis Structural analysis docking analyses for the measured interactions. We developed a knowledge-base system that collects cluster analysis results and structural analysis results Knowledge-base for interaction pairs. New target proteins and the Knowledge-base ・Clustering analysis results construction ・Structural analysis results related information in the knowledge-base system will give us a basis for designing novel drugs. 2 Method and Results Figure 1: Construction procedure of integrated database and knowledge-base 2.1 Construction Procedure of Integrated Database and Knowledge-Base Five-step construction procedure is described using Figure 1. In the information collection step, information on commonly used drugs, FLcDNA sequences and amino acid sequences, known interaction data between drugs and proteins, and measured interaction data is collected. In the pre-processing step, physical properties of drugs are calculated. Also, clustering of drugs based on the properties, normalization on the measured interaction data, and functional annotation and clustering based on genome mapping for cDNA sequences are performed. In the database construction step, pre-processed data are gathered into a Postgre SQL database. Graphical user interfaces are built in order to visualize information on drugs, proteins, and interactions. In the data-mining step, interaction cluster analysis and structural analysis are performed. In the knowledge-base construction step, results of clustering analysis and structural analysis are gathered. 2.2 Integrated Database System The database contains annotations for more than 1000 commonly used drugs and for about 20000 proteins that are coded on the human full-length cDNA sequences. Annotations for drugs include CAS No., synonyms, structures, physical properties, pharmacological effects, and adverse effects. Annotations for proteins include their functional annotations, such as similarity and motif search, and genome mapping. The database also contains the experimental interaction data obtained from several screening methods such as size exclusion chromatography and surface plasmon resonance, and known interaction data collected from public databases such as ChemBank and literatures. By using the database, we can refer to information from journals, information from public databases and experimental data at the same time, and observe connections among them. Observation of these data leads to finding of the interaction that is related to pharmacological effects or adverse effects. 2.3 Knowledge-Base System Structural annotation window Interaction matrix view by BirdsAnts Construction procedure and interfaces of the knowledge-base system are shown in Figure 2. First, the interaction matrix is clustered by using a clustering method developed by us that uses concept of biclique cover and maximizes the Clustering of matrix Modeling and docking viewer average cluster size. To check the clustering results, Interaction cluste clusterr annotation window a viewer BirdsAnts [3] developed by us was used. Cluster ID: Bia021003 Chemicals Proteins For the obtained clusters, analyses of the clusters Chemical1 Protein1 Comments: Chemical2 Protein2 ・Pharmacophore analysis results. such as pharmacophore analysis and protein Chemical3 ・Protein domain analysis results Chemical4 domain analysis were done. If an interesting pair of Chemical5 Analysis of clusters a drug and a protein is found as a result of Pharmacophore view Cluster view Pharmacophore analysis Protein domain analysis clustering analysis, and if structure of that protein Modeling and docking culculation is guessed, homology modeling and docking calculation are performed. The analyzed data are accumulated in the knowledge-base system. In the a) Knowledge-base construction b) interaction cluster annotation window, the features extracted from the analysis of the clusters such as Figure 2: Construction of knowledge-base phamacophores are shown. The cluster is visualized a)Construction procedure, b)Interface of the system with several properties of drugs and proteins by BirdsAnts. In the structural annotation window, modeling and docking data are visualized by using jV version 3 [4]. Protein properties Chemical properties 3 Acknowledgments This work was supported by a grant from NEDO Project of the Ministry of Economy, Trade and Industry of Japan. References [1] Suwa, Y., Exploring novel drug targets through the comprehensive analysis of human protein-drug interactions, HUGO's 10th Human Genome Meeting Kyoto, Japan 18-21 April 2005. [2] Nemoto, M., Araki, K., Horiuchi, K., Tobita, M., and Nishikawa, T., An integrated database of interaction between human proteins and commonly used drugs, Genome Inform., 14:599-600, 2003. [3] Tobita, M., Horiuchi, K., Araki, K., Nemoto, M., and Nishikawa, T., BirdsAnts: bringing informative rules from a database system, aimed at novel targets search, Genome Inform., 14:286-289, 2003. [4] http://pdbj.protein.osaka-u.ac.jp/PDBjViewer/index.html