Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
2007 【IBM developerWorks 開發者大會】 演繹生化科技新樂章 -生醫研發整合平台 October 30, 2007 呂政欣 Joe Lu, Ph.D. Life Sciences Center of Excellence Software Group, IBM Taiwan [email protected] © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 IBM Taiwan – Life Sciences Center of Excellence Established in September, 2002 and under the support of Ministry of Economics Affairs. Based in the Nankang Software Park, Taipei. The first and only IBM life science and healthcare solution development and enabling center in Asia Pacific region beside Japan. Develop and construct a flexible, on demand bioinformatics common platform based on open standards. Linkage 2 and bridge to IBM worldwide research capacity. IBM developerWorks| Oct 2007 © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 The Current Problems in Biomedical Studies How can I utilize & link Analysis tools on Data different sites ?? How can I integrate information from Data Integration different data resources ?? How can I do with Computation limited computing Power power ?? 3 IBM developerWorks| Oct 2007 © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 The Current Problems in Biomedical Studies Data Analysis – Multiple bioinformatics tools for an analysis process. – Bioinformatics tools on various websites and systems. – Not unified user interfaces or data formats. Data Integration – Require to integrate information to gain global view. – Dispersed data in different websites, database systems and without standard. Computation Power – Require powerful and flexible system and environment to support bioinformatics analyses and data integration. 4 IBM developerWorks| Oct 2007 © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 LSCE’s Approaches to Tackle the Problems Data Analysis – Integrated workplace and unified interface. – The “Lego” type workflow and workflow engine. – Web Services as standard. Data Integration – Federated data integration. – Easy data query and search for end users. Computation Power – High performance computing. – Grid computing architecture. 5 IBM developerWorks| Oct 2007 © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 LSCE’s Approaches to Tackle the Problems Data Analysis – Integrated workplace and unified interface. – The “Lego” type workflow and workflow engine. – Web Services as standard. Data Integration – Federated data integration. – Easy data query and search for end users. Computation Power – High performance computing. – Grid computing architecture. 6 IBM developerWorks| Oct 2007 © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 Web Services as Unified Standards Nature. 2002 May 9;417(6885):119-20 adapted from a keynote speech given by Lincoln Stein at the 2002 O’Reilly Open Bioinformatics Conference 7 IBM developerWorks| Oct 2007 © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 “Lego” Type Component and Workflow Traditional Approach Tool 2 Tool 1 Tool 3 Tool Tool 4 4 LINUX Our Approach Tool 2 Tool 1 Site A LINUX 8 Site B MS Site C UNIX Site A MS IBM developerWorks| Oct 2007 Site E LINUX Tool 3 Site X LINUX Tool Tool 4 4 Site Z UNIX © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 LSCE Static Workflow Bioinformatics tools 9 IBM developerWorks| Oct 2007 © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 Traditional Analysis Process Format conversion input output Format conversion output input input Format conversion output input output Sequence Alignment Multiple Sequence Alignment Sequence Relation Analysis Sequence Distance Analysis BLAST ClustalW Protdist Neighbor Individual Websites US NCBI UK EBI France Phylip Germany Max-Planck Analysis Time 10 IBM developerWorks| Oct 2007 © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 IBM Bioinformatics Common Platform IBM Bioinformatics Common Platform Phylogenic Analysis Web Services Sequence Alignment Multiple Sequence Alignment Sequence Relation Analysis ClustalW Protdist BLAST Save time for tool searching Save time for format conversion Unified user interface Sequence Distance Analysis Neighbor Reduce manual error Analysis time 11 IBM developerWorks| Oct 2007 © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 Bioinformatics Common Platform – Resource Sharing A B C D E IBM Bioinformatics Common Platform Workflow A Workflow B workflow C workflow D workflow E Tool 38 Tool 24 Bioinformatics Tool 36 Analysis Tools Web Services Pool 12 Tool 36 Tool Tool 27 27 Tool 19 Tool 81 Tool Tool 21 21 Tool 7 Tool 65 Tool 77 Tool 79 Tool 14 Tool 4 Tool 68 IBM developerWorks| Oct 2007 Tool 48 Tool 52 Tool 19 Tool 3 Tool 8 Tool Tool 31 31 Tool 55 Tool 72 Tool 25 Tool 6 Tool 1 Tool 49 Tool 43 Tool 57 Tool 72 Tool 43 Tool 61 Tool 13 Tool 42 Tool 19 Tool 5 © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 LSCE Dynamic Workflow Adaptors to link components Bioinformatics database access Bioinformatics analysis tools 13 IBM developerWorks| Oct 2007 © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 14 IBM developerWorks| Oct 2007 © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 15 IBM developerWorks| Oct 2007 © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 Bioinformatics analysis tools Adaptors to link components Bioinformatics database access 16 IBM developerWorks| Oct 2007 © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 DynaFlow – Unified Interface Mouse over component box to select command for: “Set Inputs” “Set Options” “Set Outputs” 17 IBM developerWorks| Oct 2007 © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 DynaFlow – User Interface Ready State Running State After all parameters being set, press “Run” button to run this step 18 IBM developerWorks| Oct 2007 © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 DynaFlow – User Interface 19 Finish State Error State When everything goes well When process fails IBM developerWorks| Oct 2007 © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 DynaFlow – Adaptor 20 IBM developerWorks| Oct 2007 © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 Integration of Bioinformatics Tools Bioinformatics Analysis Workflows: – 16 bioinformatics analysis workflows. – Applications for Microarray, EST, Transcription Factor Binding Site, Protein Structure, Antigenic Site, Alternative Splicing Site, Phylogenetic Analyses, etc. Bio Web Services and Data integration: – 68 bioinformatics & 25 data integration web services components. – Bioinformatics web services components include BLAST, CAP3, ClustalW, GLAM, Glimmer, HEMMER, Modeller, RasMol, RepeatMasker, SIM4, TMHMM, Vectorstrip and etc. – Data Integration web services components include NR, Unigene, LocusLink, dbEST, KEGG, Pfam, GO, JASPAR, InParanoid and etc. Value Added Database: – Developed 6 Value Added Databases. – VAD includes Microarray Primer Analysis, Protein–Protein Interaction, Repeat Sequences and SCA disease and etc. 21 IBM developerWorks| Oct 2007 © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 LSCE’s Approaches to Tackle the Problems Data Analysis – Integrated workplace and unified interface. – The “Lego” type workflow and workflow engine. – Web Services as standard. Data Integration – Federated data integration. – Easy data query and search for end users. Computation Power – High performance computing. – Grid computing architecture. 22 IBM developerWorks| Oct 2007 © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 Federated Data Integration MySQL Information Integrator (II) PALS Pfam HMMER Oracle KEGG BLAST MS SQL Sybase DB2 Unigene Mitochondria NT E.coli lamda LocusLink dbEST rRNA CGAP HUGO FASTA NR Homologene IBM II supported wrappers LSCE developed wrappers 23 IBM developerWorks| Oct 2007 © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 Example Looking for genes expressed in human liver from UniGene Database. Is there any gene expression difference in normal tissue and tumor?? I am only interested in genes involved in signal transduction. My research is focus on chromosome 9, 19 and 22. I also want to know their roles on Gene Ontology and biological pathway. Do these genes have information on OMIM disease database?? 24 IBM developerWorks| Oct 2007 © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 Data Integration & Query System Data Discovery & Query Builder (DDQB) Information Integrator (II) 25 IBM developerWorks| Oct 2007 © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 Click and Select Query Condition 26 IBM developerWorks| Oct 2007 © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 Click and Select Output Data 27 IBM developerWorks| Oct 2007 © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 Query Management 28 IBM developerWorks| Oct 2007 © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 Query Results — 29 IBM developerWorks| Oct 2007 © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 LSCE’s Approaches to Tackle the Problems Data Analysis – Integrated workplace and unified interface. – The “Lego” type workflow and workflow engine. – Web Services as standard. Data Integration – Federated data integration. – Easy data query and search for end users. Computation Power – High performance computing. – Grid computing architecture. 30 IBM developerWorks| Oct 2007 © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 LSCE System Architecture Portal Layer WebSephere Portal Server Contents, Collaboration, Applications based on personalization Application Integration 31 Other Applications DDQB Application Layer … Workflows IDA MAA BLAST EMBOSS SOAP WS II Database Layer EAP CAP3 WSDL KEGG GO Hugo UniGene GRID Service MGMT Others DynaFlow Engine Phred … R Others XML Federated Database Access & Integration Database Sources DB2 … CREPP MPL Web Services SOA Computing Layer Bioinformatics Common Platform WebSphere Application Server SwissPro …… dbEST Pfam GRID Security Svc DB2 Value Added Database PALS SCAdb GRID Prog. Service CHANGE …… PPI GRID Registry Svc Globus Tool Kit IBM developerWorks| Oct 2007 © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 Collaboration & Implementation 新竹生醫園區 32 IBM developerWorks| Oct 2007 © 2007 IBM Corporation 2007 【IBM developerWorks 開發者大會】 For More Information 33 IBM Taiwan, Life Sciences Center of Excellence – /http://www.petridish.cc/ IBM Healthcare and Life Sciences – http://www-03.ibm.com/industries/healthcare/index.jsp IBM Research – http://www.research.ibm.com/ The World Community – www.worldcommunitygrid.org The Genographic Project – https://www5.nationalgeographic.com/genographic/ IBM developerWorks| Oct 2007 © 2007 IBM Corporation