Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Expression Data Integration Microarray Gene Expression Database Meeting Sunday 14th November 1999 Key Topics • Incyte’s experience with expression databases • The need for integrated data management and analysis • A technology-independent exchange format for expression data Technologies For Genome Wide Analysis Gene Genome Transcriptome Transcript Protein Proteome Key Components of Expression Databases Incyte has the key components 4,645,958 Sequences >1500 CPU PC-Farm 109,938 human Genes (5’-3’ confirmed) >75 Terabytes of capacity Software Genes GEM™ cDNA Microarrays Data Management Toxicology Drug profiles Pathway analysis Disease tissues Normal tissues Proteomics 10,000 genes per GEM Proteomics Databases >100,000 genes on GEMs HTP technology with OGS 19 different GEMs in total Matched RNA/Protein Exp. Incyte’s Expression Databases Support Target Discovery and Lead Optimization Target Discovery Lead Discovery and Optimization Primary Secondary Screening Screening Protein Screen Dev. RNA RNA Protein Target Seln. Target Idn. Lead Optn. Make-Test Cycle “Accelerating Selection “Accelerating Compound Selection and Decreasing the Attrition Rate” of High Quality Targets ” Gene Expression Databases Require Integration Analytical Tools Integrated DB Proprietary Data Non-Proprietary Data The rate of change in microarray technology will accelerate from major impact players entering the marketplace Current players • Incyte • Affymetrix • NEN • Clonetech Emerging Players • Motorola • Hewlett-Packard • Perkin-Elmer • Amersham • Corning • Roche Microarray Data Management and Analysis • Technology-independent: Can store and analyze data from any microarray technology (single or dual channel) • Provides tools to allow users to load their own microarray data into the database Data Management • • • • • • • Clinical and experiment information Sample preparation Hybridization conditions Genes/clones/sequences Expression values Summarization/Normalization Methods Microarray Design Analytical Tools • Query on most database attributes • Average hybridizations; Composite hybridizations • Data visualization • Clustering • Sequence analysis • User-defined gene groups • Data export; Spotfire™ integration • Links to Incyte and PD databases Take-home message: Central to the successful creation of an expression community will be the ratification of a common data exchange protocol and format. LifeArray™ Data Import RDBMS Database Loader PMD File PMD Driver Raw Data File • Requires no knowledge of database structure • Minimize need for enduser to change software when schema changes are required • No knowledge of userspecific systems required Combining the Power of PMD with the Extensibility of XML Why XML? • XML: Extensible Markup Language • W3C Standard: V1.0, Feb 1998 • Powerful language for defining custom markup languages • Well suited for PMD content DTD DB