Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Overview of the Biotech Industry Srinivasan Seshadri, CEO, Strand Genomics The drug discovery process is currently being transformed by emerging technologies EMERGING DISCOVERY PROCESS Basic process Current impact of new technologies Identify compounds to interact with target lead Identify disease mechanism – define targets Fundamentally new approach Old world •Molecular biology •Physiology •Biochemistry New world •Genomics •Combinatorial chemistry •High throughput screening Batter/faster Optimise leads Biological validation Better/faster No major breakthrough •Still reliant on in vitro/in vivo models of disease Old world •Slow, largely manual chemical synthesis of leads •Slow, manual screening on limited range of assays New world •More sophisticated/automated (high throughput) screening has increased lead identification productivity by 30 times •Rate of compound generation increased by factor >1,000 through combinatorial chemistry techniques (although not clear what % are useful) •Greater use of more sophisticated ‘genetic’ models, but currently complex and slow 1 Genomics is central to this evolving landscape. The goal of Genomics is to unravel the genetic basis of health and disease providing a huge array of potential drug targets. Given the complexity of the genome and the volumes of data being generated, significant challenges exist in accessing and leveraging this data effectively. New IT based arenas are emerging to do this GENOMICS: A BIOLOGICAL DEFINITION Genomics is the study of the genetic composition of an organism and provides information on the structure, role and genetic linkage of genes. Some gene function is implicated in disease and it is therefore believed that better, more specific information about the origins of a disease will lead to more effective treatments. AT TA G CC GC G G C AT TA The characteristics (phenotype) of each individual. . . . . . and their organs and . . . are determined by tissues. . . their genetic makeup (genotype). Every cell contains the full “Genomics represents a paradigm complement of shift in disease treatment from individual’s genetic ‘underlying mechanism’ to ‘root cause” material – the genome CSO, Genomic-co The genome consists of a length of double standard DNA to which are attached 4 types of molecule of bases. There are 3 billion of these in total Some of these bases code for proteins [the cell manufactures protein using the DNA template). Others fill in the gaps and have no other function. A gene represents a section of DNA which codes for a protein or other 2 functional piece of cellular machinery (e.g., Bioinformatics describes one of two information driven arenas within pharmaceutical drug discovery. . . Focus on this document BIOINFORMATICS: A DEFINITION Description Bioinformatics Key applications •Information technology •Searching external designed/used to genomic databases generate and access genetic data and derive •Constructing and information from it. managing proprietary databases •Extracting information Informatics from data –Gene expression in health and disease –Gene function in health and disease Cheminformatics Relevance for pharmacos •Pharmacos need to be able to effectively access external sources •Pharmacos need to create proprietary databases (derived from data from multiple sources) so that they can be tailored to the needs of internal discovery function •Targets need to be identified and their role defined leveraging genetic data •Information technology •Molecular structure •IT solutions required to used to design molecular libraries to interact with identified targets design •Structure-activity relationships •Molecular library management/manipulation manage the increasing scale of molecule generation with discovery process •Predominantly addressed within pharmacos 3 although may require leveraging links with partners and across . . . and currently assists in leveraging genetic data. As the arena develops, the current boundaries with cheminformatics are likely to blur INFORMATICS USED IN DRUG DISCOVERY Define target Gene sequence Target role Protein sequence Protein sequence Function/ active site Analysing sequence data is just the starting point for bioinformatics – the key step will be relating that data back to protein structure J. Craig Venter, TIGR Identify lead compounds which interact with targets Optimise leads Bioinformatics Cheminformatics Activity •Trawl genomic •Define amino •Determine •Define protein •Trawl molecular •Refine search/ databases for genes of interest acid sequence derived from gene structure of protein and how it folds into active molecule activity •Define likely molecule structures to interact with target databases for likely activity against target development of lead compound •Links with combinational chemistry and high throughput screening 4 Many significant hurdles remain – before the value of bioinformatics can be fully exploited High Medium Low KEY TECHNOLOGICAL AND SCIENTIFIC HURDLES/CHALLENGES Technology/ IT-based Key challenges Why important •Developing tools which improve •Lower the barriers for effective use human-computer interface of computers by multiple disciplines to access databases and translate data into user-friendly format •Allowing disparate systems to •IT architectural differences interface with genomic databases constrain access to databases •Developing industry standard low cost infrastructure to access databases (internal and external) Key challenges for bioinformatics Science-based infrastructure is often too costly/time consuming •Current database data mining effectiveness of database mining generates vast quantities of irrelevant to search criteria •Most drug targets are proteins •Need to have clear understanding of role of protein to drug design currently carried out using laborious methodologies of moderate efficacy e.g., X-ray crystallography •Predicting protein function currently •Experimental methodologies being expensive and time consuming developed e.g. gene knockouts, although time consuming and ill-developed •Understanding how genes and proteins are expressed/modified in vivo currently unknown Source: interviews, press search •Developing proprietary •Increasing the efficiency and •Predicting tertiary protein structure Size of challenge •Gene structure predicted from genetic sequences may not reflect the gene expressed in vivo, and proteins can be modified into alternative structures with differing function to those predicted 5 The current activity is only a small part of what bioinformatics (integrating with the other emerging technologies) could contribute to the way we understand and treat disease BIOINFORMATICS HIERARCHY OF POSSIBILITIES Insilica research Examples Status •Replacing animal-based ‘wet biology’ •Very preliminary with computer-based predictive models •Replacing crystallography to Increasing the productivity of discovery Increasing the effectiveness and efficiency of genomic database mining Source: interviews, articles determine protein structure with predictive models •Generating better information about disease and patient populations allowing better targeting of clinical trials •In early development •In medium stage development •Mining genetic databases from normal •Ongoing and diseased populations to elucidate gene function •Increasing the effectiveness/efficiency of generating information from genomic data •Ongoing 6 To-date, bioinformatics has developed symbiotically with Genomics. It is now emerging as a field in its own right BIOINFORMATICS INDUSTRY EVOLUTION: A DESCRIPTION Academia-driven Multiple academic groups leveraging existing IT competencies to develop insights into identification and role of genes Source: Team interviews Genomics-driven Gene function-driven As genomic data becomes easier to generate, genomic companies (positional cloners and sequencers) develop IT systems to facilitate access to genome sequences Focus of effort becomes role and function of genes, and in particular, gene products. Organisations develop IT skills to push the boundaries of knowledge (e.g., predicting protein structure-activity relationships). Key differentiating factor is scope and scale of gene databases to which clients have access Key differentiating factor is becoming a company’s ability to provide bioinformatic solutions to extract ‘information’ from genes 7 BIOINFORMATICS INDUSTRY EVOLUTION: KEY MILESTONES 1981 First 579 human genes mapped Key Scientific Milestones 1972 first DNA cloning (Boyer & Cohen 1977 Chemical method for sequencing DNA devised (Gilbert & Maxam) 1983 Method for automated DNA sequencing (Carruthers & Hood) 1983 Huntingdon’s disease gene demonstrated to be on chromosome 4 (Gusella) 1991 Expressed sequenced tags (ESTs) created (Venter) 1992 First genetic linkage map of entire human genome published, and first whole human chromosome physical maps (Y and 21) GENOMICS-DRIVEN ACADEMIA-DRIVEN GENE FUNCTION-DRIVEN Genomics/ Bioinformatics Industry Activity 1977 First genetic engineering company, Genentech, founded 1982 Genbank established 1990 Human Genome Project launched 1988 Human Genome Organisation (HUGO) founded 1996/7 Genomic industry broadening value proposition 1997 1993 Emergence of Incyte goes bioinformatic public, the first of players with no many U.S. genomic heritage genomic e.g., companies to do - Pangea so 8 - MAG - NetGenics SUMMARY TECHNOLOGIES Three broad enabling technologies are driving progress in drug R&D : • Genomics leads to better disease understanding and target identification • Combinatorial chemistry generates more lead compounds • High throughput screening tests more leads on a greater number of targets BIOINFORMATICS The explosion of data and the increasing demand for sophisticated analytical tools has given rise to a rapidly growing bioinformatics market with three major service areas : • Database providers who generate and organize genome and discovery data • Discovery software providers who provide cutting-edge IT solutions to elements of the discovery process • Research enterprise ASPs who integrate multiple databases and analysis tools into a single platform 9 THREE BROAD TECHNOLOGIES ARE DRIVING DRUG DISCOVERY • Study of both structural and functional aspects of the genome, including both genes and proteins, leading to a greater understanding of cellular processes and disease GENOMICS Supported by BIOINFORMATICS • Rapid and systematic generation of a variety of molecular entities, or building blocks, in many different or unique combinations CATALYTIC/ COMBINATORIAL CHEMISTRY HIGH THROUGHPUT SCREENING • Use of robotic automation to allow for massive parallel experimentation and testing of many compounds or targets 10 MANY SPECIFIC EMERGING TECHNOLOGIES HAVE LED TO THE ADVANCES IN GENOMICS, COMBINATORIAL CHEMISTRY AND HIGH THROUGHPUT SCREENING • • • • • • • • Synthetic biopolymers • Biochemical drug delivery and encapsulation systems Antisense Transgenics Gene therapy Pathway mapping Surrogate markers Animal-free disease models Genetic networks GENOMICS CATALYTIC/ COMBINATORIAL CHEMISTRY HIGH THROUGHPUT SCREENING • Intelligent chemical systems • • • • • HT DNA sequencing HT proteomics Biochip microarrays Pharmacogenomics Biosensors • Lab Automation • Micromachines/miniaturization • CC library arrays • Chemical chips • Advanced biophysical assays Note : HT = High Throughput; CC = Combinatorial Chemistry 11 TECHNOLOGIES Three broad enabling technologies are driving progress in drug R&D : • Genomics leads to better disease understanding and target identification • Combinatorial chemistry generates more lead compounds • High throughput screening tests more leads on a greater number of targets BIOINFORMATICS The explosion of data and the increasing demand for sophisticated analytical tools has given rise to a rapidly growing bioinformatics market with three major service areas : • Database providers who generate and organize genome and discovery data • Discovery software providers who provide cutting-edge IT solutions to elements of the discovery process • Research enterprise ASPs who integrate multiple databases and analysis tools into a single platform 12 BIOINFORMATICS IS THE “BRAINS OF BIOTECHNOLOGY” In order for Genomics, HTS, and combinatorial chemistry to have impact, they must increasingly rely on bioinformatic capabilities. BIOINFORMATICS “BROAD SCIENCE THAT INVOLVES BOTH CONCEPTUAL AND PRACTICAL TOOLS FOR THE UNDERSTANDING, GENERATION, PROCESSING, AND PROPAGATION OF BIOLOGICAL INFORMATION”1 GENOMICS Supported by BIOINFORMATICS CATALYTIC/ COMBINATORIAL CHEMISTRY HIGH THROUGHPUT SCREENING Science, “Bioinformatics in the Information Age” April 2000; 287; 1221 Source : “Brains of Biotechnology” is from Karl Thiel, Biospace.com 1 13 NEW TECHNOLOGIES ARE DRIVING THE NEED FOR BIOINFORMATICS DATA AND ANALYSIS CAPABILITIES EXPLOSION OF DATA GENOMICS CATALYTIC/ COMBINATORIAL CHEMISTRY HIGH THROUGHPUT SCREENING NEW TECHNOLOGIES ANALYSIS TOOLS • Gene (DNA) sequences • Protein sequences • SNP mapping and disease mapping • Gene expression profiles by tissue, species, and drug influence • Protein expression profiles • Protein:protein interaction profiles • Protein structure information • CC libraries • Screening activity data (SAR) • Toxicology databases • Sequence alignment searches (BLAST) • Relational alignment programs (phylogeny) • Virtual lab processes software (PCR, elongation) • Protein folding algorithms • Structure-based target design using virtual SAR modeling • Virtual CC generation and screening • ADME and toxicology profiling software Demand for different types of databases Demand for discovery software 14 The global market for bioinformatics is expected to show significant growth over the next five years. However the state of infancy of the industry poses credibility issues on the estimates from research houses Growth in Global Bioinformatics $10-20Bln $5Bln •Numbers likely to include •Software solutions •Automations tools •“Hardware” such as microarrays $300m 1998 2003 15 The current biotechnology market in India is focused on the AgBio, Industrial and Vaccine sectors, but will see emerging opportunities in Bioinformatics and vaccines Growth in Indian Biotechnology 100%=?? Bioinformatics Genome Technologies Vaccines 100%=$500m Industrial 22% Ag Products 25% Health Products 47% 1998 The future will witness additional opportunities in Informatics and related genome based technologies 2003 16 Source: Biosupportinida Projected growth in Pharma and Biotech R&D spending will enable the industry to attain its projected targets Pharmaceutical R&D Budgets 100%=100Bln $46B $20B $13B $13B $7B Typical IT budgets will be 10-20% of total R&D Discovery PreClinical Clinical CMC Clinical Production/ Trials manufacturing 17 Source: PhRMa The Lehman Report consisted of interviews with decision makers in Pharma and Biotech and highlighted some interesting findings Summary of Findings • New Biology will significantly increase R&D costs- a large chunk of which are technology driven •Companies will see substantial pressure on earnings •Attempts to use “today's relatively immature technology” will result in higher failure rates amongst “novel” targets. These failures will likely also stretch out the time period for the arrival of new drugs that Genomics promises •High risk of “novel target failure” •Less understood (only 8 publications per novel target vs. 100 for those generated by conventional methods •Companies pushing these less understood targets through the drug pipeline •Traditional chemical technologies will n to be sufficient to identify novel chemical entities that can interact with a target- could have adverse outcomes during the clinical trial process 18 Source: & Company Genomics influenced increase in R&D Costs Assuming no increase in technology More than doubles From current 2010 3.6 3.6 2005 3.2 2000 1995 1.6 Total Annual R&D Budget 2 2 NCEs 1 1.6 0.8 Annual R&D Budget/NCE output 19 Genomics influenced increase in R&D Costs Assuming moderate increase in technology 2010 Promise of productivity expansion 2.7 4.4 2005 2.6 2000 1995 1.6 Total Annual R&D Budget 2 2 NCEs 0.6 1.3 0.8 Annual R&D Budget/NCE output 20 Most technologies are likely to make an impact only 5-10 years from today 5-10 FROM IMPACT Integrated Technologies 7 Value Mapout biological Pathways Seq Human Genome Delineate disease mechanisms Map out human proteome Map out human genome 0 2000 2005 2010 2015 We are still years away from the real impact of Genomics technology. Most of them have just got started -Biotech Executive 21 *Integrated technologies include both experimental and informatics approach Most technologies are likely to make an impact only 5-10 years from today 5-10 FROM IMPACT Protein Chips 6 Value Identify Differential Expression 0 Profile complex diseases Identify some Cellular proteins Identify key Post-translational modifications 2000 2005 2010 2015 It will be a few years before we have a protein chip that is cheap, fast and accurate -Biotech Executive Proteomics will be a big help with target validation. H however, we still need to increase speed and improve Productivity -Pharma R&D executive *Integrated technologies include both experimental and informatics approach 22 Most technologies are likely to make an impact only 5-10 years from today 5-10 FROM IMPACT Bioinformatics data mining 7 Assign single Function based On functional Genomics data Value Basic protein Structure Homology queries Correlate expression data And protein interaction data Correlate gene/protein Expression date with function 0 2000 2005 2010 2015 Most of the data mining algorithms are pretty primitive and straightforward today -Biotech Executive We are facing more explosive data produced by Genomics technologies. Unfortunately, the informatics tools are still not there to allow us to explore them fully -Pharma R&D executive *Integrated technologies include both experimental and informatics approach 23 Large investments are necessary to reap the benefits of technology THRESHOLD LEVEL OF INVESTMENT NECESSARY iNFORMATICS •Threshold annual Expenditures $20-40m •Bioinformatics •Key means/technol ogies to achieve impact •Chemoinformatics •Clinical Informatics at bottleneck TARGET VALIDATION $20-40m •Functional Genomics tools •Database subscriptions LEAD OPTIMIZATION $10-20 EXPLORATORY DEVELOPMENT $20-30m •Closed loop chemistry •Process improvements •Pharmacogenomics •ADME •Computer aided trial •HTS design 24 There are three broad organizational models emerging BIOINFORMATICS PRODUCT/SERVICE MODELS •Provide user friendly access to proprietary and public gene databases compatible with customer IT architecture •Requires bioinformatic and genomic competencies/assets •Assumes customer does not need to develop significant in-house capabilities •Conduct discrete stages Gene Database Designer Discovery Services Provider of discovery process •Requires broad informatic and drug discovery capabilities •Value proposition built on superior informatic capabilities IT Architects •Provide off-the“The trouble is that bioinformatics is so new, and the market so ill-defined, that companies are having difficulty settling on the business model they will follow” In Vivo Source: ; press search shelf/bespoke informatic solutions •Requires leading edge bioinformatic capabilities •Assumes customer has inhouse skills and competencies to be able to leverage and manipulate genetic data 25 THESE DEMANDS FOR BIOINFORMATICS ARE ADDRESSED BY THREE MAJOR SERVICE MODELS... RESEARCH ENTERPRISE ASPs INTEGRATED • Provide user friendly interface that can access both off-theshelf bioinformatics software and more sophisticated IT solutions • Require extensive IT capabilities DATABASE FOCUS DATABASE PROVIDERS NARROW DISCOVERY SOFTWARE PROVIDERS • Provide access to proprietary and public databases, e.g., gene and protein sequences • Provides cutting-edge computational solutions to discrete components of the discovery process • Require data acquisition assets (e.g., Genomics heritage) along with solid bioinformatics capabilities • Requires extensive expertise in drug discovery and bioinformatics capabilities SIMPLE COMPLEX ANALYTICAL CAPABILITIES Source: analysis 26 ...AND MANY PLAYERS HAVE ADOPTED EACH SERVICE MODEL INTEGRATED RESEARCH ENTERPRISE ASPs eBioinformatics DoubleTwist NetGenics Base4 Viaken Bioreason DATABASE FOCUS Strand Celera Genomics Incyte Structural GenomiX Compugen Tripo s Hyseq Molecular Simulations NARROW DATABASE PROVIDERS Spotfir e DISCOVERY SOFTWARE SIMPLE COMPLEX ANALYTICAL CAPABILITIES Source: analysis; company websites 27 It is not yet clear which if any of the current approaches will prove sustainable CORE BELIEFS AND CHALLENGES FOR EACH BUSINESS MODEL Service model Gene Database Designer Core beliefs •Databases sufficiently fragmented thus rendering inefficient for pharmacos to ‘go it alone’ •Ability to remain ahead of pharmacos vis-avis technological innovation •Genomic heritage a prerequisite for success IT Architects •IT skills are the defining basis of competition not knowledge of Genomics •IT solution will not emerge from existing pharmaco IT suppliers •Ability to remain ahead of other entities vis-avis technological innovation Discovery Services Provider •Pharmacos will increasingly seek discoveryoriented solutions requiring broader skill set (increasing proportion of research investments are external) •Value creating in longer term as provides a base for full integration Source: Team interviews; articles Issues •Multiple public databases challenging role of proprietary databases •Pharmacos are developing skills to create bespoke databases in-house •Real risk that skill could become a commodity (e.g., cost of sequencing a bacterial genome fell from $12m to $0.5m in 1997) •Unclear who are the natural owners/developers (“several pharmacos have thought about this longer than we have . . . we need to stay on the cutting edge” VP S&M Molecular Applications Group) •Clear potential for non pharma IT players to enter market •Potential commoditisation of services •Not clear under which conditions pharmacos will outsource discovery functions •Issues of skills, critical mass and focus present real challenges to companies developing from a Genomics/IT heritage 28 The traditional genomic companies are polarising into two categories; those that design databases, and those are broadening their value proposition to encompass ‘discovery’ offerings. The new breed of bioinformatic companies are establishing themselves in a third category – IT architects CATEGORISING TODAY’S BIOINFORMATICS COMPANIES Product services providers Gene database designers IT architects Discovery Services Provider Building and distributing annotated gene databases and services from public and private Building IT systems to enable the sequencing, synthesis and access of genomic data Conducting discrete stages of the drug discovery process using proprietary systems and knowledge •Alphagene •Digital Gene Technologies Inc. •Genome Therapeutics Corp. •human Genome Sciences Inc. •Hyseq •Incyte Pharmaceuticals •Myriad Genetics •Sequana Therapeutics •Base 4 bioinformatics •Genecodes •GeneTrace Systems •Genomica Corp •Informax Inc. •MDL Information Systems Inc. •Molecular Applications Group •Molecular informatics Inc. •Netgenics •Oncormed •Oxford Molecular •Pangea Systems Inc. •PE Applied Biosystems Source: Annual reports; text lines; interviews; team analysis •Acacia Biosciences •Affymetrix •Ariad Pharmaceuticals •Chiroscience (acq. Darwin Molecular) •Exelixis Pharmaceuticals Inc. •Genelogic •Genetech •Millennium •Mitokor •Ontogency •Pharmagene •Progenitor •Structural Bioformatics Inc. •Xenometrix 29 MOST OF THE LATEST R&D TECHNOLOGIES WERE DEVELOPED OUTSIDE BIG PHARMA Genomics Cheminformatics Bioinformatics Transgenic animals High throughput screening PharmacoGenomics Combinatorial Chemistry Proteomics Molecular modelling Antisense 30 HT DNA Sequencing Technology basics • Typically, a sample of DNA is amplified using PCR* with specific fluorescent probes for AGTC; separated by electrophoresis through automated technology and DNA sequence is analyzed. • For sequencing of both genomic DNA and expressed genes (cDNA) Competitive landscape • Many players are involved in sequencing the genome, contributing to both proprietary and public databases : – Public : Human Genome Project – Human Genome Sciences – Incyte Genomics – Celera Genomics • Supplements DNA mapping and positional or functional cloning Old method : DNA SEQUENCING TECHNOLOGY Nucleotides/day 1,000,000s** • Entire human genome will be sequenced by end of 2001 (Celera appears to be leading the way) – All 3 billion nucleotides, on 23 pairs of chromosomes, composing about ~100,000 genes! • Sequencing does not provide any insights about gene function, merely a blueprint for proteins • Viability of business model for companies only sequencing DNA is questionable. Most recognize need to move towards functional Genomics and protein studies • Patents on genes or gene fragments (expressed sequence tags, or ESTs), without annotated function data, are not likely to be approved 1000s 1990 Status and current issues 2000 * PCR refers to Polymerase Chain Reaction, a technique for amplifying specific sequences of DNA **Celera’s shotgun approach and powerful computers can sequence 11,000,000 nucleotides per day 31 HT Proteomics Technology basics Competitive landscape Status and current issues • Analysis of proteins and protein expression in diseased and normal states • Fewer companies are engaged in HT proteomics work than HT DNA sequencing • Proteomics deals with two areas: • Key players in HT proteomics : – Oxford GlycoSciences – Large Scale Proteomics Corp. – Proteome Inc. – Ciphergen Biosystems • Protein function depends on 3-D structure and at present, even the best computer software is not good at modeling protein structure – protein sequence, expression, and modification analysis using techniques of protein separation, including 2-dimensional electrophoresis (2-DE) and protein chips, and identification, typically involving mass spectrometry – 3D structure analysis by X-ray crystallography and nuclear magnetic resonance (NMR), as well as complex computer modeling. These structures are useful for structure-based drug design. • Players in 3D protein folding (mostly software) include : – Structural GenomiX, Inc – Structural Bioinformatics – Bio-IT Ltd. PROTEIN ANALYSIS TECHNOLOGY Proteins analyzed/day 100,000s • Understanding how proteins are modified after expression, especially in the presence of drugs and/or disease, will dramatically aid drug development 1000s <1 1990 2000 Prototypes* * Prototypes, which should be commercial within 2 years, involve high throughput separation techniques (HPLC) and advanced mass spectroscopy (MALDI-TOF) Source:Science journals, popular press, public biotechnology reports 32 Biochip Microarrays Technology basics • Biochip microarrays are ordered sets of known molecules (DNA, proteins, etc…) attached to a solid support (silica, fibers, etc…) that allow for a vast number of parallel experiments in miniature. • DNA chips are made by either “building” short sequences of DNA on chips or by attached pre-made oligonucleotides (short pieces of DNA) to the chip • Expressed cDNA prepared from samples is then allowed to interact with the DNA on the chips and these interactions are detected. • This same principle can be applied with proteins and small molecules Competitive landscape • Current market for biochips is about ~$175 Million, and is dominated by Affymetrix; however, many new players are entering the market with alternative chip technologies : – Nanogen (electroactive chips) – Illumina (fiber optic bead-based) – Sequenom (“industrial Genomics” with mass spectroscopy) – Ciphergen (protein chips) • Affymetrix business model : It nearly “gives away” a detection machine ($175,000) and then hopes to make money from the sale of its disposable GeneChips (Razor blade approach) Status and current issues • As of today, chips with ~250,000 probes are commercially available; in near future, probes representing entire genomes should be available • “The use of DNA arrays to interrogate biological information represents a paradigm change that will profoundly alter biology and medicine” Dr. Leroy Hood University of Washington • Uses for biochip microarrays are exploding : – gene sequencing – polymorphism identification – genetic testing – gene expression profiling – toxicology analysis – forensics – immunoassays – proteomics – drug screening Source: Literature, BioInsight 33 GENE CHIP MICROARRAYS ARE SMALL GRIDS CONTAINING PIECES OF DNA Technology Basics • Gene (or DNA) chips are grids • Each square (feature) on the grid contains the same known repeating DNA sequence . . . . . . T T A T T A T T A T T A T T A T T A . . . . . . T T A T T A T T A T T A T T A T T A . . . . . . T T A T T A T T A T T A T T A T T A . . . . . . T T A T T A T T A T T A T T A T T A . . . . . . T T A T T A T T A T T A T T A T T A . . . . . . T T A T T A T T A T T A T T A T T A . . . . . . T T A T T A T T A T T A T T A T T A . . . . . . T T A T T A T T A T T A T T A T T A • Different squares contain different sequences T T A T T C T T G T T T T A A T A C T A G T A T T G A T G C T G G T G T T C A T C C T C G T C T A T A A T C A T G A G A A C A A C C A C G A C T A G A A G C A G G A G T A A A A A C A A G A A T G T A G T C G T G G T T G G A G G C G G G G G T G G A G G C G G G G G T G C A G C C G C G G C T C T A C T C C T G C G A C C A C C C C C G C C T C G A C G C C G G C G T C A A C A C C A G C A T Add mixture of unknown flourescently labeled probes to DNA chip • Probes stick (hybridize) to squares that have a similar sequence to the probe • A laser reads out which squares the probes stick to • Software makes the information intelligible Because the DNA sequence is known at each location on the DNA chip, unknown probe sequences can be determined by monitoring where on the DNA chip these probes stick 34 BIOCHIPS CAN BE USED IN GENE EXPRESSION MONITORING AS A POWERFUL TOOL FOR IDENTIFYING KEY GENES INVOLVED IN OR AFFECTED BY DISEASE PROCESSES Technology basics Approach DNA Healthy Tissue RNA Cell DNA • Compare readouts from chips exposed to healthy and diseased samples (probes) Probes DNA chip RNA Diseased Tissue Cell Probes DNA chip Find healthy and diseased individuals Isolate healthy and diseased tissues Isolate RNA from each sample (RNA tells us which genes are turned on) Make fluorescently labeled probes from RNA (probes are pieces of DNA that represent genes which are turned on) Expose DNA chips (which have thousands of known genes on them) to probes – probes will only stick to DNA chips in certain locations (see next page) • Differences (dashed boxes) indicate genes that may be involved in the disease process • Gene products (proteins) from these genes may serve as good disease targets, therapeutics, or markers 35 THE COMPETITIVE LANDSCAPE FOR BIOCHIP ARRAYS IS HEATING UP AS THE TECHNOLOGY RAPIDLY EVOLVES Competitive Landscape Five example companies and their technologies Affymetrix • Disposable GeneChip array has oligos* attached to it by photolithography • Early leader in biochip development • Oligos are bound by fluorescent probes Nanogen Illumina • Pre-made oligos are bound to reuseable semiconductor chip • Electroactive spots on chip direct and move attached oligos, which interact with fluorescent probes • Oligos (or drugs, proteins) are attached to microbeads, which self-assemble onto the tips of fibers in an optical fiber bundled microarray • Analyzed by fluorescence with fiber optics Sequenom • MassArray chips have oligos attached to them • Analysis by laser-ionization and mass spectroscopy • Called “industrial Genomics” CipherGen • ProteinChip array has defined proteins (like antibodies) bound to it which interact with ligands in the sample • Analyzed by laser-ionization and mass spectroscopy Uses of biochip microarrays continues to expolode : • Gene sequencing • Polymorphism identification • Genetic testing (genotyping) • Gene expression profiling • Toxicology analysis • Forensics • Immunoassays • Proteomics • Drug screening • Many others Over 75 public and private Biotech firms make biochip technology * Oligos are oligonucleotides, or short (25 bases) sequences of DNA Sources : Press reviews, scientific journals, company reports 36 WHILE AFFYMETRIX HAS DOMINATED THE BIOCHIP MARKETPLACE, STRONG COMPETITION FROM NEW BIOCHIP TECHNOLOGIES WILL LIKELY FRAGMENT THE SECTOR FURTHER Competitive Landscape Market Share percent, 1999 1999 Market ~$176 Million Other Trends in competitive landscape 11% Affymetrix • New biochip technology players will cut into Affymetrix’s marketshare Homemade 43% 24% Ciphergen • Biochip market expected to grow to ~$1 Billion by 2005 2% 2% ACLARA 6% Caliper 9% 3% Incyte • Use of homemade chips will likely decrease as complexity and versatility of commercial chips increases • The market for hardware and bioinformatic software for chip detection and data collection/ analysis will also explode Phase -1 Source: lLiterature; BioInsights 37 Pharmacogenomics Technology basics Competitive landscape Status and current issues • Every individual has a distinct set of “polymorphism” or gene variants. These variants could lead to enhanced or diminished responses to therapy. • Key players include : • Pharma community is in agreement that pharmacoGenomics is important - but its effects are uncertain: • It applies genetic testing techniques to identify these variants that are predictive of a patient’s response to a therapeutic agent • Pharmacogenomics can be used to: – increase the likelihood of a drug’s success in the clinic by identifying patients who are more likely to have responses to drugs – rescue previous drugs who failed or were taken off the market for safety concerns by identifying safe patient populations – Genset is working on a map of SNPs for clinical testing (with Abbott Labs) – Others companies include: · Affymetrix · Celera · GeneLogic · Incyte · LJL Biosystems · Lynx Therapeutics · Millennium Predictive Medicine – “The FDA has asked us (senior pharma people) to come in and discuss pharmacogenomic testing with them” B. Michael Silber Director of Clinical Diagnostics Pfizer – “Rescuing drugs has the potential to absolutely take off, or it might not” Greg Miller, Head of Molecular Profiling, Genzyme 38 Lab Automation Technology basics • With the explosion of compounds from combinatorial chemistry and the accelerated identification of gene targets from Genomics, the ability to analyze and screen compounds becomes critical ratelimiting step. So highly automated lab technologies have developed in four major areas : – Microplate readers and equipment – Liquid handling, manipulating, and dispensing devices – Robotics – Software to control the process Competitive landscape Status and current issues • Key players include : – Robotics : LJL Biosystems, Robocon, Zymark – Microplate : Perkin Elmer, Molecular Devices, Dynex – Liquid : Beckman Coulter, Gilson – Software : Oxford Molecular Group, Tripos, MSI, MDL Information systems Market breakdown by sector 1998, Total market $1.1 Billion Robotics and Software • Likely to see high growth in the next few years as lab automation increases • Miniaturization will lead to lower reagent costs; likely value shift to equipment and software • Huge need for quality bioinformatics software that is capable of data acquisition/ collection as well as data analysis and storage. 16% 41% Microplaterelated equipment Lab Automation market (WW) 2100 $, Millions 43% 13% CAGR Liquid Handling/Manipulating/ Dispensing • • • • dispensers workstations organic synthesizers solid-phase extraction devices 1100 577 1993 1998 2003E Source: Literature, Genetic Engineering News 39 Database suppliers/designers Technology basics Competitive landscape Status and current issues • Provides remote access to their proprietary database, as well as public ones; typically using an internet or intranet platform • Key players include : • Multiple public databases, like GenBank, are challenging the role and importance of proprietary databases in many areas (especially Genomics). • Data acquisition skills (e.g., DNA sequencing heritage) is a prerequisite for success in this segment • Generally, three main revenue models : – Subscription-based access – Royalties-based and shared risk – Fee-for-service – Celera Genomics (subscriptions to gene database, ESTs) – Incyte Genomics (online “Incyte 2.0” : LifeSeq and LifeExpress databases) – Human Genome Sciences (exclusive databases for Human Gene Consortium) – GeneLogic (Expression databases) – AlphaGene (DNA) – Hyseq (GeneSolutions.com provides access to proprietary data) – Myriad Genetics (ProNet, a protein:protein interaction database) – Sequana – Genset (SNPs database) – Orchid Biosciences (SNPs) – Oxford GlycoSciences (LifeExpress with Incyte) • Many pharmacos/biotechs are developing their own bioinformatics skills to handle databases in-house • Large risk that gene data acquisition skills could be commodity (and therefore limit value of proprietary databases), e.g., cost of sequencing a bacterial genome fell from $12m to $0.5m by 1997 • Belief that current databases are fragmented and inefficient - leading many pharmaco/biotech firms to outsource database management Source: Literature; press releases 40 Discovery Software Providers Technology basics Competitive landscape Status and current issues • Provides cutting-edge informatics solutions to discrete components of the discovery process, e.g., protein folding or CC library selection and screening • Key players include : • Not clear which activities will be outsourced and which will be developed in-house • Drug discovery process is increasingly seeking more sophisticated IT solutions/software that require a specialized skill set • Requires deep expertise in drug discovery as well as leading edge bioinformatics/IT capabilities • Simply put, these are drug discovery tool kit companies – Structural Bioinformatics, Inc. (structure-based target id using sophisticated protein structure modeling and database) – Tripos (offers several discovery tools, including FlexX, a virtual CC library software) – Molecular Simulations, Inc. (Pharmacopeia subsidiary, software simulates molecular interactions of drugs, proteins) – Compugen’s LabOnWeb.com (aimed at early gene sequence PCR work) – Bioreason (chemical entity analysis programs) – Spotfire (decision analytic software aimed at researcher productivity) – Molecular Mining Corp. • Critical mass, skills, and focus are important issues for firms developing from a data acquisition heritage. • Value proposition must include superior IT tools. Source: Literature; press releases 41 Research Enterprise ASPs Technology basics Competitive landscape Status and current issues • Offer ASP platforms that integrate broad databases and sophisticated IT applications, coupled with “research portal” functionality. • Key players include : – DoubleTwist (leader in the research enterprise ASP space; formerly Pangea) – eBioinformatics – Base4 (collaborative knowledge and project management platform with database handling applications) – NetGenics (subscription ASP distributing computing platform with broad discovery applications) – Genomica (Discovery Manager software suite) – Viaken (a premier life science ASP for database hosting and analytic software) • Clear potential entry point for nonlife science IT players • Provides user-friendly interface that offers a suite of off-the-shelf bioinformatics solutions enabling users to access broad range of applications for data and analysis • Requires leading edge IT capabilities, but does not rely on any specific drug discovery or data acquisition knowledge. • Potential threat of commoditization of services • Unclear who are the natural owners of this space Source: Literature; press releases 42