Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Information System for Comparative Analysis of Legume Genomes Anita Dalwani Advisors: Dr. Roger Innes, Dr. Haixu Tang LAYOUT • • • • • • Motivation Participants Background Design Results/Demo Future Work Motivation? Motivation • Goal of legume genome project - Investigate the process of genome restructuring following polyploidization in plants (soybean and its relatives in the Glycine genus) - Try answering questions like : - Genome evolution on both short(<100,000yrs) and long (>50 million yrs) time scale - Evolution of disease resistance (R) genes. Motivation • To answer these questions: - 1 Mbp syntenic genomic regions from six taxa as well as their duplicated regions in the polyploidy members (12 such regions in total) will be sequenced and analyzed. - These regions contain several important disease resistance (R) genes. Motivation Plant species and accession No. of regions to be analysed Whole Genome size (megabases) G. max cultivar Williams 82 2 1103 G. max PI 96983 2 1103 G. tomentella G1188 (2n=80) 4 2083 G. tomentella race D3 (2n-40) 2 1103 Teramnus labialus 1 < 700 Medicago truncatula 1 466 Motivation • Information System - central repository for the data - stores and retrieves updated information - bioinformatics and visualization tools Participants Participants University Roles Roger Innes Tom Ashfield Anita Dalwani Murali Mohan Innes Lab Indiana University, Bloomington. Principal Investigator R gene evolution Database development, Web application. Database development. Nevin Young Steve Cannon Roxanne Denny Young Lab, University of Minnesota Co-PI phylogenetic; R genes; comparative genomics. Lab Manager Jeff Doyle Bernard Pfeil Doyle Lab Cornell University Co-PI phylogenetic and polyploidy Bruce Roe Majesta Siegfried Roe Lab, Oklahoma University Co-PI Bac sequencing Saghai Maroof Milind Ratnaparkhe Jafar Mammado Maroof Lab, Virginia Tech Co-PI R genes; comparative genomics R genes; comparative genomics Background • Procedure 1. Create and make available Bacterial Artificial Chromosome (BAC) libraries of each species. Indexing available BAC, BAC end sequences, library, probes, vector, gel images Background 2. Assemble syntenic BAC contigs from each library i. Strategically chosen soybean clones are used as probes ACCCGT AATTC Probe 53 Probe 21 Probe 9 Probe 26 Probe 1 Probe 3 - GTACTT AAACT ACCCGT AATTC GTACTT AAACT CCCC AATC CCCC AATC ii. Individual probes are hybridized to high-density BAC filters representing all the target genomes Background Background iii. Integrity of contigs is confirmed by fingerprinting iv. Set of clones that hybridize to two or more probes are selected v. BACs representing the tentative minimum tiling path will be end sequenced Probe53 Probe21 Probe9 Probe26 Probe1 Bac4 Bac4 Probe3 Bac1 Bac2 Bac2 Bac3 Bac5 Bac6 Bac7 Bac8 Bac8 Probe53 Probe21 Bac2 Bac2 Probe9 Probe26 Probe1 Bac4 Bac4 Probe3 Bac3 Bac8 Bac8 ACCCGT AATTC ACCCGT AAATC GTACTT AAACT CCCC AATCT CCGC AATC CTTCTT CCCC AATC Background 3. DNA sequencing, Assembly , Annotation 4. Compare the content, order and sequence of gene 5. Results available for public Importance • Information System - Centrally available data - User-friendly interface for retrieving the information - Updated progress information - Tools for interpreting the results. Works as an Laboratory Management Information System Design • Steps for designing the Information System. 1. Design the Database - Data: BAC, BES, Probes, Libraries, vector, library screen hits etc. Design - Visualize the relationship between these large amount of data. For example, Library table stores detailed information about each library used rather than having each BAC storing the library information Design - Created tables based on these relationship Main tables used in the database are: BAC GEL IMAGES GENOTYPE LIBRARY SCREENS PRIMER PROBE WITHIN BACS BES GENOMIC SOUTHERNS LIBRARY LIBRARY SCREEN HITS PROBE VECTOR Design PROBE Has LIBRARY Derived from Has Has is a BAC Has PRIMER is a BES Design Library Screen Has Library Screen hits Design 2. Populate the database with initial set of data - Initial set of data was stored in form of MSExcel. - Perl script for parsing information. Design • Web Database Application - understanding the needs for the project - Web database interface - displays information about the project - add and update interface - tools for analyses Design • For determining the tiling path - Designing a Visualization tool - displays the locations of the clones with respect to probes - Probes are strategically chosen from soybean genomes Design - Input : library name - subset of probes with at least one hit with the library are selected - BAC clones for the library are generated which have hits with probes - Probes are arranged in order of their position - BACs are mapped to these probes. Design • System Specifications - Database: Oracle 9i - Languages: PHP, Perl, HTML, JavaScript - Web Server: Apache 1.3.29 - Platform: Unix (SunOS 5.9) Results Future Work • Comparative physical Mapping • Bioinformatics tools • Public interface Acknowledgements • Dr. Roger Innes • Dr. Haixu Tang • Dr. Sun Kim • Legume genome project team References • Innes, Roger W. Comparative Analysis of Legume Genome Evolution, Proposal submitted to National Science Foundation. • Tang, Haixu. Comparative physical mapping: ordering clones by cross species hybridization Dec 2004. • www.bio.indiana.edu/~nsflegume