Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Comparative Genomics of Aspergilli William Nierman TIGR 5.0 1x 4.0 3.5 Sp Af Sc Electrophoretic Karyotyping 5 day run 5.7 4.6 3.5 1.8 CHEF DRII 1.2% CGA, 1x TAE, 14C, 1.8 V/cm: 2200 s, 48 h; 2200-1800 s, 68 h sizes in Mb A. fumigatus Chromosomes Size (MB) 1 4.891 2 4.834 3 4.018 ~35 copies rDNA 4 3.933 5 3.922 6 3.779 7 2.021 8 Centromeric area Telomere 1.789 Centromeres and Telomeres • Telomere repeat TTAGGG, 7-21 repeat units – Subtelomeric regions- identical sequences for several kb, helicase pseudogenes, 7 secondary metabolite clusters, niche adaption role? (Mark Farman) • Centromeres – Uncloned in shotgun libraries; 36.2 - 55.9kb – Flanked on each side by low complexity AT rich repeat region – Chromosome 2 centromere 12 kb PCR product 75% AT, overall centromeric AT of 63%, 40kb. Annotation Pipeline Eukaryotic Genome Control (EGC) is the annotation pipeline responsible for processing genomic sequence Finished chromosome sequences Masked genomic sequence Gene prediction EST alignments Optimize Predictions Protein alignments Training Data Gene and splicing site predictions including Glimmer,Exonomy, Unveil, Phat and GeneSplicer were trained with following experimental data: – Full Length cDNAs (625) and 42 partials from 589 loci in 19 Aspergillus species – 2,633 A. fumigatus ESTs from UK and Spanish collaborators Optimize Predictions Combiner combines gene model evidence from: • Gene prediction programs • Splice site prediction programs • Alignments from protein, cDNA and EST databases • Generates final gene model. All the genes were manual reviewed and the observed splits and merges were corrected. Annotation Station Screenshot Brown 2 Brown 1 Scytalone dehydratase Yellowish-green 1,3,6,8-tetrahydroxynaphthalene reductase Polyketide synthetase Gene Summary Statistics Chromosome AFU ANA AOA Size 28635699 30068514 36746653 GC Content 49.9 50.3 48.3 # of Genes 9746 9967 14063 Mean Gene Length 1442.4 1535.9 1177.5 Gene Density 2938.2 3016.8 2613 Percent of Coding 49.1 50.9 45.1 Percent Genes with Introns 75.8 88.7 80.7 Exons AFU ANA AOA Number 26181 36249 40133 Mean # per Gene 2.7 3.6 2.9 GC Content 54 53.4 52 Mean Length(bp) 536.9 422.3 412.6 Total Length(bp) 14057166 15308196 16559586 Introns AFU ANA AOA Number 16432 26282 26070 GC Content 46.3 46.1 45.5 Mean Length(bp) 121.8 104.6 129.7 Total Length(bp) 2000799 2748240 3380731 Intergenic Regions AFU ANA AOA GC Content 46 47.5 45.3 Mean Length(bp) 1276.4 1159.5 1174.3 Functional Annotation AFU ANA AOA Most Common Domains in A. fumigatus Domains Domain name #Proteins PF00172 Fungal Zn(2)-Cys(6) binuclear cluster dom. 147 PF00083 Major facilitator superfamily 109 PF00400 WD domain G-beta repeat 105 PF00069 Protein kinase domain 105 PF00106 Oxidoreductase, sh. Chain dehydro./reduc. 95 PF00271 Helicase conserved C-terminal domain 75 PF00023 Ankyrin repeat 64 PF00067 Cytochrome P450 65 PF00096 Zinc finger C2H2 type 61 PF00107 Oxidoreductase, Zn-binding dehydrogenase 61 PF00076 RNA recognition motif 59 PF00005 ABC transporter 51 PF00501 AMP-binding enzyme 44 PF00270 DEAD/DEAH box helicase 39 PF01360 Monoxygenase 39 Synteny Map of A. fumigatus and A. nidulans Synteny Map of A.fumigatus and A. oryzae Synteny Map of A. fumigatus, A. nidulans, A. oryzae Overview – Comparative Statistics The ortholog was computed by performing an all vs. all BlastP of the three proteomes with a cut-off of 1 x e-15 (no length requirement). The mutual best hits were then organized into clusters based on shared protein nodes. COG A. fumigatus A. Oryzae A. nidulans avg_pctid avg_coverage num_cogs 3 member + + + 70% 86% 5899 + + 65% 84% 967 + 61% 79% 533 + 61% 80% 936 2 member + + Species #genes included in COG percent of predicted proteome A. fumigatus 7507 79% A. nidulans 7429 75% A. Oryzae 7988 57% Total 22924 68%(22924/33552) TIGR Autoannotation vs Sanger Curated Annotation • • • • • • • • • Status Total Sanger Genes analyzed Same gene structure Different gene structure Sanger missing in TIGR annotation Sanger matches multiple TIGR annotations Sanger, TIGR annotations opposite strands TIGR missing in Sanger annotation TIGR matches multiple Sanger annotations Count 360 137 177 37 2 7 12 9 Using Ortholog Clusters to Identify Potential Annotation Problems Using Ortholog Clusters to Identify Potential Annotation Problems Different exon number due to annotation discrepancy In some cases, differences in exon number are real We need to be able to distinguish annotation inconsistencies from real, interesting phenomena Apoptosis in Fungi • Apoptosis-like process detected in S. cerevisiae, S. pombe, and Aspergilli. • Fungal genomes lack metazoan upstream machinery. • Metacaspase-dependent phenotype observed in A. fumigatus and A. nidulans. • Analysis by Goeff Robson Apoptosis in Fungi DOMAINS NB-ARC Caspase-activated nuclease S.cerevisiae S.pombe A.fumigatus A.nidulans A.oryzae X X 57.m05394 56.m02424 72.m19821 66.m04653 asfu05688 10025.m00126 10051.m00442 10115.m00081 10157.m00054 10176.m00005 10016.m00178 10150.m00052 10062.m00136 10153.m00210 20175.m00427 20175.m00347 20116.m00078 20180.m00891 20167.m00347 20122.m00102 20168.m00299 X X X X X CAS/CSE CSE-1 CSE-1 X X X MATH UBPF UBPF UBP5 53.m03780 53.m04162 10139.m00184 20147.m00277 Metacaspase MCA1 AL031179 59.m08486 54.m06827 10098.m00299 10042.m00047 10062.m00137 20149.m0027 20166.m00204 20161.m00321 Anti silencing protein1 ASF1 ASF1 59m.08789 10084.m00239 20175m.00377 STM1/MPT4 Q42914 X X X CDC48 CDC48 72.m19795 10124.m00023 20134.m00118 PROTEIN FAMILY STM1 CDC48p Aspergillus fumigatus Secondary Metabolites • Heterogeneous group of low molecular weight products. • Toxic, antibiotic, and immunosuppressant activities. –– fumagillin, gliotoxin (apoptosis and phagocyte dysfunction), fumitremorgin, verruculogen, fumigaclavine, helvolic acid, phthioc acid (granulomas when injected into mice) and sphingofungins • Virulence properties may be augmented by the A. fumigatus numerous secondary metabolites. Secondary Metabolite Genes Gene type A. oryzae A. fumigatus A. nidulans PKS 30 14 27 NRPS 18 14 14 FAS 5 1 6 Sesquiterpene cyclase 1 (1) (1) DMATS 2 7 2 Analysis by G. Turner, N. Keller, Dr. Kitamoto, and R. Kulkarni Serine Phenylalanine 2 module NRPS? Terpene Sesquiterpene cyclase Gliotoxin Fumagillin Tryptophan DMAT synthetase (X2) Fumigaclavines Tryptophan Proline NRPS? DMAT synthetase Fumitremorgens Five 2-module NRPS Gene type A. oryzae A. fumigatus A. nidulans PKS2 30 14 27 NRPS 18 14 14 FAS 5 1 6 Sesquiterpene cyclase 1 (1) (1) DMATS 2 7 2 A. fumigatus Secondary Metabolite Genes • Few true orthologues across the genus Aspergillus. Each species has its own repertoire. • Gene/product relationship requires functional analysis in most cases • Indole alkaloid pathway in A. fumigatus only. Closely related to Claviceps purpurea ergotamine pathway • Penicillin and aflatoxin pathways are absent. • A hybrid PKS/monomodular NRPS seems to be present in several fungi. Identify A. fumigatus specific genes A. fumigatus genes (9746) All vs. all BlastP of the AFU1,ANA1, AOAN proteomes cut-off E value: 1 x e-15, filtering the results for mutual best hits between genomes. A. fumigatus singletons (2075) BLASTP vs ANA1 and AOA1 proteomes A. fumigatus singletons E-value > e-10 (1081) Extend 50bp on both ends of the gene in the genome, Tblastx the genomic seq of the gene vs ana and aoa genomic seq A. fumigatus specific gene candidates E-value > e-50 (1011) BLASTP vs ANA1 and AOA1 proteomes e-5>E-value>e-10 (203) e-50<E-value < e-10 (181) E-value > e-5 (808) Extend 50bp on both ends of the gene in the genome, Tblastx the genomic seq of the gene vs ana and aoa genomic seq e-5>E-value>e-10 (75) E-value > e-5 (552) Aspergillus fumigatus Unique Genes • Vast majority are hypothetical • Includes – Several transcriptional regulators – A chaperonin – An hsp 70 related protein Arsenic Fungi • 19th century poisonings associated with green pigments. • 1892 B. Gosio, certain fungi could metabolize arsenic pigments producing toxic trimethylarsine (Gosio gas). • Screen in the 1930s (Thom & Raper) found A. fumigatus to be an arsenic fungus. • Napoleon, imperial colors green and gold, copper arsenite (Jones 1982). • Analysis of history and genome by J. Bennett, N. Hall, J. Wortman, C. Lu. A. fumigatus Arsenate Genes • Arsenite efflux pump • Arsenite translocating ATPase • Two possibly duplicated clusters – arsC – arsenate reductase (A. fumigatus unique) – arsB – arsenite symporter – arsH – Methyltransferase Chromosome 5 Chromosome 1 arsH Methytrasferase arsH arsB arsC Methyltransferase arsB arsC A. Fumigatus Teichoic Acid Biosynthesis Protein • Good homology to a the full length of the Streptomyces griseus protein. • Secretion signal peptide may direct for cell wall. • Teichoic acids demonstrated to be a virulence factor for Staphylococcus aureus. • No intervening sequences in gene. Analysis by Neil Hall More highly expressed at 48oC More highly expressed at 37oC A. Fumigatus Thermotolerance A. fumigatus Thermotolerance • Relatively few genes altered • Some HSPs transiently or stably induced (weakly) and repressed at 37oC. • HSPs induced throughout 180 min 48oC period • Transposases induced at 48oC (Mariner 4). • Stress related genes up regulated at 48oC. • Metabolic proteins down regulated at 48oC “This fungus likes it hot.” J. Bennett Microarray Detection of Clusters Aspergillus fumigatus AF293 Project Participants • The University of Manchester, UK • The Wellcome Trust Sanger Centre, UK • The Institute for Genomic Research, USA • The University of Salamaca, Spain • Complutense University, Spain • Centro de Investigaciones Biológicas, Spain Aspergillus fumigatus AF293 Joan Bennett David Denning Matt Berriman Michael Anderson Jean Paul Latge Arnab Pain Paul Dyer Goeff Robson Paul Bowyer Javier Arroyo Neil Hall Goeff Turner David Archer Aspergillus nidulans – James Galagan Aspergillus oryzae – Masayuki Machida TIGR Sequencing and Closure Tamara Feldblyum Hoda Khouri Annotation Lab Group Jennifer Wortman Heenam Kim Jiaqi Huang Dan Chen Resham Kulkarni Natalie Fedorova Claire Fraser Charles Lu NIAID and Dennis Dixon