Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Comparative Genomics
of Aspergilli
William Nierman
TIGR
5.0
1x 4.0
3.5
Sp
Af
Sc
Electrophoretic Karyotyping
5 day run
5.7
4.6
3.5
1.8
CHEF DRII 1.2% CGA, 1x TAE, 14C, 1.8 V/cm: 2200 s, 48 h; 2200-1800 s, 68 h
sizes in Mb
A. fumigatus Chromosomes
Size (MB)
1
4.891
2
4.834
3
4.018
~35 copies rDNA
4
3.933
5
3.922
6
3.779
7
2.021
8
Centromeric area
Telomere
1.789
Centromeres and Telomeres
• Telomere repeat TTAGGG, 7-21 repeat units
– Subtelomeric regions- identical sequences for several kb,
helicase pseudogenes, 7 secondary metabolite clusters, niche
adaption role? (Mark Farman)
• Centromeres
– Uncloned in shotgun libraries; 36.2 - 55.9kb
– Flanked on each side by low complexity AT rich repeat region
– Chromosome 2 centromere 12 kb PCR product 75% AT,
overall centromeric AT of 63%, 40kb.
Annotation Pipeline
Eukaryotic Genome Control (EGC) is the annotation pipeline responsible for processing
genomic sequence
Finished chromosome sequences
Masked genomic sequence
Gene prediction
EST alignments
Optimize Predictions
Protein alignments
Training Data
Gene and splicing site predictions including Glimmer,Exonomy,
Unveil, Phat and GeneSplicer were trained with following
experimental data:
– Full Length cDNAs (625) and 42 partials from 589 loci in 19
Aspergillus species
– 2,633 A. fumigatus ESTs from UK and Spanish collaborators
Optimize Predictions
Combiner combines gene model evidence from:
• Gene prediction programs
• Splice site prediction programs
• Alignments from protein, cDNA and EST databases
• Generates final gene model.
All the genes were manual reviewed and the observed
splits and merges were corrected.
Annotation Station Screenshot
Brown 2
Brown 1
Scytalone dehydratase
Yellowish-green
1,3,6,8-tetrahydroxynaphthalene reductase
Polyketide synthetase
Gene Summary Statistics
Chromosome
AFU
ANA
AOA
Size
28635699
30068514
36746653
GC Content
49.9
50.3
48.3
# of Genes
9746
9967
14063
Mean Gene Length
1442.4
1535.9
1177.5
Gene Density
2938.2
3016.8
2613
Percent of Coding
49.1
50.9
45.1
Percent Genes with Introns
75.8
88.7
80.7
Exons
AFU
ANA
AOA
Number
26181
36249
40133
Mean # per Gene
2.7
3.6
2.9
GC Content
54
53.4
52
Mean Length(bp)
536.9
422.3
412.6
Total Length(bp)
14057166
15308196
16559586
Introns
AFU
ANA
AOA
Number
16432
26282
26070
GC Content
46.3
46.1
45.5
Mean Length(bp)
121.8
104.6
129.7
Total Length(bp)
2000799
2748240
3380731
Intergenic Regions
AFU
ANA
AOA
GC Content
46
47.5
45.3
Mean Length(bp)
1276.4
1159.5
1174.3
Functional Annotation
AFU
ANA
AOA
Most Common Domains in A. fumigatus
Domains
Domain name
#Proteins
PF00172
Fungal Zn(2)-Cys(6) binuclear cluster dom.
147
PF00083
Major facilitator superfamily
109
PF00400
WD domain G-beta repeat
105
PF00069
Protein kinase domain
105
PF00106
Oxidoreductase, sh. Chain dehydro./reduc.
95
PF00271
Helicase conserved C-terminal domain
75
PF00023
Ankyrin repeat
64
PF00067
Cytochrome P450
65
PF00096
Zinc finger C2H2 type
61
PF00107
Oxidoreductase, Zn-binding dehydrogenase
61
PF00076
RNA recognition motif
59
PF00005
ABC transporter
51
PF00501
AMP-binding enzyme
44
PF00270
DEAD/DEAH box helicase
39
PF01360
Monoxygenase
39
Synteny Map of A. fumigatus and A. nidulans
Synteny Map of A.fumigatus and A. oryzae
Synteny Map of A. fumigatus, A. nidulans, A. oryzae
Overview – Comparative Statistics
The ortholog was computed by performing an all vs. all BlastP of the three
proteomes with a cut-off of 1 x e-15 (no length requirement). The mutual best
hits were then organized into clusters based on shared protein nodes.
COG
A. fumigatus
A. Oryzae
A. nidulans
avg_pctid
avg_coverage
num_cogs
3 member
+
+
+
70%
86%
5899
+
+
65%
84%
967
+
61%
79%
533
+
61%
80%
936
2 member
+
+
Species
#genes included in COG
percent of predicted proteome
A. fumigatus
7507
79%
A. nidulans
7429
75%
A. Oryzae
7988
57%
Total
22924
68%(22924/33552)
TIGR Autoannotation vs Sanger Curated
Annotation
•
•
•
•
•
•
•
•
•
Status
Total Sanger Genes analyzed
Same gene structure
Different gene structure
Sanger missing in TIGR annotation
Sanger matches multiple TIGR annotations
Sanger, TIGR annotations opposite strands
TIGR missing in Sanger annotation
TIGR matches multiple Sanger annotations
Count
360
137
177
37
2
7
12
9
Using Ortholog Clusters
to Identify Potential
Annotation Problems
Using Ortholog
Clusters
to Identify Potential
Annotation Problems
Different exon
number due to
annotation
discrepancy
In some cases,
differences in
exon number are
real
We need to be
able to
distinguish
annotation
inconsistencies
from real,
interesting
phenomena
Apoptosis in Fungi
• Apoptosis-like process detected in S. cerevisiae,
S. pombe, and Aspergilli.
• Fungal genomes lack metazoan upstream
machinery.
• Metacaspase-dependent phenotype observed in
A. fumigatus and A. nidulans.
• Analysis by Goeff Robson
Apoptosis in Fungi
DOMAINS
NB-ARC
Caspase-activated
nuclease
S.cerevisiae
S.pombe
A.fumigatus
A.nidulans
A.oryzae
X
X
57.m05394
56.m02424
72.m19821
66.m04653
asfu05688
10025.m00126
10051.m00442
10115.m00081
10157.m00054
10176.m00005
10016.m00178
10150.m00052
10062.m00136
10153.m00210
20175.m00427
20175.m00347
20116.m00078
20180.m00891
20167.m00347
20122.m00102
20168.m00299
X
X
X
X
X
CAS/CSE
CSE-1
CSE-1
X
X
X
MATH
UBPF
UBPF
UBP5
53.m03780
53.m04162
10139.m00184
20147.m00277
Metacaspase
MCA1
AL031179
59.m08486
54.m06827
10098.m00299
10042.m00047
10062.m00137
20149.m0027
20166.m00204
20161.m00321
Anti silencing protein1
ASF1
ASF1
59m.08789
10084.m00239
20175m.00377
STM1/MPT4
Q42914
X
X
X
CDC48
CDC48
72.m19795
10124.m00023
20134.m00118
PROTEIN FAMILY
STM1
CDC48p
Aspergillus fumigatus Secondary
Metabolites
• Heterogeneous group of low molecular weight products.
• Toxic, antibiotic, and immunosuppressant activities.
–– fumagillin, gliotoxin (apoptosis and phagocyte dysfunction),
fumitremorgin, verruculogen, fumigaclavine, helvolic acid, phthioc
acid (granulomas when injected into mice) and sphingofungins
• Virulence properties may be augmented by the A.
fumigatus numerous secondary metabolites.
Secondary Metabolite Genes
Gene type
A. oryzae
A. fumigatus
A. nidulans
PKS
30
14
27
NRPS
18
14
14
FAS
5
1
6
Sesquiterpene
cyclase
1
(1)
(1)
DMATS
2
7
2
Analysis by G. Turner, N. Keller, Dr. Kitamoto, and R.
Kulkarni
Serine
Phenylalanine
2 module NRPS?
Terpene
Sesquiterpene cyclase
Gliotoxin
Fumagillin
Tryptophan
DMAT synthetase (X2)
Fumigaclavines
Tryptophan
Proline
NRPS?
DMAT synthetase
Fumitremorgens
Five 2-module NRPS
Gene type
A. oryzae
A. fumigatus
A. nidulans
PKS2
30
14
27
NRPS
18
14
14
FAS
5
1
6
Sesquiterpene
cyclase
1
(1)
(1)
DMATS
2
7
2
A. fumigatus Secondary Metabolite Genes
•
Few true orthologues across the genus Aspergillus. Each species has its own
repertoire.
•
Gene/product relationship requires functional analysis in most cases
•
Indole alkaloid pathway in A. fumigatus only. Closely related to Claviceps
purpurea ergotamine pathway
•
Penicillin and aflatoxin pathways are absent.
•
A hybrid PKS/monomodular NRPS seems to be present in several fungi.
Identify A. fumigatus specific genes
A. fumigatus genes
(9746)
All vs. all BlastP of the AFU1,ANA1, AOAN proteomes
cut-off E value: 1 x e-15, filtering the results for mutual best hits
between genomes.
A. fumigatus singletons
(2075)
BLASTP vs ANA1 and AOA1 proteomes
A. fumigatus singletons E-value > e-10
(1081)
Extend 50bp on both ends of the gene in the genome, Tblastx
the genomic seq of the gene vs ana and aoa genomic seq
A. fumigatus specific gene candidates E-value > e-50
(1011)
BLASTP vs ANA1 and AOA1 proteomes
e-5>E-value>e-10
(203)
e-50<E-value < e-10
(181)
E-value > e-5
(808)
Extend 50bp on both ends of the gene in the genome, Tblastx
the genomic seq of the gene vs ana and aoa genomic seq
e-5>E-value>e-10
(75)
E-value > e-5
(552)
Aspergillus fumigatus Unique Genes
• Vast majority are hypothetical
• Includes
– Several transcriptional regulators
– A chaperonin
– An hsp 70 related protein
Arsenic Fungi
• 19th century poisonings associated with green pigments.
• 1892 B. Gosio, certain fungi could metabolize arsenic
pigments producing toxic trimethylarsine (Gosio gas).
• Screen in the 1930s (Thom & Raper) found A. fumigatus
to be an arsenic fungus.
• Napoleon, imperial colors green and gold, copper arsenite
(Jones 1982).
• Analysis of history and genome by J. Bennett, N. Hall, J.
Wortman, C. Lu.
A. fumigatus Arsenate Genes
• Arsenite efflux pump
• Arsenite translocating ATPase
• Two possibly duplicated clusters
– arsC – arsenate reductase (A.
fumigatus unique)
– arsB – arsenite symporter
– arsH
– Methyltransferase
Chromosome 5
Chromosome 1
arsH Methytrasferase
arsH
arsB
arsC
Methyltransferase
arsB
arsC
A. Fumigatus Teichoic Acid Biosynthesis Protein
• Good homology to a the full length of the
Streptomyces griseus protein.
• Secretion signal peptide may direct for cell wall.
• Teichoic acids demonstrated to be a virulence
factor for Staphylococcus aureus.
• No intervening sequences in gene.
Analysis by Neil Hall
More highly expressed at 48oC
More highly expressed at 37oC
A. Fumigatus Thermotolerance
A. fumigatus Thermotolerance
• Relatively few genes altered
• Some HSPs transiently or stably induced
(weakly) and repressed at 37oC.
• HSPs induced throughout 180 min 48oC
period
• Transposases induced at 48oC (Mariner 4).
• Stress related genes up regulated at 48oC.
• Metabolic proteins down regulated at 48oC
“This fungus likes it hot.”
J. Bennett
Microarray Detection of Clusters
Aspergillus fumigatus AF293 Project
Participants
• The University of Manchester, UK
• The Wellcome Trust Sanger Centre, UK
• The Institute for Genomic Research, USA
• The University of Salamaca, Spain
• Complutense University, Spain
• Centro de Investigaciones Biológicas, Spain
Aspergillus fumigatus AF293
Joan Bennett
David Denning
Matt Berriman
Michael Anderson
Jean Paul Latge
Arnab Pain
Paul Dyer
Goeff Robson
Paul Bowyer
Javier Arroyo
Neil Hall
Goeff Turner
David Archer
Aspergillus nidulans – James Galagan
Aspergillus oryzae – Masayuki Machida
TIGR
Sequencing and Closure
Tamara Feldblyum
Hoda Khouri
Annotation
Lab Group
Jennifer Wortman
Heenam Kim
Jiaqi Huang
Dan Chen
Resham Kulkarni
Natalie Fedorova
Claire Fraser
Charles Lu
NIAID and Dennis Dixon
Related documents