Download invited talk

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Comparative genomic analysis
of T-box regulation:
identification of new structural classes
and reconstruction of evolution
Mikhail Gelfand
Research and Training Center “Bioinformatics”
Institute for Information Transmission Problems
Moscow, Russia
HHMI Conference, June 2008
T-boxes: the mechanism
(Grundy & Henkin; Putzer & Grunberg-Manago)
Partial alignment of predicted T-boxes
Terminator(underlined)
===========> <===========
TGG: T-box Antiterminator
==> ===>
<===<==
AminoacyltRNA
synthetases
Amino acid
biosynthetic
genes
Amino acid
transporters
SA
DHA
ST
CA
DF
PN
MN
DF
HD
DF
ZC
BQ
MN
MN
ST
serS
tyrZ
trpS
aspS
valS
thrS
ileS
leuS
argS
proS
lysS
metS
pheS
glyQ
alaS
->
->
->
->
->
->
->
->
->
->
->
->
->
->
->
26
47
37
39
41
30
89
28
41
33
46
55
14
14
20
CGTTA
CGTTA
CCTTA
CGTTA
CGTTA
CGTTA
CGTTA
AGCTA
CGTTA
CGTTA
CGTTA
CGTTA
AATTA
AGCTA
AATTA
51
65
61
34
77
38
68
29
27
30
63
66
20
23
18
AAATAGGGTGGCAACGCGTAGAC------------CACGTCCCTTGTAGGGATGTGGTCTTTTTTTA
AGGTAAGGTGGTAACACGGGAGCA-------TACTCTCGTCCTTCTGGCAATGAAGGACGGGAGTTTTTTGTTTT
AATTGAGGTGGTACCGCGTATTACTT----GTAATAACGCCCTCACGTTTTAATAGCGTGGGGACTTTTTGCTAT
ATAAAGGATGGCACCGTGAAAA----------GCCTTCACTCCTTACTGGAGTGGAGGCTTTTTTTATTTTAAATAAA
AATTAAGGTGGTAACGCGAGC------------TTTTCGTCCTTTTTAAAGAGGATGAAGAGCTCTTTTTTATTTCT
AATGAAGGTGGAACCACGTTG-------------CGACGTCCTTTCGAGGATGTCGCATTTTTTTATTAG
AATTAAGGTGGTACCACGAGC-------------TTTCGTCCTTTGATGAAAGTTCTTTTTTATTGAT
AATTAGGGTGGTACCGCGAAGATT-------TATCCTCGTCCCTAAACGTAAGTTTAGTGACGAGGATTTTTTATTTTCA
AACGAGAGTGGTACCGCGGGTAA---------AAGCTCGCCTCTTTTTAGAAGAGGCGGGTTTTTTATTTT
AACTAGAGTGGTACCGCGGAAAT-----TAAACCTTTCGTCTCTATACTTGTATAGAGATGAGAGGTTTTTTATATTTTCAGG
AACTGAGGTGGTACCGCGAAGCTAA-----CAACTCTCGTCCTCAAGATGAATAATCTTGGGGGTGGGAGTTTTTTTGTTGCA
AAATAAGGTGGTACCGCGACTGTTTA---TACAGCCCCGCCCTTATCTTTTTTAGATAAGGGCGGGGCTTTTTATATTTAA
AAAACGGATGGTACCGCGTGTC-------------AACGCTCCGCTTAAGGAGTTTTGGCACTTTTTTTGTTTT
AATTAGGGTGGAACCGCGTTT------------CAAACGCCCCTATGTCAGTTGGCATGGGAGTGATTGAGCGTGGCTCTTTT
AATAGAGGTGGTACCGCGGTT--------------TTCGCCCTCTGTGAGATGGACTTGTTTTGTATGGAGGACTATTTGAAA
SA
BS
CA
BQ
BS
SA
MN
DHA
HD
BQ
EF
trpE
ilvB
ilvC
asnA
proB
cysE
hisC
pheA
serA
phhA
yxjH
->
->
->
->
->
->
->
->
->
->
->
32
50
40
51
33
33
46
41
42
51
40
AATTA
CGTTA
CGTTA
CGTTA
CGTTA
CATTA
CGTTA
CGTTA
cgtta
CGTTA
CGTTA
4
47
14
62
30
62
50
50
57
34
51
AACTAAGGTGGCACCACGGTA-------------ACGCGTCCTTACAGGTATATGCGTTATGTGGTGTCTTTTT
AACAAGGGTGGTACCGCGGAAAGAAA---AGCCTTTTCGCCCCTTTTAGCTATCGCAGTTACTGCGCGGCTGATTGT
AATTTGGGTGGTACCGCGCGACCAAA-----AATTCTCGCCCCAAGCAGGGAATTTTGGCCGTTTTTTTATATAAATAAAT
AATTTGGGTGGTACCGCGGAACC-----AAAGCCTTTCGTCCCAGTTTTTTGGGAAAGAAGGGCTTTTTTTGTTGGCTT
AATCAAGGTGGTACCACGGAAAC--------CCATTTCGTCCTTATGAATCAGGATGAAATGGGTTTTTTTATTGTAGA
ATTCAGAGTGGAACCGTGCGG-------------AAGCGCCTCTAACAATACAATTTGTATGTTAGTGGTGCTTTTTTG
AATGAAGGTGGAACCACGTGTGT---------GTCAGCGTCCTTGCAAGTTTTTTGCAAGGGCGCTTTTTTGAATAGT
AAAAAGGGTGGTACCGCGTGAC---------TTAACTCGTCCCTTATTTGGGGGTGAGGTAAGTCTTTTTTTATTTA
AATGAGGGTGGCACCGCGGTATG-------AACCTTCCGCCCCTCACGACAGTCGTCGTGTGGGCAGAAGGTTTTTTTACTAT
AAATAGGGTGGTACCGCGATTC------------TTTCGCCCCTATCGGATTTTCCGATAGGGGCTTTTTCTATTTC
AAAAAAGGTGGTACCGCGATAA-----------TAATCGCCCTTTTACTAGTTACGGCTAGTAAAAGGGCGTTTTTTTATAAA
CA yckK -> 38
DF yqiX -> 41
HD BH0807->74
EF yheL -> 8
BQ ykbA -> 46
BQ sdt2 -> 40
EF yusC -> 42
CA yhaG -> 48
BQ brnQ -> 44
REF01723 -> 44
BS yvbW -> 56
CGTTA
CCTTA
TGTTA
AATTA
CGTTA
CGTTA
CGTTA
CGTTA
CGTTA
CGTTA
CGTTA
57
30
56
33
45
56
60
51
66
55
32
AATTAGAGTGGTACCGTGGAATT-------CAACTTCTGCCTCTAACTATGAGGATAGAAGTTTTTTGTTTTTAT
AAAAAGAGTGGTAACGCGGATAT----------AATTCGTCTCTTAGCTGTAAAGCTAAGGGACTTTTTTGATTTA
AACTGGGGTGGCACCACGACAAG----------TGATCGTCCCCAAGACTTTTATCAGTCTTGGGGACGTTTTTTTGTTCAT
AATTAAGGTGGTACCGCGGAGA-----------GATTCGTCCTTATTCTTTAAGGATGAATCTCTCTTTTTATGTAGC
AACAAGGGTGGAACCACGAATAT--------AACACTCGTCCCTTTTTTAGGGAGGAGTGTTTTTTTATT
AATTGAGGTGGTACCACGGTATTAACATTACATATATCGTCCTCTACATGCATATTTGCGTGTAGGGGACTTTTTTATTTTC
AATTAAGGTGGTATCACGAAATGA-----CAAACTTTCGTCCTTTTTGCTGTAATAGCAAAAGGATGGAAGTTTTTTTGTTT
AATTTAGGTGGTACCGCGGAAGT---------ATCTCCGTCCTAATTAATAAGATTAGGGCGGAGTTTTTTATTTGC
AATTAGGGTGGTATCGCGGGTAAA------TATAACTCGTCCCTTTCTTTAGGGACGAGTTTTTTGTGTTCTT
AATTGAGGTGGCACCACGAATGC----------GATTCGTCCTCTTGGCTCACAGCCAAGAGGCTTTTTTGTTTTTTTAATA
AACAAGAGTGGTACCGCGGTCAGC--CGAAGGCTCGTCGTCTCTTTATCTATTAGATTAGGTAGGAGACGGCGGGCTTTTTT
… continued (in the 5’ direction)
specifier hairpin
===>
==>
===>
<=== <==
anti-anti
(specifier)
codon
SC<===
SA
DHA
ST
CA
DF
PN
MN
DF
HD
DF
ZC
BQ
MN
MN
ST
SERS
tyrZ
trpS
ASPS
VALS
THRS
ileS
leuS
ARGS
proS
lysS
metS
pheS
glyQ
alaS
SER
Tyr
Trp
ASP
VAL
THR
Ile
Leu
ARG
Pro
Lys
Met
Phe
Gly
Ala
---GTAGGACAAGTA
----AAGAACAAGTA
---ATTAGAAGAGTA
-----GAGAAAAGTA
-GAAGAAGAGGAGTA
----AGAGACAAGTC
----CAAAAACACAA
----CTAGAGCAGTA
-----TGGGAGAGTA
---AAAGAAATAGTA
---AAGAGAAGAGTA
---AAAGGAAAAGTA
----TGAGATTAGTA
---AGAAAGAGAGTT
-AGTTAAGAATTGTT
19
18
16
18
16
18
17
19
20
18
19
19
18
15
17
AGAGAGCTTGTGGTT---AGTGTGAACAAG--AGAAAGTTGCCGGCT---GATGAGAGGCGCTT
AGAGAGTTAGTGGTT---GGTGCAAGCTAACAGCGAATTGGGAAAT---GGTGTGAGCCCAAAGAGAGGAAAATTCACTGGCTGTAAGATTTTC
AGAGAGTGCGTGGTT---GCTGGAAACGCATAGCGAATAGGTGAT----GGTGTAAGACCTATT
AGAGGAAGTGGAA-----GGTGAGAACTAATATT
AGCGAGTCGGGAT-----GGTGGGAGCCGATAGAGAGAAAACGGT----GGTGAGAGTTTTC-AGAGAGCTCTGGTA----GCTGAGAAAGAGC-AGAGAGCTTCGGTA----GCTGAGAAGAAGC-AGGGAATGCGGGGCGTG-ACTGGAAACCCGCAGCGAACCTGAGAG----AGTGTAAGTCAGGT
AGAAAAGTGACGGTT---GCTGCGAGTCATT-
15
18
12
15
17
14
18
10
14
14
15
14
16
14
17
GAA--TCTACCTACTT
GAA--TACCTCTTTGA
GAAA-TGGACTAATGA
GAAA-GACATCTCGGA
GAAT-GTAGCTTTGGA
GAT--ACTACTCTTGA
-----ATCATTTTGTT
GAA--CTTACTAGATT
GAAA-CGCACCCATGA
GAA--CCTGTCTTTTA
GAAAAAAGACTTGGAG
GAACAATGGCCTTTGA
GAA--TTCACTCAGAA
GACT-GGCACTTTCTC
-----GCTACTTAACT
->
->
->
->
->
->
->
->
->
->
->
->
->
->
->
Amino acid
biosynthetic
genes
SA
BS
CA
BQ
BS
SA
MN
DHA
HD
BQ
EF
trpE
ilvB
ilvC
asnA
proB
cysE
hisC
pheA
serA
phhA
yxjH
Trp
Leu
Val
Asn
Pro
Cys
His
Phe
Ser
Tyr
Met
TCTAAAGAAATAGTA
---TGAGGATAAGTA
-----AGGAAGAGTA
--AGGACGAGTAGTA
-----AGGATTAGTA
--CGAAGGATTAGTA
-----AGAGAAAAAA
-----AAAGAGAGCA
----GAAGATGAGGA
AGAATCGCAGTAGTA
-----TAGGAAAGTA
22
20
17
15
18
18
16
19
17
17
17
AGAAAGCTAATGGGT---GATGGGAATTAGC-AGAGAACCGGGTTA----GCTGAGAACCGG--AGAGAGTGAGATACT---GGTGGGAACTCAT-AGCGAGTCAGGGGT----GGTGTGAGCCTGA-AGAGAGCAAAATGAACC-GCTGAAACATTTTGC
AGAGAGTGTACGGTT---GCTGTGAGTACA--AGAGAGTATGGGAA----GCTGAAAACATAC-AGGGAACTAAAGTCGGAGACTGAAAGCTTTAGT
AGAGAGCTGGTGGTT---GCTGTGAACCAGCTAGAGAGCTAATGGTC---GGTGGAAATTGGC-AGAGAGACTTTGGTT---GGTGAAAAAAGTT--
14
16
13
15
15
14
15
14
18
14
13
GAAT-TGGACTTTGGA
GAA--CTCGCCTCAGA
GAAG-GTAGCCTTTGA
GAAG-AACCTCCTGGA
GAA--CCTGCCTTGGA
GAA--TGCACCTTCGT
-----CACATTCTTGA
GAGA-TTCACTCTGGA
-----AGCCCTTCTGA
GAAT-TACAATTCTGG
GAAAAATGGCCTAGGA
->
->
->
->
->
->
->
->
->
->
->
Amino acid
transporters
CA yckK
DF yqiX
HD BH0807
EF yheL
BQ ykbA
BQ sdt2
EF yusC
CA yhaG
BQ brnQ
REF01723
BS yvbW
Cys
Arg
Lys
Tyr
Thr
Trp
Met
Trp
Ile
His
Leu
----AAGAACCAGTA
-----AGAGAAAGTA
----AGAGAAGAGTA
-TTATTAGCCCAGTA
--GAGGACACGATCA
---GCAAGAAGAGTA
----AAAGAAGAGTA
----AAGGAAGAGTA
----GAGAACGAGTA
--TTAGGACATAGTA
-----GGGAGCAGTA
17
16
19
19
16
18
18
18
19
18
18
AGAGAAAAATCTCCAAG-GCTGAAAGGGATTTT
AGCGAGTTAGGGGTT---GGTGTAAGCCTAGCAGAAAGCCTGTAGTT---GCTGAGAACGGGT-AGAAAGTCGATGGTT---GCTGCGAATCGAT-AGAGAGGGAAGCCTTTG-GCTGTGAGCTTCCTAGAGAGCTGGGGGAA---GGTGTGAGCCCGGTAGAGAGCCCTGTTT----GCTGAGAATGGG--AGAGAGCTGAGGGT----GGTGTGATCTCAGTAGAGAGTTGGCGATTT--GCTGAAAGCCAAC-AGAGACTTTTTCATTG--GCTGAAAGAAAAAGAGAGAGCTGCGGGGT---GGTGCGACGCAGC--
15
14
14
13
14
15
16
15
15
17
13
GAA--TGCATCTTTGA
GAAG-AGAGCTCTGGA
GAAGCAAGACTCTGAG
GAAT-TACACTAATAA
GATT-ACCACCTCTGA
GAA--TGGGCTTGCGA
GAAG-ATGGTCTTTGA
GAA--TGGACCTTTTA
GAAA-ATCATCTCCGA
-----CACACCTAAAA
GAA--CTCGCCCGGGA
->
->
->
->
->
->
->
->
->
->
->
AminoacyltRNA
synthetases
Why T-boxes?
• May be easily identified
• In most cases functional specificity may
be reliably predicted by the analysis of
the specifier codons (anti-anti-codons)
• Sufficiently long to retain phylogenetic
signal
=> T-boxes are a good model of
regulatory evolution
805 T-boxes in 96 bacteria
• Firmicutes
–
–
–
–
aa-tRNA synthetases
enzymes
transporters
all amino acids excluding glutamate
• Actinobacteria (regulation of translation – predicted)
– branched chain (ileS)
– aromatic (Atopobium minutum)
• Delta-proteobacteria
– branched chain (leu – enzymes)
• Thermus/Deinococcus group (aa-tRNA synthases)
– branched chain (ileS, valS)
– glycine
• Chloroflexi, Dictyoglomi
– aromatic (trp – enzymes)
– branched chain (ileS)
– threonine
Double and partially double T-boxes
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
TRP: trp operon (Bacillales,
C. beijerincki, D. hafniense)
TYR: pah (B. cereus)
THR: thrZ (Bacillales);
hom (C. difficile)
ILE: ilv operon (B. cereus)
LEU: leuA (C. thermocellum)
ILE-LEU: ilvDBNCB-leuACDBA
(Desulfotomaculum reducens)
TRP: trp operon (T. tengcongensis)
PHE: arpLA-pheA (D. reducens, S. wolfei)
PHE: trpXY2 (D. reducens)
PHE: yngI (D. reducens)
TYR: yheL (B. cereus)
SER: serCA (D. hafniense)
THR: thrZ (S. uberis)
THR: brnQ-braB1 (C. thermocellum)
HIS: hisXYZ (Lactobacillales)
ARG: yqiXYZ (C. difficile)
Predicted regulation of translation:
ileS in many Actinobacteria
• Instead of the terminator, the sequester hairpin
(hides the translation initiation site)
• Same mechanism regulates different processes –
cf. riboswitches
A new type of translational T-boxes
in Actinobacteria
• Shorter specifier hairpin
• Anti-anti-codon in the “head”
loop, not a bulge loop
• A majority of cases (all
except Streptomyces spp.)
Same enzymes
– different
regulators
(common part
of the
aromatic amino
acids
biosynthesis
pathway)
PEP
E4P
aroA
aro:
Regulated by TYR (BC)
Regulated by PHE (SWO, DRE, HMO, CH, MTH, CTH)
Regulated by TRP (DE, DEH)
DAHP
aroB
aroC
aroD
SHIKIMATE
aroI
aroE
aroF
pabA
pabB
CHORISMATE
aroA
trpE
pheB
aroH
trpG
FOLATE
ANTHRANILATE
tyrA
hisC
aspB
trpDCFBA
kinurenine
pathway
ADC
TRP
yhaG
TRP
TYR
PHE
phhA
TRP trpXYZ
TRP\PHE yocR family
TYR yheL
cf. E.coli: aroF,G,H:
feedback inhibition
by TRP, TYR, PHE;
transcriptional
regulation by
TrpR, TyrR
Recent duplications and bursts:
ARG-T-box in Clostridium difficile
LR_ARGS
CPE_ARGS
CAC_ARGS
CB_ARGS
CBE_ARGS
Lactobacillales
CTC_ARGS
LP_ARGS
LME_ARGS
Clostridiales
argS
argS
LJ_ARGS
CDF_YQIXYZ
LGA_ARGS
RDF02391
PPE_ARGS
LSA_ARGS
СDF_ARGC
BC_ARGS2
EF_ARGS
BH_ARGS
CDF_ARGH
Bacillales
argS
: ARG-specific T-box regulatory site
yqiXYZ
NEW
NEW
aminoacyl-tRNA synthetase
biosynthetic genes
amino acid transporters
Clostridium
difficile
RDF02391
argCJBDF
argH
others
argG
predicted
amino acid
transporters
amino acid
biosynthetic
genes
… caused by loss of transcription factor AhrC
Gram+ bacteria:
Clostridium
difficile:
AhrC regulatory protein
(negative regulation of arginine metabolism
positive regulation of arginine catabolism)
Binding to 5’ UTR gene region
regulation of gene expression
5’
...
AhrC site
AhrC is lost
Expansion of T-box regulon
regulation of expression of
arginine biosynthetic
and transport genes by
T-box antitermination
Other clostridia spp.
(CA, CTC, CTH, CPE, CB, CPE)
yqiXYZ
yqiXYZ
argC
argH
argC
argH
argG
: AhrC binding site
: ARG-specific T-box regulatory site
CH_HISS
Bacillales
Other Gram+
hisS aspS
CTH_HISS
Lactobacillales
ASP\ASN
his operon
DRE_HISS
HIS
TTE_HISS
ASP
GAC
NEW
BE_HISS
ASN
AAC
Duplications
and changes in
specificity:
ASN/ASP/HIS
T-boxes
his XYZ
PL_HISS
Rapid mutation
of regulatory codons
BL_HISS
BS_HISS
BC_HISS
LRE_HISXYZ
LSA_HISXYZ
OOE_HISXYZ SGO_HISC
SMU_HISC
Z
XY
HI S
_
LP
EF_HISXYZ
OB_HISS
BCL_HISS
HIS
BH_HISS
EX_HISS
LME_HISXYZ
CDF_HISZX
EF_HISS
LMO_HISXYZ
EF_HISXYZ
LME_HIS(Z\G)
LL_HISC
LP_HISZ
Clostridiales
CPE_ASNS2
CDF_ASNA
CB_ASNS2
CDF_ASNS2
CTC_ASNA
asnS
ASN
LCA_HISZ
CB_ASNS3
CAC_ASNS32
asnA
BC_ASNS2
BC_ASNA
ASN
CBE_ASNS2
P. pentosaceus
asnS
CTC_ASNS2
CPE_ASNA
ASP
PPE_HISXYZ
Lactobacillales
hisS aspS
PPE_ASNS
EX_ASNA
LCA_HISS
ASP
hisXYZ
HIS
LB_ASNA
LB_ASNS2
LJ_HISS
LP_ASNA
PPE_ASNA
Lactobacillales
asnS
ASN
LB_HISS
asnA
LRE_ASPS
LP_HISS PPE_HISS
L. reuteri
aspS
ASP
hisS
HIS
LRE_HISS
ASN
LJ_ASNA
L. johnsonii
asnA
LJ_glnQHMP
LD_ASNA
ASN
glnQHMP
ASP
SG_ASPS2 SMU_ASPS2
Blow-up 1
LCA_HISS
LJ_HISS
PPE_HISXYZ
PPE_ASNS2
LB_HISS
LRE_ASPS
LB_ASNA
LP_HISS PPE_HISS
PPE_ASNA
LP_ASNA
LRE_HISS
ASN
AAC
HIS
CAC
P. pentosaceus
asnS
ASP
LJ_ASNA
hisXYZ
LJ_GLNQHMP
ASP
ASN
AAC
HIS
CAC
GAC
ASP
GAC
Lactobacillales
Lactobacillales
asnA
hisS aspS
ASN
ASP
L. reuteri
L. johnsonii
aspS
hisS
HIS
LD_ASNA
ASP
disruption of hisS-aspS operon
mutation of regulatory codon
asnA
ASN
glnQHMP
ASP
HIS
Blow-up 2. Prediction
Regulators
lost in
lineages
with
expanded
HIS-T-box
regulon??
… and validation
• conserved motifs upstream of HIS biosynthesis genes
Bacillales
(his operon)
Clostridiales
Thermoanaerobacteriales
Halanaerobiales
Bacillales
• candidate transcription factor yerC co-localized with the his genes
• present only in genomes with the motifs upstream of the his genes
• genomes with neither YerC motif nor HIS-T-boxes: attenuators
New histidine transporters
 hisXYZ
(The ATP-binding Cassette (ABC) Superfamily)
Firmicutes
 yuiF
(Na+/H+ antiporter, NahC family)
Bacillales, some Clostridiales
(regulated by his-attenuator in Haemophilus inlfuenzae)
 Cphy_3090
(SSS sodium solute transporter superfamily)
Clostridiales, Thermoanaerobacteriales,
Halanaerobiales
The evolutionary history of the his
genes regulation in the Firmicutes
More duplications:
THR-T-box in C. difficile and B. cereus
Bacillales
thrS
BE_THRS BC_THRS
BCE_BRNQ2
BH_THRS
BL_THRZ
BS_THRZ*
BC_HOM
thrZ
hom
B. cereus
thrCB
BCL_THRZ*
BC_THRZ*
BC_THRZ
brnQ
LMO_THRS
BCL_THRS
BL_THRS
BS_THRS
BCL_THRZ
PPE_THRS
LB_THRS
LJ_THRS
LP_THRS
TR_THRZ
Lactobacillaceae
Leuconostocaceae
thrS
EX_THRS
BS_THRZ
thrZ
CBE_THRZ CTH_THRZ
CPE_THRS
CDF_THRZ
CAC_THRZ
HMO_YNGI
OOE_THRS
CTE_THRZ
CDF_THRC
CDF_HOM
CDF_HOM*
TTE_THRZ
Clostridiales
thrS
СB_THRZ
CBE_THRS
MFL_THRS
MMY_THRS
LME_THRS
thrZ
SA_THRS
CTC_BRNQ1
SPY_THRS
SEQ_THRS
SUB_THRS
SMU_THRS
Streptococcaecae
thrS
hom
SAG_THRS
thrCB
С. difficile
SMI_THRS
SPN_THRS
brnQ
SG_THRS
STH_THRS
LL_THRS
SUI_THRS
: THR-specific T-box regulatory site
aminoacyl-tRNA synthetase
biosynthetic genes
amino acid transporters
others
Duplications and changes in specificity:
branched-chain amino acids
Firmicutes
leuS
LEU
LEU
Bacillales
PL_ILVB
Ilv operon
LEU
BH_ILVB
C. thermocellum
148_0001
.......
B. cereus
YOCR3
LEU
LEU
δ-proteobacteria
Clostridium difficile
Desulfitobacterium
hafniense
BS_ILVB
DTH_ILVB
Syntrophomonas
wolfei
029_0008
CPE_LEUS
EU
A
BCL_ILVB
LEU
_L
CBE_LEUS
DH
A
.......
Oceanobacillus
iheyensis
OB1271
CDF_LEUA
B. Subtilis
B. licheniformis
yvbW
leu operon
LEU
CTH_148_0001
BE_ILVB
DF_LEUS
BL_ILVB
TTE_LEUS
LEU
CTC_LEUS
GSU_LEUA
BS_LEUS
CB_LEUS
CA_LEUS
BL_LEUS
LEU
LP_BRNQ1_ile
BCL_LEUS
BH_LEUS
BC_LEUS
BE_LEUS
Firmicutes
DAC_LEUA
US
OB_LEUS
SWO_029_0008
Firmicutes
LCR_ILES
LL_ILES
LE
O_
LP_LEUS
DRE_070_0004
CH_LEUS
LE
US BS_YVBW
BL_YVBW
LM
SW
O_
LSA_LEUS
EX_LEUS
ileS
OB_ILVB
LJ_LEUS
LGA_LEUS
valS
VAL
LB_LEUS
ILE
SPY_ILES
SZ_ILES
SEQ_ILES
EF_LEUS
BC_YOCR3
STH_ILES
PPE_LEUS
OB1271
C. acetobutylicum
OOE_LEUS
SMU_ILES EF_ILES
LP3666
VAL
DG_VALS
SG_ILES
SAG_ILES
ilvC
CA_ILVC
SA_VALS
BE_VALS
CTH_VALS CH_VALS BH_VALS
Ilv operon2
SMI_ILES
SP_ILES
SOB_ILES
ILE
LME_ILES
Ilv operon2
BC_VALS
EX_VALS BCL_VALS
HMO_VALS
S
E_
VA
L
CPE_ILES
CB_ILES
CTC_VALS CBE_VALS
LJ_VALS
VAL
Lactobacillaceae
Clostridiaceae
Bacillus cereus
LJ_OPP
PPE_ILES
CAC_VALS
LS
VA
A_
LS
LL_VALS
LCR_VALS
brnQ
ILE
LMO_ILES
VAL
DF_ILES
EX_ILES
BC_YBGE*
BC_YBGE
LR_VALS
Lactobacillus casei
Lactobacillus plantarum
brnQ
CTC_ILES
LD_VALS
LME_VALS
CB_VALS
DF_VALS
CP
DHA_VALS
PPE_VALS
EF_VALS
LCA_BRNQ2_ile
LRE_BRNQ_ile
TTE_ILES
BL_VALS
IlvCB
ILE
LP_BRNQ2_val
LSA_ILES
BS_VALS
TTE_VALS
LP_VALS
OB_ILES
ILE
LRE_3666_1
BC_ILES
CPE_BRNQ
CTC_BRNQ2
LP_ILES
BCE_BRNQ1
HMO_ILVB
ATC
CTC
BS_ILES
BL_ILES
BC_ILES2
VAL
ILE
CAC_BRNQ
CTH_ILES
LR_LEUS
GTC
T-box duplication and mutation
of regulatory codon
BCL_ILES
BH_ILES
CTC_BRNQ1 CDF_ILVC
BC_ILVB
Lactobacillales
lp3666
DHA_ILES
BE
_IL
CH_ILES
ILE
Desulfotomaculum reducens
Ilv operon
ES
OOE_ILES
Lactobacillus johnsonii
opp
LRE_3666_2
DRE_ILES
CH_YBGE
ILE
LEU
HMO_ILES
ILE
DRE_ILVD*_leu
Lactobacillus reuteri
panE
ILE
DRE_ILVD_ile
IlvBN
ILE
.......
LCA_BRNQ1_val
LJ_BRNQ_ile
DRE_VALS
.......
C. difficile
ILE
LB_ILES
OOE_LP3666
LRE_PANE
Heliobacillus mobilis
Ilv operon
ILE
.......
LJ_ILES
LD_ILES SA_ILES
LMO_VALS
Carboxydothermus
hydrogenoformans
B. cereus
SUB_ILES
Recent T-box duplication and mutation
of regulatory codon
ILE
CTC
ATC
LEU
ATC
CTC
Blow-up
transporter:
ATC
GTC
dual
regulation of
common
enzymes:
ATC
CTC
Summary / History
Other results
• Bacteria (comparative genomics of regulation)
– Reconstruction of metabolic pathways and their regulation
• niacin
• ethanolamine
– Prediction of regulation
• cysteine and methionine pathways in the Streptococcus spp.
• radiation resistance in the Deinococcus spp.
– Identification and experimental validation (collaborators) of
a new class of transporters with shared ATP-dependent
energizing modules
– Identification of new microcins
– Analysis of co-evolution of transcription factors and their
binding motifs
• Eukaryotes (alternative splicing)
– Evolution of the exon-intron structure and alternative
splicing in the Drosophila spp. and in mammals
• estimates of the rate of intron and exon gain and loss
– Proof of positive selection in minor-isoform alternative
regions of human genes
Acknowledgements
• Alexei Vitreschak
•
•
•
•
•
•
•
•
•
•
•
Ekaterina Ermakova
Alexei Kazakov
Marat Kazanov
Galina Kovaleva
Andrei Mironov
Ramil Nurtdinov
Mikhail Pyatnitsky
Alexandra Rakhmaninova
Dmitry Ravcheev
Valery Sorokin
Olga Tsoy
• Anna Gerasimova (Ann-Arbor)
• Olga Kalinina (Heidelberg)
• Dmitry Rodionov (La Jolla)
•
•
•
•
•
Thomas Eitinger (Berlin)
Dmitry Malko (Moscow)
Andrey Osterman (La Jolla)
Vasily Ramensky (Moscow)
Konstantin Severinov (Moscow)
• HHMI
• RFBR
• RAS (program “Molecular and
Cellular Biology”)
Related documents