* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Riboswitches: the oldest regulatory system?
Survey
Document related concepts
Transcript
Evolution of bacterial regulatory systems Mikhail Gelfand Research and Training Center “Bioinformatics” Institute for Information Transmission Problems Moscow, Russia January 2008 Plan • Individual sites • Transcription factors and their binding signals • Regulatory systems and regulons Birth and death of sites is a very dynamic process NadR-binding sites upstream of pnuB seem absent in Klebsiella pneumoniae and Serratia marcescens … but there are candidate sites further upstream … … and they are clearly different (not simply misaligned). Cryptic sites and loss of regulators Loss of RbsR in Y. pestis (ABC-transporter also is lost) RbsR binding site Start codon of rbsD Unexpected conservation of non-consensus positions in orthologous sites regulatory site of LexA upstream of lexA consensus nucleotides are in caps Escherichia coli Salmonella typhi Yersinia pestis Haemophilus influenzae Pasteurella multocida Vibrio cholerae TgCTGTATATActcACAGcA aACTGTATATActcACAGcA agCTGTATATActcACAGcA atCTGTATAcAatacCAGTt TtCTGTATATAataACAGTt cACTGgATATActcACAGTc wrong consensus? TF PurR, gene purL Escherichia coli Salmonella typhi Yersinia pestis Haemophilus influenzae Pasteurella multocida Vibrio cholerae A C G C A A A C Gg T T t C G T A C G C A A A C Gg T T t C G T A C G C A A A C Gg T T t C G T A t G C A A A C G T T T G Ct T A C G C A A A C G T T Tt C G T A C G C A A A C Gg T T G C t T TF PurR, gene purM Escherichia coli Salmonella typhi Yersinia pestis Haemophilus influenzae Pasteurella multocida Vibrio cholerae t C G C A A A C G T T T G Ct T t C G C A A A C G T T T G Ct T t C G C A A A C G T T T G Cc T t C G C A A A C G T T T G Ct T t C G C A A A C G T T T G Ct T A C G C A A A C G T T Tt C c T Non-consensus positions are more conserved than synonymous codon positions Regulators and their motifs • Cases of motif conservation at surprisingly large distances • Subtle changes at close evolutionary distances • Correlation between contacting nucleotides and amino acid residues • Changes in symmetry patterns NrdR (regulator of ribonucleotide reducases and some other replication-related genes): conservation at large distances DNA motifs and protein-DNA interactions Entropy at aligned sites and the number of contacts (heavy atoms in a base pair at a distance <cutoff from a protein atom) CRP PurR IHF TrpR The LacI family: subtle changes in motifs at close distances G A CG Gn GC n Specificity-determining positions in the LacI family Training set: 459 sequences average length: 338 amino acids, 85 specificity groups – 44 SDPs 10 residues contact NPF (analog of the effector) 7 residues in the effector contact zone (5Ǻ<dmin<10Ǻ) 6 residues in the intersubunit contacts 5 residues in the intersubunit contact zone (5Ǻ<dmin<10Ǻ) 7 residues contact the operator sequence 6 residues in the operator contact zone (5Ǻ<dmin<10Ǻ) LacI from E.coli The CRP/FNR family of regulators TGTCGGCnnGCCGACA CooA Desulfovibrio TTGTGAnnnnnnTCACAA FNR Gamma TTGATnnnnATCAA HcpR Desulfovibrio TTGTgAnnnnnnTcACAA Correlation between contacting nucleotides and amino acid residues • • • • DD DV EC YP VC DD DV EC YP VC CooA in Desulfovibrio spp. CRP in Gamma-proteobacteria HcpR in Desulfovibrio spp. FNR in Gamma-proteobacteria COOA COOA CRP CRP CRP HCPR HCPR FNR FNR FNR Contacting residues: REnnnR TG: 1st arginine GA: glutamate and 2nd arginine ALTTEQLSLHMGATRQTVSTLLNNLVR ELTMEQLAGLVGTTRQTASTLLNDMIR KITRQEIGQIVGCSRETVGRILKMLED KXTRQEIGQIVGCSRETVGRILKMLED KITRQEIGQIVGCSRETVGRILKMLEE DVSKSLLAGVLGTARETLSRALAKLVE DVTKGLLAGLLGTARETLSRCLSRMVE TMTRGDIGNYLGLTVETISRLLGRFQK TMTRGDIGNYLGLTVETISRLLGRFQK TMTRGDIGNYLGLTVETISRLLGRFQK TGTCGGCnnGCCGACA TTGTGAnnnnnnTCACAA TTGTgAnnnnnnTcACAA TTGATnnnnATCAA The correlation holds for other factors in the family NrtR (regulator of NAD metabolism): systematic search for correlated positions • • • • analysis of correlated positions in proteins and sites analysis of specificity determining positions the same positions in one alpha-helix identified plans for experimental verification NiaR: changed dimer structure? The GalR family and Cproteins of RMsystems: direct and inverted repeats BirA: changed spacing What are the events leading to the present-day state? • Expansion and contraction of regulons • New regulators (where from?) • Duplications of regulators with or without regulated loci • Loss of regulators with or without regulated loci • Re-assortment of regulators and structural genes • … especially in complex systems • Horizontal transfer Trehalose/maltose catabolism in alpha-proteobacteria Duplicated LacI-family regulators: lineagespecific post-duplication loss The binding motifs are very similar (the blue branch is somewhat different: to avoid cross-recognition?) Utilization of an unknown galactoside in gamma-proteobacteria Yersinia and Klebsiella: two regulons, GalR and Laci-X Erwinia: one regulon, GalR Loss of regulator and merger of regulons: It seems that laci-X was present in the common ancestor (Klebsiella is an outgroup) Utilization of maltose/maltodextrin in Firmicutes Displacement: invasion of a regulator from a different subfamily (horizontal transfer from a related species?) – blue sites Orthologous TFs with completely different regulons (alpha-proteobaceria and Xanthomonadales) Catabolism of gluconate in proteobacteria Extreme variability of the regulation of “marginal” regulon members β Pseudomonas spp. γ Regulation of amino acid biosynthesis in Firmicutes • Interplay between regulatory RNA elements and transcription factors • Expansion of T-box systems (normally – RNA structures regulating aminoacyl-tRNA-synthetases) Three regulatory systems for the methionine biosynthesis A. B. C. SAMdependent riboswitch Met-T-box MtaR: repressor of transcription MtaR Methionine regulatory systems: loss of S-box regulons • S-boxes (SAM-1 riboswitch) – Bacillales – Clostridiales – the Zoo: • • • • • • ZOO Petrotoga actinobacteria (Streptomyces, Thermobifida) Chlorobium, Chloroflexus, Cytophaga Fusobacterium Deinococcus proteobacteria (Xanthomonas, Geobacter) • Met-T-boxes (Met-tRNA-dependent attenuator) + SAM-2 riboswitch for metK – Lactobacillales • MET-boxes (candidate transcription signal) Lact. – Streptococcales Strep. Bac. Clostr. Recent duplications and bursts: Arg-T-box in Clostridium difficile LR_ARGS CPE_ARGS CAC_ARGS CB_ARGS CBE_ARGS Lactobacillales CTC_ARGS LP_ARGS LME_ARGS Clostridiales argS argS LJ_ARGS CDF_YQIXYZ LGA_ARGS RDF02391 PPE_ARGS LSA_ARGS СDF_ARGC BC_ARGS2 EF_ARGS BH_ARGS CDF_ARGH Bacillales argS : ARG-specific T-box regulatory site yqiXYZ NEW NEW aminoacyl-tRNA synthetase biosynthetic genes amino acid transporters Clostridium difficile RDF02391 argCJBDF argH others argG predicted amino acid transporters amino acid biosynthetic genes … following transcription factor loss Gram+ bacteria: Clostridium difficile: AhrC regulatory protein (negative regulation of arginine metabolism positive regulation of arginine catabolism) Binding to 5’ UTR gene region regulation of gene expression 5’ ... AhrC site AhrC is lost Expansion of T-box regulon regulation of expression of arginine biosynthetic and transport genes by T-box antitermination Other clostridia spp. (CA, CTC, CTH, CPE, CB, CPE) yqiXYZ yqiXYZ argC argH argC argH argG : AhrC binding site : ARG-specific T-box regulatory site Regulon expansion, or how FruR has become CRA • CRA (a.k.a. FruR) in Escherichia coli: – global regulator – well-studied in experiment (many regulated genes known) • Going back in time: looking for candidate CRA/FruR sites upstream of (orthologs of) genes known to be regulated in E.coli Common ancestor of gamma-proteobacteria Mannose Glucose manXYZ ptsHI-crr edd epd eda adhE aceEF Mannitol mtlA gapA fbp Fructose pykF mtlD fruBA fruK pfkA pgk gpmA icdA ppsA pckA aceA tpiA aceB Gamma-proteobacteria Common ancestor of the Enterobacteriales Mannose Glucose manXYZ ptsHI-crr edd epd eda adhE aceEF Mannitol mtlA gapA fbp Fructose pykF mtlD fruBA fruK pfkA pgk gpmA icdA ppsA pckA aceA tpiA aceB Gamma-proteobacteria Enterobacteriales Common ancestor of Escherichia and Salmonella Mannose Glucose manXYZ ptsHI-crr edd epd eda adhE aceEF Mannitol mtlA gapA fbp Fructose pykF mtlD fruBA fruK pfkA pgk gpmA icdA ppsA pckA aceA tpiA aceB Gamma-proteobacteria Enterobacteriales E. coli and Salmonella spp. Life without Fur Regulation of iron homeostasis (the Escherichia coli paradigm) Iron: • essential cofactor (limiting in many environments) • dangerous at large concentrations FUR (responds to iron): • synthesis of siderophores • transport (siderophores, heme, Fe2+, Fe3+) • storage • iron-dependent enzymes • synthesis of heme • synthesis of Fe-S clusters Similar in Bacillus subtilis Regulation of iron homeostasis in α-proteobacteria [- Fe] [+Fe] [ - Fe] [+Fe] RirA RirA Irr Irr FeS heme degraded Siderophore uptake 2+ 3+ Fe / Fe uptake Iron uptakesystems Fur [- Fe] Iron storage ferritins FeS synthesis Heme synthesis Iron-requiring enzymes [ironcofactor] Fur IscR Fe FeS Transcription factors FeS status of cell [+Fe] Experimental studies: • FUR/MUR: Bradyrhizobium, Rhizobium and Sinorhizobium • RirA (Rrf2 family): Rhizobium and Sinorhizobium • Irr (FUR family): Bradyrhizobium, Rhizobium and Brucella Distribution of transcription factors in genomes Search for candidate motifs and binding sites using standard comparative genomic techniques Regulation of genes in functional subsystems Rhizobiales Bradyrhizobiaceae Rhodobacteriales The Zoo (likely ancestral state) Reconstruction of history Frequent co-regulation with Irr Strict division of function with Irr Appearance of the iron-Rhodo motif All logos and Some Very Tempting Hypotheses: Cross-recognition of FUR and IscR motifs in the ancestor. 2. When FUR had become MUR, and IscR had been lost in Rhizobiales, emerging RirA (from the Rrf2 family, with a rather different general consensus) took over their sites. 3. Iron-Rhodo boxes are recognized by IscR: directly testable 2 1. 1 3 Summary and open problems • Regulatory systems are very flexible – – – – easily lost easily expanded (in particular, by duplication) may change specificity rapid turnover of regulatory sites • With more stories like these, we can start thinking about a general theory – catalog of elementary events; how frequent? – mechanisms (duplication, birth e.g. from enzymes, horizontal transfer) – conserved (regulon cores) and non-conserved (marginal regulon members) genes in relation to metabolic and functional subsystems/roles – (TF family-specific) protein-DNA recognition code – distribution of TF families in genomes; distribution of regulon sizes; etc. People • • • • • • Andrei A. Mironov – software, algorithms Alexandra Rakhmaninova – SDP, protein-DNA correlations • • • • • • • • Anna Gerasimova (now at U. Michigan) – NadR Olga Kalinina (on loan to EMBL) – SDP Yuri Korostelev – protein-DNA correlations Ekateina Kotelnikova (now at Ariadne Genomics) – evolution of sites Olga Laikova – LacI Dmitry Ravcheev– CRA/FruR Dmitry Rodionov (on loan to Burnham Institute) – iron etc. Alexei Vitreschak – T-boxes and riboswitches • • • Andy Jonson (U. of East Anglia) – experimental validation (iron) Leonid Mirny (MIT) – protein-DNA, SDP Andrei Osterman (Burnham Institute) – experimental validation Howard Hughes Medical Institute Russian Foundation of Basic Research Russian Academy of Sciences, program “Molecular and Cellular Biology” INTAS