Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Регуляторные структуры РНК RNA genes: • sRNAs: CsrB/RsmB , CsrC, DsrA, GadY, MicC, OxyS, RyhB, RydC, etc • Antisense RNAs: CopA, DicF, MicF, RNAI, QaRNA etc Cys UTR regulatory RNAs: riboswitches, Tboxes, attenuators, IREs, etc. sRNAs DsrA RNA Regulation of rpoS: • Overcoming transcriptional silencing • Promoting translation E. coli, salmonella spp., Shigella spp sRNAs CsrB/RsmB RNA family RNA binds to approximately 18 copies of the CsrA protein negative effect: glycogen biosynthesis, glyconeogenesis, glycogen catabolism positive effect: glycolysis conserved motif CAGGXX enterobacteria sRNAs PrrB/RsmZ RNA family Pseudomonas spp. 5'-AGGA-3' repeats in loops RNA possibly interacts with a CsrA-like protein Involvement in regulation of 2, 4-diacetylphloroglucinol (Phl) and hydrogen cyanide (HCN) production sRNAs GadY GadY interacts with the 3' UTR of mRNA gadX: increased stability to the transcript E. coli, salmonella spp., Shigella spp. RydC RNA RydC is known to bind the protein Hfq The Hfq/RydC complex causes degradation of the target nRNA E. coli, salmonella spp. Antisense RNAs CopA-like RNA RNAs regulate plasmid copy number four-way inhibition junction structure copA-mRNA copT MicF RNA regulates ompF expression by inhibiting translation and inducing degradation UTR RNA regulatory elements Mediators of regulation: Ribosomes (Transcription attenuation ) Repressor/Activator proteins (feedback inhibition of gene translation/splicing, antitermination (bgl), IREs (regulation of translation/mRNA stability), etc) Uncharged tRNA (T-boxes) Small molecules (various riboswitch regulatory elements) Alternative RNA structures in transcription termination Antitermination Termination (anti-antitermination) Attenuation of transcription (Yanovsky). Prediction of attenuators: Amino acid biosynthesis (branched amino acids (ILE, LEU, VAL), histidine, threonine, tryptophan, and phenylalanine) (gamma- and alpha-proteobacteria, in some cases low-GC Gram-positive bacteria, Thermotogales and Bacteroidetes/Chlorobi) Three new histidine transporters were predicted: • ortholog of BS- yuiF and yvsH • from lysQ/lysP family • HI0325 (Haemophylus influenzae ) E. coli: three aspartate kinase isozymes, ThrA, MetL and LysC thrA: ILE-THR attenuator metL: MetJ lysC: LYS-element Pasteurellales (two aspartate kinase isozymes): thrA THR-MET-ILE attenuator LysC: LYS-element Detection of 5’ UTR RNA-elements The RNApattern program: RNA pattern: • consensus motifs • RNA secondary structure: number of helices length of each helix loop lengths parameters of topology and distance between pairs of helices Partial alignment of predicted T-boxes specifier hairpin ===> ==> ===> <=== <== SC<=== SA DHA ST CA DF PN MN DF HD DF ZC BQ MN MN ST SERS tyrZ trpS ASPS VALS THRS ileS leuS ARGS proS lysS metS pheS glyQ alaS SER Tyr Trp ASP VAL THR Ile Leu ARG Pro Lys Met Phe Gly Ala ---GTAGGACAAGTA ----AAGAACAAGTA ---ATTAGAAGAGTA -----GAGAAAAGTA -GAAGAAGAGGAGTA ----AGAGACAAGTC ----CAAAAACACAA ----CTAGAGCAGTA -----TGGGAGAGTA ---AAAGAAATAGTA ---AAGAGAAGAGTA ---AAAGGAAAAGTA ----TGAGATTAGTA ---AGAAAGAGAGTT -AGTTAAGAATTGTT 19 18 16 18 16 18 17 19 20 18 19 19 18 15 17 AGAGAGCTTGTGGTT---AGTGTGAACAAG--AGAAAGTTGCCGGCT---GATGAGAGGCGCTT AGAGAGTTAGTGGTT---GGTGCAAGCTAACAGCGAATTGGGAAAT---GGTGTGAGCCCAAAGAGAGGAAAATTCACTGGCTGTAAGATTTTC AGAGAGTGCGTGGTT---GCTGGAAACGCATAGCGAATAGGTGAT----GGTGTAAGACCTATT AGAGGAAGTGGAA-----GGTGAGAACTAATATT AGCGAGTCGGGAT-----GGTGGGAGCCGATAGAGAGAAAACGGT----GGTGAGAGTTTTC-AGAGAGCTCTGGTA----GCTGAGAAAGAGC-AGAGAGCTTCGGTA----GCTGAGAAGAAGC-AGGGAATGCGGGGCGTG-ACTGGAAACCCGCAGCGAACCTGAGAG----AGTGTAAGTCAGGT AGAAAAGTGACGGTT---GCTGCGAGTCATT- 15 18 12 15 17 14 18 10 14 14 15 14 16 14 17 GAA--TCTACCTACTT GAA--TACCTCTTTGA GAAA-TGGACTAATGA GAAA-GACATCTCGGA GAAT-GTAGCTTTGGA GAT--ACTACTCTTGA -----ATCATTTTGTT GAA--CTTACTAGATT GAAA-CGCACCCATGA GAA--CCTGTCTTTTA GAAAAAAGACTTGGAG GAACAATGGCCTTTGA GAA--TTCACTCAGAA GACT-GGCACTTTCTC -----GCTACTTAACT -> -> -> -> -> -> -> -> -> -> -> -> -> -> -> Amino acid biosynthetic genes SA BS CA BQ BS SA MN DHA HD BQ EF trpE ilvB ilvC asnA proB cysE hisC pheA serA phhA yxjH Trp Leu Val Asn Pro Cys His Phe Ser Tyr Met TCTAAAGAAATAGTA ---TGAGGATAAGTA -----AGGAAGAGTA --AGGACGAGTAGTA -----AGGATTAGTA --CGAAGGATTAGTA -----AGAGAAAAAA -----AAAGAGAGCA ----GAAGATGAGGA AGAATCGCAGTAGTA -----TAGGAAAGTA 22 20 17 15 18 18 16 19 17 17 17 AGAAAGCTAATGGGT---GATGGGAATTAGC-AGAGAACCGGGTTA----GCTGAGAACCGG--AGAGAGTGAGATACT---GGTGGGAACTCAT-AGCGAGTCAGGGGT----GGTGTGAGCCTGA-AGAGAGCAAAATGAACC-GCTGAAACATTTTGC AGAGAGTGTACGGTT---GCTGTGAGTACA--AGAGAGTATGGGAA----GCTGAAAACATAC-AGGGAACTAAAGTCGGAGACTGAAAGCTTTAGT AGAGAGCTGGTGGTT---GCTGTGAACCAGCTAGAGAGCTAATGGTC---GGTGGAAATTGGC-AGAGAGACTTTGGTT---GGTGAAAAAAGTT-- 14 16 13 15 15 14 15 14 18 14 13 GAAT-TGGACTTTGGA GAA--CTCGCCTCAGA GAAG-GTAGCCTTTGA GAAG-AACCTCCTGGA GAA--CCTGCCTTGGA GAA--TGCACCTTCGT -----CACATTCTTGA GAGA-TTCACTCTGGA -----AGCCCTTCTGA GAAT-TACAATTCTGG GAAAAATGGCCTAGGA -> -> -> -> -> -> -> -> -> -> -> Amino acid transporters CA yckK DF yqiX HD BH0807 EF yheL BQ ykbA BQ sdt2 EF yusC CA yhaG BQ brnQ REF01723 BS yvbW Cys Arg Lys Tyr Thr Trp Met Trp Ile His Leu ----AAGAACCAGTA -----AGAGAAAGTA ----AGAGAAGAGTA -TTATTAGCCCAGTA --GAGGACACGATCA ---GCAAGAAGAGTA ----AAAGAAGAGTA ----AAGGAAGAGTA ----GAGAACGAGTA --TTAGGACATAGTA -----GGGAGCAGTA 17 16 19 19 16 18 18 18 19 18 18 AGAGAAAAATCTCCAAG-GCTGAAAGGGATTTT AGCGAGTTAGGGGTT---GGTGTAAGCCTAGCAGAAAGCCTGTAGTT---GCTGAGAACGGGT-AGAAAGTCGATGGTT---GCTGCGAATCGAT-AGAGAGGGAAGCCTTTG-GCTGTGAGCTTCCTAGAGAGCTGGGGGAA---GGTGTGAGCCCGGTAGAGAGCCCTGTTT----GCTGAGAATGGG--AGAGAGCTGAGGGT----GGTGTGATCTCAGTAGAGAGTTGGCGATTT--GCTGAAAGCCAAC-AGAGACTTTTTCATTG--GCTGAAAGAAAAAGAGAGAGCTGCGGGGT---GGTGCGACGCAGC-- 15 14 14 13 14 15 16 15 15 17 13 GAA--TGCATCTTTGA GAAG-AGAGCTCTGGA GAAGCAAGACTCTGAG GAAT-TACACTAATAA GATT-ACCACCTCTGA GAA--TGGGCTTGCGA GAAG-ATGGTCTTTGA GAA--TGGACCTTTTA GAAA-ATCATCTCCGA -----CACACCTAAAA GAA--CTCGCCCGGGA -> -> -> -> -> -> -> -> -> -> -> AminoacyltRNA synthetases … continued AminoacyltRNA synthetases Amino acid biosynthetic genes Amino acid transporters Terminator(underlined) ===========> <=========== Antiterminator ==> ===> <===<== SA DHA ST CA DF PN MN DF HD DF ZC BQ MN MN ST serS tyrZ trpS aspS valS thrS ileS leuS argS proS lysS metS pheS glyQ alaS -> -> -> -> -> -> -> -> -> -> -> -> -> -> -> 26 47 37 39 41 30 89 28 41 33 46 55 14 14 20 CGTTA CGTTA CCTTA CGTTA CGTTA CGTTA CGTTA AGCTA CGTTA CGTTA CGTTA CGTTA AATTA AGCTA AATTA 51 65 61 34 77 38 68 29 27 30 63 66 20 23 18 AAATAGGGTGGCAACGCGTAGAC------------CACGTCCCTTGTAGGGATGTGGTCTTTTTTTA AGGTAAGGTGGTAACACGGGAGCA-------TACTCTCGTCCTTCTGGCAATGAAGGACGGGAGTTTTTTGTTTT AATTGAGGTGGTACCGCGTATTACTT----GTAATAACGCCCTCACGTTTTAATAGCGTGGGGACTTTTTGCTAT ATAAAGGATGGCACCGTGAAAA----------GCCTTCACTCCTTACTGGAGTGGAGGCTTTTTTTATTTTAAATAAA AATTAAGGTGGTAACGCGAGC------------TTTTCGTCCTTTTTAAAGAGGATGAAGAGCTCTTTTTTATTTCT AATGAAGGTGGAACCACGTTG-------------CGACGTCCTTTCGAGGATGTCGCATTTTTTTATTAG AATTAAGGTGGTACCACGAGC-------------TTTCGTCCTTTGATGAAAGTTCTTTTTTATTGAT AATTAGGGTGGTACCGCGAAGATT-------TATCCTCGTCCCTAAACGTAAGTTTAGTGACGAGGATTTTTTATTTTCA AACGAGAGTGGTACCGCGGGTAA---------AAGCTCGCCTCTTTTTAGAAGAGGCGGGTTTTTTATTTT AACTAGAGTGGTACCGCGGAAAT-----TAAACCTTTCGTCTCTATACTTGTATAGAGATGAGAGGTTTTTTATATTTTCAGG AACTGAGGTGGTACCGCGAAGCTAA-----CAACTCTCGTCCTCAAGATGAATAATCTTGGGGGTGGGAGTTTTTTTGTTGCA AAATAAGGTGGTACCGCGACTGTTTA---TACAGCCCCGCCCTTATCTTTTTTAGATAAGGGCGGGGCTTTTTATATTTAA AAAACGGATGGTACCGCGTGTC-------------AACGCTCCGCTTAAGGAGTTTTGGCACTTTTTTTGTTTT AATTAGGGTGGAACCGCGTTT------------CAAACGCCCCTATGTCAGTTGGCATGGGAGTGATTGAGCGTGGCTCTTTT AATAGAGGTGGTACCGCGGTT--------------TTCGCCCTCTGTGAGATGGACTTGTTTTGTATGGAGGACTATTTGAAA SA BS CA BQ BS SA MN DHA HD BQ EF trpE ilvB ilvC asnA proB cysE hisC pheA serA phhA yxjH -> -> -> -> -> -> -> -> -> -> -> 32 50 40 51 33 33 46 41 42 51 40 AATTA CGTTA CGTTA CGTTA CGTTA CATTA CGTTA CGTTA cgtta CGTTA CGTTA 4 47 14 62 30 62 50 50 57 34 51 AACTAAGGTGGCACCACGGTA-------------ACGCGTCCTTACAGGTATATGCGTTATGTGGTGTCTTTTT AACAAGGGTGGTACCGCGGAAAGAAA---AGCCTTTTCGCCCCTTTTAGCTATCGCAGTTACTGCGCGGCTGATTGT AATTTGGGTGGTACCGCGCGACCAAA-----AATTCTCGCCCCAAGCAGGGAATTTTGGCCGTTTTTTTATATAAATAAAT AATTTGGGTGGTACCGCGGAACC-----AAAGCCTTTCGTCCCAGTTTTTTGGGAAAGAAGGGCTTTTTTTGTTGGCTT AATCAAGGTGGTACCACGGAAAC--------CCATTTCGTCCTTATGAATCAGGATGAAATGGGTTTTTTTATTGTAGA ATTCAGAGTGGAACCGTGCGG-------------AAGCGCCTCTAACAATACAATTTGTATGTTAGTGGTGCTTTTTTG AATGAAGGTGGAACCACGTGTGT---------GTCAGCGTCCTTGCAAGTTTTTTGCAAGGGCGCTTTTTTGAATAGT AAAAAGGGTGGTACCGCGTGAC---------TTAACTCGTCCCTTATTTGGGGGTGAGGTAAGTCTTTTTTTATTTA AATGAGGGTGGCACCGCGGTATG-------AACCTTCCGCCCCTCACGACAGTCGTCGTGTGGGCAGAAGGTTTTTTTACTAT AAATAGGGTGGTACCGCGATTC------------TTTCGCCCCTATCGGATTTTCCGATAGGGGCTTTTTCTATTTC AAAAAAGGTGGTACCGCGATAA-----------TAATCGCCCTTTTACTAGTTACGGCTAGTAAAAGGGCGTTTTTTTATAAA CA yckK -> 38 DF yqiX -> 41 HD BH0807->74 EF yheL -> 8 BQ ykbA -> 46 BQ sdt2 -> 40 EF yusC -> 42 CA yhaG -> 48 BQ brnQ -> 44 REF01723 -> 44 BS yvbW -> 56 CGTTA CCTTA TGTTA AATTA CGTTA CGTTA CGTTA CGTTA CGTTA CGTTA CGTTA 57 30 56 33 45 56 60 51 66 55 32 AATTAGAGTGGTACCGTGGAATT-------CAACTTCTGCCTCTAACTATGAGGATAGAAGTTTTTTGTTTTTAT AAAAAGAGTGGTAACGCGGATAT----------AATTCGTCTCTTAGCTGTAAAGCTAAGGGACTTTTTTGATTTA AACTGGGGTGGCACCACGACAAG----------TGATCGTCCCCAAGACTTTTATCAGTCTTGGGGACGTTTTTTTGTTCAT AATTAAGGTGGTACCGCGGAGA-----------GATTCGTCCTTATTCTTTAAGGATGAATCTCTCTTTTTATGTAGC AACAAGGGTGGAACCACGAATAT--------AACACTCGTCCCTTTTTTAGGGAGGAGTGTTTTTTTATT AATTGAGGTGGTACCACGGTATTAACATTACATATATCGTCCTCTACATGCATATTTGCGTGTAGGGGACTTTTTTATTTTC AATTAAGGTGGTATCACGAAATGA-----CAAACTTTCGTCCTTTTTGCTGTAATAGCAAAAGGATGGAAGTTTTTTTGTTT AATTTAGGTGGTACCGCGGAAGT---------ATCTCCGTCCTAATTAATAAGATTAGGGCGGAGTTTTTTATTTGC AATTAGGGTGGTATCGCGGGTAAA------TATAACTCGTCCCTTTCTTTAGGGACGAGTTTTTTGTGTTCTT AATTGAGGTGGCACCACGAATGC----------GATTCGTCCTCTTGGCTCACAGCCAAGAGGCTTTTTTGTTTTTTTAATA AACAAGAGTGGTACCGCGGTCAGC--CGAAGGCTCGTCGTCTCTTTATCTATTAGATTAGGTAGGAGACGGCGGGCTTTTTT Amynoacyl-tRNA synthetases Aromatic a/a TRP, PHE, Most FIRMICUTES, Atopobium minutum TYR Branched chain Most FIRMICUTES, Actinobacteria(ileS), Dienococcales\ Thermales(ileS, valS), a/a Chloroflexi(ileS), Thermomicrobium roseum(leuS) ILE, LEU,VAL methionine Bacillales, Clostridiales, Thermoanaerobacter tengcongensis proline Some Bacillales, Clostridiales, cysteine Bacillales, some Lactobacillales, Clostridiales, Thermoanaerobacteriales Bacillales, Lactobacillales(exept streptococcus spp.), some Clostridiales, histidine Thermoanaerobacter tengcongensis arginine Bacillales, Lactobacillales (exept streptococcus spp.), Clostridiales, threonine Bacillales, Lactobacillales, Clostridiales, Dictyoglomi, Thermomicrobium roseum serine Most FIRMICUTES alanine Bacillales, Lactobacillales, Clostridiales ASP, ASN Most FIRMICUTES (exept streptococcus spp., Mycoplasmatales, Entomoplasmatales) glycine Most FIRMICUTES, Dienococcales\ Thermales lysine Bacillus cereus, Clostridium thermocellum Amino acid biosynthetic genes Aromatic a/a Most FIRMICUTES, Chloroflexi and Dictyoglomi (trp operon), some FIRMICUTES TRP, PHE, (aro genes, pheA, pah) TYR Branched chain Bacillales, Clostridiales, Syntrophomonas wolfei, δ-proteobacteria(leu), Dictyoglomi, a/a Thermomicrobium roseum ILE, LEU,VAL methionine Lactobacillales (exept streptococcus spp.), Desulfotomaculum reducens proline Bacillales, Desulfitobacterium hafniense, Desulfotomaculum reducens cysteine Bacillales, Enterococcus faecalis, Clostridium acetobutylicum, Dictyoglomi histidine some Lactobacillales arginine Clostridium difficile threonine Bacillus cereus, Clostridium difficile serine some FIRMICUTES alanine ASP, ASN some FIRMICUTES glutamine Clostridium perfringes glycine lysine - ycbK yhaG yvbW ykbA ybgF/aapA T-box specificity TRP TRP LEU THR ? yheL TYR LysX LYS Gene name Predicted function T-box srecifier codon Bacillus subtilis, Bacillus licheniformis Clostridiales Bacillus subtilis, Bacillus licheniformis Bacillus subtilis Lactobacillus reuteri yusCBA yqiXYZ MET ARG tryptophan-specific permease tryptophan-specific permease leucine-specific permease threonine-specific permease ? Tyrosine transporter (Na+/H+ antiporter) lysine transporter Branched-chain amino acid transporter family: ILE-specific Branched-chain amino acid transporter family:: THR-specific Branched-chain amino acid transporter family:: VAL-specific methionine ABC transporter arginine ABC transporter hisXYZ HIS histidine ABC transporter yckKJI CYS MET ASP MET cysteine ABC transporter methionine ABC transporter ASP(ASN) ABC transporter methionine ABC transporter TRP-specific sodium dependent transporter PHE-specific sodium dependent transporter LEU-specific sodium dependent transporter sodium dependent transporter uptake of unknown methionine precursors, possibly oligopeptides ILE brnQ_braB THR VAL aspQHMP ytmKLM TRP yocR(yhdH) PHE LEU TYR?\MET some Bacillales and Lactobacillales some Bacillales some Bacillales, Lactobacillales andClostridiales Bacillus cereus, Clostridium tetani some Lactobacillales Lactobacillales, Enterococcus faecalis Clostridium difficile Lactobacillales, Clostridium difficile, Listeria monocytogenes, Enterococcus faecalis Clostridium acetobutylicum some Lactobacillales Lactobacillus johnsonii Leuconostoc mesenteroides Bacillus cereus Bacillus cereus Bacillus cereus Clostridium tetani mtsABC opp MET trpXYZ TRP tryptophan ABC transporter RDF02391 ABC-like transporter CBX gltT like ARG arginine permease Peptococcaceae, Streptococcus spp., Paenibacillus larvae Clostridium difficile ? ? Desulfotomaculum reducens ? ? ? ? Clostridium botulinum some Clostridium spp. some Lactobacillales New predicted amino acid transporters Conserved RNA secondary structure of the regulatory RFN element RFN element additional stem-loop variable stem-loop Ag Y CC N R GN rU G GY Y G 3 G C c A N A UCCcN a N Y G G g Nc G 2 x G G g RC U Y Y y N N N N 5’ GA A R R r N N N N K N R u A RG x Y yB RYC K V 4 Rr C C G A UxN CRG N G G Y C U Ax G A 5 u x g Capitals: invariant (absolutely conserved) RR 3’ positions. Lower case letters: strongly conserved positions. Dashes and stars: obligatory and facultative base pairs Degenerate positions: R = A or G; Y = C or U; K = G or U; B= not A; V = not U. 5’ UTR regions of riboflavin genes from various bacteria BS BQ BE HD Bam CA DF SA LLX PN TM DR TQ AO DU CAU FN TFU SX BU BPS REU RSO EC TY KP HI VK VC YP AB BP AC Spu PP AU PU PY PA MLO SM BME BS BQ BE CA DF EF LLX LO PN ST MN SA AMI DHA FN GLU 1 2 2’ 3 =========> ==> <== ===> TTGTATCTTCGGGG-CAGGGTGGAAATCCCGACCGGCGGT AGCATCCTTCGGGG-TCGGGTGAAATTCCCAACCGGCGGT TGCATCCTTCGGGG-CAGGGTGAAATTCCCGACCGGCGGT TTTATCCTTCGGGG-CTGGGTGGAAATCCCGACCGGCGGT TGTATCCTTCGGGG-CTGGGTGAAAATCCCGACCGGCGGT GATGTTCTTCAGGG-ATGGGTGAAATTCCCAATCGGCGGT CTTAATCTTCGGGG-TAGGGTGAAATTCCCAATCGGCGGT TAATTCTTTCGGGG-CAGGGTGAAATTCCCAACCGGCAGT ATAAATCTTCAGGG-CAGGGTGTAATTCCCTACCGGCGGT AACTATCTTCAGGG-CAGGGTGAAATTCCCTACCGGTGGT AAACGCTCTCGGGG-CAGGGTGGAATTCCCGACCGGCGGT GACCTCTTTCGGGG-CGGGGCGAAATTCCCCACCGGCGGT CACCTCCTTCGGGG-CGGGGTGGAAGTCCCCACCGGCGGT AATAATCTTCAGGG-CAGGGTGAAATTCCCGATCGGCGGT TTTAATCTTCAGGG-CAGGGTGAAATTCCCGATCGGTGGT GAAGACCTTCGGGG-CAAGGTGAAATTCCTGATCGGCGGT TAAAGTCTTCAGGG-CAGGGTGAAATTCCCGACCGGTGGT ACGCGTGCTCCGGG-GTCGGTGAAAGTCCGAACCGGCGGT -AGCGCACTCCGGG-GTCGGTGAAAGTCCGAACCGGCGGT GTGCGTCTTCAGGG-CGGGGTGAAATTCCCCACCGGCGGT GTGCGTCTTCAGGG-CGGGGCGAAATTCCCCACCGGCGGT TTACGTCTTCAGGG-CGGGGTGCAATTCCCCACCGGCGGT GTACGTCTTCAGGG-CGGGGTGGAATTCCCCACCGGCGGT GCTTATTCTCAGGG-CGGGGCGAAATTCCCCACCGGCGGT GCTTATTCTCAGGG-CGGGGCGAAATTCCCCACCGGCGGT GCTTATTCTCAGGG-CGGGGCGAAATTCCCCACCGGCGGT TCGCATTCTCAGGG-CAGGGTGAAATTCCCTACCGGTGGT GCGCATTCTCAGGG-CAGGGTGAAATTCCCTACCGGTGGT CAATATTCTCAGGG-CGGGGCGAAATTCCCCACCGGTGGT GCTTATTCTCAGGG-CGGGGTGAAAGTCCCCACCGGCGGT GCGCATTCTCAGGG-CAGGGTGAAAGTCCCTACCGGTGGT GTACGTCTTCAGGG-CGGGGTGCAATTCCCCACCGGCGGT ACATCGCTTCAGGG-CGGGGCGTAATTCCCCACCGGCGGT AACAATTCTCAGGG-CGGGGTGAAACTCCCCACCGGCGGT GTCGGTCTTCAGGG-CGGGGTGTAAGTCCCCACCGGCGGT GGTTGTTCTCAGGG-CGGGGTGCAATTCCCCACCGGCGGT AAACGTTCTCAGGG-CGGGGTGCAATTCCCCACCGGCGGT TAACGTTCTCAGGG-CGGGGTGCAACTCCCCACCGGCGGT TAACGTTCTCAGGG-CGGGGTGAAAGTCCCCACCGGCGGT TAAAGTTCTCAGGG-CGGGGTGAAAGTCCCCACCGGCGGT AAGCGTTCTCAGGG-CGGGGTGAAATTCCCCACCGGCGGT GCTTGTTCTCGGGG-CGGGGTGAAACTCCCCACCGGCGGT ATCAATCTTCGGGG-CAGGGTGAAATTCCCTACCGGCGGT GTCTATCTTCGGGG-CAGGGTGAAAATCCCGACCGGCGGT ATTCATCTTCGGGG-CAGGGTGAAATTCCCGACCGGCGGT AATGATCTTCAGGG-CAGGGTGAAATTCCCTACCGGCGGT GAAGATCTTCGGGG-CAGGGTGAAATTCCCTACCGGCGGT GTTCGTCTTCAGGGGCAGGGTGTAATTCCCGACCGGTGGT AAATATCTTCAGGG-CACCGTGTAATTCGGGACCGGCGGT GTTCATCTTCGGGG-CAGGGTGCAATTCCCGACCGGTGGT AAGAGTCTTCAGGG-CAGGGTGAAATTCCCGACCGGCGGT AAGTGTCTTCAGGG-CAGGGTGTGATTCCCGACCGGCGGT AAGTGTCTTCAGGG-CAGGGTGAGATTCCCGACCGGCGGT ATTCATCTTCGGGG-TCGGGTGTAATTCCCAACCGGCAGT TCACAGTTTCAGGG-CGGGGTGCAATTCCCCACTGGCGGT ACGAACCTTCGAGG-TAGGGTGAAATTCCCGACCGGCGGT AATAATCTTCGGGG-CAGGGTGAAATTCCCGACCGGTGGT ---TGTTCTCAGGG-CGGGGCGAAATTCCCCACCGGCGGT Add. 3’ -><<=== 21 AGCCCGTGAC-19 AGTCCGTGAC-20 AGCCCGCGA--19 AGTCCGTGAC-23 AGCCCGTGAC-2 AGCCCGCAA--2 AGCCCGCG---6 AGCCTGCGAC-2 AGCCCGCGA--2 AGCCCACGA--3 AGCCCGCGAG-15 AGCCCGCGAA-3 AGCCCGCGAA-2 AGTCCGCGA--2 AGTCCGCGA--20 AGCCCGCGA--2 AGTCCACG---3 AGTCCGCGAC-3 AGTCCGCGAC-30 AGCCCGCGAGCG 21 AGCCCGCGAGCG 31 AGCCCGCGAGCG 21 AGCCCGCGAGCG 17 AGCCCGCGAGCG 67 AGCCCGCGAGCG 20 AGCCCGCGAGCG 2 AGCCCACGAGCG 14 AGCCCACGAGCG 13 AGCCCACGAGCG 40 AGCCCGCGAGCG 25 AGCCCACGAGCG 18 AGCCCGCGAGCG 16 AGCCCGCGAGCA 34 AGCCCGCGAGCG 13 AGCCCGCGAGCG 17 AGCCCGCGAGCG 19 AGCCCGCGAGCG 19 AGCCCGCGAGCG 19 AGCCCGCGAGCG 16 AGCCCGCGAGCG 34 AGCCCGCGAGCG 17 AGCCCGCGAGCG 18 AGCCCGCGA--27 AGCCCGCGA—-20 AGCCCGCGA--2 AGCCCGCGAG-2 AGCCCGCG---3 AGTCCACGAC-21 ACTCCGCGAT-3 AGTCCACGAT-125 AGTCCGTG---14 AGTCCGCG---104 AGTCCGCG---6 AGCCTGCGAC-14 AGCCCGCGC--20 AGCCCGCAAC-2 AGTCCACG---28 AGCCCGCGAGCG Variable 4 4’ 5 5’ 1’ -> <====> <==== ==> <== <========= 8 4 8 -----TGGATTCAGTTTAA-GCTGAAGCCGACAGTGAA-AGTCTGGAT-GGGAGAAGGATGAT 8 5 8 -----TGGATCTAGTGAAACTCTAGGGCCGACAGT-AT-AGTCTGGAT-GGGAGAAGGATATG 3 4 3 -----AGGATCCGGTGCGATTCCGGAGCCGACAGT-AT-AGTCTGGAT-GGGAGAAGGATGCC 10 4 10 ----–TGGACCTGGTGAAAATCCGGGACCGACAGTGAA-AGTCTGGAT-GGGAGAAGGAAACG 8 4 8 ----–TGGATTCAGTGAAAAGCTGAAGCCGACAGTGAA-AGTCTGGAT-GGGAGAAGGATGAG 3 4 3 ------AGATCCGGTTAAACTCCGGGGCCGACAGTTAA-AGTCTGGAT-GAAAGAAGAAATAG 7 6 7 --------ATTTGGTTAAATTCCAAAGCCGACAGT-AA-AGTCTGGAT-GGAAGAAGATATTT 11 3 11 ----–CTGATCTAGTGAGATTCTAGAGCCGACAGTTAA-AGTCTGGAT-GGGAGAAAGAATGT 4 4 4 -----ATGATTCGGTGAAACTCCGAGGCCGACAGT-AT-AGTCTGGAT-GAAAGAAGATAATA 3 4 3 -----ATGATTTGGTGAAATTCCAAAGCCGACAGT-AT-AGTCTGGAT-GAAAGAAGATAAAA 5 4 5 ----–TTGACCCGGTGGAATTCCGGGGCCGACGGTGAA-AGTCCGGAT-GGGAGAGAGCGTGA 8 12 9 ----–CCGATGCCGCGCAACTCGGCAGCCGACGGTCAC-AGTCCGGAC-GAAAGAAGGAGGAG 5 4 5 -----CCGACCCGGTGGAATTCCGGGGCCGACGGTGAA-AGTCCGGAT-GGGAGAAGGAGGGC 7 7 7 -----AGGAACCGGTGAGATTCCGGTACCGACAGT-AT-AGTCTGGAT-GGAAGAAGATGAAA 13 4 12 -----AGGAACTAGTGAAATTCTAGTACCGACAGT-AT-AGTCTGGAT-GGAAGAAGAGCAGA 3 4 3 -----AGGACCCGGTGTGATTCCGGGGCCGACGGT-AT-AGTCCGGAT-GGGAGAAGGTCGGC 5 4 5 -------GATTTGGTGAAATTCCAAAACCGACAGT-AG-AGTCTGGAT-GGGAGAAGAATTAG 8 5 8 -----TGGAACCGGTGAAACTCCGGTACCGACGGTGAA-AGTCCGGAT-GGGAGGTAGTACGTG 8 5 8 -----TTGACCAGGTGAAATTCCTGGACCGACGGTTAA-AGTCCGGAT-GGGAGGCAGTGCGCG 137 GTCAGCAGATCTGGTGAGAAGCCAGAGCCGACGGTTAG-AGTCCGGAT-GGAAGAAGATGTGC 8 4 8 GTCAGCAGATCTGGTCCGATGCCAGAGCCGACGGTCAT-AGTCCGGAT-GAAAGAAGATGTGC 7 5 7 GTCAGCAGATCTGGTGAGAGGCCAGGGCCGACGGTTAA-AGTCCGGAT-GAAAGAAGATGGGC 11 3 11 GTCAGCAGATCCGGTGAGATGCCGGGGCCGACGGTCAG-AGTCCGGAT-GGAAGAAGATGTGC 8 4 8 GACAGCAGATCCGGTGTAATTCCGGGGCCGACGGTTAG-AGTCCGGAT-GGGAGAGAGTAACG 8 3 8 GTCAGCAGATCCGGTGTAATTCCGGGGCCGACGGTTAA-AGTCCGGAT-GGGAGAGGGTAACG 8 4 8 GTCAGCAGATCCGGTGTAATTCCGGGGCCGACGGTTAA-AGTCCGGAT-GGGAGAGAGTAACG 26 9 30 GTCAGCAGATTTGGTGAAATTCCAAAGCCGACAGT-AA-AGTCTGGAT-GAAAGAGAATAAAA 11 9 11 GTCAGCAGATTTGGTGAGAATCCAAAGCCGACAGT-AT-AGTCTGGAT-GAAAGAGAATAAGC 5 4 5 GTCAGCAGATCTGGTGAGAAGCCAGGGCCGACGGTTAC-AGTCCGGAT-GAGAGAGAATGACA 16 6 16 GTCAGCAGACCCGGTGTAATTCCGGGGCCGACGGTTAT-AGTCCGGAT-GGGAGAGAGTAACG 16 4 27 GTCAGCAGATTTGGTGCGAATCCAAAGCCGACAGTGAC-AGTCTGGAT-GAAAGAGAATAAAA 10 4 10 GTCAGCAGACCTGGTGAGATGCCAGGGCCGACGGTCAT-AGTCCGGAT-GAGAGAAGATGTGC 10 3 11 ---CGCAGATCTGGTGTAAATCCAGAGCCGACGGT-AT-AGTCCGGAT-GAAAGAAGACGACG 6 6 6 GTCAGCAGATCTGGTG 52 TCCAGAGCCGACGGT 31 AGTCCGGAT-GGAAGAGAATGTAA 7 3 7 GTCAGCAGATCTGGTGCAACTCCAGAGCCGACGGTCAT-AGTCCGGAT-GAAAGAAGGCGTCA 7 9 7 GTCAGCAGATCCGGTGAGAGGCCGGAGCCGACGGT-AT-AGTCCGGAT-GGAAGAGGACAAGG 19 4 18 GTCAGCAGACCCGGTGTGATTCCGGGGCCGACGGTCAC-AGTCCGGATGAAGAGAGAACGGGA 15 4 16 GTCAGCAGACCCGGTGTGATTCCGGGGCCGACGGTCAT-AGTCCGGATGAAGAGAGAGCGGGA 14 4 13 GTCAGCAGACCCGGTGCGATTCCGGGGCCGACGGTCAT-AGTCCGGATAAAGAGAGAACGGGA 8 5 8 GTCAGCAGATCCGGTGTGATTCCGGAGCCGACGGTTAG-AGTCCGGAT-GAAAGAGGACGAAA 8 3 8 GTCAGCAGATCCGGTCGAATTCCGGAGCCGACGGTTAT-AGTCCGGAT-GGAAGAGAGCAAGC 10 15 10 GTCAGCAGATCCGGTGAGATGCCGGAGCCGACGGTTAA-AGTCCGGAT-GGAAGAGAGCGAAT 5 4 5 -----AGGATTCGGTGAGATTCCGGAGCCGACAGT-AC-AGTCTGGAT-GGGAGAAGATGGAG 3 5 3 -----AGGATTTGGTGTGATTCCAAAGCCGACAGT-AT-AGTCTGGAT-GGGAGAAGATGGAG 3 4 3 -----AGGATCCGGTGCGAGTCCGGAGCCGACAGT-AT-AGTCTGGAT-GGGAGAAGATGAAG 3 4 3 ----TATGATCCGGTTTGATTCCGGAGCCGACAGT-AA-AGTCTGGAT-GAAAGAAGATATAT 6 4 6 -------GATTTGGTGAGATTCCAAAGCCGACAGT-AA-AGTCTGGAT-GAGAGAAGATATTT 5 3 5 ----ATTGAATTGGTGTAATTCCAATACCGACAGT-AT-AGTCTGGAT—-AAAGAAGATAGGG 4 4 4 ----–TTGAAGCAGTGAGAATCTGCTAGCGACAGT-AA-AGTCTGGAT-GGAAGAAGATGAAC 3 10 3 ----TTGACTCTGGTGTAATTCCAGGACCGACAGT-AT-AGTCTGGAT-GGGAGAAGATGTTG 3 4 3 -------GATGTGGTGAGATTCCACAACCGACAGT-AT-AGTCTGGAT-GGGAGAAGACGAAA 3 4 3 -------GATGTGGTGTAACTCCACAACCGACAGT-AT-AGTCTGGAT-GAGAGAAGACCGGG 3 4 3 -------GATGTGGTGAAATTCCACAACCGACAGT-AA-AGTCTGGAT-GGGAGAAGACTGAG 11 3 11 ----–CTGATCTAGTGAGATTCTAGAGCCGACAGT-AT-AGTCTGGAT-GGGAGAAGATGGAG 5 5 5 ------TGATCTGGTGCAAATCCAGAGCCAACGGT-AT-AGTCCGGAT-GGAAGAAACGGAGC 11 4 11 --CGACTGACTTGGTGAGACTCCAAGGCCGACGGT-AT-AGTCCGGAT-GGGAGAAGGTACAA 4 6 4 -------GATTTGGTGAAATTCCAAAACCGACAGT-AG-AGTCTGGAT-GAGAGAAGAAAAGA 10 4 10 GTCAGCAGATCCGGTTAAATTCCGGAGCCGACGGTCAT-AGTCCGGAT-GCAAGAGAACC--- Distribution of RFN-elements in bacterial genomes RFN regulates riboflavin biosynthetic genes and transporters Genomes Number of analyzed genomes Number of genomes with RFN Number of the RFN elements a-proteobacteria 8 4 4 b-proteobacteria 7 4 4 g-proteobacteria 17 15 15 e- and d-proteobacteria 3 0 0 Bacillus/Clostridium 12 12 19 Actinomycetes 9 4 4 Cyanobacteria 5 0 0 Other eubacteria 7 5 6 Total 68 47 52 Some predicted transporters are NEW Alternative RNA secondary structures upstream of riboflavin operons with RFN elements Attenuation of transcription via antitermination mechanism Antiterminator The RFN element Bam BS BQ BE HD CA DF LLX PN* PN* TM AO DU FN SA DHA FN CA DF BS BQ BE PN ST MN SA EF LLX LO GACAAAAAAATATTGATTGTATCCTTCGGGGCTGGGTG GGACAAATGAATAAAGATTGTATCTTCGGGGCAGGGTG CTATAATTTGAGCAAACAGCATCCTTCGGGGTCGGGTG ACATAACGATATAGTGATGCATCCTTCGGGGCAGGGTG AAATTGAATAATTAATTTTTATCCTTCGGGGCTGGGTG TAATGGTAATTTAATAGGATGTTCTTCAGGGATGGGTG TAAATATAAATTTAATACTTAATCTTCGGGGTAGGGTG ACTTTAGCTACAATTGAATAAATCTTCAGGGCAGGGTG ATCATCTGTAATTGAATAACTATCTTCAGGGCAGGGTG ATCATCTGTAATTGAATAACTATCTTCAGGGCAGGGTG AAAACTGAATACAAAAGAAACGCTCTCGGGGCAGGGTG ATTTGCAACAATTTTTTAATAATCTTCAGGGCAGGGTG AATTTTTTTAATACTATTTTAATCTTCAGGGCAGGGTG TAATCGAATATGTAAAATAAAGTCTTCAGGGCAGGGTG TATAACAATTTCATATATAATTCTTTCGGGGCAGGGTG ACTCTTTTTAGATGAATACGAACCTTCGAGGTAGGGTG GAAAAATAAATATTAAAAATAATCTTCGGGGCAGGGTG AATATAAAAAAATAAAGAATGATCTTCAGGGCAGGGTG AAAATTAAAAAATCAAAGAAGATCTTCGGGGCAGGGTG TAATTAAATTTCATATGATCAATCTTCGGGGCAGGGTG GGGAAAATAGAATATCGGTCTATCTTCGGGGCAGGGTG ATAAAAATGTATAAGCGATTCATCTTCGGGGCAGGGTG GTTTTTTGTTATGATAAAAGAGTCTTCAGGGCAGGGTG TAAATCTGCTATGCTAGAAGTGTCTTCAGGGCAGGGTG ATTTTTTGATATGCTATAAGTGTCTTCAGGGCAGGGTG AAATTTAATAATGTAAAATTCATCTTCGGGGTCGGGTG AAAAAATATAATACAAGGTTCGTCTTCAGGGGCAGGGT TTTTTGTGCTATAATAAAAATATCTTCAGGGCACCGTG ATTGTAAGAAAATATTCGTTCATCTTCGGGGCAGGGTG ----------------------------------------------------------- TCTGGATGGGAGAAGGATGA 59 TCTGGATGGGAGAAGGATGA 59 TCTGGATGGGAGAAGGATAT 250 TCTGGATGGGAGAAGGATGC 155 TCTGGATGGGAGAAGGAAAC 148 TCTGGATGAAAGAAGAAATA 34 TCTGGATGGAAGAAGATATT 63 TCTGGATGAAAGAAGATAAT 127 TCTGGATGAAAGAAGATAAA 81 TCTGGATGAAAGAAGATAAA 19 TCCGGATGGGAGAGAGCGTG 13 TCTGGATGGAAGAAGATGAA 33 TCTGGATGGAAGAAGAAGAG 47 TCTGGATGGGAGAAGAATTA 18 TCTGGATGGGAGAAAGAATG 74 TCCGGATGGGAGAAGGTACA 43 TCTGGATGAGAGAAGAAAAG 40 TCTGGATGAAAGAAGATATA 19 TCTGGATGAGAGAAGATATT 45 TCTGGATGGGAGAAGATGGA 103 TCTGGATGGGAGAAGATGGA 54 TCTGGATGGGAGAAGATGAA 114 TCTGGATGGGAGAAGACGAA 137 TCTGGATGAGAGAAGACCGG 130 TCTGGATGGGAGAAGACTGA 138 TCTGGATGGGAGAAGATGGA 17 GTCTGGATAAAGAAGATAGG 33 TCTGGATGGAAGAAGATGAA 66 TCTGGATGGGAGAAGATGTTG 79 Terminator +RBS sequestor ----------GTAAAGCCCCGAATGTGTAA---ACATTCGGGGCTTTTTGACGCCAAAT ----------CTAAAGCCCCGAATTTTTTA--TAAATTCGGGGCTTTTTTGACGGTAAA -----------CCAAACCCCAAGGATATTAAA--ATCCTTGGGGTTTTTTGTTTTTTTT ------------TGAGCCCCCGGGGACAT--------CCCGGGGGTTTCATTTTTATTG -------------ATGCCCCGTGAGAACAAAA-----TCTCTGGGGCTTTTTTGCGCGC -------------AATCTCCGAAGGATTACC----TTTCTTTGGAGATTTTTTTATTTG ------------TAAACCCTGAGTTAATT--------CTCAGGGTTTTTTGTTTAAAAA ----------AAAAGACCCTGAAATTTT------ATTTTAGGGTCTTATTTTTTATTAG ----------TGTATGCCTTGAGTAGTCCCC---TATTCAAGGTATATTTTTTTGGAGG ------------CGTGCTCTGAAATGATTACTTGTCATTTCAGAGCATTTTTGTTAATC -----------ATGGGACCCGAGA----------------GGGTCCCTTTTCTTTTACA --------TTTACAAGCCTTGAGATCGAAAG----ATTTCAAGGCTTTTTTCATCATTA --------TGCATAAGCCTTGAGATCTTAG----GATTTCAAGGCTTTTTCATTAGTTA ----------ATATTGCTCAGACTTT------------GTTTGAGCATTTTTTTATTAA ------TTTTCTCCTTGCATCTTAATT----------GATGTGAGGATTTTTGTTTATA -----------GTTTATGCCTCGAGGAACACCATTTCCTCGAGGCATTTTTGTTCTTTC ------------CTTACCCGAATTCTAT------------AATTCGGTTTTTTTATTTT ----------–-TATGCCCTGACGTTTTT---------CGTTGGGGCTTTTTTAATGCT ----------ATAAAAACTCGAAGATAGGG----TCTTCGAGTTTTTTGTTTTTCCTAA --AAAGAACCTTTCCGTTTTCGAGTAAGATGTGATCGAAAAGGAGAGAATGAAGTGAAA -------ATTCTCCCTTTGTGTAAA------------ACACAAAGGGTTTTTTCGTTCTAT --------GGCAGCCTTCTTCTTGTGAGGATGAATCACGAGAAGGGGAGGAGAACAAGCAT -–AACTTCTTCTGATTTTATAG------------AAAATTGGAGGAACCTGTTATGACA ---GGAACTTCTTTCAATTTGAAA-----------AAATTGGAGGAATTTTTTAATGTC ---–GGCCTTCTTTCGATTTGTAA-----------AAATTGGAGGAATTTTTTTATGAA --------TCCTCCTATTCTTACG--------AGATGAATGGAAGGAGAAAATTGAATATG ---CTACTCTATTTTTCCCTGCAGA------------AAAATAGGGTTTTTTTGTATGA -–TCAACTTCCTCGAAATTTGAAGAAT-TATTTTCTCATATTTGGAGGTTTTTTTATGT ---ATGCACAAACTCTCCCTCAACTTTTTTTA--------GTTGAGGTTTTTTATTTGC Antiterminator Alternative RNA secondary structures upstream of riboflavin genes with RFN elements Attenuation of translation by sequestering of the RBS Antisequestor The RFN element EC TY KP HI VK AB YP VC Spu MLO AC BP BPS BU REU RSO PP PY PU PA BME CAU TFU GLU DR SM TQ AMI AATCCGCTTATTCTCAGGGCGGGGCG AACCCGCTTATTCTCAGGGCGGGGCG ATCTCGCTTATTCTCAGGGCGGGGCG TTAGCTCGCATTCTCAGGGCAGGGTG TATTTGCGCATTCTCAGGGCAGGGTG TAGGCGCGCATTCTCAGGGCAGGGTG ATGGGGCTTATTCTCAGGGCGGGGTG CACAACAATATTCTCAGGGCGGGGCG CTATCAACAATTCTCAGGGCGGGGTG GACGTTAAAGTTCTCAGGGCGGGGTG AAGCGACATCGCTTCAGGGCGGGGCG AAGCAGTACGTCTTCAGGGCGGGGTG AGTCAGTGCGTCTTCAGGGCGGGGCG AATCAGTGCGTCTTCAGGGCGGGGTG CATCGTTACGTCTTCAGGGCGGGGTG GCTTGGTACGTCTTCAGGGCGGGGTG GGTCGGTCGGTCTTCAGGGCGGGGTG GCCGGTAACGTTCTCAGGGCGGGGTG CGGCGAAACGTTCTCAGGGCGGGGTG GGCCGTAACGTTCTCAGGGCGGGGTG CGCGGGCTTGTTCTCGGGGCGGGGTG AATCCGAAGACCTTCGGGGCAAGGTG GTACACACGCGTGCTCCGGGGTCGGT TGAGTTTTGTTCTCAGGGCGGGGCG GAACCGACCTCTTTCGGGGCGGGGCG GTCGCAAGCGTTCTCAGGGCGGGGTG TTCGGCACCTCCTTCGGGGCGGGGTG CTTACTCACAGTTTCAGGGCGGGGTG --------------------------------------------------------- RBS-sequestor TCCGGATGGGAGAGAGTAACG 59 ----------CTGCCCTGATTCTGGTAACCATAATTTTAGTGAGGTTTTT-------TACCATGAATCAGACGCTA TCCGGATGGGAGAGGGTAACG 61 ----------CTGCCCTGATTCTGGTAACCATAATGTTAATGAGGTTTTTT------TACCATGAATCAGACGCTA TCCGGATGGGAGAGAGTAACG 61 ----------CTGCCCTGATTCTGGTAACCATAATTTTAATGAGGTTTTTT------TACCATGAATCAGACGCTC TCTGGATGAAAGAGAATAAAA 41 ----------CAGCCCTGATTCTGGTATTTAATTGAAATCTCAAAT-TAGGAAAT--TACTATGAATCAGTCAATT TCTGGATGAAAGAGAATAAGC 76 ----------CAGCCCTGATTCTGGTATCTAAATATCTTTATATTTCAAGGAATT--TACTATGAATCAGTCTATT TCTGGATGAAAGAGAATAAAA 54 ----------CCGCCCTGATTCTGGTATAAATTCATCTTATTAAA—AAGGCATT---TACTATGAATCAGTCATTA TCCGGATGGGAGAGAGTAACG 194 ----------CCGCCCTGATTCTGGTAATCCATAATTTTTTAATGAGGTTTCT---TTACCATGAATCAGACGCTT TCCGGATGAGAGAGAATGACA 83 ----------AAGCCCTGATTCTGGTCATTTTTT--------------GGAGTATT--ACCATGAATCAGTCCTCA TCCGGATGGAAGAGAATGTAA 145 ----------ACGCCCTGATTCTGGATATTCCCATGTCGTATTTTTGAAGGATATTAA-CCATGAATCAGTCTTTA TCCGGATGAAAGAGGACGAAA 44 -------CGTGCGTCCTGATTCTGGTTCGAAACGGA--------------AGGATGGACCCATGAATCAGCATTCC TCCGGATGAAAGAAGACGACG 51 ----------CAGTCCTGAAATGTTTAACCGTAATT-------------------TACGAGAGCATTTCATATGTC TCCGGATGAGAGAAGATGTGC 62 ----------TAGCCCTGAAACGTTTTTCGCCATTTCCTTTTTT------------GCGAGAGCGTTTCAATGTCC TCCGGATGAAAGAAGATGTGC 86 ----------GAGCCCTGAAACGTTTTTCGCCCATTCATGTTTC-----------GCGAGGAGCGTTTCACATCATG GCCGGATGGAAGAAGATGTGC 99 ----------ATGCCCTGAAACGTTTTTCGCCCAACTTTT--------------GCGATGAGCGTTTCAACTATGT TCCGGATGAAAGAAGATGGGC 77 ----------ATCCCCTGAAACGCCCATCCATGGAAATCCACGCAC-------------GGAGCGTTTCAATGCTG TCCGGATGGAAGAAGATGTGC 80 ---------CGTGCCCTGGAACGTCTTGTCGCCCATTTCA---------------GCGAGGAGCGTTTCCATGTTG TCCGGATGAAAGAAGGCGTCA 50 ----------TCGCCCCGAGACGTTCATCGATCATTCA------------------CGAGGAGCGTTTCATGTTCA CCGGATGAAGAGAGAGCGGGA 91 ----------ATGCCCTGTTTTTTCATTAAATT---------------------AAACAGGAGTCAGAACACGTGC CCGGATGAAGAGAGAACGGGA 68 ----------ACGCCCTGTTTTTCACAC--------------------------AAACAGGAGTCAGAACATGCAA CCGGATAAAGAGAGAACGGG 53 ---------AAAGCCCTGTTTTTCAC---------------------------GAAACAGGAGTTCGTCATATG-TCCGGATGGAAGAGAGCGAAT 54 ----------GCGCCCTGATTCTAGTTTCGTG--------------------------AGGAACCTATGAACCAAA TCCGGATGGGAGAAGGTCGGC 116 ------CGCGATGCCCCGAAGGTGTG-----------------------------TTCAGGGGTGTCGCGATGAAC GGATGGGAGGTAGTACGTGGT 58 -------GCCTTACCCCGGAGCCTGACCT-------------------------GGCTAGGGGGAAGGCTTCTCGCAT TCCGGATGCAAGAGAACCG 32 ---------AAGGCCCCGAGGATTACATGCTTTTAAATCCTTTGAAAAGGGGACAAGATCATGAATCCTATAACCG TCCGGACGAAAGAAGGAGGAG 1 GACGCTCAGCTTGCCCCCCA------------------------------------GCAGGCGGCGTCCGCGTATG TCCGGATGGAAGAGAGCAAGC 45 ATCATTGGAAAAATGCCAACCCTGAAA-------------------GGCTTGAGACCATGACCATACTT TCCGGATGGGAGAAGGAGGGCCACTTGCGC TCCGGATGGAAGAAACGGAGCGCCTTATGG Direct RBS sequestering The predicted mechanism of the RFN-mediated regulation of riboflavin genes and operons • Transcription attenuation • Translation attenuation Phylogenetic tree of RFN-elements новые потенциальные транспортеры флавинов: 1. ImpX найден в Fusobacterium nucleatum и Desulfitobacterium halfniense: impX Имеет 9 предполагаемых трансмембранных сегментов; не имеет гомологии с какими-либо известными генами. 2. PnuX найден в актинобактериях: pnuX Streptomyces coelicolor Thermomonospora fusca pnuX Corynebacterium glutamicum Имеет 6 предполагаемых трансмембранных сегментов; гомологичен PnuC (транспортер N-рибозил никотинамида) Known Thi-box signal in diverse bacterial genomes (Miranda-Rios et.al., 1997) TTCGGGATCCGCGGAACCTGA-TCAGGCTAA-TACCTGCG-AAGGGAACAAGAGTTA TTCGGGATCCGTTGAACCTGA-TCAGGTTAA-TACCTGCG-AAGGGAACAAGAGAAG GCAGTGACCCGTTGAACCTGA-TCCAGTTCA-TACTGGCG-TAGGGACGGTGCAAGC GCAGTGACCCGTTGAACCTGA-TCCAGTTCA-CACTGGCG-TAGGGACGGTGCAGAC AGAAATACCCTTTACACCCGA-TCGGGATAA-TACCTGCG-TGGGGAGTTTTCACGG TTCTTAACCCTTTGGACCTGA-TCTGGTTCG-TACCAGCG-TGGGGAAGTAGAGGAA CCGTCGACCGTACGAACCTGA--CCGGGTAA-TGCCGGCG-TAGGGAGTTGCAAATG GGATCGACCCTTTGAACCTGA-TCCGGGTAA-TGCCGGCG-GAGGGAAATTATGTCG TCCTCGACCCCAAGAACCTGA-TCCGGGTAA-TGCCGGCG-GAGGGATCGGGGAAGG Notation: Red– Conserved nucleotides; Green– Purine or Pyrimidine conserved nucleotides; Blue– Non-conserved nucleotides THIC_EC THIC_VC THIC_MLO THIC_SM THIC_NM thiC_BS THIC_MT THIT2_TVO thi1_TM Predicted regulatory THI-elements in bacterial genomes 1 2 3 3' FACULTATIVE STEM-LOOP 2' 4 5 5' 4' 1' ----====>===> -=====> <===== ========> <======= <=== ===> =====> <===== <=== <====---BACILLUS/CLOSTRIDIUM GROUP BS_THIC TAGTTACTGGGGGTGCCCGCT----------------TTCcgGGCTGAGAGAGAAGGCA-------------AGCTTCTTAACCCTTT---GGACCTGA-TCTGGTTCG-TACCAGCG-TGGGGA-AGTAGAGGA BS_TENA TAACCACTAGGGGTGTCCTTC----------------ATAAGGGCTGAGATAAAAGTGT-------------GACTTTTAGACCCTCA---TAACTTGA-ACAGGTTCA-GACCTGCG-TAGGGA-AGTGGAGCG BS_YLMB TTCATCCTAGGGGTGCTTTG-------------------CGAAGCTGAGAGAGACTT-----------------TGTCTCAACCCTTT---TGACCTGA-TCTGGATCA-TGCCAGCG-GAGGGA-AGCGGTGAA BS_YKOF AAAGCACTAGGGGTGCTGT--------------------TTTGGCTGAGATAAAGCGCGGAA-----GAAACGCGCTTTGATCCCTTA---TGACCCGA-TCTGGATAA-TACCAGCG-TGGGGA-AGTGCAGGT SA_TENA GAACTACTAGGGGAGCCTAAT----------------GATATGGCTGAGATGAATT-------------------GTTCAGACCCTTA---TGACCTGA-TTTGGTTAG-TACCAACG-TAGGAA-AGTAGTTAT SA_YKOE CACACACTAGGGGTGTTT----------------------TATACTGAGATGAGGCTT---------------GCCCTCAAACCCTTT---GAACCTGA-TCTAGCTTG-AACTAGCG-TAGGAA-AGTGTTACT LLX_YUAJ TTTGCACAATGGGTCTATTGACAAA---------ACTGTCAGTAGCGAGA----------------------------AATACCATC----TGACCTGA-TCTGGGTAA-TGCCAGCG-TAGGAA-TGTGTTAAG CA_THIS ATAGTTAACGGGGAGCCTGTA-----------------GACAGGCTGAGAGTGGAATG--------------TGATTCCAGACCCTCA---TAACCTGA-TTTGGATAA-TGCCAACG-TAGGGA-GTTAATGCA CA_YUAJ TATGTGCTAGGGGTGCCTT---------------------TAGGCTGAGAAACAGTTT--------------GTCACGTTAACCCTT-----AACCTGA-TCTGGATAA-TACCAGCG-TAGGGA-AGCAGTTTG ST_YUAJ TTTCACAAAGGAGTGCTT-----------------------TGGCTGAGATCGCAA------------------TTGCGAAATCCTGA---GGACCTGA-TCTTGTTAG-TACAAGCG-TAGGGA-TTGTGACCA DHA_THIC TAATCACTAGGGGGGCCGAATA---------------AGGTCGGCTGAGATAAAGGACCCA---------AGAATCCTTTGACCCTT-----AACCTGA-TCTGGGTAA-TGCCAGCG-TAGGGAAGGTGGATAA LMO_TENA GAAAAACTAGGGGGGCCGAT-------------------TCTGGCTGAGATAGGAAGGTAAT-----------GCTTTCTGACCCTTT---GAACCTGT-TT--GTTAG-TGCAAGCG-TAGGGA-AGTGAATGT LMO_YUAJ TTACCACAGGGGGGGCTTC---------------------TTAGCTGAGATTGAGTCCACGTGT-----TTTTGGATTCTGACCCTTT---GAACCTGT-TC--GTTAA-TACGAGCG-TAGGGA-TTGTGGCGA PROTEOBACTERIA EC_THIB GTTCTCAACGGGGTGCCACGCGT------------ACGCGTGCGCTGAGAAA---------------------------ATACCCGTCGA---ACCTGA-TCCGGATAA-CGCCGGCG-AAGGGATTTGAGGC EC_THIM AAACGACTCGGGGTGCCCTTCTGC-------------GTGAAGGCTGAGAAA----------------------------TACCCGTATC---ACCTGA-TCTGGATAA-TGCCAGCG-TAGGGA-AGTCACG EC_THIC TTTCTTGTCGGAGTGCCTTA-------------------ACTGGCTGAGACCGTTT------------------ATTCGGGATCCGCGGA---ACCTGA-TCAGGCTAA-TACCTGCG-AAGGGA-ACAAGAG VC_THIC CCACTTGTCGGAGTGCCAT---------------------TGGGCTGAGACCGTTT------------------ATTCGGGATCCGTTGA---ACCTGA-TCAGGTTAA-TACCTGCG-AAGGGA-ACAAGAG VC_THID CCTGTAGTCGGGGAGCCTGAGAG-- 66 5 71 -AATTAAAGGCTGAGATCGCGT-------------------AGCGAGACCCGTTGA---ACCTGA-TTCAGTTAG-GACTGACG-TAGGGA-ACTATCC VC_THIB CCCACTCACGGGGGGCCACCCATTCAT-------CCGAATGGCGCTGAGATCAAGCAC---------------TGCTTGGGACCCGCA 21 -ACCTGA-ACCAGATAA-TGCTGGCG-TAGGAATTGAGCTA XFA_THIC TTTGAAGCGGGGGTACCATAGCCA------------AGCTGCGGTTGAGAC----------------------------ACACCCTTCGA---ACCTGA-TCCGGTTTA-CACCGGCG-TAGGAAAGCTTCGT MLO_THIC CATTCACCAGGGGAGTCCCGG----------------CAAGGGGCTGAGATACTGCTGGCTTTC------GCGGCGCAGTGACCCGTTGA---ACCTGA-TCCAGTTCA-TACTGGCG-TAGGGACGGTGCAA MLO_THIB CGCTCTAACGGGGTGCCGGA------ 5 3 5 -----GACCGGCTGAGAGGCAGT------------------CTCGCCAACCCGCTGA---ACCTGA-TCCGGTTTG-TACCGGCG-GAGGGA-TTAGACG MLO_YK GCCCATCCACAGGGGTGCTCCGTAC-------------GGTCGGGGCTGAGACGGGGGCGG-----------CAAGCCCACAGACCCTAGA----AGCTGA-TCTGGGTAA-TACCAGCG-GAGCGA-GGCGGGCG NX_CITX CTCCTTGTCGGAGTGCCGCCGC---------------CGGGCGGCTGAGATTGCGA------------------AAGCAGAATCCGTAGA---ACCTGT--CGGGGTAA-TGCCTGCG-TAGGAA-ACAAACC NX_THIC ATTGAAACAGGGGTGCTGCCTGAT----------GTTTAGGCGGCTGAGAA----------------------------ATACCCTTTAC---ACCCGA-TCGGGATAA-TACCTGCG-TGGGGA-GTTTTCA ACTINOBACTERIAE MT_THIO CTGTAGACACGGGAGTCCCGGG--------------AGCGGGGTCTGAGAGTGGGCGCGCCT-------------GCCCTTACCGTCAC----ACCTGA-TCCGGATCA-TGCCGGCG-AAGGGAGGTCAAGGATG MT_THIC GTACCCACGCGGGAGCGCACGC--------------CGAGTGCGCTGAGAGGACGGCTCGGG------------GCCGTCGACCGTACGA---ACCTGA--CCGGGTAA-TGCCGGCG-TAGGGAGTTGCAAATG CGL_THIC CAGTCCCCACGGGCGCCCGA-----------------GCACGGGCTGAGATCGCGCTGATT---------GCTGCGCGAGCACCGTTTGA---ACCTG--TCCGGTTAG-CACCGGCG-AAGGAAGAGAGGAATGGTGC CGL_THID ACTAGGCACGGGGTGCCAACCGGATGG---AAAAATTCCGGAGGCTGAGAAA---------------------------ACACCCGTTGA---ACCTGC-TCTAGCTCG-TACTAGCG-AAGGGATGGCCTTAACGTG CGL_THIE CTTACCCCACGGGTGCCCAAT---------------GCATTGGGCTGAGATTGCGCGCTGT---------TGCTGCGCGGGACCGTTCGA---ACCTG--TCTGGTTAA-CACCAGCG-AAGGAAGCGAGGATTGATTG CGL_YKOE TCATAGACACGGGTGCTCGGTGA------------AAATCCGGGCTGAGATCTGGCA----------------TAGCCACGACCGTCGA----ACCTG-ATCCGGATAA-TGCCGGCG-ATAGGGAGGAAAAATATG CGL_OARX TAGTGACACGGGGTGCAAAAGCACTTT----AAAAAAGCTTTCGCTGAGATT---------------------------ACACCCGTCGA---ACCTG-ATCCAGTTAG-TACTGGCG-AAGGGACTGTCGCAT CYANOBACTERIA NPU_THIC TCCATGCTAGGGGTGCCTACAT---------------AACCAGGCTGAGATC---------------------------ACACCCTTAAC---ACCTGAGTCTGGGTAA-TACCAGCG-GAGGGAAGCTGTTTATTG CY_THIC CCATAGCTAGGGGTGTCTAGAA---------------AGCTAGGCTGAGAA----------------------------AAACCCTTAGA---ACCTGAGACTGGGTAA-TACCAGCG-GAGGGAAGCTCACCATTC AN_THIC TCCATGCTAGGGGTGCTTGCAC---------------TAACAGGCTGAGATT---------------------------ACACCCTTAAC---ACCTGAGACTGGGTAA-TACCAGCG-AAGGGAAGCTGTTTATTG THERMUS/DEINOCOCCUS, THERMOTOGALES, Fusobacterium, CFB group DR_THIB CGCGTCACCGGGGGTGCCCTGCTT------------CGGCAGCGGCTGAGAAC---------------------------ACACCCCAGGA---ACCTGA-ACCGGGTCA-TTCCGGCG-GAGGGAGTGTGATGC DR_THIC ATCGTCAACAGGGGTGCCTCCGCATA--------TGGGCCGGAGGCTGAGAGGGCAACT---------------CGGGCCTAACCCTATGA---ACCTGA-ACTGGTTAG-CACCAGCG-GAGGGA-GTGTGACG TQ_THIBGGCCGTCACCGGGGGTGCCCCA------------------AAAGGGCTGAGAGC---------------------------ATACCCTTGGA---ACCTGA-TCCGGGTCA-TGCCGGCG-TAGGGAAGGTGACGGCC TM_THI1 CCTTCCCCAGGGGGAGCTCCTAT---------------TCCGGGGCTGAGAGGAGGACGG-------------AAGTCCTCGACCCCAAGA---ACCTGA-TCCGGGTAA-TGCCGGCG-GAGGGATCGGGGAAGGA FN_THIC TATATGTACTGGGGAGCTT----------------------TGTGCTGAGATTAGAACCT------------TTTTTCTTAGACCCATAGT---ACCT-GA-TTTGGATAA-TGCCAACG-AAGGGA—GTACCA FN_THIX ACTAGTTACAAGGGAGTTAATA-----------------AATTGACTGAGAAAAGGATG--------------TGAGCCTTGACCTTTTG----ACCT-GA-TTTGGATAA-TGCCAACG-TAGGAA--GTAAA PG_THIS AGACCGCTACGGGGGTGCTTGCCG--- 4 3 4 -GATACGGCAGGCTGAGAT---------------------------AATACCCATAG---ACCT-GA-TCCGGATAA-TACCGGCG-GAGGGAT-GTAG PG_OMR ATTGGGAGAAGGGGTGCTTCCTGTA--- 3 7 3 --GTGGATGGCTGAGAAC---------------------------AAACCCTCATC---ACCT-GA-ACCGGATAA-TACCGGCG-TAGGAAA-CTCTC BX_THIS TAAAGACAAAGGGGTGCCACC------------------CGGTGGCTGAGATT---------------------------ATACCCTAAGA---ACCT-GA-TGCAGTTAG-TACTGCCG-AAGGGA—TTGTG ARCHAEA TAC_T1 GGTGTGGTGGGGGAGCTCCAT-----------------AAGGGGCTGAGAGGATCCGG---------------ATGGATCGATCCCTGGA---ACCTGA-TCCGGGTAA-TACCGGCG-GAGGGAAATTATG FAC_T1 AGTTATACCGGGGAGCTAA---------------------AATGCTGAGAGGATAA-------------------GGATCGACCCGTGCA---ACCTGA-TCCGGACAA-TACCGGCG-GAGGGAGATGGATA Conserved RNA secondary structure of the regulatory THI element THI-element facultative stem-loop G A G U C N A C C R C N G R G K G G Y G Y M 3 N UN R G A U YG U C R G CC R G U C AC C 5 A G G N G G A Thi-box A 2 4 1 Capitals: strongly conserved positions. Dashes and points: obligatory and facultative base pairs Degenerate positions: R = A or G; Y = C or U; K = G or U; M= A or C; N = any nucleotide Distribution of THI elements in bacterial genomes THI-element regulates thiamine biosynthetic genes and transporters. Genomes Number of analyzed genomes Number of genomes with THI Number of the THI elements a-proteobacteria 7 7 15 b-proteobacteria 6 6 12 g-proteobacteria 18 17 38 e- and d-proteobacteria 3 1 1 Bacillus/Clostridium 18 18 51 Actinomycetes 9 9 25 Cyanobacteria 5 5 5 Other eubacteria 14 11 11 Archaea (Thermoplasma) 17 3 6 Total 97 77 164 A number of NEW candidate thiamine-related transporters were identified. The predicted mechanism of the THI-mediated regulation of thiamin genes •Bacillus/Clostridium group, •Thermotoga, •Fusobacterium, •Chloroflexus • Transcription attenuation 1,2 •Thermus/Deinococcus group, •CFB group •Proteobacteria, • Translation attenuation 1,2 •Actinobacteria, •Cyanobacteria, •Archaea • Direct RBS sequestering New functional predictions Транспорт гидроксиэтилтиазола (грамположительные бактерии) Транспорт гидроксиметилпиримидина (грамотрицательные бактерии) Predicted THI-regulated genes (more enzymes) • tenA: gene of unknown function somehow associated with thiD Found in most firmicutes, some proteobacteria and archaea; ThiD-TenA gene fusions in some eukaryotes; Forms clusters with thiD and other THI-elements-regulated genes in most bacteria; Single tenA gene is also regulated by THI-elements in some bacteria; Not found in genomes without the thiamin pathway; Always co-occurs with the thiD and thiE genes • tenI: gene of unknown function, thiE paralog Found in some unrelated bacteria; Forms a separate branch in the phylogenetic tree for thiE; In most bacteria, located in clusters of THI-elements-regulated genes. • ylmB from Bacilli belongs to ArgE/dapE/ACY1/CPG2/yscS family of metallopeptidases; regulated by the THI-elements in B. subtilis and B. halodurans, not regulated in B. cereus. • thi-4 from Thermotoga maritima belongs to a family of putative thiamine biosynthetic enzymes from archaea and eukaryotes. Located in the one operon with thiC and thiD. • oarX from Methylobacillus and Staphylococcus is a single THI-elements-regulated gene; belongs to short-chain dehydrogenase/reductase (SDR) superfamily Regulation of cobalamin-related genes: Experimentally known facts: Extensive region of the mRNA leader is essential for regulation of the btuB gene by vitamin B12. Involvement of highly conserved B12-box rAGYCMGgAgaCCkGCcd in regulation of the cobalamin biosynthetic genes (E. coli, S. typhimurium). Post-transcriptional regulation: RBS-sequestering hairpin is essential for regulation of the btuB and cbiA genes. Ado-CBL is an effector molecule involved in the regulation of the CBL genes. Identifying of other conserved sequenced regions and prediction of common RNA secondary structure of the B12-element. B12-элемент – регулятор кобаламинового пути Дополнительная шпилька -I g aN t C t Gg cg N N N N 2 A A G G G a N a a 1 C c y G d RC c C c G 3 C h a C Часть I K G T r a 4 r A G Y N g k r c tG y G h C B12-бокс G M C k Gg C C A C d Часть II 5 g c C 6 A Дополнительная шпилька -II CTG c gG GGY AG A Группа Bacillus/Clostridium -протеобактерии a g 0 Факультативная шпилька 5' 3' основная спираль Различные таксономические группы Allignment of B12-elements alpha and beta proteobacteria 0 1 1' 2 AddI 2' 3 3' 4 ======> -===><=======> >< <==== ===> <== =====> -proteobacteria hgGtkcy rg aa aGGGAA cgGtg a tCcg RCdG-ycCcCGChaCKGTra MLO_METE -285 GCGCATGTCGTGGTTCT 22 AGC--TAAGAGGGAA--GCCGGTG 2 ATGCCGGCGCTG-CCCCCGCAACTGTTAGCGGCGAG MLO_CFRX -237 CCGCTCCAGACGGTCCC 15 GGGGCTAAGAGGGAA--TGCGGTG 16 AATCCGCGGCTG-TCCCCGCAACTGTAAGCGAAGAG MLO_BTUD -290 GGGTGCGTGATGGTCCC 16 GGGT-GAAAAGGGAA--CACGGTG 16 AGACCGTGGCTG-CCCCCGCAACTGTAAGCGGAGAG MLO_CBTAB -213 AGTCATGCAGTCGTCGG 13 CC----AAGAGGGAA--TGCGGTG 19 ATGCCGTGGCTG-CCCCCGCAACTGTGTGCGGTAGT MLO_BLUB -233 CGCCACTGCCTGGTGCC 11 GGA--GAATCGGGAA--CACGGTT 2 ACTCCGTGGCGT--GCCCAACGCTGTAAGGGGGACC MLO_ARDX -308 ATGTCATCTCAGGTGCC 18 GGA--GAATTGGGAA--GCCGGTC 2 AGTCCGGCGCTG-CCCCCGCAACGGTGGTGGAGTTC SM_ARDX -310 AGGACACTCAAGGTGCC 16 GGA--GAATTGGGAA--GCCGGTC 2 ATCCCGGCGCTG-CCCCCGCAACGGTGGTGGAGCGA SM_BTUF -391 CTGGGACCGACGGTTCC 19 GGAT-TAATAGGGAA--CACGGTG 21 AAACCGTGGCTG-CCCCCGCAACTGTAAGCGGATCG SM_BLUB -251 TGCCGCCGTCAGGTGCC 11 GGG--GAATCGGGAA--GCCGGTG 2 GTTCCGGCACGT-GCCC---AACGCTGTGAAGGGGA SM_CBTC -255 GATCATGTGATGGTTCC 18 GGAT-GAAAAGGGAA--CACGGTG 21 AAACCGTGGCTG-CCCCCGCAACTGTGAGCGGCGAG SM_COBU -527 GCAGTATGGATGGTTCT 21 GGAG-TAAATGGGAA--TGCGAAG 23 TTATCGCAGCCG-ACCCCGCGACTGTAGAACGGTCA PD_COBU -586 AGGTGTTGGATGGTTCC 21 GGAA-TAATTGGGAA--TGTGACG 22 TTATCGCAGCCG-ACCCCGCGACTGTAGAACGGTCA BME_BTUB -378 TTTCAGGAGACGGTTCC 11 GGAT-GAAAAGGGAA--CACGGTG 14 AAACCGAGACTG-CCCCCGCAACTGTAACCGGAGAG BME_BTUF -398 ACCGTCATGACGGTTCC 17 GGAT-TAATAGGGAA--CACGGTG 22 AGACCGTGGCTG-CCCCCGCAACTGTAAGCGGATTG BME_NRDH -558 CTTGTGTTCGAGGTTCT 19 AGCT-AAGACGGGAA--TCCGGTG 23 ATGCCGGAGCTG-CCCCCGCAACTGTAAGCGGCGAG BME_CBTAB -281 ACCATGTGACAGGTTTT 19 AATACCAAAAGGGAA--TGCGACG 22 TTATCGCAGCCG-ACCCCGCGACTGTAGAGCGGAGA AU_CFRX -329 AAGGGACTGACGGTCTT 16 AAGC-TAAGAGGGAA--CACGGTT 18 ATTCCGTGGCTG-CCCCCGCAACTGTAAGCGGTAAG AU_NRDH -257 GTGGTGTTCAAGGTTCT 20 AGCT-AAGACGGGAA--TTCGGTG 23 AGGCCGAAACTG-CCCCCGCAACTGTGAGCGGCGAG AU_CBTAB -382 ATGTCCGTGATGGTTCC 17 GGT--GAAAAGGGAA--CACGATA 12 CATTCGTGGCTG-CCCCCGCAACTGTGAGCGGAGAG AU_ACHX -299 TTAGCCATCGTGGT-TC 16 GAGC-TAAGAGGGAA--TTCGGTG 20 AATCCGAAGCTG-CCCCCGCAACTGTAAGCGACGAG AU_BTUF -386 GAGAAAGCGACGGTTCC 18 GGAT-TAATAGGGAA--CATGGTG 20 ATGCCTTGGCTG-CCCCCGCAACTGTAAGCGGATTG AU_BLUB -272 TTCTCCGGTCAGGTGCC 9 GGC 4 AATCGGGAA--TCCGGTG 2 AGACCGGAACGT-GCCC-AACGCTGTAAGGCGGATG BJA_BTUB -321 TGATCGGTGACGGTTCT 9 GAT CAAAAGGGAA--CGTGGTG 30 ACGCCACGGCTG-CCCCCGCAACTGTAAGCGGTGAA BJA_METE -296 CAAGTCGTCGAGGTTCT 12 GAT 8 AAGAGGGAA--GCCGGTG 3 ATGCCGGCTCTG-CCCCCGCAACTGTGAGCGGCGAG BJA_CBTC -250 AGGACGGGCATGGTGCT 22 GCA--TAATCGGGAA--TGGGGAT 24 AAACCCCAGCCG-CCCCCGCGACTGTAAGCGGTGAA BJA_BTUB3 -308 ATGCTCGCGACGGTTTC 11 GAT--GAAAAGGGAA--TGCGGTG 16 ATGCCGCGGCTG-CCCCCGCAACTGTAAGCGGATAA BJA_CFRX -308 GGCCCGGCGTTGGTTCC 12 GGC--GAAGAGGGAA--TGCGATA 27 AAAATGCAGCCG-CCCCCGCGACCGTGACCGGAGAG RC_CBTF -327 AAGGCGGGATTGGTTCC 12 GGAT-GAAAAGGGAA--TGCGGTG 12 AACCCGCAGCTG-CCCCCGCAACTGTAAGCGGCGAG RC_BTUB -313 TGTCCCGTCCAAGTTCC 12 GGAT-TGAAAGGGAA--CACGGAA 14 AGACCGTGGCTG-CCCCCGCAACTGTGAGCGGCGAG RC_X-CBIP3 -264 GCCCGGGCCTTGGTTCC 14 GGAC-GAAGAGGGAA--GCCGGTG 2 AGTCCGGCGCTG-CCCCCGCAACTGTAAGCGGCAAG RC_BTUF -361 CCAGCGGCGTCGGTTTC 6 GAAT-TGAAAGGGAA--TCCGTTG 15 GAACCGGAACTG-CCCCCGCAACTGTAGGCGGCGAG RC_ARDX -246 GAAGGCCTCAGGGTGCC 14 GGA--GAATTGGGAA--GCCGGTG 2 AGACCGGCGCTG-CCCCCGCAACGGTCAGCAATGAG RC_X-CNOA -200 GGGGCGTCATCGGTCCC 24 GGGGGAAAGAGGGAA--TACGGTG 21 AATCCGTGGCTG-CCCCCGCAACTGTGAGCGGCGAG RC_X-BTUD -240 TCGCGGCAGATGGTTCC 21 GGT--GAAAAGGGAA--TACGGTG 20 AATCCGTAACTG-CCCCCGCAACTGTAAGCGGCGAG RC_CFRX -295 GGGCGGGCGCTGGTTTC 13 GC---GAAGAGGGAA-----TGTG 31 CGACCGCAGCCG-CCCCCGCGACCGTGACCGGAGAG RC_CBIM -282 CAACAGGCGATGGTTCC 10 GGAT-TAATAGGGAA--CACGGTG 21 AATCCGTGGCTG-CCCCCGCAACTGTGAGCGGCGAG RC_EXBB -322 TGACGTGTTCAAGTTCC 12 GGAT-TGAAAGGGAA--CACGGAA 14 AGACCGTGGCTG-CCCCCGCAACTGTGAGCGGCGAG RC_CRDX -264 CAGCGGGCCTTGGT-CC 16 GGGG-TAATAGGGAA--GCCGGTG 2 ACTCCGGCGCTG-CCCCCGCAACTGTCAGCGGCAAG RC_NRDD -272 GTGACGCTCTGGGT-CT 14 AGC--CAAGAGGGAA--GCCGGTG 2 ATTCCGGCGCTG-CCCCCGCAACTGTAAGCGGCGAG RC04759 -466 CTTGTGGCGATGGTGGC 17 GCCT-GAAAAGGGAA--TGCGGTG 14 AGGCCGCGGCTG-CCCCCGCAACTGTGAGCGACGAG RS_BLUB -217 GGCAGGGGTCAGGTGCC 10 GGA--GAATCGGGAA--GCCGGTG 2 AATCCGGCGCGG-GCCC-GCCGCTGTGACGGGGATG RS_BLUE -287 GTGCGGGCGACGGTTCC 14 GGC--GAAGAGGGAA--TGCGGTG 17 AAGCCGCGACTG-CCCCCGCAACTGTAGGCGGCGAG RS_CFRX -286 TCCGGCGCGCTGGTTCC 14 GGC--GAAGAGGGAA--TGCCCCA 0 --GAGGCAGCCG-CCCCCGCGACCGTGACCGGAGAG RS_CBTC -267 CGGGCTATGACGGTTCC 19 GGAT-GAAAAGGGAA--CGCGGTG 16 GTTCCGCGACTG-CCCCCGCAACTGTGAGCGGCGAG RS_BTUB -320 GGAACGGCTTCGGTTCC 12 GGAT-GAAAAGGGAA--CGCGGTG 16 ACTCCGCGGCTG-CCCCCGCAACTGTAGGCGGCGAG RS_BTUF -365 CAATCCTCGTCGGTTTC 6 GAAT-TGAAAGGGAA--TCCGCCG 15 GAACCGGAACTG-CCCCCGCAACTGTAGGCGGCGAG SAR_BTUB -400 TTGATCGCGCCGGTGCC 8 GGGCTTAATCGGGAA--TGCGGTG 16 AATCCGCGGCTG-TCTCTGCAACTGTAAGCGGATAG SAR_COBW -403 ATGATCGCGCCGGTGCC 8 GGCT-TAATCGGGAA--TGCGGTG 16 AATCCGTGGCTG-TCCCTGCAACTGTAAGCGGATAG SAR_BTUBF -297 CCGACGCCAGAGGTGCC 10 GGCT--AAGAGGGAA--GCCGGTT 2 ATTCCGGCGCTG-CCCCCGCAACTGTAACCGGATAG CO_METE -339 GCCGTTGTCGTGGT-CT 18 AGC--TAAGAGGGAA--GTCGGTG 16 AATCCGGCGCTG-CCCCCGCAACTGTGAGCGGCGAG CO_BTUB -318 GCTTCGCGTCAGGTTCC 8 GGAT-GAAAAGGGAA--CGAGGTT 2 AGACCTCGGCTG-CCCCCGCAACTGTAAGCGGCGAG RPA_HOXN -281 GCGCCCGTTCAGGTGTG 15 CAC------AGGGAA--GCCGGTG 28 AATCCGGCGCTG-CGCCCGCAACTGTGAGCGGTGAG RPA_BTUB3 -448 TGACCAGCGACGGTTCC 6 GGAT-CAATAGGGAA--CGCGGTG 16 ATTCCGCGGCTG-CCCCCGCAACTGTAAGCGGCGAG RPA_CFRX5 -383 TTGACGTCTTCGGTGCC 10 GGTG-AAACTGGGAA--TACGGTG 15 AATCCGTAGCTG-CCCCCGCAACTGTAGGCGGATCT RPA_CRDX -364 TGCCAAGCGATGGTCCT 10 AGGT-GAAAAGGGAA--GCCGGTG 19 ATCCCGGAGCTG-CCCCCGCAACTGTAAGCGACGAG RPA_METE -297 ATCGCCGTCGAGGTTCT 19 AGCT--AAGAGGGAA--GCCGGTG 2 AGGCCGGCGCTG-CCCCCGCAACTGTTAGCGGTGAG RPA_COBT2 -412 CCGCTCGCTTCGGTGCC 12 GGTG--AAACGGGAA--TGCGGTG 16 AGTCCGCGGCTG-CCCCCGCAACTGTAAGCGGATCG RPA_BTUF2 -320 GAGGTTGTACCGGTGCC 13 GGTG--AAACGGGAA--TGCGGTG 15 ATGCCGCAGCTG-CCCTCGCAACTGTGGGCGGATCG RPA_BTUB -304 ATGGCGGTGACGGTTCC 5 GGGATGAAAAGGGAA--TACGGTG 24 AGGCCGTAGCTG-TTCCCGCAACTGTAAGCGGATCG RPA_CBIC CGCGCGCCGACGGTGTC 14 GACG--AAGAGGGAA-TATCGGAA 20 GCGCCGAAGCTG-CCCCCGCAACTGTAAACGGTGAG BPS_HOXN -591 GCTCGCGTTTCGGTGCT 23 AGT--CAAACGGGAA--ACAGGGA 22 CAACCTGTGCTG-CCCCCGCAACGGTAAGCGAAGGC BPS_BTUB -329 GGCGCCGCCTCGGTGCT 16 GGT--TAAACGGGAA--GCAGGGC 22 CAACCTGCGCTG-CCCCCGCAACGGTAAGCGATCGC BPS_COBE -391 TGCGCGCGTTCGGTGCC 22 GCC---CAACGGGAA--ACAGGAA 17 CAACCTGTGCTGCCCCCCGCAACGGTAAGCCGCCTG BPS_COBG -303 GTCCGTCGACCGGCGCC 6 GGC---AAGAGGGAA--CGCAGGG 9 CCGCTGCGGCTG-CCCCCGCAACTGTGAGCAGCGAG NE_BTUB -343 CCCTTGTTTGAGGTGTC 20 GAT--GAAACGGGAA--GCCGGTG 22 ATGCCGGCACTG-CCCCCGCAACGGTAAATGAGTCA MFL_BTUB -327 CCAAGTTTTGAGGTGTC 22 GGTG-AAACTGGGAA--ACAGGTG 23 ATGCCTGTGCTG-CCCCCGCAACGGTAAGCAAGCCG MFL_BTUB2 -411 ACCTCACTTACGGTTTT 19 AAAT--AATAGGGAA--TCCGGTG 16 AATCCGGAACTG-CCCCCGCAACTGTAATCGGTGAG MFL_NRDA -365 ACACCATCTACGGTGTC 22 GA----AACAGGGAA--TGCGGTC 16 AAGCCGCAGCTG-CCCCCGCAACTGTGACCAGTGAG REU_BTUB -252 CCCCCGTTCCAGGTGCT 24 AGTT--CAACGGGAA--ACAGGGA 34 CAACCTGTGCTG-CCCCCGCAACGGTAAGCGACCGC RSO_HOXN -270 CTCACGATGATGGTGCC 7 GGTG--AAACGGGAA--CGCGGTG 2 ATGCCGCGGCTG-CCCCCGCAACTGTAAGCGACGAG RSO_BTUB -388 CGCCGCGTCCTGGTGCC 16 AGTT--AAACGGGAA--GCAGGGA 22 CAACCTGCGCTG-CCCCCGCAACGGTAAGCGAACGC VS 11 9 10 8 9 12 13 26 37 11 76 77 28 28 10 24 10 13 10 11 29 10 12 14 11 12 8 11 9 8 13 7 10 9 8 10 9 9 10 10 10 11 6 11 11 12 29 28 13 13 10 11 10 11 8 12 11 11 10 9 53 70 28 13 10 9 13 15 35 10 59 5 6 AddII 6' 5' VS -==> ======> >< <====== <== gcCACTG YGGGAAGgc GGTGTCACTGAGGCGAA-----CGGCCTCGGGAAGACGGG 9 AAAGCCACTGGGACG---------TTCCCGGGAAGGCGGC 11 CATGCCACTGGCCGGC-------AAGGCTGGGAAGGCAGG 9 TATGCCACTGAAGATT------CGTCTTCGGGAAGGTGGG 9 AATGCCACTGTCGA-----------TGACGGGAAGGCACC 9 GAGACCACTGGGCAA--------AAGCCTGGGAAGGTGTC 16 AAGGCCACTGGACACC-------GCGTCCGGGAAGGCGCC 18 CCAGCCACTGCGCGCG-------TTGCGCGGGAAGGCAGA 9 TTTGCCACTGAATATTGA---AGCTATTCGGGAAGGCGGC 8 GATGCCATTGGCCATGA-----ATCGGCTGATAAGGCGGA 8 AAAGCCACTGGCGT--- 69 ---ACGCCGGGAAGGCGAG 76 AAAGCCACTGGCGT-- 119 AAGACGCCGGGAAGGTGAG 64 AAAGCCACTGAAA---- 15 ----AATCGGGAAGGCGGA 10 CATGCCACTGTGCCCA-------CGGCACGGGAAGGCAGA 10 CATGCCACTGGCGA----------AAGCCGGGAAGGCGGG 9 GCAGCCACTGGAAATCAGA-TGGATTTCTGGGAAGGCGCT 10 AAAGCCACTGAACCTTTA-TGATCGGTTCGGGAAGGCGGT 12 TGAGCCACTGGAGCCAA-----AAGCTCCGGGAAGGCTGG 11 AATGCCACTGGCAA--- 29 --AATGCCGGGAAGGTGTT 8 CATGTCACTGAGGCC--------GGCCTCGGGAAGACGGA 9 CATGCCACTGTTTTTTT----CGGAATGCGGGAAGGCAGA 10 CATGCCACTGAAGC----------AATTCGGGAAGGCGAA 9 TATGCCACTGGGAATCT-----CGGTCCTGGGAAGGCGAC 9 GATGTCGCTGAAGCCTGC---ACGGCTTCGGGAAGGCCGG 10 ACCGCCACTGGGCCGCA------AGGTCCGGGAAGGCCGG 10 GAAGCCACTGGGTCCC-------GGTCCCGGGAAGGCGAC 10 GAGGCCACTGATCCCTG----ACGGGATCGGGAAGGCGGG 18 GATGCCACTGGGGAT---------GCCCCGGGAAGGCCGA 9 CATGCCACTGGGAT----------TTCCCGGGAAGGCCGA 8 ACGGCCACTGATCGC--------AGGATCGGGAAGGCGCA 9 AAAGTCACTGTGGCGC------ATGCCATGGGAAGGCCGC 11 AAGGCCACTGGACCCC------GGGGTTCGGGAAGGCGCT 18 GATGCCACTGGGCCTG------CCGGTCCGGGAAGGCCGG 8 GCAACCACTGGCCCCGAC--CGCGGGGCCGGGAAGGTGGG 7 GAGGCCACTGGCAC----------CAGCCGGGAAGGCGGG 24 AAGACCACTGGCCC--- 18 -AACGGCCGGGAAGGTGAC 10 CATGCCACTGGGCC----------CGCCCGGGAAGGCCGA 8 AСACCCACTGGCCC----------CGGCCGGGAAGGGGCA 9 CACGCCACTGGGCCC--------CGGCCCGGGAAGGCCAG 8 AATGCCACTGGGCCC--------GCGCCCGGGAAGGTCCG 10 GAGGCCACCGGTT------------CGCCGGGAAGGCGCC 9 ATGCCCACTGGCCCAGG-----ACAGGCCGGGAAGGGCGG 9 CAAGCCACTGGCCGCA--------AGGCCGGGAAGGCGGG 23 GACGCCACTGGACCGAA-----AGGGCCCGGGAAGGTTCG 10 GGCGCCACTGGGAT----------GTCCCGGGAAGGCCGG 9 AAAGCCACTGTGGCCTC-----AAGCCATGGGAAGGCCGC 10 GCAGCCACTGGGCCGG- 15 --AGGTCTGGGAAGGCGTG 10 GCAGCCACTGGGCAG-- 10 -CAAGTCTGGGAAGGCGTG 10 CATGCCACTGGTGTCGG- 6 CCAGCACCGGGAAGGCGGG 10 CGTGTCACTGACGCG-- 15 GATGCGTCGGGAAGGCCAG 11 CATGCCACTGGGCCCAA-----AAGGCCTGGGAAGGCGAC 12 TCGGCCACTGGGCAGCA-----CTTGCCCGGGAAGGCGAA 9 CACGCCACTGGGCTTT-------CGTCCTGGGAAGGCGGT 9 GTAGCCACTGACGTCCT-----CGGCGTCGGGAAGGCGGT 7 GAGGCCACTGGGAA----------TTCCTGGGAAGGCGGC 9 AAAGCCACTGGGAGC--------GATCCCGGGAAGGTCGA 10 TCCGCCACTGAG----- 18 -----CTCGGGAAGGCGAC 7 CATGCCACTGACCAGA-------TCGGTCGGGAAGGCGGA 8 GATGCCACTGGGAACCT-----CGGTCCTGGGAAGGCGAC 6 TACGCCACTGGATCA---------TATCCGGGAAGGCCGC 8 CCAACCACTGGACGCAT-----CGCGTCCGGGAAGGTGAA 5 GGTGCCACTGCGCTTC-------GCGCGCGGGAAGGCGAG 5 ACGGCCACTGTCCTC--------GCGGATGGGAAGGCGGC 7 CACGCCACTGGCCAC-- 16 ----CGCCGGGAAGGCCCG 10 CACGCCACTGTGCTGT------ATGGCACGGGAAGGCGCA 20 CATGCCACTGTGAAAGA-----CCTTCATGGGAAGGCGGC 9 CTTGCCACTGGACTT---------GATCCGGGAAGGCCGC 11 AAGGTCACTGGGCCTGG- 5 TGAGGCCCGGGAAGACAGG 10 AACGCCACTGAATC--- 17 -----ATCGGGAAGGCGGC 6 CCAGCCACCGCACG----------ATGCCGGGAAGGCGGC 9 CATGCCACTGTTCCG--------CGGAACGGGAAGGCGGC 6 4' 0' <===== ---> <---<====== rAGYCMGgAgaCCkGCcd TGACCCGCGAGCCAGGAGACCTGCCACGACGAACAAC TGACCCGCGAGCCAGGAGACCTGCCGTCTGCGACAAA AGACCCGCGAGCCAGGAGACCTGCCATCACTGAGTTG TGATCCGTGAGCCAGGAGACCTGCCGACGACGGCAAA TTGATCCCGAGCCAGAAGACCGGCCTGGCAGGCATCG ACACTCCAGAGCCCGGAAACCAGCCCGAGATTTTTGA CGGCTCCAGAGCCCGGAAACCAGCCTTGAAGCAGAAA CTGTCCGTGAGCCAGGAGACCTGCCGTCAAATCGATC ATGATCCGAAGTCAGAAGACCGGCCTGGCGAGATAGA CGACCCGCAAGCCAGGAGACCTGCCATCACCTTGGGC GACGCCGTGAGCCAGGAGACCTGCCATCCGTCAGGGC GACGCCGTGAGCCAGGAGACCTGCCATCCGGCATGGG AGACCCGGAAGTCAGGAGACCTGCCGTATCCGGTCAC TTATCCGCAAGCCAGGAGACCTGCCGTCTTACGTAGT TGAGCCGTGAGCCAGGAGACCTGCCTTGAGCGTGAAC AGACCCGCGAGCCAGGAGACCTGCCTGTTGCATGAGG ATAGCCGCAAGCCAGGAGACCTGCCGTTTCAGGAAAA TGACCCGCAAGTCAGGAGACCTGCCTTGAGCGCAAAT TGACCCGTAAGCCAGGAGACCTGCCATCACGGAAATA TGACCCGCAAGCCAGGAGACCTGCCGCGATAGATAAC AAATCCGTGAGCCAGGAGACCTGCCGTCAAAATGGAA -TGAAGCTTAGTCAGAAGACCGGCCTGGCAGGATAGA CGACCCGCGAGCCAGGAGACCTGCCGTCAGCCGTGGT TGACCAGCAAGCCAGGAGACCGGCCCCGACAATATAT TGAACCGCGAGCCAGGAGACCGGCCGTGCATGTTTTG CGACCCGCGAGCCAGGAGACCTGCCGTCAGCCGTGGT TGCTCCGCAAGCCGGGAGACCTGCCAGCGCGGACGAT GGACCCGCAAGCCAGGAGACCTGCCACCCCCCGGGCC AGACCCGTGAGCCAGGAGACCTGCTTGGACGATCACC CGAGCCGCAAGCCAGGAGACCTGCCAGGCCGAAACCA -GACCCGCCAGTCAGGAGACCTGCCGACACGTCGAAA CGCATTGCAAGCCCGGAAACCAGCCCTGTGACCGCCG AGACCCGCAAGCCAGGAGACCTGCCTGTGATGCGCCC CGACCCGCAAGTCAGGAGACCTGCCATCAGCGTCATC GCATCCGCAAGCCGGGAGACCTGCCAGCGCATGGATT CGAACCGCAAGTCAGGAGACCTGCCATCGCTCTGGCG AGACCCGCGAGCCAGGAGACCTGCTTGGACATTCACC CGAGCCGCAAGCCAGGAGACCTGCCAGGCCAAAGACC TGACCCGCAAAGCAGGAGACCTGCCCGAGCCTTGATG -GACCCGCAAGCCAGGAGACCTGCCATCGCAGACGTT ATGAACCGGAGCCAGAAGACCGGCCTGACGCAGAGGT AGACCCGCGAGTCAGGAGACCTGCCGTCGACGGACCT ACATCCGCAAGCCGGGAGACCTGCCAGCGCTGAGACT CGACCCGCGAGTCAGGAGACCTGCCGTCGAGCGCGCA CGACCCGCAAGTCAGGAGACCTGCCGGAGCGATCACC TGACCCGCCAGTCAGGAGACCTGCCGGCGTTCGATCT TGACCCGTGAGCCAGGAGACCTGCCGGCGTCTGGTCG TGACCCGCGAGCCAGGAGACCTGCCGGCGCACCGGTC TGACCCGGAAGCCAGGAAACCTGCCCTTGGTTGTCGT CGACCCGTGAGCCAGGAGACCTGCCTCGACAGATAAC TGACCCGTGAGCCAGGAGACCTGCCCGGCGCAGTCGT CGACCCGTGAGCCAGGAGACCGGCCTGAGTACGTCAT CGACCCGCGAGCCAGGAGACCTGCCGTCAGTCGTGGT ATATCCGTGAGCCAGGAGACCGGCCGAAGACGGGAAG CGACTCGCGAGCCAGGAGACCTGCCATCGCGTATTGT TGACCCGCGAGCCAGGAGACCTGCCTCGTCGAACGAA ATGTCCGCGAGCCAGGAGACCGGCCGAAGTCCGCAAC ATATCCGCGAGCCAGGAGACCGGCCGGTACAAGGTGT TCAACCGCGAGCCAGGAGACCTGCCGTCATTCGTGGT CGACCCGTGAGCCAGGAGACCTGCCGTCGCCTGCTAT GTTTTCGTCAGCCCGGATACCGGCCGAGACACGGGGC GACGTCGCGAGCCCGGATACCGGCCGAGGCGGGGAGG CACGCGGCCAGCCCGGATACCGGCCGACGCACGGGGC CGACCTGCCAGCCAGGAGACCTGCCGGGACGTTTCGT CCGCTCATAAGTCCGGAGACCGGCCTGAAGCAATATC AATCTTGCAAGCCCGGAGACCGGCCTGAAAACGATCA TGACCCGAGAGTCAGGAGACCTGCCGCAAGTGAGCTA GGACCTGGGAGCCAGGAAACCTGCCGTAGATCATTTT GATGTCGTCAGCCCGGATACCGGCCTGCAGAACGAGG TGACGCGCGAGCCAGGAGACCGGCCATCTCCTTCTGT CGGTTCGCCAGCCCGGATACCGGCCAGGACAGTGGGT Allignment of B12-elements (continued) Gamma-proteobacteria, the Bacillus/Clostridium group EC_BTUB SY_BTUB SY_CBIA KP_CBIA KP_BTUB YP_BTUB YE_BTUB YE_CBIA EO_BTUB VC_BTUB PA_BTUB PA_BTUB2 PA_COBW PA_COBG PA_CBTAB PP_BTUR PP_BTUF PP_BTUB2 PP_COBW PP_CBTAB PU_BTUR PU_COBW PU_CBTAB PY_COBW PY_BTUR PY_CBTAB PY_BTUF SON_BTUD SON_BTUB AV_BTUB XAX_BTUB BS_BTUF ZC_METE HD_ACHX HD_BTUF HD_METE HD_COBT HD_NRDA BE_NRDA BE_BTUF BE_CBIW BI_CBIW LMO_X LMO_CBIA CA_BTUF CA_CBIM CPE_CBIM CPE_CBIK CPE_BTUF CPE_CBLT CB_CBIP DF_CBIM DF_CBIP DF_BTUF THT_BTUR THT_BTUF EF_BTUF HMO_CBIM HMO01408 HMO_CBIQ HMO_CBLS HMO_CBID DHA_BTUF DHA_CBIET DHA_CBLS DHA_CNOA DHA_NRDD DHA05379 -248 -252 -265 -264 -245 -324 -288 -282 -360 -326 -297 -297 -305 -244 -245 -334 -302 -319 -299 -309 -300 -302 -335 -331 -298 -303 -321 -303 -332 -302 -327 -237 -309 -377 -401 -322 -247 -345 -318 -333 -346 -566 -318 -332 -332 -307 -480 -294 -482 -537 -317 -367 -287 -393 -337 -352 -340 -396 -271 -382 -306 -294 -297 -298 -282 -352 -316 -325 0 1 1' 2 AddI 2' 3 3' 4 VS 5 6 AddII 6' 5' VS 4' 0' ======> -===><=======> >< <==== ===> <== =====> -==> ======> >< <====== <== <===== ---> <---<====== hgGtkcy rg aa aGGGAA cgGtg a tCcg RCdG-ycCcCGChaCKGTra gcCACTG YGGGAAGgc rAGYCMGgAgaCCkGCcd ATCCACTTGCCGGT-CCTGTGAGTT--AATAGGGAA--TCCAGTG 2 AATCTGGAGCTG--ACGCGCAGCGGTAAGGAAAGGT 16 GCGGACACTGCCAT----------TCGGTGGGAAGTCATC 19 ACCCCTCCAAGCCCGAAGACCTGCCGGCCAACGTCGC ATCCGTGGGCCGGT-CCTGTGAGTT--AATAGGGAA--TCCAGTG 2 AATCTGGAGCTG--ACGCGCAGCGGTAAGGAAAGGT 15 GCAGACACTGCCTC-----------CGGCGGGAAGTCATC 24 AACCCTCCAAGCCCGAAGACCTGCCGGCTAACGTCGC GTAAACCAACAGGTTTG 12 T--------AGGGAA--GGGGGTG 2 AATCCCCCGCAG-CCCCCGCTGCTGTGATGCTGACG 8 AAGACCACTGATCGC--------AAGATTGGGAAGGACGG 6 AGGACGCTAAGCCAGAAGACCTGCCTGTCGGTGATAA ACAAACCGACAGGTTCG 15 C--------AGGGAA--GGGGGTG 2 AATCCCCCGCAG-CCCCCGCTGCTGTGATGCTGACG 8 AAAACCACTGATCGA--------AAGATTGGGAAGGGCGG 6 ACGAGGCTAAGCCAGAAGACCTGCCTGCCGGTAACTG ATTCGCCTACCGGT-CCTGTGAGTT--AAAAGGGAA--CCCAGTG 2 AATCTGGGGCTG--ACGCGCAGCGGTAAGGAAGGTG 19 GCAGACACTGCGGCT--------AGCCGTGGGAAGTCATT 11 CAGCCTCCAAGCCCGAAGACCTGCCGGAATACGTCGC CATTGTGGTCCGGC-CT 22 AGAGTTAAAAGGGAA--TCCGGTG 2 AATCCGGAGCTG--ACGCGCAGCGGTAAGGGGAAGT 18 ACAGACACTGTCCGC--------AAGGATGGGAAGTCATC 67 GAGATCCTAAGCCCGAAGACCTGCCGGTATTACGTCG CATTGCGGTCCGGC-CT 22 AGAGTTAAAAGGGAA--TCCGGTG 2 AATCCGGAGCTG--ACGCGCAGCGGTAAGGGGAAGT 18 CCAGACACTGTCCGT--------AAGGATGGGAAGTCATC 32 GAGATCCCAAGCCCGAAAACCTGCCGGTATACGTCGC ATACTGAAACAGGTATG 15 T--------TGGGAA--GGGGGTG 2 AATCCCCCGCAG-CCCCCGCTGCTGTGATGCTGACG 8 GAGACCACTGATCCAT-------AGGATTGGGAAGGTAGC 8 GTGACGCTAAGCCAGAAGACCAGCCAAATCAGTAAAG GATGAGCGTCCGGC-CTT 7 AAGTC-AAAAGGGAA--TCCAGTG 2 AATCTGGAGCTG--ACGCGCAGCGGTAAGGAATGCC 17 GCAGACACTGTTAT--- 80 ----CGATGGGAAGTCATC 45 CGGCATCCAAGCCCGAAGACCTGCCGGAATACGTCGC AGCGCCAAGCTGGTGCT 26 GGCT-GAAAAGGGAA--TCCGGTG 2 ACTCCGGAACTG--ACGCGCAGCGGTAAGAGAGAAC 9 AACGACACTGCTTTT---------CGAGTGGGAAGTCGAG 14 GTGCTCTCAAGTCCGAAGACCTGCCAGCAACTGAGTT GCCTTGCGACAGGTGCC 8 GGTG-AAACAGGGAA--GCTGGTG 15 AGGCCAGCGCTG-CCCCCGCAACGGTAGGCGAATCA 12 ATGACCACTGTGCTC--------CGGCATGGGAAGGCGCG 19 TCGCTCGCGAGCCCGGAGACCGGCCTGACGCACCCAC GGCCCGTTCCAGGTGCC 18 GGTG--AAACGGGAA--GCCGGTG 16 AGTCCGGCGCTG-CCCCCGCAACGGTAAGCGAGCGA 9 CAGGCCACTGTGCTC--------CGGCATGGGAAGGCGAG 9 ACCCTCGCAAGCCCGGAGACCGGCCTGCAACGCCCTG TGCCGGTTCGAGGTTCC 16 GGC--TAAGAGGGAA--CGCGGTC 1 ATGCCGCGGCTG-CCCCCGCAACTGTGAACGGCGAT 8 AATGCCACTGCGTG-----------ACGCGGGAAGGCGGG 16 CAGACCGTGAGCCAGGAGACCTGCCTCGTCGATCCCG GCGCGTTCGTCGGTGCC 37 ------AAGAGGGAA--CACGGAG 25 TAGCCGTGGCTG-CCCCCGCAACTGTATGCAGCCTG 11 TTCGCCACTGGAT------------TACCGGGAAGGCGGC 33 CGGGCTGCGAGCCAGGAGACCTGCCGCCGAAACCAGT GGGTTGTCCCAGGTGTC 17 AGGT-GAAACGGGAA--GCCGGTG 14 AGTCCGGCGCTG-CCCCCGCAACGGTAAGCGCATC---------------------------------------------------GCGCGCGAGCCCGGAGACCGGCCTGGAACCTTTCG GGCGTGTTTCAGGTGCC 21 GGTG-AAACTGGGAA--GCCGGTG 17 ATTCCGGCGCTG-CCCCCGCAACGGTGGATGAGTAA 10 AGGGCCACTGGATGCC------AGCATCCGGGAAGGCGCG 17 CCACTCACAAGCCCGGAGACCGGCCTGATACTGCCAA TGCGGGCCGCCGGTTTC 7 GAAC-TAACAGGGAA--TCCCAGG 15 CAATCGGAACTG-CCCCCGCAACTGTAGGTGCCGAG 11 GATGCCACTGGGCCTG-------CCGCCCGGGAAGGCCGG 11 -GACGCACCAGTCAGGAGACCTGCCGGCCTACATTCA CGCCAGTTTCAGGTGCC 18 GGTG--AAACGGGAA--ACCGGTG 19 AGTCCGGTGCTG-CCCCCGCAACGGTAAGCGAGCGA 9 GATACCACTGTGCTC--------AAGCATGGGAAGGTGAA 9 CCCCTCGCAAGCCCGGAGACCGGCCTGGAGCTTCACT TGCCACTTCGAGGTTCT 13 AGCT-AAGACGGGAA--CGCGGTA 1 AAGCCGCGGCTG-CCCCCGCAACTGTAAGCACCGAC 11 ACAGCCACTGCGCCA--------ACGCGCGGGAAGGCGTC 27 AACGGTGCAAGCCAGGAGACCTGCCTCGTCACGTTTT CCTCGCGTTCAGGTGCC 8 GGTG--AAACGGGAA--ACCGGTG 29 ATGCCGGTGCTG-CCCCCGCAACGGTAAGCGAGTGA 6 TGTACCACTGTGCCTCGT-AGTACGGCATGGGAAGGTGAC 20 TTCCTCGCAAGCCCGGAGACCGGCCTGGCGTTCATGA GGCTTGTTTCAGGTGCT 19 AGTG-AAACAGGGAA--GCCGGTG 30 ATCCCGGCGCTG-CCCCCGCAACGGTAAATGAGTAA 11 GATGCCACTGCTTA----------ACAGCGGGAAGGCGCG 14 CCGCTCATGAGCCCGGAGACCGGCCTGATCCATCCAG TGCGCTTTCGAGGTTCT 14 AGCT-AAGAAGGGAA--CGCGGTC 1 AAGCCGCGGCTG-CCCCCGCAACTGTGAACGGTGCT 9 CACGCCACTGCCAA--- 12 ---CCAGCGGGAAGGCGCA 22 AACACCGTCAGCCAGGAGACCTGCCTCGTCACAGATT AACTTGTTACGGGTGCC 8 GGTG--AAACGGGAA--ACCGGTG 29 AGTCCGGTGCTG-CCCCCGCAACGGTAAGCGAGCGA 6 AGATCCACTGTGCCCA------CGGGCATGGGAAGGTGAC 23 CCCCTCGTGAGCCCGGAGACCGGCCCGCAACACACAG TGCCGGTTCGAGGTTCT 25 AGCT-AAGACGGGAA--TGCGGTA 1 ATGCCGCAGCTG-CCCCCGCAACTGTAAACGGTCAT 9 ACAGCCACTGCTG------------CGGCGGGAAGGCGCG 39 GCTGCCGTGAGCCAGGAGACCTGCCTCGAACCGGGCT GGCTTGTTTCAGGTGCT 20 GGTG-AAACAGGGAA--GCCGGTG 16 ATCCCGGCGCTG-CCCCCGCAACGGTAAATGAGTCA 11 CGTGCCACTGTGTTT--------CGACACGGGAAGGCGCG 13 CCGCTCATGAGCCCGGAGACCGGCCTGAACCACTCAA ACCTTGTTTCGGGTGCC 8 GGTG--AAACGGGAA--ACCGGTG 18 AGTCCGGTGCTG-CCCCCGCAACGGTAAGCGAGAGA 5 TGATCCACTGTGCTC--------TGGCATGGGAAGGTGAC 30 CCCCTCGCGAGCCCGGAGACCGGCCCGACATTTTTCC TGCCGGCCGTCGGTTTC 6 GAAC-TAACAGGGAA--TTCGCCA 17 AAAACGAAACTG-CCCCCGCAACTGTAGGCATCGAG 11 ACTGCCACTGGATTC--------AGATCCGGGAAGGCCGG 11 -GACATGCCAGTCAGGAGACCTGCCGACCCGATTCAA --------------------------TAATAGGGAA--TCGGGGC 13 CAGCCCGAACTG-TACCCGCAACTGTGAGTAGTTAA----------------------69------------------------TTTTCTACAAGTCAGGAGACCTGCCTATTGCTGTTTT CAACCTTCTGTGGTGCT 18 AGA--TAATCGGGAA--GCCAGTG 2 ATTCTGGCACTG-CCCCCGCAACGGTAAAAGGTGAG----------------------89------------------------TATAGCCTAAGTCCGGAGACCGGCCCTAAAGGTGTTT GCCTCGCTTCAGGTGCC 5 GGTG-AAACAGGGAA--GCCGGTG 24 AGGCCGGCGCTG-CCCCCGCAACGGTAGACGAGTCG 10 ATAGCCACTGTGTTGC-----TCGGACACGGGAAGGCGCG 25 TCGCTCGTGAGCCCGGAGACCGGCCTGTGGCGATCCA CGCGCCCCTGAGGTGAC 16 GTTT--AAACGGGAA--TCCGGTG 24 ATTCCGGAGCTG-CCCCCGCAACGGTGGGCGAGGTC 11 TACGCCACTGTGCAG--------TCGCATGGGAAGGCGCG 19 CCACTCGCAAGCCCGGAGACCGGCCTGAGGGATTGAC AATGTCAAATAGGTGCC 18 GGCT-TAAAAGGGAA--ACCGGTA 1 AAGCCGGTGCGG-T-CCCGCCACTGTAATTGGCCAA-------------------------------------------------GCGCCAAGAGCCAGGATACCTGCCTGTTTGATCAGC AAAGGAAAATAGGTACA 16 TGTT-TAAAAGGGAAG-CTTGGTG 2 ACTCCAACACGG-T-CCCGCCACTGTAAATGCTGAG 9 TGGTGCCACTGTGA-----------AAACGGGAAGGTAAA 10 TGAAGCATAAGTCAGGAGACCTGCCTGTTTTAACAAC CTCAAGCATTAGGTGGT 16 ATCT-GAAAAGGGAA--GCTGGTG 2 AGTCCAGCACGG-T-CGCGCCACTGTAATAAGGAGC 10 GAAACCACTGTCCAA---------AGGATGGGAAGGTACA 9 -TTATCTTAAGTCAGGAGACCTGCCTAATGTATGCAC TCGCGCTGAAGGGTCGT 11 GCGT-GAAAAGGGAA--GTCGGTG 2 AATCCGACACGG-T-CCCGCCACTGTAAATGGGAGA 8 AGATCCACTGTCTA----------GCGACGGGAAGGGGGC 9 ATGAACATAAGTCAGGAGACCTGCCTTTCAGTTTGAG GTTTGGGAACAGGTACG 22 TGTT-TAAAAGGGAA--TCCGGTG 2 AATCCGGAGCGG-T-CCCGCCACTGTCATAGCTGAG 10 ATTGTCACTGACCGTTС-----ATTGGTTGGGAAGACTGT 8 TGACGCTAGAGCCAGGAGACCTGCCTGTTCTAACAGC TAGGCTTCTTAGGTGCC 9 GGA--GAATAGGGAA---GTTCTG 2 A---CGACGCGG-AGCCCGCCACTGTAGTCGAGGAG 7 AATACCACTGGGA------------AACTGGGAAGGTGTA 8 -TGAATCGGAGCCAGGAGACCTGCCTAAGAAGATGCG GTGGACGGTAAGGTGCC 6 GGCT-TAAAAGGGAA--TCTGGTG 2 AATCCGGAGCTG-TCCCCGCAACTGTGAGTGCTACG 10 TTTGCCACTGTACATC- 14 AAATGTATGGGAAGGCTTC 8 TAAAGCACGAGTCAGGAGACCTGCCTTACTTCCACAA TGCCAAGCAATGGTGTC 6 GACT-TAATAGGGAA--TCCGGCG 2 AATCCGGAACTG-CCCCCGCAACTGTATGTGCGGAC 8 ATGGCCACTGGCGGCA- 14 -CGCCGCTGGGAAGGCCCC 9 CGATGCACGAGTCAGGAGACCTGCCTTGCTTGGAACG ATTCGCAGCAAGGTGCC 6 GGCT-TAATAGGGAA--TCCGGTG 2 AATCCGGAGCTG-TCCCCGCAACTGTCAATGCGGAC 8 ATCGCCACTGTACGGAC 18 -TCCGTACGGGAAGGCTTC 9 TGAAGCATGAGCCAGTAGACCTGCCTTGCTTGCCGCA AGCCTGCTTAAGGCTTGGGT-AG----AAAGGGGAAG-CCCGGTG 3 AATCCGGCACGG-TGCCCGCCACTGTGGTGGGGAGC 10 CAAGTCACTGAAGGA--------TGCTTCGGGAAGACGCC 8 ATGATCCTAAGTCAGGAGACCTGCCTTGTTTGGATCG GCAACAGTAAAGGTGCC 5 GGCT-TAATAGGGAA--ACTGGTG 2 AGACCAGTACTG-CCCCCGCAACTGTAAGTGTGGAC 8 ATAACCACTGTGAAAA-------AATCACGGGAAGGTTCT 9 TGATACACAAGTCAGGAGACCTGTCTTTATTGTGAAG GGTCTTATGTTGGTGGA 12 TTCT-GAAAGAGGAA--TTCGGTG 2 ATGCCGAAACTG-CCCCCGCAACTGTAAGGTGGACA 9 ATAACCACTGTACGTTTT---TAGCGTATGGGAAGGTTCG 8 ATGAAGCCAAGTCAGGATACTCGCCAAATAAGACGGA ACAACTAAATAGGTGAA 4 TTA---ATCCGGGAA--AGAGGTG 2 AATCCTCTACAGGCCCTAGCTACTGTAATACGGACG 11 TATGTCACTGGAAGC--------AATTCCGGGAAGACTGG 8 ATGATGTTAAGTCAGGAGACC-GCTTTTATATTCGAT ACCATATTTTAGGCACC 8 GGTT-TAATAGGGAA--ATTGGTG 2 AATCCAATGCAA-CCCCCGTTACTGTATACAGTTAC 7 ATGTCCACTGGAGTT--------TTCTCTGGGAAGGATGG 7 TAAACTGTGAGCCAGGAGACCTACCTAAAATATTATG TAAAATTTGTAGGTTCA 16 TGAT-TAAAAAGGAA--TCAGGTG 2 AAGCCTGAGCGG-T-CCCGCCACTGTAATAAAGGAG 11 TATGTCACTGGGA------------AACTGGGAAGGCGTA 10 -GATTTTTGAGCCAGGATACTTGCCATATTCTAGTAT ATTTAGAAATAGGTTAA 20 ATAT-TAAAAGGGAAG-TTGGGTT 2 AATCCCACGCGG-T-CCCGCCGCTGTAATAGAGGAG 12 TAAGCCACTGGAATATA-----ATATTTTGGGAAGGCCAC 9 TGATACTTGAGCCAGAAGACCTGCCTATTTTTAAAAC TTATATTTTTAGGTTTG 4 TAAT-TAAAAGGGAA--AGTGGTT 2 AGTCCACTACAG-CCCCCGCTACTGTGATAGGATAC 10 TTGACCACTGATTATA-------TAAATTGGGAAGGGAGA 8 TAAGCCTTAAGTCAGGATACCTGCCTAAAGATCATGA AACTAATAATTGGTGTG 5 CGCT-TAATAGGGAA--TGAAGTT 2 AGTCTTCAACTA-CC------TCAGTAACCGTGAAG 15 TATGTCACTGCATTT-------TTTGTGTGGGAAGACGAG 7 AAGAAGCAAAGTCGGGATACCTGCCTTTTATTTAAGT TAAGAGCATTAGGTGTT 4 AACT-TAATAGGGAA-----AGTT 2 AAACT---GCAG-CCCCCGCTACTGTTGATAAGGAC 8 AAAGCCACTGTGATAA-----ATAGTCATGGAAAGGATTG 9 -GATTTATTAGCCAGGAGACCTGCCTAGTATGCTATT AAAAAGATTTAGGTGCC 11 GG-T-GAAAAGGGAA--TGTGGTA 2 A-GCCACAGCAG-CCCCCGCTACTGTAATTGAGGAC 10 TAAACCACTCTTTA----------AAAAGGGGAAGGGAAA 8 TGAATCATGAGCCAGGAGACCTGCCTAGATTTTTATT ATAATATTATAGGTTCT 7 AGAT-TAATAGGGAA--AAAGGTT 2 ATTCCTTTACAG-CCCCCGCTACTGTGATGCAGACG 9 TTAGCCACTATGATG-- 13 ---CTCATGGGAAGGAAAA 8 ATGAAGCTAAGTCAGGAGACCTGCCTAAAATATTAAA GATTAAAATTAGGTTCT 5 AGA 4 AAAAGGGAA--AAAGGTT 2 ATGCCTTTGCAG-CCCCCGCTACTGTGAAACCAACG 8 AATACCACTGTCAGT---------TTGATGGGAAGGTTAT 9 ATGAAGTTAAGCCAGGATACCTGCCTAATTTAATTTA AATAATACTAGGGTACT 5 AGTT-TAATAGGGAA--AGTGATG 2 AATTCACTACAG-CCCCCGCTACTGTATACGGATAC 7 AAATCCACTGAAATTTAT--AAAAATTTTGGGAAGGGTGA 7 AAAGCCGTGAGTCAGGAGACCTGCCCAGTATTATATA TAAAGCCTTATGGTCCC 5 GGGT-TAAAAGGGAAG-ACGGGTG 2 AATCCCGCGCAG-CCCCCGCTACTGTGAGGGAGGAC 10 TAAGCCACTGTCCGG-- 60 --CCGGGTGGGAAGGCAGG 8 TGAGTCCCGAGCCAGGAGACCTGCCATAAGGTTTTAG AAAAGCCTTATGGTCCC 5 GGGT-TAAAAGGGAAG-ACGGGTG 2 AATCCCGCGCAG-CCCCCGCTACTGTGAGGGAGGAC 10 TAAGCCACTGTCCGG-- 60 --CCGGATGGGAAGGCAGG 8 TGAGTCCCGAGCCAGGAGACCTGCCATAAGGTTTTTA TGTACTTATGAAGTGTC-------------AGGGAA--AGAGGTG 2 AATCCTCTACAG-ACCTACCTACTGTATGGTGGATG 8 AAGACCACAGATT------------ATTCTGGAAGGATTG 8 AAGAAGCTAAGTCAGGATACCGGCTTGATAAGTCTAA TTGCTGGAACAGGTCGC 20 GCGTTAGAAAGGGAAG-TTCGGTG 2 AATCCGACGCGG-T-CCCGCCACTGTAAGGGGAATG 10 AATGTCACTGGCGTTT------AAGCGCTGGGAAGACGGA 8 ATGAACCCGAGCCAGGAGACCTGCCTGTTACCACGTC CTACGGTTACAGGTGCC 6 GGA--GAATAGGGAA--CCGGGTG 2 AATCCTGGGCGG-T-CCCGCCGCTGTATGGTCGAGT 9 TGAGCCACTTCGT------------GTGAGGGAAGGCGCC 7 ATTAAGCCGAGCCAGAAGACCTGCCTGTACACTGTTC CGATGTCTGCAGGCGCC 5 GGCT-GAAAAGGGAA--TGAGGTG 2 AGACCTCAGCAG-CCCCCGCTACTGTATGGGAAGAC 12 GAATCCACTGGACTG--------CCGTCTGGGAAGGAAAC 9 TGATTCCTGAGCCAGGAGACCTGCCTGTCGCGACAAA TAACCGTTTCAGGTGCC 8 GGA--GAATAGGGAA--CTGGGTG 2 AATCCCGGACGG-A-CCCACCACTGTAAGAGGAGCT 8 TTGGCCACTGGGA------------TTCTGGGAAGGCGTG 7 ATGATTCGGAGTCAGGAGACCTGCCTGTAACGCTCGG ATGATGCAAAGGGTGGC 34 GTC 6 ATTAGGGAA--GTCGGTG 2 ATTCCGACGCGG-T-GCCGCCACTGTGAAAGGGGAG 10 CAGGCCACCGGGT------------AACCGGGAAGGCGAA 8 ATGAACCTGAGCCAGGAAACCTGCCTGTCCCCGCACC AACTATTGACAGGTTTA 6 TAAT-GAAAAGGGAA--TCAGGTG 2 AATCCTGAGCAA-CCCCCGTTACTGTAAGCGCCGTT 17 CATGCCACTGGCGA----------AGACTGGGAAGGCGAT 4 AAAGGCGCGAGCCAGGAGACCTGCCTGTTAATAAAAC ATAGTATTCAAGGTTCC 8 GGAA-GAAAAGGGAA--GCCGGTG 2 AATCCGGCGCGG-T-CCCGCCACTGTGAACCACGAG 9 AGGCCCACTGGGATG---------AGCCTGGGAAGGGAAG 7 AAGACTGGAAGCCAGGAGACCTGCCTTGAACATTGCG TAAGGATTTCAGGTGCC 13 GGA--GAATAGGGAA--CCGGGTG 2 ATTCCCGGACGG-A-CCCGCCACTGTAAAGAGGAGT 10 AATGCCACTGGGT------------AACTGGGAAGGCAGC 9 ATGACTCGAAGTCAGGAGACCTGCCTGGATCCGGGGA ACATAGCTTAAGGTGCC 5 GGA--GAATAGGGAA--ACCGGTA 2 AGTCCGGTGCGG-A-TCCGCCGCTGTAATCGGAGAC 10 AATGTCACTGTCTTTTT-----TAGAGATGGGAAGGCGTG 9 TGACACGAGAGCCAGAAGACCTGCCTTTTAGAAAGCT GGAATCTCATAGGTGAC 12 GTT--GAAAAGGGAA--GCCGGTT 2 AGGCCGGCACGG-T-CCCGCCGCTGTAAGGGAAATA 11 ATTACCACTGAAAGG---------GTTTCGGGAAGGTAAG 8 ATGATCCTAAGTCAGAAGAC-TGCCTATGTGTATACC AGGCGGAATAGGGTTGC 6 GCAT-TAATAGGGAAC-TCCGGTG 2 AAGCCGGGACAG-C-CCCGCTACTGTAAGAAGGACG 11 GGATCCACTGGTGA----------AAACCGGGAAGGTAAG 8 ATGAGTTCAAGTCAGGATACCTGCCCCATTCCGGAAA Allignment of B12-elements (continued) (Actinobacteria, Cyanobacteria, The CFB group, Thermotogales, The Thermus/Dienoccoccus group and some others) DI_BTUC MT_CBTG MT_METE ML_CBTG ML_METE RK_CHLID RK_COBN RK_CBTE RK_BTUF SX_CBIM SX_METE SX_PDUX SX_BTUF SX_NRDA SX_BTUC SX12454 TFU_COBN TFU_CHLID TFU_CBTE PI_CBIB PI_CBIL PI_MUTA PG_BTUB4 PG00461 PG_BTUF PG_NRDD PG_X_CBTD BX_BTUB BX_PCCC BX_BTUB4 BX_NRDA BX_CBTD BX_METE BX_NRDD CL_BTUB2 CL_X_CBIM CL_X_FRD CL_X_NRDJ CL_BTUB AN_X_CBIJ AN_CFRX AN_COBG TE_X_METE TE_CBIX CY_HUPE SN_HUPE PMA_HUPE DR_BTUFC DR_BTUFR DR_ACHX LI_CBIX LI_BTUB FN_BTUF FN_BTUB TM_BTUF CAU_BTUR CAU_BTUF GME_COBU TDE_CBTF TDE_ROCG TDE_BTUF -246 -309 -362 -270 -369 -224 -260 -137 -153 -209 -387 -365 -190 -271 -311 -204 -299 -225 -134 -164 -185 -253 -526 -556 -354 -342 -228 -371 -344 -344 -280 -269 -264 -210 -231 -227 -498 -265 -364 -153 -187 -152 -160 -141 -160 -210 -232 -236 -312 -270 -365 -279 -276 -240 -224 -268 -397 -290 -520 -490 -371 0 1 1' 2 AddI 2' 3 3' 4 VS 5 6 AddII 6' 5' VS 4' 0' ======> -===><=======> >< <==== ===> <== =====> -==> ======> >< <====== <== <===== ---> <---<====== hgGtkcy rg aa aGGGAA cgGtg a tCcg RCdG-ycCcCGChaCKGTra gcCACTG YGGGAAGgc rAGYCMGgAgaCCkGCcd CACATTGATTAGGTGCA 12 TGC-----ATGGGAA--TCTGGTG 2 AATCCAGAGCTG-A-CGCGCAGCGGTGAAGGTGCAA 14 GTAGCCACTGAGAGTATA--AAAACTCTTGGGAAGGTGAG 17 GAGCACCCCAGTCCGAAGACCGGCCTAATCAGAAACA TCAGGCGATGACGAT--------------GCAGGAA--GCCGGTG 2 AATCCGGCGCGG-T-CCCGCCACTGTCACCGGGGAG 9 TAAGCCACGGCCAC-----------AGGCTGGAAGGCGAG 8 CGATCCGGGAGCCAGGAGACTCGCGTCATCGCGTCCT ACCACGCAGCTGGTCTG-48-------GAGAGGGAA--CCTGGTG 2 AATCCGGGACTG-T-CCCGCAGCGGTATGCAGGAAC 20 ACAAGCACTGGTCTCA-------ACGACTGGGAAGCGACG 17 GAGCCTGCGAGTCCGAAGACCTGCCAGCCGTGCCGGA AAAGGCGATGACGATGC--------------AGGAA--GTCGGTG 2 AAGCCGGCGCGG-T-CCCGCCACTGTAATCGGGGAG 9 TAGGCCACGGCCAT-----------TGGCTGGAAGGCGAG 8 TGATCCGAGAGCCAGGAAACTCGCGTCATCGCGTCCT GCTGGTCTGCTGGTTCC 44 ------GAGAGGGAA--CCCGGTG 2 AATCCGGGACTG-T-CCCGCAGCGGTATGCAGGAAC 11 GGAAGCACTGGTCTTA-- 8 -CGAGACTGGGAAGCGATG 18 GCGCCTGCGAGTCCGAAGACCTGCCGGCTGTGTCGGG AAGACAATCGAGGTGCC 8 GGA--TAATCGGGAA--GCCGGTG 2 AATCCGGCACAG-G-CCCGCTGCGGTGACCCGGGAG 22 GCAGCCACTGGACCGG------CCGGTCCGGGAAGGCGAT 11 CGACCGGGAAGTCCGAATACCGGCCTCGATTTCAGCT CCACCTGCCGTGGTGCT-------------CGGGAA--GCCGGTG 2 AGACCGGCGCGG-CCCTCGCCACTGTGAGCGGGTAG 35 GAGACCACTGGACGG--------AAGTCCGGGAAGGTCGG 11 TGATCCGTCAGCCAGGAGACCGGCCACGGCGCGGGAA CACACGTGCCGAGGTGC-------------AGGCAA--TCCGGTG 2 AGTCCGGAGCGG-T-CGCGCCACTGTGACCGGGCGA-----------------------1------------------------CCGCCCGGGAGTCAGAAAACTGTCTCGGCGCATGGAT GCTGACGCCCGTGC----------------AGGGAAAGTCCGGTC 2 AGTCCGGCGCTG-A-CCCGCAACGGTAGGCCGTCCA-----------------------1------------------------CCGACGGTGAGCCCGATCGCCTGCACGGGGGTGCGCG CGCCACGCCTTGGTG------------AACGGGAAA--TCCGGTG 2 ATGCCGGTGCGG-CCCTCGCCACTGTGAATCGGGAA 21 GCAGCCACTGGATCGCT---TGCGGTCCGGGAAGGCGGA 12 GTACCCGTAAGCCAGGAGACCGGCCAAGGCGCGTCGT CCCGTGCAGCTGGTTCG 21 CGTCGCAAGAGGGAA--CCCGGTG 2 AATCCGGGACTG-C-CCCGCAGCGGTGAGCGGGAAC 10 ATACGCACTGGGCCCG- 6 -CGGGCCCGGGAAGCGACG 29 GGGCCCGCGAGTCCGAAGACCTGCCACCTGCCCGCGC TGCCCGCAGTTGGTTCG 30 CGACGCAAGAGGGAA--CCCGGTG 2 AATCCGGGACTG-T-CCCGCAGCGGTGAGTGGGAAC 10 AACAGCACTGGGCC-- 13 ---AGCCCGGGAAGCGACG 40 GCGCCCACGAGTCCGAAGACCTGCCACTGCGCCCGTA TCGCCGCGACGGGAG--------------ACAGGAA--GCCGGTG 2 AATCCGGCACGG-T-CCCGCCACTGTGACCGGGGAG 10 CACGCCACTGCGCGC--------CGCGCGGGAAGGCCAG 10 CGATCCGGGAGTCAGGACACTGGCCTGTCGCGGGCCC ---TCGCTGTCGCCGC-------------AGGGGAA--TCCGGTG 2 AATCCGGAACTG-T-CCCGCAACGGTGTACTTGCGT----------------------38------------------------CGCCTGTCCAGTCCGAGGACCTGCCGACAGTGCGCCC CGAAGCGCCTCGTGG---------------GGGAA-GTCCGGTC 2 AGTCCGGCGCTG-A-CCCGCAACGGTAGGCGGAGCC-----------------------8------------------------GGTCCCGTGAGCCCGATTACCCGCGGTGGTGAAGCCC CAGGGCGACGACGGTC-------------CGAGGAA--GCCGGTG 2 AATCCGGCGCGG-T-CCCGCCACTGTGATCGGTGAG 11 GTTGCCACTGCCCCGG------AGGGGCGGGAAGGCCGG 9 TGACCCGGGAGCCAGGAAACTCACGTCGTCGCCTCCT TGCGCTATGGTGGTCGC 3 GTGGT-GAACGGGAA-GACCGGTG 2 AGACCGGCGCGG-CCCTCGCCACTGTGATCGAGGAG 30 GCGGCCACCGGGCAC-------CAGCCTGGGAAGGTCAG 11 TGACTCGTCAGCCAGGAGACCGGCCACGACGCGTCAT GGAACCGCCGAGGTGCC 11 GGA--TAATCGGGAA--GCCGGTG 2 AATCCGGCACAG-G-CCCGCTGCGGTGACCTGGGAG 20 GCAGCCACTGGACGG-------CAGTCTGGGAAGGCGAT 10 TGATCAGGAAGTCCGAAGACCGGCCTCGGCATGGCTG AAGAGCGTCGGGTGC----------------AGGCA-ATCCGGTC 2 AGTCCGGAGCGG-T-CGCGCCACTGTAGACGGGCTC------------------------------------------------AAGCCCGTGAGCCAGAAAACTCACCCGGCGTAGTGGT CGGCCAGCGCGCGTCCG------------CAGCGAA--GCCGGTG 2 AATCCGGCGCTG-T-CCCGCAACGGTGATGGGGCCC------------------------1---------------------- GCCCCG47CAGCCCCACGAGCTGCCTGCGCGTGCACC TCCCGGGCACCGGATGA 33 ---------GAGGAAT-GCCGGTG 23 AGTCCGCGACGG-T-CCCGCCACTGTGAGCCGGTGA-------------------------------------------------AGCCGGCGAGTCAGACACTCCGCCGGTGCCGCTGAC CTAGTAGTGCTGGTTCG 16 CGTCGCAAGAGGGAA--TCCGGTG 3 ATTCCGGAACTG-T-CCCGCAGCGGTCAATGGGAAC 9 TAAGGCACTGGGCGGC------AACGCCTGGGAAGTAGTA 28 ATGCCCATGAGTCCGAAGACCTGCCAGCAGCGACAAC GAAAAGACTGAAGTAAC 19 GTGC----AAGGGAA--TCCGGTG 2 ATTCCGGAGCTG-AGCCCTCAGCTGTAATGCTTCGA 45 GATGGTCACTGTAGA--- 11 -CCCTATGGGAAGGCCGA 12 AAGAAGCTAAGCCAGAAGACCTGCTTTAGTAGATTTG GCCGTGTCAATGGTTTT 23 AAT--GAAAAGGGAA--CCCAGTG 2 ATTCTGGGACTG-TACCCTCAGCTGTAAGTTCAGAT 19 AAAGCCACTATACAGA------ATCGTATGGGAAGGCAGC 4 CCTTGAATAAGTCAGAAGACCTGCCATTACAAGCGTT CGGATATGTGCGGTTCA 39 GAT--TAAAAGAGAA--TTTGGTG 2 AAGCCAAAACTA-TCCCCGTAGCCGTATGGTCGTAC 15 GATGCCACTGCATAT--------CGATGTGGGAAGGCGTA 4 TTTAGGCCGAGTCGGAAGACCTGCCGCACATATCTAA CCCATCGTAGTGGTCCC 23 GGG 4 AAGAGGGAA--TCGGGTG 2 AATCCCGAGCAG-T-CCCGCTGCTGTAAGCTTTTAC 44 GATGCCACTGTTCATT-- 19 GCTGAATGGGAAGGCGCG 14 GATGAAGTAAGCCAGAATACCTGCCTCTACGAGTTGC TGTGCGGACTTTGTTCA 33 TGAT-TAAAAGGGAA--TCGGGTG 2 AATCCCGAGCAG-T-CCCGCTGCTGTGAACCTTGTT 15 ATATCCACTGTCCGTTCT---GTGCGGATGGGAAGGAGTC 5 TATGGGGTGAGCCAGAAGACCTGCAAAGTCTTTGTCT TGCAGTGCATTGGTTTG 22 CAAT-TAAAAGGGAA--TCAGGTG 2 AATCCTGAACAG-T-CCCGCTGCTGTAAGTTTCACA 24 CTTGCCACTGGGAAAC------GTTTCCTGGGAAGGCGCT 5 ACAGAAACGAGTCAGAAGACCTGCCTGTGCATCTTTT GTCGCCGAATTGGTTCG 18 CGA 5 AAAAGGGAA--TCGGGTG 2 AATCCCGGACAG-T-CCCGCTGCTGTGAAACTCTGT 26 CGTACCACTGACAGAAA-- 7 CTCTGTCGGGAAGGTCCC 7 TGTAGAGTCAGTCAGAAGACCTGCCATTCGTGAATAA GCAGCCGCTTAGGTGAT 25 AT----AAAAGGGAA--TCGGGTG 2 AATCCCGAACAG-TGCCCGCTACTGTGATCCCCCTG 53 TATACCACTGTCATA--- 10 -CATGACGGGAAGGTAGC 6 AAAAGGGATAGTCAGGAAACCTGCCGAAGCAGACATA GCTCCCTGATCGGTTCC 20 GGAT-TAAAAGGGAA--TCGGGTG 2 AATCCCGGACAG-T-CCCGCTGCTGTGAAGCTCCGT 20 GTTGCCACTGGGA----- 26 ---CACCGGGAAGGCGTС 5 CAAGGAGTCAGTCAGAAGACCTGCCGCTTATCAAAGG TGTCCCGAATTGGTTTC 21 GGAT-TAAAAGGGAA--TCGGGTG 2 AATCCCGGACAG-T-CCCGCTGCTGTGAAGCTTCAT 20 TTCGCCACTGACGT---- 15 ----GTCGGGAAGGCTTС 4 TTAGAAGTCAGTCAGAAGACCTGCCGTTCATCAAAGG GTCGGCAGATTGGTTCG 21 CGAT-TAAAAGGGAA--TCGGGTG 2 ACTCCCGGACAG-T-CCCGCTGCTGTGAAGTTTTAT 25 TTGGCCACTGACTCGT-------GTAGTCGGGAAGGCGTT 5 TGGAAGCTAAGTCAGAAGACCTGCCACTCTCGCTGAT TAGCAGATTTCAGTACT 12 AGT--CATAAGGGAA--CGCTGTG 2 AATCGGCGACAG-TACCCGCTGCTGTAATTCTCTGA 12 TATGCCACTGCGCCC---------AGCGTGGGAAGGCGTT 5 GGAGAGATAAGTCAGAAGACCTGCTGAAAAAGTAAAC TTACGGTTTCCGGTGCC 6 GGC 9 AAAAGGGAA--CCCGGTG 2 AATCCGGGACAG-TGCCCGCTGCTGTGATCCTCCCG 37 GAGGCCACTGGTTCGCGC--CCGCGAACCGGGAAGGCCGG 3 CGAGGGGAGAGTCAGAAGACCTGCCGTAATGCAGTAA TCCGATTATGTGGTGCC 17 GGCT-TAAAAGGGAA--TCCGGTG 2 AGTCCGGAACAG-TACCCGCTGCTGTAATTCCGCGC 32 AATGCCACTGTCCCGTT-----CAGGGATGGGAAGGCCGG 4 ATCCGGGAAAGTCAGAAGACCTGCCTCATATTTTTTG TCGCCATGACAGGTGCC 12 GGA--GAATAGGGAA--GTACGTG 2 ATTCGTACACTG-TACCCGCAACTGTACAACGGTTA 47 CAGGTCACTGCCGGTT-- 13 -AACTGCGGGAAGGTTTG 11 TGCCGTGAAAGTCAGGAGACCTGCCAGTCATGCATTT TTCAGCATTACGGTGCC 14 GGA--TAATAGGGAA--GTGCGTG 2 AATCGCACACTG-TGCCCGCAACTGTAAGATGGTAT 50 TGTATCCACTCCGCCA-- 20 --ATGCGGGGGAAGGCTG 29 AGCCATCGAAGTCAGGAGACCTGCCGTAGTGGTTGGC CATGATTAGCTGGTGCC 12 GGA--GAATAGGGAA--GTACGTG 2 ATTCGTACACTG-TACCCGCAACTGTACAACGGAAA 47 CACGTCACTGCCAG---- 15 ---GGGCGGGAAGGCTGC 8 AAGCCGTAAAGTCAGGAGACCTGCCAGTTACTCTTTG AATATCAACTCGGTTCT 17 AGAGGTAAGGGGGAAAGTCCGGTG 2 AATCCGGCGCTG-T-CCCGCAACTGTAATGGGGCTT------------------------------------------------ATGCCTCAAAGTCAGAATGCCCGCCGAAAGTACAACA AATAAATATTCGGTTCT 17 AqAGGTAACGGGGAA26AACGGTG 2 AGTCCGGCGCTG-T-CCCGCAACTGTGAAGGAAAGA-----------------------10-----------------------AACTTTCCCAGTCAGAACGCCCGCCGAAATTGACGAT TACTAGAACTTGGTGTT--------------GGGAAACTCCGGTG 2 ATTCCGGGGCTG-T-GCCGCAGCTGTGATGAAAAGT-----------------------18-----------------------AACTTTCCGAGTCAGAATGCCAATTCCAAGAGTTAGC CTTAGTTGCTCGGTTCT 17 AGACGTAAGGGGGAAAGTGCAGTG 2 AATCTGCCGCTG-T-CCCGCAGCTGTGAGGAGAGA-------------------------3-----------------------CACTCTCTAAGTCAGAATGCCCGCCGAGTGGTCAACC TGTGAGAAGCAGCCTGT-------------AGGGAAAATCCAGTG 2 AGTCTGGTGCTG-T-GCCGCAGCTGTGATGGGAAT--------------------------------------------------CTTCCCTCAGCCAGAATGCCTACTTGCTGTGGTTCA TAAGTTTAGTTGGTTCC 17 GGAGGTAACGGGGAAAAGCTGGTG 2 AAGCCAATACTG-T-CCCGCAACTGTGATGGGCCC--------------------------------------------------AGGCCCTAAGTCAGGATGCCCGCCAACGATGGCCGA GGTTTGGGTCTGGTTTC 17 GATGGAAACGGGGAAAGAACGGTG 2 AATCCGTCGCTG-T-CCCGCAGCTGTAAAGCGTCCGGCCC-------------------------------------------CGCCGGCGTCAGTCAGAACGCCCGCCAGGAGCACTACC ATCCATCAATCGGTTTC 17 GAAGGAAACGGGGAAAGTTCGGCG 2 AATCCGGCACTG-T-CCCGCAGCTGTAAAGCGCAAC-----------------------15-----------------------ACTTGCGCGAGTCAGAATACCCGCCGAATTTCCATCG TCCTCGCAGCAGGCGC--------------AGGGAAAGTCCGGTT 2 AGTCCGGCACTG-T-CGCGCAACGGTTTT---------------------------------------------------------------CAGTCCGAACACCTCGCCTGCTCGCGCTG TGAGGCCACCTGAGCC-------------AGGGGAA-GCCCGGTG 34 ATTCCGGCACTG-T-CGCGCAGCGGTGAATCGGCCT------------------------2-----------------------AGGGCCGTCAGTCCGAATGCCTCTCAGGGACGCGAAC AAGCCTCCCGAGGAAC------------AGAGGGAA-GTCCGGTC 20 AGTCCGGCACAG-T-CGCGCTACGGTTA----------------------------------------------------------------CAGTCCGAACGCTCGCCTCGTGGAGAACG ATGTTTCACATAGAT----------------AGGAA--GACGGTT 2 AATCCGTCACGG-TATCCGCCGCTGTAAGAAGGACG 10 TAAGCCACTGGGAC----------AACCTGGGAAGGCGTG 9 AAGATTTCAAGTCAGAATACGACCTATGAAAATTCCT ACGGAAAACTTGTTTAT 7 ATG--AGGAAGGGAA--TCCGGTT 2 AATCCGGAGCTG-AACCCGCAGCTGTAATCGCCGAA 16 CATGCCACTGCGTTAA-------ATACGCGGGAAGGCTGC 3 ATCGGCGAAAGCCAGAAGACCTAACAAGTAAAAAAAC CATGTCAATTATGTTCC 11 GGC--TAAGAGGGAA--TTTGGTG 2 ATACCAAAACGA-G-CCCGTCGCTGTAATTGAGTTT 10 TATACCACTGGATTT---------TATTTGGGAAGGTAAA 6 TAAATCATAAGTCAGAAGACCTGCATAATTGAATTAC AGAAACAAATAGGTGCT 4 GGCTTAATAAAGGAA-GTTGGGTG 2 AATCCCACACAG-C-AATGCTACTGTATTGTGGACG 8 ATAGCCACTGGGA------------AACTGGGAAGGTGTA 8 TTGAAACTAAGTCAGGAGACTTACCATTATTTTATAT CCTCACCGTGCGGTACC 6 GGTT-CAAAGGGGAA--GCCGGTG 2 AATCCGGCGCGG-G-GCCGCCACCGTGACCGGGGAC 11 AACGCCACTGGGGCGA------TCACCCTGGGAAGGCGCG 10 TGATCCGGAAGCCGGGAAACCCGCCCGCGGTGAAGGG TAGATCGTCGCGGTGAC 28 GTG-----GAAGGAA--GCTGGTG 2 AGTCCAGCACTG-T-GCCGCAACTGTAACCGGCTGT-----------------------2------------------------CAGGCCGGAAGTCAGGACGCCTGCCGCGATGTGTTGT CATATCGTCGCGGTGAC 25 ------AAGGTGGAA--GCTGGTG 2 AGTCCAGCGCTG-T-GCCGCAACTGTAACCGGTTAG------------------------------------------------AAAGCCGGAAGCCAGGACGCCTGCCGCGATGTGATGA TTTTACGTTCAGGTGCT 16 AGG--TAAAAGGGAA--AAGGGTG 2 ACTCCCTTGCTG--TCCCGCAACTGTGAACGGTGAT 14 GATGCCACTGATCT---- 18 -----CCGGGAAGGCGCG 10 TGATCCGTGAGCCAGGAAACCTGCCTGACCGTCAGCT TAGACAAATAAGGTTCT 14 AGAT-TAAAAGGGAA--ACCGGTG 2 AAACCGGCACAGCC-CCCGCTACTGTAATTGAGTTT 34 TAAGCCACTGTTA------------ATATGGGAAGGCGAT 6 TAAATCATAAGTCAGGAGACCTGCCTATTTGTATTAC GTTTCGGTCTTGGTGCT 14 AGTG--AAAAGGGAA--TCAGGTG 2 AGTCCTGAGCAG-T-CCGGCTGACGTAAGTGAGAGA 14 AATGCCACTGGTTT----------ATTCCGGGAAGGCGAA 9 TGACCTCCGAGCCGTAAGACCTGCCAATGACTATAAG CAAACCATACAGGTGCC 7 GGTT--AAAAGGGAA-GCACGGTG 2 ATTCCGTCACGG-T-CCCGCCGCTGTAAGAGAATAG 13 TATGTCACTCGGGA----------AATCGGGGAAGGCTTA 10 GAAGCTCGAAGTCAGAATACCTGCCTGTAAAGGACTA B2 B12-regulon: identification of genes and regulatory elements Mesorhizobium loti Bradyrhizobium japonicum Pseudomonas denitrificans # Sinorhizobium meliloti Brucella melitensis Agrobacterium tumefaciens Rhodopseudomonas palustris Rhodobacter capsulatus # Rhodobacter sphaeroides # Sphing. aromaticivorans # Rickettsia prowazekii Caulobacter crescentus Bordetella pertussis Burkholderia pseudomallei Neisseria meningitidis Nitrosomonas europaea Methylobacillus flagellatus # Ralstonia eutropha # Ralstonia solanacearum Escherichia coli Salmonella typhimurium Klebsiella pneumoniae # Yersinia enterocolitica Y. pestis;E. carotovora Vibrio cholerae Pasteurellaecae Pseudomonas aeruginosa Pseudomonas putida Pseudomonas fluorescens # Pseudomonas syringae # Shewanella oneidensis Azotobacter vinelandii # Xanthomonas axonopodis Xylella fastidiosa H. pylori, C. jejuni Magnetococcus # Geobacter metallireducens # T/D Deinococcus radiodurans Thermus thermophilus # Fusobacterium nucleatum 6 ardX2<>&transp5-6-cobU-btuR><cbiB<>cobD-X; &G1-cobW-cobN-cobG-cbiCLH><cbiJ<>cbi(ET)-cobE-cbiF-cobA-cbiA-cobS-cobT1; cobF; X-cbiP; cobT2<>gene4-transp8; &G1-btuDFC; &ardX-frdX; &metE; &cbiY BJA 4 &transp7-cobF-cobT] [cobS-cbiY] [cbiB-cobD><cbiP-btuR<>&G1-cobW-cobN-cobG] [cbiLH><cbiJ] [cbi(ET)-cobE-cbiF-cbiA-cobA]; &btuB; &metE [btuB3-transp4] PD 1 cbiP; &cobU-cobW-cobN-btuR-//-cobE-cobA-cbiA-cobD-cbiB; cobF-cobG-cbiCLH><cbiJ<>cbi(ET)-cbiF; cobT<>cobS-gene4-transp8 SM 5 cbiP; &cobU-cobW-cobN-btuR-//-cobE-cobA-cbiA-cobD-cbiB; cobF-cobG-cbiCLH><cbiJ<>cbi(ET)-cbiF; cobT<>cobS-gene4-transp8; &cbiY; &btuFCD; &ardXfrdX; &transp7 BME 4 cbiP-&transp5-6-cobU-cobW-cobN-btuR-//-cobE-cbiF><cbiJ-cbiD<>cobA-cbiA-cobD-cbiB; cbi(ET)<>cobG-cbiCL-cbi(GH); cobT<>cobS-gene4-transp8; &btuFCD2 &btuBFCD; &nrdHIEF; cbiY AU 6 cobD<>cbiB><cbiP-btuR<>&G1-cobW-cobN-cobG-cbiCLH><cbiJ<>cbi(ET)-cobE-cbiF><cbiD<>cobA-cbiA; cobU; cobT<>cobS-gene4-transp8; &btuFCD; &nrdHIEF; &transp5-6; &cbiY; yxjH-ATU04068<>&ATU04066-metR &metE-ZUR~btuB-cobN-gene2-3; RPA 9 cbiY-cobT1<>&G1-cobU-cobW-cobN-btuR-cbiP1><cbiB-btuF<>cobD><btuDC-hoxN&; cobF-cobP2-cobS-cobT2&<>ORF663-&cbiCLH>-<cbiJ<>cbi(ET)-cobE-cbiF-cbiA-cobA><btuF2&; &btuB; &btuB3-transp4; cobC-&gene5 RC 12 cobSTU<>cobC-bluE-cbiB-cobD-cbiY><cbiP-btuR<>&G1-cobW-cobN-cbiCLH><cbiJ<>cbi(ET)-cobE-cbiF-cobA-cobF-&cbiMNQO-ORF663-cbiA; &btuFCD; &btuD; frdX-ardX&<metH<cbiP2-cbiP3&<>&gene6&btuBFCD-cobX; &exbBD-tonB; &gene5; ?&<>&nrdDG; &oppABCD RS 6 [cobT; &X-bluE-cobD; &cbiY; cbiB; cbiP; btuR-X<>&G1-cobW-cobN-cbiCLH><cbiJ<>cbi(ET)-cobE-cbiF-cobA-cobF; &transp7-cbiA; &btuBFCD-cobX; &btuFC] SAR 3 hoxN-&cobW-cobN-cobG-cbiCLH><cbiJ<>cbi(ET)-cobE-cbiF-cbiA-cobF-btuR-cbiY-G1-cobT-cobC-cobS-cobD-cbiB-cbiP-cobU; &btuBFCD; &~btuB RP 0 no CO 2 cobT<>gene4-transp8; &btuB-cbiP1; cbiP2; X-btuFCD; &metE BP 0 metH<>btuBF; btuB3-transp4 BPS 4 cbi(GH)-cbiLC-cobG&<>cbi(ET)-cbiDJF; cbiA-btuR-cobE&<>&hoxN-cobW-cobN--chlID; &btuBCD-cobTSC><btuF-cobD-cbiB-cobU<>cbiP; cbiY NM 0 no NE 1 &btuB-transp3 -btuR><gene2-3-cobN; ~btuB-cobN-gene2-3 MFL 3 &btuB-transp3-btuR--cbiA-cobN-gene2-3-cbiY-btuF-cobU; cbiP<>cbiB-cobD><cobC-cobST; &btuB3-transp4; &nrdAB REU 1 &btuBCD-btuR-cobTS-cobC><btuF-cobD-cbiB-cobU<>cbiP; cbiA; cbiY RSO 2 &btuBCD-cobTS-cobC><btuF-cobD-cbiB-cobU<>cbiP; &hoxN-cbiW-cobN-chl(ID)--cbiZ-(cbiX-cbiC)-cbiDLFG-cbi(HJ)-cobW-btuR-cbiA-cbiY EC 1 &btuB; btuCDE; X-btuF-X; X-btuR; cobC; cobUST SY 2 &btuB; btuCDE; X-btuF-X; X-btuR; cobC<>cobD; pdu cluster-&cbiABCDETFGHJKLMNQO-cbiP-cobUST KP 2 &btuB; btuC-//-btuED; X-btuF-X; X-btuR; cobC; cobD; pdu cluster-&cbiABCDETFGHJKLMNQO-cbiP-cobU-cobT2; cobS-cobT1 YE 2 &btuB; btuCED; X-btuF-X; X-btuR; pduX-cobD-cobA<-pdu cluster-&cbiABCDETFGHJKLMNQO-cbiP-cobUS-cobC-cobT; cobA2-pduX2 YP,EO 1 &btuB; btuCDE; X-btuF-X; X-btuR VC 1 &btuB; btuCD; X-cbiB-btuF-X; btuR; cbiP-X; cobTSU-cobC HI,VK,AB 0 no PA 5 &btuB-btuR-cbiA-cbiY-cbiB-cobD-cbiP-cobUTCS;chlD-I//-cobN-cobW&<>&transp(5-6)-cobE-cbiF; cbi(GH)-cbiLC-cobG&<>cbi(ET)-cbiDJ; &bruB2-btuDFC; ZURbtuB3-cobN; -gene2-3--metE; btuB3-transp4 &cobW-cobN-chlID; &transp5-6-cobE-cbiF; cbi(GH)-cbiLC-cobG<>cbi(ET)-cbiDJ; cobF; PP 5 &btuR-cbiA-cbiY-cbiB-cobD-cbiP-cobUTCS; &btuB2-X-&btuFCD; btuF><btuB chlDI-cobN-cobW&<>&transp5-6-cobE-cbiF; cbi(GH)-cbiLC-cobG<>cbi(ET)-cbiDJ; cobC2-cobF; PU 3 &btuR-cbiA-cbiY-cbiB-cobD-cbiP-cobUTCS; btuB2-btuDFC; btuB3; btuF><btuB] chlDI-cobN-cobW&<>&transp5-6 -cobE-cbiF; cbi(GH)-cbiLC-cobG<>cbi(ET)-cbiDJ; cobF; PY 4 &btuR-cbiA-cbiY-cbiB-cobD-cbiP-cobUTCS; &btuFCD; metXY-bruB2-btuDFC SH 2 &btuB-X-metR<>metE; metH<>cobC-&btuDC--cobTSU-cbiP-btuR; cbiB; X-btuF AV 1 &(btuB-transp3-btuR)-cbiA-cbiY-cbiB-cobD-cbiP-cobUTCS; btuB; X-btuF XAX 1 &btuB-transp3-transp3-btuR-cbiB-cobD-cbiP-cobUT-cobC-cobS; btuB3-transp4 XFA 0 no HP,CJ 0 no MCO 0 [cbiO-cbiA-cbiF-cbiX-cbiCD-cbi(TE)-cbiLG-cobE-cbiH><cbiP<>cobD; cbiB; X-btuR; cbiY; cobC; cobTS; cobU GME 1 &cobUTSC-cbiA-X-cbiMNQO -cbiX-cbiCD-X-cbi(ET)-cbiLFGH-cbiPB-cobD DR 3 XX-cobTS-X; X-cobU; &btuFCD; &btuF-btuR-cbiA-cbiB-cobD-cbiP; X-cobC; &achX-nrdIEF TQ 3 cbiA-btuR-X-cobST<>cobC-&hoxN-cbiDC-cbi(ET)-cbiLFHG-cbiX-cobA-cbiY-cbiB-XX-cobD--(cbiP-cobU); &btuFCD; &achX-nrdBA FN 2 cbiP1-X-cbiB-X-cobD-cbiA-X-cbiC-//-cbiDE-X-cbiT-//-cbiL-X-cbiF-//-cbiGHJ; btuR; cobUS-cobC-cobT;cbiK; transp11; &btuFCD; &btuBFCD; btuB<>btuFCD MLO B12-regulon: identification of genes and regulatory elements B/CBacillus subtilis BS Bacillus cereus ZC Bacillus megaterium # BI Bacillus halodurans HD Bacillus stearothermophilus # BE Staphylococcus aureus SA Listeria monocytogenes LMO Clostridium acetobutylicum CA Clostridium perfringes CPE Clostridium botulinum # CB Clostridium difficile # DF Thermoanaerobacter tengcongensis TTE Enterococcus faecalis EF Streptococci (ST, PN, MN, LL) Act HMO Heliobacillus mobilis # Desulfitobacterium halfniense DHA Corynebacterium glutamicum CGL Corynebacterium diphtheriae DI Mycobacterium tuberculosis MT ML Mycobacterium leprae TFU Thermobifida fusca # RK Rhodococcus str. # Streptomyces coelicolor SX PI Propionicibacterium freudenreichii# (PMA,CY,SN) Anabaena sp. AN T. elongatus TE CAU CL PG 3 1 3 2 2 4 5 BX 7 Cya Chloroflexus aurantiacus # Chlorobium tepidum CFBPorphyromonas gingivalis Bacteroides fragilis # Thermotoga maritima Treponema denticola Leptospira interrogans 1 1 1 5 3 0 2 2 4 1 3 2 1 0 5 6 0 1 2 2 3 4 7 TM TDE LEP AA Aquifex aeolicus Arc Thermoplasma volcanicum TVO Methanosarcina acetivorans MAC HSL Halobacterium sp. AG Archaeoglobus fulgidus AP Aeropyrum pernix Methanopyrus kandleri MK Methanococcus jannaschii MJ Methanobacterium thermoaut.TH PK Pyrobaculum aerophilum PH Pyrococcus horikoshii PO Pyrococcus abysii PF Pyrococcus furiosus Sulfolobus tokodaii STO 1 3 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 &btuFCD-pduO pduO; &metE [&cbiW-cbi(H-?)-cbiX-cbiJCD-cbi(ET)-cbiLFGA-cobA-cbiY-btuR] &btuFCD-cbiB-cobD-cobU-cbiP-cobS-cobC-~cobU-pduO; &cobT; cbiA; cblX; &nrdAB; &metE; &achX &btuFCD-cbiB-cobD-cobUS-cobC-~cobU-pduO-cblT-cblX; [&cbiW-cbiMNQO-cbi(H-?)-cbiX-cbiJCD-cbi(ET)-cbiLFG-cbiAP--cobT-cobA-btuR; &nrdAB btuFC pdu-cblX-cobUSC><pocR<>&pduABCDEGHKJL-eutJ-pduMNOPQFW-cobD-pduX; cblT-&cbiABCDETFGHJ-(cobA-hemD)-cbiKLMNQO-cbiP-pduO; btuF &cbiMNQO-cobD-cbiG-pduX-cobT-cbiK-cbiP-cbiACDTLFJH-cobUS-cobC; cbiB -cbiK2-cbiE; &btuFCD-gene5 cobTSUC-cbiB-cobD-&btuFCD-gene7-cbiP; &cbiK-cbiCDETLFG--cbiHJ-btuFCD; btuR-X; &cbiMNQO; &cblT-cblX &cbiPB-cobD-cbiMNQ--hemC-(cobA-hemD)--hemB-cbiAC; cbiD-cbiE-X--cbiT; cbiLFGHJ-cblT-cbiK-pduX-btuR; cobC; cobU; cobS--cblX; btuFCD-1; btuFCD-2 B cobTSU-cobC-&cbiP-cbiA-cbiB-cobD-pduX-cbiCDETFGHJ-cbi(LK)-hemC-(cobA-hemD)-hemB-cysG ; &cbiMNQO; &btuFCD--nrdEF &btuR-&btuFCD-cbiA-cbiPB-cblT-cblX-cobD-cobUS &btuFCD-pduV-pduO-pdu cluster no &cbiD-cbi(ET)-cbiLFGHJ-cbiX-cbiC-//-cobU-cbiPBA-btuR-cobTS-pduX-cobD-~cobC; cbiK; &cbiM]; &cbiQO; &cblX]; cblT-cbiO; &HMO01408 cbiD-&cbi(ET)-cbiLFGHJ-cbiX-cbiC-cbiPB--cbiA-btuR-pduX-cobD]; [cobTUS]; cobC]; &btuFCD; &oppAB]; &cblX; [cbiMN]; [cbiQO]; &DHA05379; &nrdD] gene8-cobUTS; pduO; btuCFD gene8-cobUTS; chlID-btuR-cbiA-//-cobA; cbiP; cobF; cbiB-//-cobD; cobN<>cobG-cbiC-cbi(LH)><cbiJ-cbiF-cbi(ET); &btuCDF gene8-cobTS; chl(ID)-btuR-cbiA--cysG; &transp12-cbiP-cobU; cbiB-//-cobD; cobN<>cobG-cbiC-cbi(LH)><metZ-X<>X><cbiJ-cbiF-cbi(ET); &metE gene8-cobTS; btuR-cbiA; &transp12; &metE gene8-//-cobUS]; cbiB<>cobD-(cbiY-cobT)--cysG; cobF-cobN&<>cobG-cbiC-cbi(LH)><cbiF-cbi(ET)<>&chl(ID)-btuR-cbiA><cbiJ-cbiP-transp10& gene8--cobT]; cobU; [cobD]; &transp10-cbiP]; [cobF];[cobN&<>cobG-cbiC-cbi(LH)];[cbiF];[cbi(ET)<>&chl(ID)];[btuR];[cbiA];[cbiJ><cbiB]; &btuFCD gene8-//-cobU--cobT-//-cobS; (cbiY-cobT)--cysG; cobF; cbiBP-cobN-chl(ID)-btuR-cbiA-cbiL><SX03279<>cbiF-cbi(ET)-cbi(GH)-cbiX-cobD; cbiC><cbiJ; &pduX-XX; &cbiMNQO; &btuFCD; &btuCDF; &RSX12454; &nrdAB; &metE B mutBA&<>&cbiLF-cbi(EGH)-(cbiX-cysG )><cbiDCTJ; &cbiBP-btuR]; cbiMNQO-cobA cbiJ; cbiF; cbiC; cbiL; cbi(GH); cbi(ET); cbiD; cbiB; cbiA; cobU; cobS; btuR; cbiP; cobD; cobN; cobW; cbiX; &hupE &G1-cbiJ; cbiF; &cobG-cbiCL-X-cbi(GH); cbi(ET); cbiD; cbiB; cbiA; cobU; cobS; btuR; cbiP; cobD; cobN; cobW; &G1-btuB-(genW-cbiW)-btuFCD; cbiMNQO cbiJ; cbiF; cbiC; cbiL; cbi(GH); cbi(ET); cbiD; cbiB; cbiA; cobU; cobS; btuR; cbiP; cobD; cobN; cobW; &cbiX; cbiMNQO; &metE B &btuF-cbiW-btuCD-genW]; [X-cbiCD-cbi(ET)-cbiLF]; [cbiG]; [(cbiH-cysG )-cobA-cbiA-cobUD-cbiP; &btuR-cobN-cobT-cbiMNQO-cbiB; cobCS; chl(ID) &btuBF-cbiY-cbiP-cobD-cbiB-btuR-btuCD-X-cobUT-X-cobS;&cbiMNQO-cobA-cbiK-cbiL-cbi(HC)-(ET)-(GF)-(JD); chl(ID)-cobN-&btuB2;cbiA-btuF2; &nrdJ cobUTSC; cbiA-pduO-cbiP-cobD-cbiB; &transp9; &btuB4-cbiK-btuFCD1; cbi(HC)-cbi(ET)-cbi(GF)-cbi(JD); cbiL; &btuFCD2; hmuY-hmuR(~btuB)-cobN-X-gene2-3; &nrdDG; &PG00461-62-63 cobUTSC><cbiB-cobD-cbiP; cbiA-pduO-//-transp9&<>&btuB4-cbiK-btuB3-transp4-cobN-X-gene2-3-cbi(HC)-cbi(ET)-cbi(GF)-cbi(JD); cbiL<>btuFCD &btuB; ~btuB-cobN-X-gene2-3-&nrdAB; btuFC; ~btuB-&metE; &nrdDG; &BX01357-58-59 &btuFCD; btuR &btuFCD; cbiK-cbiLA; cbiG-cbiF; cbiHJ-btuD3; cbiET; X-cbiC; cbiD; cobUSC; btuR; cbiB; X-cbiP; cobD; chlID-cobN--btuFCD2; &transp11]; &rocG &btuB; &(cbiX-cbiW)-X-frd-cbiDC-cbi(ET)-cbiLG-cbi(H?)-cbiF-btuR-cbiA-cobX-cobU-cbiPB; cobTSC no B cysG -cobA-cbiCHDTLF-cbi(GE); cbiPB; X-cobT; X-cbiA-XX; cobDS; btuR; cobC; cobX cobY><btuD-X; btuF; btuC cbiTLFGHC; cbiDE; cbiMNQO; cbiA1; cbiA2; cbiP; cobD-cbiB--cobZ-cobS-cobY; btuR; cobT; transp2-gene2-3-cobN; opp-cobN-chlID; btuFC-X-btuD cbiTLFGH-HSL00646-(cbiX-cbiW)-X><chl(ID)<>cobN-cbiC-cbiE><cbiD<>cobT-cbiA-btuR-cbiP-HSL01294-cbiB-cobSYD-cobX; btuF<>btuCD cbiT-cbiMNQO<>cbiLFG-cbi(HC)-cbiDE--cbiX-gene7; cbiB-X><cbiP; cbiA; cobY-X-cobS1; cobS2; ??-cobD-X; cobT; btuCD-XX-btuF cobT-(cobX-cobZ)-cobY-cobD-cobS-cbiB-X-gene7; $~btuFCD btuF<$>mutBA-ygfD X-cbiC; X-cbiD; cbiT<>cbiL; cbiFGH; cbiE; cbiA1; cbiA2-X; cbiB; cbiP-X; ??-cobT; (cobX-cobZ)-cobS-cobY; ?-cobD; cobN<>cobN2-X-metE cbiC; cbiD; cbiE; cbiF; cbiG; X-cbiH; cbiJ-X; cbiL-X; cbiT-X; cbiMNQO; cbiP; cbiA; X-cbiB-X; cobS-X; X-cobT; cobD; cobY; cobZ; cobN; btuF<>btuCD cbiC-X; cbiD; cbiE; cbiF; cbiB-cbiG; cbiH; X-cbiJ; cbiL; cbiT; cbiMNQO; cbiP; cbiA; (cobX-cobZ)-cobS-X; cobD; cobT-X; X-cobY;cobN-transp2-gene2 cbiCHDTLF; cbiGE; X-cobD-cobS-cbiB; cbiA; cobT-(cobX-cobZ)-btuR; cobD-cbiB<$>gene7-cobZ--cobS-cobY; $cobT; btuR; $nrdDG; $mutB*-$ygfD-mmcE; $mutB^; $sucS; $btuF; $btuCD cobD-cbiB<$>cobZ--cobS-XX-cobY; gene7; $cobT; btuR; $nrdDG; $mutB*-$ygfD-mmcE; $mutB^; $sucS; $btuF; $btuCD cobD-cobZ-gene7-cbiB$<//>cobS-cobY><cbiP-X<$>cobT; btuR; $cobS2; $nrdDG; $mutB*-$ygfD-mmcE; $mutB^-X; $sucS; $btuF; $btuCD $cbiGECHDTLF; cbiP<>cobS-cbiB; $cobT; X-cobD; cobY; cobC; cbiA; $hoxN; btuF<$>btuCD Distribution of B12-elements in bacterial genomes B12-element regulates cobalamin biosynthetic genes and transporters, cobalt transporters and a number of other cobalamin related genes. Phylogenetic tree of B12-elements CA_BTUF CPE_CBIK THT_BTUF DF_BTUF BX_BTUB THT_BTUR SON_BTUD DHA_CBLS HMO_CBLS KP_CBIA DHA_NRDD HMO01408 EF_BTUF DR_ACHX CL_BTUB DR_BTUFC AN_COBG EO_BTUB RK_BTUF TE_CBIX TQ_HOXN C TU _B DI PI_CBIB CY_HUPE AN_CBIJ SN_HUPE RPA_BTUF2 BW O C _ UB AR BT S _ R SA RS_BLUB MLO_BLUB SM_BLUB AU_BLUB TFU_CHLID MT_CBTG ML_CBTG SX12454 SX_BTUF TM_BTUF TFU_CBTE PI_CBIL RK_CBTE AU_ACHX CO_METE MLO_METE RPA_METE BJA_METE BME_NRDH RC04759 RPA_BTUB3 BJA_BTUB MLO_BTUD AU_NRDH AU_CBTA SAR_BTUBF BPS_BTUB BP BPS_HOXN S_ CO BE RSO_BTUB REU_BTUB RPA_HOXN XAX_BTUB RC_NRDD PP_BTUR PU_BTUR PY_BTUR NE_BTUB MFL_BTUB PP_CBTA A_ RP BT CO DX CR A_ P R RX _CF AU BE_NRDA DHA05379 PP_BTUB2 PA_BTUB2 PY_CBTA PU_CBTA BE_BTUF HD_NRDA RPA_BTUB BME_BTUF AU_BTUF PP_BTUF SM_BTUF PY_BTUF RS_BTUF MFL_NRDA RC_BTUF RPA_CBIC RK_CHLID (in gray squares) BI_CBIW HMO_CBIQ MLO_CBTA DHA_CBIET HD_COBT SX_NRDA TQ_BTUF Without B2 domain BE_CBIW DF_CBIM DF_CBIP KP_BTUB VC_BTUB YE_BTUB YP_BTUB TE_METE PMA_HUPE CPE_BTUF EC_BTUB TQ_ACHX AN_CFRX HD_ACHX CPE_CBLT CAU_BTUR CAU_BTUF SX_BTUC LMO_CBIA FN_BTUB ML_METE DHA_BTUF SX_PDUX MT_METE SX_METE CL_NRDJ CL_FRD LI_CBIX YE_CBIA CB_CBIP DR_BTUFR HMO_CBID HMO_CBIM RK_COBN TFU_COBN SX_CBIM PA_COBG BPS_COBG RSO_HOXN RPA_CFRX RS_BLUE RC_CBIP3 RC_CBTF RC_CRDX BME_CBTA SM_COBU SM_ CBT C RC BJA_CFRX _C BI RC_BTUD M PD_COBU RS_CBTC RC_CNOA BME_BTUB RS_CFRX CO_BTUB RC_BTUB RC_EXBB RC_CFRX PY_COBW PA_COBW PU_COBW BS_BTUF ZC_METE CA_CBIM MFL_BTUB2 CPE_CBIM HD_METE CL_BTUB2 SON_BTUB BX_NRDD FN_BTUF PA_CBTA PA_BTUB PG_NRDD CL_CBIM SM_ARDX MLO_ARDX PG_CBTD BX_PCCC BX_BTUB4 BX_NRDA BX_CBTD BX_METE The predicted mechanism of the B12-mediated regulation of cobalamin genes A. g B. pseudoknot aN t C t Gg P2 cg N N N N A A G G G a N a a R C c y G C d c P1 r +Ado-CBL C c G C P3 h a C g K G T r a P0 C G M C k Gg C C A C d a g aN t C t Gg P2 P4 P6 P5 A g c C CTG c gG GGY AG A r A G Y N g k c tG y G h pseudoknot cg N N N N A A G G G a N a a c P1 r 3 2 1 terminator B12-element R C c y G C d +Ado-CBL C c G C P3 h a C K G T r a P4 P0 C G M C k Gg C C A C d a g CTG c gG GGY AG A r A G Y N g k c tG y G h P6 P5 A g c C 2 1 RBS-sequestor hairpin B12-element Ado-CBL 1 2 antiterminator 3 Ado-CBL 1 2 antisequestor Phylogenetic distribution of gene clusters regulated by B12-elements Gene cluster 1. CBL biosynthesis: cbi and cob cbt, hoxN, cbiMNQO, hupE orf1-cobW-cobNchlID bluB btuR Function Taxonomic group cobalamin biosynthesis cobalt transporters proteobacteria, the Bacillus/Clostridium group cobalt chelation -, -proteobacteria, Pseudomonadaceae, actinobacteria -proteobacteria -, -proteobacteria, Pseudomonadaceae cobalt reduction CBL adenosyltransferase 2. Vitamin B12 transport: btuB vitamin B12 receptor btuFCD vitamin B12 transporter (ABC components) all CBL-synthesizing bacteria proteobacteria -, -proteobacteria, Pseudomonadaceae, the Bacillus/Clostridium and CFB groups, Deinococcus radiodurans, actinobacteria, spirochetes, Fusobacteriaceae, Thermotogales,Chloroflexaceae 3. B12-dependent or alternative metabolic pathways: metE methionin synthase various groups nrd ribonucleotide various groups reductase ardX-frdX predicted enzymes -proteobacteria achX predicted enzymes Deinococcus radiodurans and some other species B12-dependent and B12-independent izozymes Ribonucleotide reductases Methionine synthases NrdJ NrdAB/NrdDG MetH MetE (B12-dependent) (B12-independent) (B12-dependent) (B12-independent) + – + – – + – + + + + + B12 B12 B12 B12-independent izozymes of methionine synthase and ribonucleotide reductase are regulated by the B12-elements in the genomes possessing both izozymes (it was not known formerly) Conserved S-box structure D c C a A C G R c gg y N G Aa r Cc N CCCD c AG G G A P3 Gr y GgN g A P2 Ga Nc U A u P1 U C u 5' a H g G P4 U G C YAA N u c c N P5 g car Ga A U R A G a N 3' base stem r gu y Distriubtion of MET regulatory elements Genome AB Methionine biosynthetic genes Bacillales: Bacillus subtilis BS MetK metB; &metI-metC; &metF*; &metE; &metK &cysH-ylnABCDEF Bacillus cereus BC &metY-metB-hom; metC-metI&<>&metF*-metH; &metK &cysH-ylnBCADEF; metX; metE Bacillus halodurans BH metY; metB; &hom; &metI-metC-metF*-metH; metE&metK Bacillus stearothermophilus # BE [metY; [metB; [&metI-metC; [&metF*-metH &metK Oceanobacillus iheyensis OB &metY1; metB; Staphylococcus aureus Listeria monocytogenes &b hmT; &X-metY2 SA &metX; $metI-metC-metF*-metE-mdh LMO &metY-metX; &metE-metI-metC-metF* Lactobacillales: Enterococcus faecalis Lactobacillus plantarum Lactobacillus gasseri # Lactobacillus casei # Lactobacillus delbrueckii # Lactobacillus brevis # Oenococcus oeni # Leuconostoc mesenteroides # Pediococcus pentosaceus # EF no metK LP $metB-metY-hom; $metI; $metE- metF* metK LGA no ? LCA $metE-metF metK LDB metY; metB-rhc1-yusA2; $rhc2-metE-metF metK LB no ? OOE metY; yxjH2-metI-metC-metB metK LME $metB-metI-metC-yxjH1--rhc-$yxjH2-metY; $metF-E metK PPE no metK Streptococcaceae: Lactococcus lactis LL metY; metB-metI; Streptococcus agalactiae SAG Streptococcus mutans MN #metY-mdh; ##metB; metI; Streptococcus pneumoniae PN #metY; #metB; #metI; Streptococcus pyogenes ST metC Streptococcus suis # SSU #metY; metB; [metI; Streptococcus thermophilus # STH metY; ##metB; #metI; Streptococcus uberis # SUB no #metE-metF #metE-metF* ##metE-metF* #metE-metF #metE-metF ##metE-metF Clostridia: Clostridium acetobutylicum CAC &metY; &metB; &metI-metC; metF*; &metH Clostridium perfringes CPE no Clostridium botulinum CB &metY-hom-metB; &metF-msd-metH^ Clostridium tetani CTC &metY-metB; -msd-metH^ Clostridium difficile # DF &metY-metB; &hom; folD-X-metF; rhc-msd-metH^ Thermoanaerobacter tengcongensis TTE &hom-metY-metX; &metF-metH^ Streptomyces coelicolor Thermobifida fusca # Chlorobium tepidum Chloroflexus aurantiacus Cytophaga hutchinsonii Therm otogales (TM, PMI) SX TFU metY-metX; CL &metY-metX; CAU &metY-metX; CHU &metY; metX; metY-metB; metF; metE; metH metF; metH metF; metH metF; metH metE; metH--metF metF-msd-metH^ Transporters Other genes &yusCBA; yusA2 mtnZYXW&mtnV<>mtnU&mtnKS; &yoaD; yrrT-mtn-yrhAB; rhc; &yxjH1-&yxjH2 mtnZYXW&mtnV<>mtnU&mtnKS; yrrT-mtn-yrhAB; rhc; &mdh; &hmrA &BH0835; mtn; rhc mtnZYXW&mtnV<>mtnU&mtnKS; yrrT-mtn-yrhAB; rhc yrrT-rhc–yrhAB; mtn; &yxjH; &OB1276; OB3079&<>&OB3078; &OB2779-OB2778 yrhAB-yusCBA2; rhc<>hmrA; mtn &yxjH; mtn; rhc &yusCBA1-yusA2; &yusCBA3; &yusACB4; &metT; &mtnABC; &oppBCDFA; &mtsABC &yusCBA &yusCBA; mtnABC &metK &yusCBA1; &X-yusACB2; &yusCBA3; &yusCBA4-hmrA &metK &metT; &yusCBA; hcp-mtsABC &metK &yusCBA1; &yusACB2; &oppABC $yusCBA1;$$yusCBA2; yusCBA3; $opp; mtsABC $yusCBA1; $yusCBA2 $yusACB; hcp-mtsABC $yusA1; $yusA2CB $yusACB; hcp-mtsABC $yusACB $yusCBA1; $yusA2-yxjH1-hmrB-yusCB; mtsABC [yusA1-yusA2-hmrB-yusCB; yusA3; $hcp-mtsABC $yusCBA metK metK metK metK metK metK metK metK yusA1-A2-A3-A4-yusCB-mtsABC #yusA-hmrB-yusCB-/-mtsABC; #yusA2; yusCBA3 #yusA-hmrB-yusCB-X-#hcp-mtsABC #yusA-hmrB-yusCB; #hcp-mtsABC #yusACB [yusA-hmrB-yusCB]; [hcp-mtsABC #yusA-hmrB-yusCB-X-hcp-mtsABC;yusA2 #yusA-hmrB-yusCB; #hcp-mtsABC &metK &metK &metK &metK &metK &metK metK metK metK metK metK metK &yusCBA &metT; &metT; &yusCBA1-yusA2; yusCB2 &metT; &yusCBA &yusCBA1; &yusA2 mtsABC yusCBA yusCBA &yusACB (only in Petrotoga miotherma) $yxjH; $yxjH1; yxjH2; $yxjH-rhc; $yxjH-rhc; rhc; rhc; $yxjH; yxjH3; mdh1; $mdh2 $yxjH; rhc; rhc; yxjH; #yxjH; #fhs; #folD; #yxjH; #mdh; #yxjH; #mdh; rhc; rhc; rhc; rhc; rhc; rhc; rhc rhc; rhc Tcub iG-yrhBA><&; rhc; ub iG-yrhBA-rhc; ub iG-yrhB-rhc-yrhA; &SCD95A.26 &SCD95A.26 mtn mtn mtn mtn mtn mtn mtn mtn mtn mtn mtn mtn mtn mtn mtn mtn mtn mtn Aspartate semialdehyde hom Threonine Homoserine cysH-... metX metB O-acetylhomoserine metI yrhB Sulfide metY Cystathionine methylene-THF metC S-ribosylhomocysteine (SRH) yrhA Homocysteine yxjH* metE mtn S-adenosylmethionine S-adenosylhomocysteine (SAH) metK (SAM) CH3 mtn methyl-THF metH Methionine mtnKSUVWXYZ MTA metF Methylthioribose (MTR) THF Phylogenetic tree of the NhaC Na+:H+ antiporter superfamily including predicted methionine-, lysine- and tyrosine-specific transporters Pasteurellaceae NMB SON-2 BL1111 SON-1 VC-2 VC-1 BH SON-3 clostridia OB CAC0744 LysT CB Archaea FN0352 PPE LP-nha2 LGA LP-nha1 LME LB EF-nhaC2 TyrT BC1434 FN1414 BT1270 CB NMB05 36 EF-nhaC1 SA2117 CJ OB2874 BC4121 TTE-nhaC 269. 47 CTC CPE DF FN0978 OB1118 HP MetT BS-yheL FN0650 BC1709 CTC00901 FN062 4 CTC02520 BS-mleN BB0637 CPE2317 FN1420 CTC02529 VCA0193 SO1087 FN1422 BC0373 BB0638 FN207 7 BH3946 VC2037 SA2292 HI1107 VV21061 MleN RFN Riboflavin biosynthesis and transport FMN (flavin mononucleotide) Bacillus/Clostridium group, proteobacteria, actinobacteria, other bacteria THI Biosynthesis and transport of thiamin and related compounds Thiamin Bacillus/Clostridium group, pyrophosphate proteobacteria, actinobacteria, cyanobacteria, other bacteria, archea (thermoplasmas), plants, fungi B12 Biosynthesis of cobalamine, transport of cobalt, cobalamindependent enzymes Adenosylcobalamine Bacillus/Clostridium group, proteobacteria, actinobacteria, cyanobacteria, spirochaetes, other bacteria S-box Metabolism of methionine and cystein Adenosylmethionine Bacillus/Clostridium group and some other bacteria LYS Lysine metabolism lysine Bacillus/Clostridium group, enterobacteria, other bacteria G-box Metabolism of purines Guanine, adenine Bacillus/Clostridium group and some other bacteria Properties of riboswitches • Direct binding of ligands • Same structure – different mechanisms • Distribution in all taxonomic groups (diverse bacteria, archea - thermoplasmas, eukaryotes – plants and fungi) • Correlation between the mechanism and taxonomy: – Preferable attenuation of transcription (anti-antiterminator) – Bacillus/Clostridium group – Preferable attenuation of translation (anti-antisequestor of translation initiation) – proteobacteria Some confirmed predictions of metabolite promoting RNA riboswitches. Structure of RFN-element and RNA riboswitch mechanism. (regulation of riboflavin metabolism and transport genes) Bacillus subtilis Winkler et al., 2002b Mironov et. al. 2002 Structure of THI-element and RNA riboswith mechanism. (regulation of thiamin metabolism and transport genes) Bacillus subtilis Mironov et. al. 2002 Escherichia coli Winkler et al., 2002a Structure of B12-element (regulation of cobalamin metabolism, transport and others cobalamin related genes) Escherichia coli Streptomyces coelicolor Nahvi et al., 2002 Borovk et al, 2006 • Dmitry Rodionov • Andrei Mironov • Mikhail Gelfand