* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Nucleic Acids Research
Amino acid synthesis wikipedia , lookup
Gene desert wikipedia , lookup
Zinc finger nuclease wikipedia , lookup
Molecular Inversion Probe wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Molecular ecology wikipedia , lookup
Gene expression wikipedia , lookup
Endogenous retrovirus wikipedia , lookup
Biochemistry wikipedia , lookup
Ancestral sequence reconstruction wikipedia , lookup
Multilocus sequence typing wikipedia , lookup
Two-hybrid screening wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Homology modeling wikipedia , lookup
Non-coding DNA wikipedia , lookup
Community fingerprinting wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Promoter (genetics) wikipedia , lookup
Biosynthesis wikipedia , lookup
Genetic code wikipedia , lookup
Point mutation wikipedia , lookup
Volume 12 Number 11 1984 Nucleic Acids Research Nudeotide sequence of the tmr locus of Agrobactentum tumefaciens pTi T37 T-DNA S.B.Goldberg, J.S.Flick and S.G.Rogers* Monsanto Company, 800 North Lindbergh Boulevard, Saint Louis, MO 63167, USA Received 5 March 1984; Revised and Accepted 16 May 1984 ABSTRACT The nucleotide sequence of the tmr locus from the nopaline-type pTi T37 plasmid of Agrobacterium tumefaciens was determined. Examination of this sequence allowed us to identify an open reading frame of 720 nucleotides capable of encoding a protein with a derived molecular weight of 27025 d. Comparison of the pTi T37 tizr sequence with the published sequence of the pTi Ach5 tmz' locus shows over 88% homology in the 240 bases 5' to the translational initiation codon and over 91% homology in the coding sequences. The 3' nontranslated regions show less than 50% homology as expected for the 3' regions of divergent related genes. The possible significance of areas of conserved sequences, particularly in the 5' regulatory regions, is discussed. INTRODUCTION Agrobacterium tumefaciens causes crown gall disease by transfer of a DNA segment from its large resident Ti plasmid into the plant cell where this DNA is covalently integrated into the genome (1-5). Expression of certain genes located on the transferred DNA (T-DNA) results in in situ neoplastic growth or phytohormone-independent growth of the infected tissue when placed into culture (6-7). Recent results implicate certain specific T-DNA encoded genetic and transcriptional units, the tms and timr loci, as the units of expression responsible for the hormone-independent growth (8-12). Specifically, expression of the tms locus results in elevated auxin (indole acetic acid) levels in Ti transformed cells while expression of the tmr locus elicits increased levels of cytokinins (13) in tumor tissues relative to non-transformed tissues. Mutations at these loci have specific effects on the morphology of the tumors induced by the mutant Ti plasmid (8). Tumors induced by a strain with an inactivated tms (tumor morphology shooty) locus show large numbers of shoots appearing on the tumor tissue. Tumors induced by a strain with an inactivated tmr (tumor morphology rooty) locus display excessive root development. The phenotype of the tumor induced by strains carrying mutations at these C I R L Press Limited, Oxford, England. 4665 Nucleic Acids Research loci can be reverted to normal crown gall callus by supplying exogenous phytohormones. Added cytokinins reverse the effect of the tmr mutation; added auxins reverse the effect of the tms mutation on the morphology of normal cultured tumor tissue (14). From these results, it is evident that the products of the tms and tmr loci are involved in the regulation of phytohormone levels in the tumor tissue. As a first step to defining and better understanding the functions of these genes, we have determined the nucleotide sequence of the tmr locus from the nopaline-type pTi T37 plasmid. During the preparation of this manuscript, the nucleotide sequence for the tmr locus of an octopine-type pTi Ach5 plasmid was published by Heidekamp et al. (15). The tmr locus resides in the DNA region common to both nopaline and octopine type Ti plasmids as determined by DNA heteroduplex analysis, genetic and transcript mapping (16,1012). The availability of the DNA sequences of both tmr loci provides a unique opportunity to examine two functionally related genes for the extent of similarity or variation in their regulatory and structural regions. Such a comparison permits insight into the importance of various DNA sequences within the common regulatory regions and particular amino acids in the protein encoding regions. In this report we describe the nucleotide sequence of the pTi T37 tmr locus and compare this sequence with that of the previously reported pTi Ach5 homologue. MATERIALS AND METHODS Bacteria and bacteriophage The Escherichia coli recipient for plasmid transformation was strain LE392:F-, hsdR514(rk-, mk+), metBI (17). The host for M13 phage cloning and growth was JM101 (18). M13 mp8 and mp9 were obtained from BRL (Gaithersburg, MD.) Enzymes All restriction endonucleases were purchased from New England Biolabs (Beverly, MA) and used according to the manufacturers instructions. See Roberts (19) for specificity. Bacteriophage T4 DNA ligase was prepared using a modification of the procedure of Murray et al. (20). Plasmid and phage DNA reconstructions Cleavage of DNAs, ligations, and transformations were performed as described by Taylor et al. (23) for plasmids and as described by Messing et al. (18) for M13. 4666 Nucleic Acids Research DNA preparation and sequencing Plasmid DNA was prepared as described by Davis et al. (21). M13 DNA was prepared by the procedure of Messing et al. (18) and used as template for the dideoxynucleotide chain termination method described by Anderson (22). Analysis and assembly of the DNA sequence data was performed using programs obtained from IntelliGenetics (Palo Alto, CA). RESULTS Cloning of the tmr locus The tmr locus was first isolated on the 3.8 kb HindIII-22 fragment prepared by digestion of the nos::Tn7 derivative of pTi T37, pGV3106 (24). This fragment was inserted into the HindIII site of pBR327 (25) to yield pMON69. Restriction mapping showed that the inserted fragment was indeed HindIII-22 by comparison of the internal BamHI cleavage sites with published restriction cleavage site maps of the pTi T37 plasmid (11-12,26). Transcript mapping carried out by both Bevan and Chilton (12) and Willmitzer et al. (11) had E f~~~~ pTiT37 Hindil(-22 tm r Coding Region 8-C8\1 Ee; 8 88r 8 EI I #2(1), #3(1) E 8 8 8 8 8 11 ~~~~~I ~ ~ ~ ~ ~~ #1(2), #3(2) #1(1), #2(1), #3(1) #3(1) #1(1) #1(2) #4(2) #5(1) #3(1) #1(3) #8(2) #2(1) Figure 1. Restriction endonuclease cleavage map of the pTi T37 HindIII-22 fragment and tmr locus containing 2 kb BamHI to HindIII subfragment. The major restriction endonuclease cleavage sites are shown for the BamHI-HindIII subfragment. The arrows beneath the map show the independent clones of various subfragments, #, and the number of times each was used for sequence determinations ( ). The length of the arrow shows the approximate extent of the sequence data obtained. Continuous sequence through all junctions showed that no small fragments were lost during subcloning. 4667 Nucleic Acids Research identified the tmr transcript as a 1200 bp mRNA that mapped entirely within the 2.0 kb BamHI to HindIII segment from the right side of fragment HindIII22 (Fig. 1). This 2.0 kb BamHI-HindIII fragment was isolated from pMON69 and inserted into similarly cleaved pBR327 to yield pMON99. The 2.0 kb insert was mapped by cleavage with various restriction endonucleases to provide the detailed map in Fig. 1. The presence of the unique HpaI site at approximately nucleotide 1350 served to locate the active portion of the pTi T37 tmr gene since insertion of DNA fragments encoding antibiotic resistance at this site results in the tmr phenotype and inactivates the gene (27-28). Nucleotide sequence determination of the tmr locus The resulting restriction map (Fig. 1) provided a large number of cleavage sites all of which were used, alone or in combinations, to obtain sub- 1 GGATCCTGTT ACMGTATTG CACGTTTTAT AAATTGCATA TTAATGCAAT CTTGATTTTC 61 AACMCGAAG GTAATGGCGT AAAAGAAAAA ATGTATGTTA TTGTATTGAT CTTTCATGAT 121 GTTGAAGCGT GCCATAATAT GATGATGTAT AATTAAAATA TTAACTGTCG CATTTTATTG 181 AAATGGCACT GTTATTTCAA CCATATCTTT GATTCTGTTA CATGACACGA CTGCAAGAAG 241 TAAATAATAG ACGCCGTTGT TAAAGAATTG CTATCATATG TGCCTAACTA GAGGGAATTT 301 GAGCGTCAGA CCTAATCAAA TATTACAAAA TATCTCACTC TGTCGCCAGC AATGGTGTAA 361 TCAGCGCAGA CAAATGGCGT AAAGATCGCG GAAAAACCTC CCCGAGTGGC ATGATAGCTG 421 CCTCTGTATT GCTGATTTAG TCAGCCTTAT TTGACTTMG GGTGCCCTCG TTAGTGACAA 481 ATTGCTTTCA AGGAGACAGC CATGCCCCAC ACTTTGTTGA AAAACAAATT GCCTTTGGGfi 541 AGACGGTAAA GCCAGTTGCT CTTCAATMG GAATCTCGAG GAGGCAATAT AACCGCCTCT 601 GGTAGTACAC TTCTCTAATC CAAAAATCAA TTTGTATTCA AGATACCGCA AAAAACTT 659 ATG GAT CTG CGT CTA ATT TTC GGT CCA ACT TGC ACA GGA MG ACG TCG MET Asp Leu Arg Leu Ile Phe Gly Pro Thr Cys Thr Gly Lys Thr Ser 707 ACC GCG GTA GCT CTT GCC CAG CAG ACT GGG CTT CCA GTC CTT TCG CTC Thr Ala Val Ala Leu Ala Gln Gln Thr Gly Leu Pro Val Leu Ser Leu 755 GAT CGG GTC CM TGT TGT CCT CAG CTG TCA ACC GGA AGC GGA CGA CCA Asp Arg Val Gln Cys Cys Pro Gln Leu Ser Thr Gly Ser Gly Arg Pro 803 ACA GTG GM GM CTG AM GGA ACG AGC CGT CTA TAC CTT GAT GAT CGG Thr Val Glu Glu Leu Lys Gly Thr Ser Arg Leu Tyr Leu Asp Asp Arg 851 CCT CTG GTG AAG GGT ATC ATC GCA GCC MG CAA GCT CAT GAA AGG CTG Pro Leu Val Lys Gly Ile Ile Ala Ala Lys Gln Ala His Glu Arg Leu 899 ATG GGG GAG GTG TAT MT TAT GAG GCC CAC GGC GGG CTT ATT CTT GAG MET Gly Glu Val Tyr Asn Tyr Glu Ala His Gly Gly Leu Ile Leu Glu 947 GGA GGA TCT ATC TCG TTG CTC MG TGC ATG GCG CM AGC AGT TAT TGG Gly Gly Ser Ile Ser Leu Leu Lys Cys MET Ala Gln Ser Ser Tyr Trp 995 AGT GCG GAT TTT CGT TGG CAT ATT ATT CGC CAC GAG TTA GCA GAC GM Ser Ala Asp Phe Arg Trp His Ile Ile Arg His Glu Leu Ala Asp Glu 1043 GAG ACC TTC ATG AAC GTG GCC AAG GCC AGA GTT AAG CAG ATG TTA CGC Glu Thr Phe MET Asn Val Ala Lys Ala Arg Val Lys Gln MET Leu Arg 4668 Nucleic Acids Research 1091 CCT GCT GCA GGC CTT TCT ATT ATC CM GAG TTG GTT GAT CTT TGG AAA Pro Ala Ala Gly Leu Ser Ile Ile Gln Glu Leu Val Asp Leu Trp Lys 1139 GAG CCT CGG CTG AGG CCC ATA CTG MA GAG ATC GAT GGA TAT CGA TAT Glu Pro Arg Leu Arg Pro Ile Leu Lys Glu Ile Asp Gly Tyr Arg Tyr 1187 GCC ATG TTG TTT GCT AGC CAG AAC CAG ATC ACA TCC GAT ATG CTA TTG Ala MET Leu Phe Ala Ser Gln Asn Gln Ile Thr Ser Asp MET Leu Leu 1235 CAG CTT GAC GCA GAT ATG GAG GAT AAG TTG ATT CAT GGG ATC GCT CAG Gln Leu Asp Ala Asp MET Glu Asp Lys Leu Ile His Gly Ile Ala Gln 1283 GAG TAT CTC ATC CAT GCA CGC CGA CAA GM CAG AM TTC CCT CGA GTT Glu Tyr Leu Ile His Ala Arg Arg Gln Glu Gln Lys Phe Pro Arg Val 1331 AAC GCA GCC GCT TAC GAC GGA TTC GAA GGT CAT CCA TTC GGA ATG TAT Asn Ala Ala Ala Tyr Asp Gly Phe Glu Gly His Pro Phe Gly MET Tyr 1379 TAG TTTGCACCAG CTCCGCGTCA CACCTGTCTT CATTTGAATA AGATGTTCGC 1432 MTTGTTTTT AGCTTTGTCT TGTTGTGGCA GGGCGGCAAG TGCTTCAGAC ATCATTCTGT 1492 TTTCAAATTT TATGCTGGAG MCAGCTTCT TAATTCCTTT GGAMTAATA GACTGCGTCT 1552 TAAAMTTCAG ATGTCTGGAT ATAGATATGA TTGTAAAATA ACCTATTTAA GTGTCATTTA 1612 GAACATAAGT TTTATGAATG TTCTTCCATT TTCGTCATCG MCGAATAAG AGTAAATACA 1672 CCTTTTTTM CATTATAMT AAGTTCTTAT ACGTTGTTTA TACACCGGGA ATCATTTCCA 1732 TTATTTTCGC GCAAAAGTCA CGGATATTCG TGAAAGCGAC AAAAACTGCG AAATTTGCGG 1792 GGAGTGTCTT CAGTTTGCCT ATTMTATTT AGTTTGACAC TMATTGTTAC CATTGCAGCC 1852 AAGCTCAGCT GTTTCTTTTC TTAAAAACGC AGGATCGAAA GAGCATGACT CGGCAAGGTT 1912 GGCTTGTACC ATGCCTTTCT CATGGCAMG ATGATCAACT GCAGGATGM CTCTCGGAGC 1972 TTTCAAAAGC TT Figure 2. Nucleotide sequence of the 2 kb BamHI-HindIII pTi T37 tbr locus containing fragment. The 720 bp open reading frame and derived amino acid sequence begins at nucleotide 659 with an ATG translation initiator and ends at nucleotide 1378 adjacent to a TAG translational terminator. fragments that were cloned into M13 mp8 or mp9 for subsequent di-deoxy sequencing. The strategy for the subcloning and sequencing appears in Fig. 1. No difficulty was encountered in obtaining clones of any of the subfragments nor in their sequencing. The final nucleotide sequence appears in Fig. 2. The total sequence extending from the beginning of the BamHI recognition sequence to the end of the HindIII recognition sequence comprises 1983 nucleotides. An open reading frame of 720 nucleotides sufficient to encode a protein of derived molecular weight 27025 d was found. Significantly, this open reading frame includes the HpaI cleavage site, preceding nucleotide 1331, where insertions of foreign DNAs result in inactivation of the pTi T37 tbr locus (27-28). This coding sequence starts with an ATG initiator codon beginning at nucleotide 659 and ends at nucleotide 1378 which is adjacent to a TAG translational 4669 Nucleic Acids Research 419 TGCCTCTGTA TTGCTGATTT AGTCAGCCTT ATTTGACTTA AGGGTGCCCT CGTTAGTGAC 450 TTCCTCTGCA TTGCCAATTT ATTCAGCTTT ATTTGACTTA GGTGTGCCTT CGTTAGCGAC 479 AAATTGCTTT CAAGGAGACA GCCATCCCCC ACACTTTGTT GAAAAACAAA TTGCCTTTGG 510 AAATTGCTTT CAAGGAGACA GCCATCCCCC ACACTTTGTT GAAAAACAAG TTGCCTTTTG 539 GGAGACG1GTA AAGCCAGTTG CTCTTCAATA AGGAATCTCG AGGAGGCAAT ATAACCGCCT 570 GGATACGGTA AAGCCAGTTG CACTTCAATA ATGAATTTCA AGGAGACAAT ATAACCGCCT 599 CTGGTAGTAC ACTTCTCTAA TCCAAAAATC AATTTGTATT CAAGATACCG CAAAAAACTT ATG 630 CTGATAACAC AATTCTCTAA TATAAAAATC AGTTTGTATT CAATATACTG CAAAAAACTT ATG Figure 3. Comparison of the S' nontranslated regions of the tmr loci from the T37 (upper lines) and Ach5 (lower lines) Ti plasmids. The underscored nucleotides are those in the Ach5 sequence that differ from the T37 sequence. The enclosed nucleotides are regions of potential importance in RNA polymerase II binding and transcription initiation. termination codon. The derived size for the proposed tmr protein is in agreement with the bacterial expression and hybrid-selected translation data of Schroder and his co-workers (29-30) and with the derived octopine tmr protein size of 27003 d. reported by Heidekamp et al. (15). Further similarities to the octopine tmr protein will be discussed in the comparison of the coding sequences below. Examination of the DNA sequences immediately preceding the coding sequence reveal the features expected for an RNA polymerase II recognition and transcription initiation region. These include a 5'-TATAA- sequence beginning at nucleotide 588. This canonical "TATA box" is preceded at nucleotide 545 by the sequence 5'-GGTAAAG- which was also identified by Heidekamp et al. (15), bears some resemblance to the canonical "CAAT box" (5'-GGC/TCAATCT-) described for non-plant eucaryotic RNA polymerase II recognition regions (31). Based upon our current understanding of plant gene regulatory elements (38), it is possible that plant gene promoters do not contain this feature. Although we have not performed SI digestion analysis to precisely locate the 5' end of the transcript, the similarity of the signals just described for the pTi T37 tmr gene and those of the pTi Ach5 tbr gene discussed below suggest strongly that these signals are indeed those recognized during transcription of the pTi T37 tmr gene in transformed plant cells. 4670 Nucleic Acids Research Comparison of the nucleotide sequences of the pTi T37 tmr and pTi Ach5 tmr loci The 5' regions The sequences of the 240 nucleotides preceding the ATG initiator codon of both the pTi T37 and pTi Ach5 tmr genes show greater than 88% homoInterestingly, the region with the greatest continuous logy (Fig. 3). conserved sequence falls between bases 477 and 526 which are approximately 130 to 180 nucleotides 5' to the ATG initiation codon. Whether this has significance with respect to promoter function will await Of deletion or site-directed mutagenesis analysis of these sequences. the 27 base changes that occur in this 240 nucleotide segment, most (17 of 27) are transitions which preserve a purine or pyrimidine, respectively, at the site of the change. Of the transversions that have occurred, most of these have been of the G-+T type when comparing the pTi T37 to the pTi Ach5 sequence. Without quantitative comparison of transcription levels from the two tmr loci, it is not possible to assess the overall effects of these base changes on relative promoter strength. Heidekamp et al. found two mRNAs from the pTi Ach5 tmr locus: a minor, "long start" transcript (5' end:nucleotides 646-651, Ach5; nucleotides 615-620, T37) and a major, "short start" transcript (5' end:nucleotides 679-683, Ach5; nucleotides 649-653, T37). Each of these starts is preceded by a canonical "TATA box" approximately 30 nucleotides upstream. Significantly, the "TATA box" (5'-TATAAA) for the "short" transcript has been mutated at nucleotides 620 and 621 of the pTi T37 sequence to become 5'-TCCAAA presumably eliminating this transcript of the pTiT37 tmr locus. As mentioned previously, we have not mapped the transcription start of the pTi T37 tmr RNA and cannot say for -certain that the T37 gene will show only one mRNA equivalent to the longer of the two transcripts described for the Ach5 tmr gene. Certainly that would be the prediction based on studies which demonstrate the importance of the "TATA box" in positioning the start point for transcription of other eucaryotic genes (31). The answer to this question awaits further experimental analysis. If only the "long start" is used during pTi T37 tmr transcription, then the transcription start should lie between nucleotides 615 and 620 of the T37 sequence as shown by Heidekamp and co-workers (15) for Ach5. This means that 5 out of the 27 changes have occurred in the 5' nontranslated leader (nucleotides 617-659) of the tmr gene and confirms the variability seen in the 5' nontranslated sequences of other plant gene transcripts of different 4671 Nucleic Acids Research 659 ATG GAT CTG MET Asp Leu C Asp CGT CTA ATT TTC GGT CCA ACT TGC ACA GGA MG ACG TCG Arg Leu Ile Phe Gly Pro Thr Cys Thr Gly Lys Thr Ser A A His Thr 707 ACC GCG GTA GCT CTT GCC CAG CAG ACT GGG CTT CCA GTC CTT TCG CTC Thr Ala Val Ala Leu Ala Gln Gln Thr Gly Leu Pro Val Leu Ser Leu A A T Ile Thr Leu 755 GAT CGG GTC CM TGT TGT CCT CAG CTG TCA ACC GGA AGC GGA CGA CCA Asp Arg Val Gln Cys Cys Pro Gln Leu Ser Thr Gly Ser Gly Arg Pro C A A Gln Leu Cys 803 ACA GTG GM GAA CTG MA GGA ACG AGC CGT CTA TAC CTT GAT GAT CGG Thr Val Glu Glu Leu Lys Gly Thr Ser Arg Leu Tyr Leu Asp Asp Arg C Leu 851 CCT CTG GTG MG GGT ATC ATC GCA GCC MG CAA GCT CAT GM AGG CTG Pro Leu Val Lys Gly Ile Ile Ala Ala Lys Gln Ala His Glu Arg Leu G C T Glu His 899 ATG GGG GAG GTG TAT MT TAT GAG GCC CAC GGC GGG CTT ATT CTT GAG MET Gly Glu Val Tyr Asn Tyr Glu Ala His Gly Gly Leu Ile Leu Glu C A Ile Glu C His A Asn 947 GGA GGA TCT ATC TCG TTG CTC MG TGC ATG GCG CM AGC AGT TAT TGG Gly Gly Ser Ile Ser Leu Leu Lys Cys NET Ala Gln Ser Ser Tyr Trp C C Ser Thr C Asn C G A Arg Asn Ser 995 AGT GCG GAT TTT CGT TGG CAT ATT ATT CGC CAC GAG TTA GCA GAC GM Ser Ala Asp Phe Arg Trp His Ile Ile Arg His Glu Leu Ala Asp Glu A CC A C Ala Lys Pro Gln 1043 GAG ACC TTC ATG MC GTG GCC AAG GCC AGA GTT MG CAG ATG TTA CGC Glu Thr Phe MET Asn Val Ala Lys Ala Arg Val Lys Gln MET Leu Arg A C G A Lys Ala Leu His 1091 CCT GCT GCA GGC CTT TCT ATT ATC CM GAG TTG GTT GAT CTT TGG AM Pro Ala Ala Gly Leu Ser Ile Ile Gln Glu Leu Val Asp Leu Trp Lys C A T T T Pro His Ile Asn Tyr 1139 GAG CCT CGG CTG AGG CCC ATA CTG MA GAG ATC GAT GGA TAT CGA TAT Glu Pro Arg Leu Arg Pro Ile Leu Lys Glu Ile Asp Gly Tyr Arg Tyr A T Glu Ile 1187 GCC ATG TTG TTT GCT AGC CAG MC CAG ATC ACA TCC GAT ATG CTA TTG Ala MET Leu Phe Ala Ser Gln Asn Gln Ile Thr Ser Asp MET Leu Leu G G A Thr Ala 1235 CAG CTT GAC GCA GAT ATG GAG GAT MG TTG ATT CAT GGG ATC GCT CAG Gln Leu Asp Ala Asp MET Glu Asp Lys Leu Ile His Gly Ile Ala Glri A Asn A G Glu Gly A Asn 1283 GAG TAT CTC ATC CAT GCA CGC CGA CAA GM CAG AM TTC CCT Glu Tyr Leu Ile His Ala Arg Arg Gln Glu Gln Lys Phe Pro T A G G C Phe Ala Gln Gln Pro CGA GTT Arg Val A Gln 1331 MC GCA GCC GCT TAC GAC GGA TTC GAA GGT CAT CCA TTC GGA ATG TAT TAG Asn Ala Ala Ala Tyr Asp Gly Phe Glu Gly His Pro Phe Gly MET Tyr T G Phe Pro Figure 4. Comparison of the nucleotide and derived amino acids sequences of the tmr loci from the T37 (upper lines) and Ach5 (lower lines) Ti plasmids. 4672 Nucleic Acids Research members of the same functional gene family, such as those of the pea small subunit of ribulose bis-phosphate carboxylase family (32-33). The coding sequence The 723 nucleotides that comprise the coding sequence and termination codons of both tmr loci are shown in Fig. 4, and the deduced amino acid sequence for each is also presented. There is greater than 91% nucleotide homology. Of the 54 nucleotide changes that occur, 21 are third position changes which do not alter the amino acid at this position. Nine of the remaining changes result in substitution of similar amino acids such as the change of a threonine for a serine at amino acid 16. The remaining changes result in substantial differences of the amino acid; for example, the replacement of lysine by glutamate at amino acid 68 or the replacement of aspartate by tyrosine at amino acid 157. Overall, these changes result in a net negative charge of -2 for the T37 tmr protein versus a net negative change of -5 for the Ach5 protein. These changes have not substantially altered the basic function of the resulting tmr proteins since genetic evidence suggest that each performs the same function in T37 or Ach5 transformed tissues. What effects these substitutions might have on the efficiency with which each of the respective tmr proteins fulfills its intracellular role awaits identification of biological activity and comparison of the two purified proteins. It should be noted that there is no great difference in the codon usage for the two coding sequences. Codons that appear infrequently in either of the tmr genes are not under represented in the codon usage of both the octopine and nopaline synthase proteins (34-35) and probably represent only random variation in codon usage in the smaller tmr proteins with their fewer number of codons. The 3' region The sequences of nearly 360 nucleotides from the 3' end of the pTi T37 and pTi Ach5 tmr genes appear in Fig. 5. We have attempted to align these so that the maximum homology has been shown. This has been accomplished by including spaces in both sequences where insertions or deletions appear to have occurred. As has been described for the 3' nontranslated regions of different members of a gene family (33,36-37), a great amount of variability exists between the two tbr genes. The overall homology is only about 50%. Because neither we nor Heidekamp and his co-workers (15) have performed Si analysis to accurately determine the location of the 3' end of the respective tbr mRNAs, the following conclusions will be based entirely on inspection of the sequences for the presence of the canonical plant poly-adenylation site 4673 Nucleic Acids Research 1382 TTTGCACCAG CTCCGCGTCA CACCTGTCTT CATTTGAATA AGATGTTCGC AATTGTTTTT 1413 GTTACGCCAG CCCTGCGTCG CACCTGTCTT CATCTGGATA AGATGTTCGT AATTGTTTTT 1447 AGCTTTGTCT TGTTGTGGCA GGGCGGCAAG TGCTTCAGAC ATCATTCTG TTTTCAAAT 1473 GGCTTTGTCC TGTTGTGGCA GGGCGGCAAA TACTTGCGAC AATCCATCGT GTCTTCAAAC 1500 TTTATGCTGG AGAACAGCTT CTTAATTCCT TTGGAAATAA TAGACTGCGT CTTAAAATT 1533 TTTATGCTGG TGAACAAGTC TTAGTTTCCA CGAAAGTA TTATGTTAAA TTTTAAAATT 1559 CAGATGTCTG GATATAGATA TGATTGTAAA ATAACCTATT TAAGTGTCA TTTAGMCAT 1591 TCGATGTATA ATGTGGCTAT AATTGTAAAA ATAAACTATC GTAAGTGTGC GTGTTATGTA 1618 AAGTTTTATG AATGTTCTTC CATTTTCGTC ATCGAACGAA TAAGAGTAAA TACACCTTTT 1651 TAATTTGTCT AAATGTTTAA TATATATCAT AGAACGCAAT AAATATTAAA TATAGCGCTT 1678 TTAACAT TA TAAATAAGTT CTTATACGTT GTTTATACAC CGGGAATCAT TTCCATTATT 1711 TTATGAAATA TAAATACATC ATTACAAGTT GTTTATATTT CGGGTACCTT TTCCATTATT Figure 5. Comparison of the 3' nontranslated regions of the tmr loci from the T37 (upper lines) and Ach5 (lower lines) Ti plasmids. Spaces have been inserted into both sequences to achieve maximal alignment. The nucleotide numbering is the same as in Fig. 2 and has been adjusted for the inserted spaces in the pTi T37 tmr sequence. The underscored nucleotides are potential poly-adenylation signals. 5'-G/AATAA- (38). These sites are marked on Fig. 5. It is interesting that both the pTi T37 and the pTi Ach5 tmr loci show a consensus poly-adenylation signal near to the coding sequence (nucleotide 1416; 5'-AATAA- for T37 and 5'-GATAA for Ach5). The significance of these signals approximately 36 nucleotides from the translational termination codon is not known but they have been found in most of the plant 3' nontranslated sequences examined (38). In addition to these "close-in't poly-adenylation signals both the pTi T37 and pTi Ach5 3' regions show consensus plant signals at similar locations at approximately 200 and 270 nucleotides downstream from the translation terminator. The pTi T37 sequence shows two additional consensus polyadenylation signals one of which is located 155 nucleotides from the terminator codon and the other of which occurs approximately 300 nucleotides from the terminator. The relative utilization of these various signals in posttranscriptional modification of the respective tmr mRNAs awaits further experimentation. DISCUSSION In this paper we report the nucleotide sequence of the pTi T37 tmr locus and compare and contrast this with the sequence of the pTi Ach5 tmr locus. The results raise many basic questions concerning plant gene expression as have previous reports describing and comparing nucleotide sequences in the absence 4674 Nucleic Acids Research of experimental manipulation of these DNAs. The existence of two functionally identical but structurally different DNAs has allowed us to reach the following conclusions concerning the significance of, in particular, the conserved sequences. We suggest that the extreme conservation of sequences located 130 to 180 nucleotides 5' of the translational start signal indicates a more significant role of these distal sequences in proper binding and interaction with the plant cell RNA polymerase II complex than is usually presumed. The importance of these regions might be assessed by experimental analysis. The significance of the pTi T37 single "TATA box" versus two such signals and two different mRNAs for the pTi Ach5 promoter can only be assessed by quantitation of the total amount of transcription from the two different tmr gene promoters. Fortunately, all of the questions raised are answerable. We now have access to the nucleotide sequences and the means to alter and re-introduce modified DNAs into plant cells to assay the effects of our manipulations (39-40). In addition, the availability of the coding sequence permits us to modify the pTi T37 tmr gene for expression in Escherichia coli. This will enable us to obtain the product free from contaminating plant proteins and be able to perform assays for the possible cytokinin biosynthetic enzyme activities of this protein (42). Such experiments are currently in progress. ACKNOWLEDGEMENTS The authors are grateful to Dr. J. Schell for plasmid pGV3106. The authors wish to thank Ms. P. Guenther for exceptional patience during the preparation of the figures and text of this manuscript and to Drs. R. Fraley, R. Horsch, G. Barry and A. Levine for their critical reading of this manuscript. *To whom correspondence should be addressed ABBREVIATIONS nos nopaline synthase kb kilobases, 1000 bases REFERENCES 1. Chilton, M.-D., Drummond, M. H., Merlo, D. J., Sciaky, D., Montoya, A. L., Gordon, M. P. and Nester, E. W. (1977) Cell 11, 263-271. 2. Zaenen, I., Van Larebeke, N., Teuchy, H., Van Montagu, M. and Schell, J. (1979) J: Mol. Biol. 86, 109-127. 3. Chilton, M.-D. (1982) in Molecular Biology of Plant Tumors, Kahl, G. and Schell, J. Eds., Academic Press, New York. 4675 Nucleic Acids Research Holsters, M., Hernalsteens, J.-P. Van Montagu, M. and Schell, J. (1982) in Molecular Biology of Plant Tumors, Kahl, G. and Schell, J. Eds., Academic Press, New York. 5. Bevan, M. W. and Chilton, M.-D. (1982) Ann. Rev. Genet. 16, 357-384. 6. Braun, A. C. (1958) Proc. Natl. Acad. Sci. USA 44, 344-349. 7. Braun, A. C. (1978) Biochim. Biophys. Acta. 516, 167-191. 8. Garfinkel, D. J., Simpson, R. B., Ream, L. W., White, F. F., Gordon, M. P. and Nester, E. W. (1981) Cell 27, 143-153. 9. Leemans, J., Deblaere, R., Willmitzer, L., De Greve, H., Hernalsteens, J.-P., Van Montagu, M. and Schell, J. (1982) The EMBO Journal 1, 147-152. 10. Joos, H., Inze, D., Caplan, A., Sormann, M., Van Montagu, M. and Schell, J. (1983) Cell 32, 1057-1067. 11. Willmitzer, L., Dhaese, P., Schreier, P. H., Schmalenbach, W., Van Montagu, M. and Schell, J. (1983) Cell 32, 1045-1056. 12. Bevan, M. W. and Chilton, M.-D. (1982) J. Mol. Appl. Genet. 1, 539-546. 13. Akiyoshi, D. E., Morris, R. O., Hinz, R., Mischke, B. S., Kosuge, T., Garfinkel, D. J., Gordon, M. P. and Nester, E. W. (1983) Proc. Natl. Acad. Sci. USA 80, 407-411. 14. Ooms, G., Klapwijk, P. M., Poulis, J. A. and Schilperoort, R. A. (1980) J. Bacteriology 144, 82-91. 15. Heidekamp, F., Dirkse, W. G., Hille, J. and van Ormondt, H. (1983) Nucleic Acids Res. 11, 6211-6223. 16. Engler, G., Depicker, A., Maenhaut, R., Villarroel, R., Van Montagu, M. and Schell, J. (1981) J. Mol. Biol. 152, 183-208. 17. Maniatis, T., Fritsch, E. F. and Sambrook, J. (1982) Molecular Cloning, A laboratory Manual. Cold Spring Harbor Laboratory, Cold Spring Harbor, New York. 18. Messing, J., Crea, R. and Seeburg, P. H. (1981) Nucleic Acids Res. 9, 309-321. 19. Roberts, R. J. (1982) Nucleic Acids Res. 10, 1830 (r117-rI44). 20. Murray, N. E., Bruce, S. A. and Murray, K. (1979) J. Mol. Biol. 132, 4. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 4676 493-505. Davis, R. W., Botstein, D. and Roth, J. R. (1980) Advanced Bacterial Genetics. Cold Spring Harbor Labortory, Cold Spring Harbor, New York. Anderson, S. (1981) Nucleic Acids Res. 9, 3015-3027. Taylor, A. F., Silicano, P. G. and Weiss, B. (1980) Gene 9, 321-336. Hernalsteens, J.-P., Van Vliet, F., De Beuckeleer, M., DePicker, A., Engler, G., Lemmers, M., Holsters, M., Van Montagu, M. and Schell, J. (1980) Nature 287, 654-656. Soberon, X., Covarrubias, L. and Bolivar F. (1980) Gene 9, 287-305. Hepburn, A. G., Clarke, L. E., Blundy, K. S. and White, J. (1983) J. Mol. Appl. Genet. 2, 211-224. Matzke, A. J. M. and Chilton, M.-D. (1981) J. Mol. Appl. Genet. 1, 39-49. Barton, K. A., Binns, A. N., Matzke, A. J. M. and Chilton, M.-D. (1983) Cell 32, 1033-1043. Schr6der, G., Klipp, W., Hillebrand, A., Ehring, R., Koncz, C. and Schroder, J. (1983) The EMBO J. 2, 403-409. Schroder, G. and Schroder, J. (1982) Mol. Gen. Genet. 185, 51-55. Breathnach, R. and Chambon, P. (1981) Ann. Rev. Biochem. 50, 349-383. Cashmore, A. R. (1983) in Genetic Engineering of Plants, an Agricultural Perspective, Kosuge, T., Meredith, C. P. and Hollaender, A. Eds., pp. 29-38, Plenum Press, New York. Broglie, R., Coruzzi, G., Lamppa, G., Keith, B. and Chua, N.-H. (1983) Biotechnology 1, 55-61. Nucleic Acids Research De Greve, H., Dhaese, P., Seurinck, J., Lemmers, M., Van Montagu, M. and Schell, J. (1982) J. Mol. Appl. Genet. 1, 499-512. 35. Depicker, A., Stachel, S., Dhaese, P., Zambryski, P. and Goodman, H. M. (1982) J. Mol. Appl. Genet. 1, 561-574. 36. Berry-Lowe, S. L., McKnight, T. D., Shah, D. M. and Meagher, R. B. (1982) J. Mol. Appl. Genet. 1, 483-498. 37. Coruzzi, G., Broglie, R., Edwards, C. and Chua, N.-H. (1984) Cell, submitted for publication. 38. Messing, J., Geraghty, D., Heidecker, G., Hu, N.-T., Kridl, J. and Rubenstein, I. (1983) in Genetic Engineering of Plants, an Agricultural Perspective, Kosuge, T., Meredith, C. P. and Hollaender, A. Eds., pp. 211-227, Plenum Press, New York. 39. Herrera-Estrella, L., De Block, M., Hernalsteens, J.-P., Van Montagu, M. and Schell, J. (1983) The EMBO Journal 2, 987-995. 40. Bevan, M. W., Flavell, R. and Chilton, M.-D. (1983) Nature 304, 184-187. 41. Fraley, R. T., Rogers, S. G., Horsch, R. B., Sanders, P. R., Flick, J. S., Adams, S. P., Bittner, M. L., Brand, L. B., Fink, C. L., Fry, J. S., Galluppi, G. R., Goldberg, S. B., Hoffmann, N. L. and Woo, S. C. (1983) Proc. Natl. Acad. Sci. USA 80, 4803-4807. 42. Morris, R. 0., Akiyoshi, D. E., MacDonald, E. M. S., Morris, J. W., Regier, D. A. and Zaerr, J. B. (1982) in Plant Growth Substances, Waring, P. F., Ed., Acadmic Press, London. 34. 4677