Download Nucleotide sequence of a cytomegalovirus single

Journal of General Virology (1990), 71, 2451-2456. 2451 Printed in Great Britain Nucleotide sequence of a cytomegalovirus single-stranded DNA-binding protein gene: comparison with alpha- and gammaherpesvirus counterparts reveals conserved segments David G. Anders Virology Laboratories, The Wadsworth Center for Laboratories and Research, S U N Y at Albany, School of Public Health, New York State Department of Health, Albany, New York, 12201-0509, U.S.A. The genomic sequence encoding a cytomegalovirus strain Colburn homologue (DB129) of the herpes simplex virus major DNA-binding protein (ICP8) was determined. Multiple alignments of the deduced DB129 amino acid sequence and three alpha- and gammaherpesvirus homologues revealed that 56 % of the amino acid residues identical in all four homologues are contained within 12 relatively conserved segments, which together constitute only 11.2 % of the shortest aligned sequence. In light of published ICP8 deletion analyses, this alignment suggests conserved segments that may participate in forming DNA contacts. The identified conserved regions present interesting targets for site-directed mutagenesis in structure-function analyses. Previous studies have identified an early nuclear ssDNA-binding protein, present in cytomegalovirus (CMV)-infected cells, as a probable homologue of the herpes simplex virus type 1 (HSV-1) major DNAbinding protein (ICP8) (Anders et al., 1986, 1987) and have mapped its gene near the centre of the long unique component of the viral genome (Anders & Gibson, 1988; Kemble et al., 1987). The CMV strain Colburn protein (DB129) has an estimated Mr of 129000, whereas its immunologically cross-reactive human CMV (HCMV) counterpart (DB140), encoded by the UL57 open reading frame (ORF; M. Chee, personal communication), has an estimated Mr of about 140000 (Anders et al., 1986). The extensive biochemical (Anders et al., 1986; Kemble et al., 1987) and sequence (this paper) similarities of this CMV protein to HSV-1 ICP8 argue that it plays a comparable role in infection, although no genetic evidence for its function(s) is available. This laboratory is studying the CMV ssDNA-binding protein as a model for the structure and function of this group-common gene product. To facilitate these studies, and to allow comparison with herpesvirus counterparts, I determined the nucleotide sequence encoding DB129 and its flanking regions. Strain Colburn CMV (Gibson, 1981, 1983), from which all recombinant clones were derived, was obtained from W. Gibson. Plasmid pDGA8 contains the EcoRI D fragment of CMV(Colburn) and has also been described (Anders & Gibson, 1988). All recombinant plasmids were propagated in Escherichia coli DH5~ (Bethesda Research Laboratories). Hybrid-arrested in vitro translation experiments previously showed that the gene encoding CMV(Colburn) DB129 lies within EcoRI-D; its boundaries were predicted from limited sequence data and transcript mapping (Anders & Gibson, 1988). To establish the structure of the DB129 gene, the 5-6 kb HindIII subfragment of EeoRI-D containing the DB129 gene (Fig. 1) was excised from pDGA8 and ligated into the HindIII site of pBluescribe (Stratagene Cloning Systems) in both a and b orientations to generate pDGA16 and pDGA24, respectively. Nested sets of 0000--9628 © 1990 SGM K (a) If B ASUXPL1L2DVRFM t II I "~,,,~,~,~,\,/,, . . . . . . . . . . . . I ,/ jf E H P Sp S X K K X I I E H INGJZQ , , t,\/,I, P, . . . . . HXX X C OTWY ~/~/ III E Fig. 1. Location of DB129 gene and sequencing strategy. (a) EeoRI physical map of the CMV(Colburn) genome (LaFemina & Hayward, 1980; Jeang & Hayward, 1983). Cleavage sites are indicated by crosshatches and the larger fragments are designated with uppercase letters. (b) Restriction map of EcoRI-D, cloned as pDGA8, showing restriction sites designated as follows: E, EcoRI ; H, HindIII ; K, KpnI, P, PstI ; S, Sail; Sp, SphI; X, XbaI. The approximate position of the most abundant transcript from this region and its orientation are indicated above the map. The HindIII fragment subcloned from EeoRI-D in both a and b orientations, as pDGAI6 and pDGA24 respectively, is filled. The cross-hatched segment shows the location of sequence data given in Fig. 2. Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Thu, 15 Jun 2017 07:49:23 2452 Short communication -240 CACCGAGCATCACGcTATTTTCGGAAGCGcCT~TGGAGGACAGTTCGCCGAGCCGGCGGCGGGTGGCGCGGGGTGTGTTTGGGTCGCAAATCGGGCGCATAACCGTCGGAGGGGGCGGCG -121 -120 CGCGTCGGGGACAGTGTGCA~TTCCTT~CTGGTGGTCTACGTATT~CGTCAACTATGACG~C~AGCC~CTCC~AGGTATAAAGTTCACTT~ATTA~CG~AGTCGTTACA~ACCACC -i i ~TGAGCAACGAGGA~CTC~GT~CTCTcGCTCCC~TGGGTCCGGCGGCCTATGTATACTTTACCAAAACCAACCAT~AAA#GAACGAGGTTTTAGCCACGTTATCGCTGTGCGATTC~T~T M S N E E L S A L A P V G P A A Y V Y F T K T N H E M N E V L A T L S L C D S S 120 40 121 ~CGCCCGTGGTGATCGCCCCGCTCTTGATGGG~CTC~CCGTCGATCAAGATTTCT~TACCTCGGTTCGCACCCCGGTCGTGTGTTACGACGGT~GGGTGCTC~CCAAGGTGA~GTCTTTC S P V V I A P L L M G L T V D Q D F C T S V R T P V V C Y D G G V L T K V T S F 240 80 241 TGTC~CTTTGCTCTGTATTTCTACAACACTCAGGGGATCGT~GATTTCTCGG~GCCGCATGGTGACGTACAACGGCTGTGTGACGAAAC~CGTCAAAGATACGCCATTGAGAGCTACATG C P F A L Y F Y N T Q G I V D F S E P H G D V Q R L C D E T R Q R Y A I E S Y M 360 120 361 CCGGAAGAAG~C~T~CCCCAC~GACCTTGC~GCCCTCTGCACGGCCGCCGGGTGCGATCCTCAAGAGGTGTTGGT~CAC~TCGTCGTG~GCAACGGCATGAAAGAGTTCATGTATGCG 480 P E E G R A P I D L A A L C T A A G C D P Q E V L V H V V V G N G M K E F M Y A 160 481 GGCC~GCTcATcC~GTGCT~CGAGGA~CGGCGCCCACTCGACTGAACGATTGCG~CGCGGTGCGCGTCCCGCTGTATCCT~CCACCCTCTTTGGTT~TTTGCAGGCCGATGTGGATTCT G Q L I P C F E E A A P T R L N D C D A V R V P L Y P P T L F G S L Q A O V D S 600 200 601 GACGAGC~GlC~CTAGACAAGCGCAGCT~GTTCGTGGAATCTCGGGGATTGTACGTGCCTGCGGTGAGCGAAACCCTGT~CTACTATGTTTACA~TTCGTGGTGCCAGGCCCTGCGCTT D E L S L D K R S S F V E S R G L Y V P A V S T L F Y Y V Y T S W C Q A L R F 720 240 721 TCAGAGACcAAGGTGCTGATCGAAGCGGCCCTGAAGCAGTTCGTGAACGACAGCCAGCAGTcCGTGAAGCTGGCTCCGcACAAGAAGTACTTCGGGTACACGAGccAGAAGCTGAGCAGT 840 S E T K V L I E A A L K Q F V N D S Q Q S V K L A P H K K Y F G Y T S Q K L S S 280 841 CTGGAGAAAGACCACTTAATGCTCAGCGACGCCGTGATCTG~GAGCTGGGGTTCAGCTTcGCCTCGGTGTTTCTGGACTCGGCCTACGGGGCA~CGGATTCCATGGTTTAcTcGGAATGG L E K D H L M L S D A V I C E L G F S F A S V F L D S A Y G A S D S M V Y S E W 960 320 96] CcTGTCGTGGTGAACGCCAcGGA~CATCG~GATCTCATCCGAGCTCTCACCGAGCTCAAATTGcATCTCTcTACCCATATTAGTGCACTGcTGTTTAGTTGTAATT~TATTCTGTAT~AT P V V V N A T D H R D L I R A L T E L K L H L S T H I S A L L F S C N S I L Y H ]080 360 1081 AA~CGGCTTGTG~ATCT£ACTTCCAACAAGAACGC~AGCGGTACCGGAGCCAGCCAGGAGGTGCT~CTGAAGTCTATTCACTTCGCCAACGGCCTGACGGGGCTGTG~GAAGACACGTAT 1200 N R L V Y L T S N K N A S G T G A S Q E V L L K S I H F A N G L T G L C E D T Y 400 1201 AACGACGCCAGGAAACTGATCAAGTGTTCTGGCGTGGTTG~CAAGGACGAACGTTATGCGCCGTATCA~CTGTC~CTCATCTGCGGTACGTGTCCTCAACTTTTC~CTG~TTTCATA~GG N D A R K L I K C S G V V A K D E R Y A P Y H L S L C G T C P Q L F S A F I W 1321 1320 440 TACCT•AATCGAGTTTCTGTTTACAATACCGGGTTGACAGGG•CTTCGACTTTGAGTAATCATTTAATCGGTTGTTCG•CTA•TCTGT•TGGGGCCTGTGGTGGGACATGTTGTCATACC 1440 Y L N R V S V Y N T G L T G S S T L S N H L I G C S S S L C G A C G G T C C H T 480 ]44] TGTTATAACACGGCATTCGTGCGGGTACAGACCCGTCTGCC~CAGATGCCGAGGCTCCCGAAGAAGGAGCCCTCCGTCGTGGTCATGCAA~TcGATTTCTCAACGATGTGGATGTGCTG 1560 C Y N T A F V R V Q T R L P Q M P R L P K K E P S V V V M Q S R F L N D V D V L 520 1561 GGTACGTTCGGACGCCGCTATAGCGCGGAG~CTAAAGAAGCGAGTCTA~ACGCGAAAG~CGACGAGGGTTCCGCGTCGACGTCTAATCGCACCGCGAGCTCGAGCGTGGACCGCACCCAT 1680 G T F G R R Y S A E S K E A S L D A K A D E G S A S T S N R T A S S S V D R T H 560 1681 CGT~TCAACCGCA~CTTGGACTATTGTAAAAAAA~GAGA~TCATAGACTcGGTTACGGGTGAAGACACCATGACTATCAACGGCAGGAGCGATTTTATTAATCTGGTGTCCTCGCTTAAT 1800 R / N R I L D Y C K K M R L I D S V T G E D T M T I N G R S D F I N L V S S L N 600 1801 AAGTTTG~AGA~GA~GAAGCCATGAGC~CGTG~CCGAGG~CCGTA~GAAAAGTAATCGCGACGAGGTTTTAGG~GCTA~GCAGGCCTT~AACCTCGATCTCAACCC~TCGccGTTTCG 1920 K F V D D E A M S F V S E V R M K S N R D E V L G A T Q A F N L D L N P F A V S 640 ]92] TTCAGTCCCATTCTTGCGTATGAGTACTATCGGGTGATTTTCGCCATCATTCAGAACGTCGCCCTGATCACGGC~ACGTCCTACATTGTAGAcAA~cCCCTCACCACGAGTTTGGTT~CC 2040 F S P I L A Y E Y Y R V I F A I I Q N V A L I T A T S Y I V D N P L T T S L V S 680 2041 CGGTGGGTGACTCAA~ACTTC~AGT~TATCCACGGGGCT~7TTCCACCAcTTCCTcCCGAAAGGGTTTTCTCTTTATTAGGAATGTGAAATCCTCAAAAAACGCGGATCATGACCGCCTC 2160 R W V T Q H F Q S I H G A F S T T S S R K G F L F I R N V K S S K N A D H D R L 720 2161 CccGACTTTAAA~TcTATGCCCGCGGCACGTACT~GGTCAT~TCCATGGAGATCAAGCTCTCccGcCTCTCTGTCC~TAGTCTGCTCATGTTCAGGGTcAAAAACCGCCCCATCTcTAAG P D F K L Y A R G T Y S V I S M E I K L S R L S V P S L L M F R V K N R P I S K 2281 GCTAGCAAGGGTACGACGGCTCACGTGTTTTTTCGCCGCGAGCACGTACCTAAGAAAAATCCAGTcAAGGG~TGTTTGGGCTTTCTGCTCTACAAGTATCATGATAAGT~ATTTCCCGAT 2400 A S K G T T A H V F F R R E H V P K K N P V K G C L G F L L Y K Y H D K L F P D 800 2280 760 2401 TGCGGGTTCTCATGTTTACAGTTCTGG~AAAAAGTGTGTGCCAACGCACTG~CCAAAAACGTGAATATCGGGGA~ATGGGGGAGTTCAA~AATTTTG~CAAGTTCGTCATCTCGGTCA~C 2520 C G F S C L Q F W Q K V C A N A L P K N V N I G D M G E F N N F V K F V I S V T 840 2521 GCCGATTATAACGAGCATGACC~GATTGACGTGCCGCCCGATTGCATGCTTAACTATCTGGAGAACCGA~TCACAA~AAGTTCC~TTG~TTTTACGGGTTTAAAGATTACATAGGCACG 2640 A D Y N E H D L I D V P P D C M L N Y L E N R F H N K F L C F Y G F K D Y I G T 880 2641 TTG~ACGGCCTGACAA~GAGGCTAACGTATCAGAATCACGCCCAGTTTCcCTACCTCTTGGG~GAGAGTCCCAATTTTG~GTCAGCTGCCGATTTTGCCETGCGCTTAAAGGATCTCAAA 2760 L H G L T T R L T Y Q N H A Q F P Y L L G E S P N F A S A A D F A L R L K D L K 920 2761 GCGACCGG~GTTA~GGCGCCG~T~CGTCTACGGTTACGCGAGAGTCCTTGA~GCGCACCATTTTTGAGCAACGCTCCCTGGT~ACTGTGAGTTTTTCCATTGAGAAGTAC~CGGGGGTG 2880 A T G V T A P L A S I V T R E S L M R T I F E Q R S L V T V S F S I E K Y A G V960 2881 AACAACAACAAGGAAATT~A~CAGTTTGG~CAGATTGGGTACTTTTCG~GCAACGGGGTGGAGcGCAGCCTGAATACCAAT~CATAGGG~GTCAGGAT~ATAAATTCATGCGTCAGCGC 3000 N N N K E I Y Q F G Q I G Y F S G N G V E R S L N T N S I G G Q D Y K F M R Q RIO00 Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Thu, 15 Jun 2017 07:49:23 Short communication 2453 3001 TGTATCCTGGC~ACCAAACTCTCAGAcGTTCTCATCAAGCGTTCACGGCGCGATAACGTGCTGTTTGACGAGGACATTATCAAGAACAGGGTCATGGCGGCCCTGGATTCGGAGAACCTG 3120 3121 GATGTTGA~CC~GAGCTCATGGCTATGTACGAGATA~TGAGCACTCGGGAGGAGATTCCCGAGCGGGACGA~GTTTTGTT~TTTGTAGATGGATGTCAGGCCGTGGCCGATTCCCTGATG 3240 C I L A T K L S D V L I K R S R R D N V L F D E D I I K N R V M A A L D S E N 1040 L D V D P E L M A M Y E I L S T R E E I P E R D D V L F F V D G C Q A V A D S L M 1080 3241 GAGAAGTTTTCGCGCTTGCAGGAGAT~GGAGTGGACGAC$TTTCCCTGGTGAAT~TGCAGCAGGTGCTGGA~A~CCGGCCGGAGTGCGGCGG~GGCGGGGGCGAGGTTCACGACCTGTCG 3360 E K F S R L Q E M G V D D F S L V N L Q Q V L D S R P E C G G G G G E V H D L S1120 3361 G~GCTG~TTACCGCCGCCTCCGGGGAGGCG~TGGGCAACTCTGTGGGCCG i AACGCGCGEGGGGGGGAGEACGCCTJTGL~GAGGATTGG i GTCTGTTGECGGCCAAGAGAGGCC~CCTG 3480 A L F T A A S G E A V G N S V G L N A R G G E H A F D E D C G L L P A K R G R L1160 3481 TAATAAACGCCGTGCACGCCGTTATATATTAACGTCGGTGTGCACGGCAEACTGCAGAGC 3540 Fig. 2. Nucleotidesequence of the DB129-codingsegment of ~oR]-D and flanking regions, and deduced amino acid sequence. The first nucleotideof the probable initiation codon is taken as + 1. The TATA consensus, the polyadenylationsignal and other potential regulato~ or signal sequencesdiscussed in the text are underlined. The deduced amino acid sequenceis given beneath the nucleotide sequence. progressive unidirectional deletions, in both directions with respect to the DB129 gene, were prepared using the exonuclease III method of Henikoff (1984). The sequence across the deletion junction of selected clones was determined using the dideoxynucleotide chain termination method (Sanger et al., 1977). When it was necessary to fill in gaps, reactions were primed with an oligonucleotide corresponding to a known sequence within the insert. Primers were extended with modified T7 D N A polymerase (Sequenase; U.S. Biochemical) according to the supplier's instructions, in the presence of [~-35S]dATP (Amersham). Nucleotide sequence data were assembled and analysed on a Digital Equipment Corporation VAX/VMS computer using the Genetics Computer Group Sequence Analysis Software Package, version 5.3 ( G C G ; Devereaux et al., 1984). Both strands were essentially completely sequenced. A preliminary report of this work was presented at the 13th International Herpesvirus Workshop, 7 to 13 August, 1988, Irvine, California, U.S.A. The data revealed a 3480 nucleotide (nt) contiguous O R F in the region to which the DB129 gene was previously mapped, transcribed from right to left with respect to the standard orientation of the Colburn genome (LaFemina & Hayward, 1980; Jeang & Hayward, 1983). The nucleotide sequence and deduced amino acid sequence are shown in Fig. 2. Three lines of evidence indicate that this ORF encodes the complete DB129 polypeptide. First, the O R F predicts a protein of Mr 129005, in good agreement with the estimated Mr of DB129 (Anders et al., 1986) and similar to the Mr of the HSV major DNA-binding protein (Quinn & McGeoch, 1985). Second, as described below, sequence comparisons show similarity to the full length of herpesvirus homologues. Third, the ORF is flanked by sets of elements that probably define the structure of encoding transcripts. Upstream there is a T A T A box homology (TATAAA), at nt - 43 to - 38 relative to the predicted translational start site, and several candidate regulatory elements (discussed below). Immediately downstream of the ORF, forming a portion of the termination codon, is the polyadenylation signal A A T A A A (Proudfoot & Brownlee, 1976); further downstream are several blocks of GT-rich sequence (e.g. nt 3518 to 3522; additional blocks are not shown) and another short consensus, C A C T G (nt 3529 to 3533), present distal to the polyadenylation signals of many genes (McLauchlan et al., 1985; Berget, 1984). The most abundant transcript detected by coding-region probes on Northern transfers was estimated to be 3.9 kb in length (Anders & Gibson, 1988), consistent with the utilization of these flanking signals. Although the cap site has not been determined, upstream probes do not detect the 3.9 kb transcript, also consistent with the use of the T A T A box at - 43 to - 38 (Anders & Gibson, 1988; D. G. Anders, unpublished results). These data predict an unspliced m R N A with a short 5' untranslated region, similar to the organization of the HSV-1 ICP8 transcription unit (Rafield & Knipe, 1984; Su & Knipe, 1987). The first A T G of the ORF, at nt + 1 to + 3, is likely to be the translational start codon because (i) it is the first A T G after the putative T A T A box at position - 4 3 to - 3 8 and, because transcription usually initiates 19 to 27 nt downstream from the T A T A sequence, this A T G is probably 3' to the cap site, (ii) it is in a favourable sequence context (5' A C C A C C A T G A 33 to initiate translation (Kozak, 1987) and (iii) the aminoterminal amino acid sequence deduced from this site shows significant similarity to that predicted for ICP8 and other herpesvirus homologues, as shown below. Alternatively, or in addition, the next in-frame ATG, at nt + 79 to + 81, also in a favourable context, may be utilized to begin translation. More detailed transcript mapping and peptide sequence analysis will be required to establish the organization of the transcription unit and confirm the predicted DB129 protein sequence. The region from nt - 2 4 0 to - 6 functioned as an orientation-specific promoter, albeit a weak one, in transient assays using reporter constructs (D. G. Anders Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Thu, 15 Jun 2017 07:49:23 2454 Short communication & S. Punturieri, unpublished results). Inspection of this sequence revealed several motifs which have been implicated in regulating transcription in various systems. At nt - 66 to - 59, immediately upstream of the TATA box, is a close match (ATGACGTCT) to the cyclic AMP response element (CRE) consensus (Montminy et al., 1986) and an adjacent partial CRE (CGTCA). This motif, recognized by the factor A T F / C R E B (Montminy & Bilezikjian, 1987; Lee et al., 1987), is present in the adenovirus E2 promoter, upstream of the HCMV 2.2 kb early promoter described by Staprans et al. (1988), and in multiple copies upstream of the CMV(Colburn) and HCMV major immediate early genes (Hunninghake et al., 1989; Chang et al., 1990). Situated further upstream are two GC-rich regions at n t - 1 3 0 to -111 and at nt - 189 to - 168, each of which contains the 7 bp sequence GGCGGCG. A CCAAT box consensus was not found within the 240 nt 5' to the ORF. However, between the two GC-rich blocks, at nt - 157 to - 148, is the sequence TCGCAAATCG, intriguingly similar to the octamer binding site (Pruijn et al., 1987). Also present in this region is a direct repeat of the 7 nt sequence GGACAGT at positions - 112 to - 106 and -203 to - 197. The deduced DB129 amino acid sequence shares about 72% identity with its HCMV counterpart UL57 and the similarity is roughly collinear from the amino to the carboxyl terminus (M. Chee, personal communication), consistent with their shared biochemical properties and immunological cross-reactivity (Anders et al., 1986, 1987). The observed higher M r of the HCMV protein may be accounted for by the presence in UL57 of a glycine-rich 40 amino acid segment between residues 545 and 585 and two other short glycine-rich stretches near the carboxyl terminus, which are absent in DB 129, if the encoding sequences are not spliced out of the transcript. Similarity matrices comparing DB129 with alphaherpesvirus [HSV-1 ICP8; Quinn & McGeoch, 1985; varicella-zoster virus (VZV) gene 29; Davidson & Scott, 1986] and gammaherpesvirus [Epstein-Barr virus (EBV) BALF2; Baer et al., 1984) homologues also revealed collinear nucleotide and amino acid sequence similarity (not shown). Pairwise alignments made using BESTFIT and GAP, which maximize the quality statistic, substantiated the impressions conveyed by homology matrices. BESTFIT alignment of DB 129 with ICP8 yielded a quality of 502-5, alignment of DB 129 with VZV gene 29 a quality of 500.1 and alignment of DB 129 with BALF2 a quality of 581.7, whereas alignment of the two alphaherpesvirus proteins, ICP8 and VZV gene 29, gave a quality of 1101.3. For comparison, alignment of DB129 with HCMV UL57 yielded a quality of 1396.7, and self-alignment of DB129 (i.e. a perfect match) gave a quality of 1740. Similar results were obtained using GAP, which does not truncate paired sequences. This hierarchy of similarities is consistent with previous studies comparing herpesvirus homologues (e.g. Chee et al., 1989). To resolve those regions most conserved during the evolutionary divergence of alpha-, beta- and gammaherpesvirus major DNA-binding protein homologues, their deduced amino acid sequences were compared in multiple alignments generated using two different methods. First, an iterative approach using BESTFIT and GAP to make rounds of pairwise comparisons was applied (not shown). Second, the CLUSTAL programs (Higgins & Sharp, 1988) were used to produce a multiple alignment (Fig. 3). Results obtained using either approach were similar. Inspection of the alignments revealed clusters of residues conserved in all four compared sequences. These segments, numbered I to XII, are ungapped and were further defined on the basis of arbitrary criteria (given in the legend to Fig. 3). Two larger conserved regions, a and b, which contain a gap or fell below the arbitrary standards, but which are clearly more conserved than the mean, are enclosed by broken lines. Adding HCMV UL57 to this alignment caused no significant changes in the conserved regions defined by these criteria (data not shown). Together, the identified segments contain 51 of a total of 91 amino acids conserved in all four aligned proteins, or 56%, within a combined 130 residues (11.2% of the shortest aligned sequence). If regions a and b and the carboxyl terminus proximal KR sequence (boxed, discussed below) are included, the corresponding numbers become 69-2% of conserved residues in 16.4% of the shortest aligned sequence. The density of identical residues (fraction identical/fraction total) is about fivefold higher in the conserved segments than in the whole sequence, and about 10-fold higher than that of the excluded (i.e. unboxed) sequence. Several laboratories have investigated the functional organization of the prototype homologue HSV-1 ICP8 and it is of interest to consider their results as regards this alignment. ICP8 mutants missing as few as 36 residues of the carboxyl terminus fail to accumulate in the nucleus although they still bind to ssDNA in vitro (Gao & Knipe, 1989); the conserved sequence Lys-Arg (Fig. 3), along with a nearby Pro, may form all or part of a nuclear localization signal. Leinbach & Heath (1988, 1989) showed that fragments of ICP8 made using in vitro transcription and translation, containing either residues 571 to 1196 or 332 to 564, can bind ssDNA in vitro. Consistent with those results, truncated ICP8 proteins, produced by the amino-terminal region deletion mutants dl01 and d102, efficiently bind ssDNA in vitro but fail to support DNA replication (Gao & Knipe, 1989). Temperature-sensitive mutants tsHA1 and tsl3, reported to show thermolabile DNA-binding, mapped to ICP8 Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Thu, 15 Jun 2017 07:49:23 Short communication __i l CMV DB129 EBV BALF2 HSVI DBP VZV GENE 29 M. . . . . SNEELSALAPVGPAAYVYF---TKTNHEMNEVLATLSLCDSSSPVVIAPLLMGL MQGAQTSEDNLGSQSQ~PcGYYIV~yY~--YPLATyPLREV-ATLGTGYAGHRCLTV~ METKPKTATTIKV--P GPLGYVY RACPSEGIEL'--LALLSARSGDSDVAVAPLVVGL CMV DB129 EBV BALF2 HSV] DBP VZV GENE 2g TVE~IEANVAVVVGSRTTGLGGTAV$LKLTPSHY$$$VYVFHGGRHLDPSTQAPN--LT CMV 08129 EBV BALF2 HSV] °BP VZV GENE 29 TVDQ~CTSV. . . . . . RTPVVCYDGGVLTKVTSFCPFALYFYNTQGIVD--FSEPHGDVQ TVEPG SINVKALH-RRPDPNC-- -GLLRATSYHR-DIYVFHNAHMVPPIFEGP--GLE **TVEKTITSSLAVVSGARTTGLAGAGITLKLTTS *HFYP .S . *VFVFHGG . KH . VL.PSS . AA.PN-'.LI. . CMV DB129 EBV BALF2 HSVI DBP VZV GENE 29 LIPCFEEAAPTRLNDCDAV~GSLQADVDSDELSLDKRS-LVAIPSLKQEVAVGQSASVIRVPLYDKEVFIPEGVPQL ............... CMV DBI29 EBV BALF2 HSVI DBP VZV GENE 29 PAVSETLFYYVYTSWC~-~FSETKVLIEAALKQFVNDSQQSVKLAPHKKYFGY CMV DBI29 EBV BALF2 HSVI DBP VZV GENE 29 ............. CMV DBI29 EBV BALF2 HSVI DBP VZV GENE 29 CMV DB]29 EBV BALF2 HSV] DBP VZV GENE 29 . * . Ill RLCDETRQRYAIESY--MPEEGRAPTDLAALCTAAGCDPQEVLVHVVVGNGMKEFMYAGQ ALCGETREVFGYDAYSALPRE$SKPGDF . . . . FPEGLDPSAYLGAVAITEAFK[RLYSGN RLCERARRHfGFSDYTPRPGDLKHE33GEALCERLGLDPDRALLYLVVIEGFKEAVCINN RACNAARERFGFSRCQGPPVDGAVETTGAEICTRLGL~PENTILYLVVIALFK[AVFMCN SFVESRGLYV RQFYN ...... SDLSRCMHEALYIGLA~ALR~RRVGKLVELLEKQSLQDQAKVAKVAPLK---EF . . . . . . RPLNRLLFEAVVGPAAVIAL~RNVDAVARAAAHLAFDENHEGAALPADITFTAFEASQGTGLCHLIHDCVIAPMA . . . . ,~,L,RRNVTAVARGAAHLAFDENHEGAVLPPDITYTYFQSSSSG . . . . . . . . . . . . CMV 0 B 1 2 9 EBV B A L F 2 HSV1 DBP VZV GEN£ 29 HOLSAL FTAASGEAVGNSVGLNARGGEHAFDED ........ CG . . . . . . . L L P/~GRLQDNF | $ V A E P V S T A S Q A S A G L L L G G G G Q G S G G R - - - RIk~RLAT . . . . . . . VLPGLE °°V" GEVFNFGDFGCEDDNATP . . . . . FGGPGAPGPAFAGI~KI~AFHGDOPFG- EG PPDKKGDLT NLAFNFD--SCEPSHDTTSNVLNISGSNISGSTVPG~PPEDDELFDLSGIPIKHGNIT CRY D B I Z 9 EBV BALFZ HSV] DBP VZV GENE 29 .... .... LDML MEHI Fig. 3. Multiplealignment of DB129, EBV BALF2, HSV-1 ICP8 and VZV gene 29 performed using the CLUSTAL package (Higgins & Sharp, 1988)with default parameters. Residuesthat are identical in all aligned sequences are indicated with an asterisk below and where only conserved substitutions have occurred a period appears. Conserved segments, boxed and numbered I to XII, contained no gaps and met one of the following criteria: two or more adjacent conserved residues (i.e. asterisks), three conserved residues in 10, four conserved residues in 15, or five conserved residues in 20. TSQKLSSLEKDHLMLSDAVICELGFSFASVFLDSAYGASDSMVYsEVw~ ............. PASTISHPDSGALMIVOSAACELAVSYAPAMLEASHETPASLNYDSSW -KTPRGGRDGGGKGAA . . . . . GGFEQRLASVMAGDAALALESIVSMAVFDEPPI OISA]W VII VVVNATDHRDLIRALTELKLHLSTHISAL~HNRLVYLTSNKNASGTGASQf LFADCEGPEARVAALHRYNASLAPHVSTQI]fATNSVLIYVSGV . . . . . . . . SKSTGQGKE LFEGQDTAAARANAVGAYLARAAGLVGAM~FSTNSAL~LTEV---DDAGPADPKDHSKMFIGMEGTLPRLNALGSY . TARV . AGV.IGAM~ . !P~A~ .I!LTEV.---EDS.GMTE .AKDG . GPG ., . . . . I VLLKSIHFANGLTGLCEOTYNDARKLIKC . . . . . . $GVVAKDER--YAPYHLSLICGTCP $LFNSFYMTHGLGTLQEGTWDPCRR--PCFSGWGGPDVTGTNGpGNYAVEHLVYAASFSP PSFY'RffLVPGTHVAANPQVDREGHV~PGFEGRPTARLVG:-GTQEFAGEHUAMLCGFSP PSFNRFYQFAGPHLAANPQTDRDGHVLSS . . . . . . QSTGS~-SNTEFSVDYLALICGFGA IX " " NRVSVYNTGLTGSSTLSNHLIGCSSS-L~ICGACGGTCCHTCYNTAFVRVQ QFCQGQKSSLTPVPETGSYVAGAAASPM~CSLCEGRAPAVCLNTLFFRLR ERCDGAVIVGRQEMDVFRYVAOSNQTDV~CNLCTFDTRHACVHTTLMRLR CMV DBI2g EBV BALF2 HSVI DBP VZV GENE 29 CMV DB129 EBV 8ALF2 HSVI DBP VZV GENE 29 TR-R~'P~MPRLPKKEPSVVVMQSgFLNO~FGRRYSAESKEASLDAKADEGSASTSN9 DgF~PVMSTQRROPYVISGASGSYNE~DFLG~F . . . . . . . . LNFIDKEDDGQRPDDEP ARHP FASAARGAIGVFGTMNSMYSDC YAA-FSA. . . . . LKRA-DGSETARTIM ~ ! ~ F G Q A ' RQPI GVF~TMNSQYSD~,DDVp~YAg-YLI ...... LRKPGDQTEAAKATM CMV DB]29 EBV BALF2 H$V] DBP VZV GENE 29 TA$SSVDRTHRLNRILDYCKKMR[IOSVTGEDTMTINGRSDFINLVSSLNKFVDDEAMSF RYTYWQLNONLLERL . . . . . . $RLGIOA[GKIEKEPHGPRDFVKMFKDVDAAVDAEVVQF QETYRAATERVMAELETLQYVDQAVPTAMGRLETIITNREALHTVVNNVRQVVDREVEQL QDTYRATLERLFIDLEOERLLDRGAPCSSEGLSSVIVDHPTFRRILDTLRARIEQTTTQF CMV D9129 EBV BALF~ HSVI DBP VZV GENE 29 MNSM-AKNNITYKDLVKSCYHVMQYSCN~FAQPACPIFTQLFYRSLLTILQDISL~ICMC MRNLVEGRNFKFRDGLGEANHAMSLTLD~YACGPCPLLQLLGRRSNLAVYQDLAL~QCHG MKVLVETRDYKIREGLSEATHSMALTFD~YSGAFCPITNFLVKRTHLAVVQDLAL~QCHC CMV DB]29 EBV BALF2 HSVI DBP VZV GENE 29 YIVDNPLTTSLVSRWVTQHFQSIHGAFSTTSSRKGFLFIRNVKSSKNADHDRLPDFKLYA YENONPGLGQSPPEWLKGHYQTLCTNFRSLAIDKGVLTAKEAKVVHGEPTCDLPDLDAAL VFAGQSVEGRN. . . . FRNQFQPVLRRRVMDMFNNGFLSAKTLTVALSEGAAIEAPSLTAG VFYGQQVEGRN. . . . FRNQFQPVLRRRFVDLFNGGFISTRSITVTLSEGP-VSAPNRTLG CMV DBIZ9 EBV BALF2 HSVI DBP VZV GENE 29 RGTYSVISMEIKLSRLSVPSLLMFRVKNRPISKASKG . . . . . . . . . TTAHVFFRREHV QGRVYGRRLPVRMSKVLMLCPRNIKIKNRVVFTGENA . . . . . . . . . ALQNSFIKSTTR QTAPAESSFEGDVARVTLGFPKELRVKSRVLFAGASANASEAAKARVASLQSAYOKPDKR QDAPAGRTFDGDLARVSVEVIRDIRVKNRVVFSGNCTNLSEAARARLVGLASAYQRQEKR CMV DB]29 EBV BALF2 HSV] DBP VZV GENE 29 PKKNPV~GCLGFLLYKYHDKLF~DCGFSCLQ . . . . . . FWQKVCANALP-KNVNIGDMGEF RENYII GPYMKFLNTYHKTLF DTKISSLY. . . . . . LWHNFSRRRSV-PVPSGASAEEY VD--IL~GPLGFLLKQFHAAIFINGKPPGSNQPNPQWFWTALQRNQLPARLLSREOIETI VD--ML~GALGFLLKQFHGLLF~RGMPPNSKSPNPQWFWILLQRN.Q * MPADKLT.HEE .ITTI..,.. . . . CMV DB129 EBV BALF2 HSVI DBP VZV GENE 29 b NNFVKFVI SVTADYNEHDL~DV~CC MLNYL~ RFHNKF~F~GFKD£Y~GTLHGLTTRL SDLALFVDGGSRAHEESNVdDVVPGNLVTYAKQRLNNAILKACGQTQFYIISLI'QGLVPRT AFIKKF . . . . SLDYGAINFqNLAPNNVSELAMYYMANQILRYCDHSTYFIINTLTAIIAGS AAVKRF. . . . TEEYAAIN~INLPPTCIGELAQFYMANLILKYCDHSQYLI~NTLTSIITGA CMV 0B129 EBV BALF2 HSV] DBP VZV GENE 29 TYQNHAQFPYLLGESPNFASAADfALRLKOLKATGVTAPL--ASTVTRESLMRTIFE~ QSVPARDYPHVLG-TRAVESAAAYAEATSSLTATTVVCAA--TDCLSQ VCK~RP RRPPSVQAAAAW. . . . $AQGGAGLEAGARALMOAVOAHPGAWTSMFASCNLLRPVMA~RP RRPRDPSSVLHW-IRKOVTSAAOIETQ .A .* KA.LLE.KTE . NL.PELW * TT .AFTSTH .LVRAAM . NQR.~P .. . . *. VSEV-9MKSN--RDEVLGATQAFNLOEN~FAVSFS~ILAY~Y~]~ . . . . . . . . . . . . . . . . . . . . . • CMV DB129 EBV BALF2 HSV] DBP VZV GENE 29 2455 . . * . . ki . . . . . . . . . C,_,_*: . *" _._._ ~.- . _ . . _ :_*_~_ - - -~ . -.:._*~._.3 . ~ ~QNVROITATS . ~__3 . . . . . . . . . XII LVIVSFSIEKYAGVNNNKEIYQFGQIGYFS~NGVERSLflTNSIGG ~ QDYKFMRQRC] VVTLPVTINKYTGVNGNNQIFQAGNLGYFM~RGVDRNLLQAPGAGLRKQAGGSSMRKKFV MVVLGLSISKYYGMAGNDRVFQAGNWASLM~GKNACPLLIFDRI . . . . . . . . RKFV M~!o,!~!~.~!~Ng!~!~!~!.~----~'o~T. . . . . . . . . . . . !!~! CMV DBI29 EBV BALF2 HSVI DBP VZV GENE 29 LATKLSDVLIKRSRRDNVLFDEDIIKNRVMAALDSENLDVDPELMAMYEILSTREEIPEFATPTLGLTVKRRTQAATTYEIENIRA-GLEAIISQKQEEDCVFDVVCNLVDAMGEACAS LACPRAGFVCAASSLGGGAHESSLCEQLRGIISEGGAAVASSVFVATVKSLGPRTQQ--IACPRGGFICPVTGPSSGNRETTLSDQVRGIIVSGGAMVQLAIYATVVRAVGARAQH--- CMV DBI29 EBV BALF2 HSVI DBP VZV GENE 29 LTRCDAEYLLGRFSVLAOSVLETLATIASSGIE-WTAEAARO--~FLEGVW---GGPGAA LQIEDWLALLED-EYLSEEMMELTARALERGNGEWSTDAALE---VAHEAEALVSQLGNA MAFDOWLSLTDD-EFLARDLEELHDQIIQTLETPWTV[GALEAVKILDEKTTAGDGETPT --RODVLFFVOGCQAVAOSLM[KFSRLQEMGVDDFSLVNLQQ--*VLDSRPECGGGGGEV positions 348 and 450, respectively (Gao et al., 1988). I note that these substitutions occurred proximal to, but not at, conserved positions. Mutant n2, in which the carboxyl terminus of ICP8 is truncated to residue 1029, deleting 163 residues but retaining conserved segment XII, still binds to s s D N A avidly (i.e. in 0.3 M-NaCI), though less efficiently than the wild-type protein (Gao & Knipe, 1989). Site-specific alteration of two cysteine residues in the zinc-finger-like sequence within conserved region IX also greatly reduced in vitro s s D N A binding (Gao et al., 1988; Gao & Knipe, 1989). Finally, a 56K fragment of ICP8 produced by limited proteolysis, the amino terminus of which is position 300, bound s s D N A in vitro (Wang & Hall, 1990). The latter authors also suggested a possible ssDNA-binding motif, which includes the segment identified here as conserved segment XI. Together, the above results indicate that the ssDNA-binding domain, or domains, resides within residues 348 to 1029, spanning the conserved segments VI to XII. As conserved residues of homologous proteins often form inter- and intramolecular contacts (e.g. Pabo et al., 1990), the results further imply that some subset of residues within one or more of these conserved segments form the ssDNA-binding site(s) and contact D N A . A full understanding of the interaction between this ssDNA-binding protein and its nucleic acid substrate awaits determination of its three-dimensional structure in co-crystals. In the meantime, the identified conserved segments and other conserved residues suggest themselves as targets for site-specific mutagenesis in structure-function analyses of this multifunctional herpesvirus group-common protein. Carefully chosen substitutions might be expected to inactivate selected functions and thus have the potential to reveal important details of the protein's interaction with D N A and with other elements of the herpesvirus D N A replication apparatus. Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Thu, 15 Jun 2017 07:49:23 2456 Short communication I thank Wade Gibson for making available virus strains and cells as well as for advice and encouragement, Louise Belensz and Suzanne Punturieri for technical assistance, Mark Chee for sharing sequence data before publication, Ivan Auger for help with sequence alignments and using GCG, and Paul Masters for helpful criticisms of the manuscript. References ANDERS,D. G. & GIBSON,W. (1988). Location, transcript analysis, and partial nucleotide sequence of the cytomegalovirus gene encoding an early DNA-binding protein with similarities to ICP8 of herpes simplex virus type 1. Journal of Virology 62, 1364-1372. ANDERS, O. G., IRMIERE,A. & GIBSON, W. (1986). Identification and characterization of a major cytomegalovirus early DNA-binding protein. Journal of Virology 58, 253-262. ANDERS, D. G., KIDD, J. R. & GIBSON, W. (1987). Immunological characterization of an early cytomegalovirus single-strand DNAbinding protein with similarities to the HSV major DNA-binding protein. Virology 161, 579-588. BAER, R., BANKIER,A. T., BIGGIN, M. D., DEININGER,P. L., FARRELL, P. J., GIBSON,T. J., HATFULL,G., HUDSON,G. S., SATCHVq'ELL,S. C., Sf~GUIN, C., TUFFNELL, P. S. & BARRELL, B. G. (1984). DNA sequence and expression of the B95-8 Epstein-Barr virus genome. Nature, London 310, 207-21t. BERGET, S. M. (1984). Are U4 small nuclear ribonuclear proteins involved in polyadenylation? Nature, London 309, 179-182. CHANG, Y.-N., CRAWFORD,S., STALL, J., RAWLINS, D. R_, JEANG, K.-T. & HAYWARD,G. S. (1990). The palindromic series I repeats in the simian cytomegalovirus major immediate-early promoter behave as both strong basal enhancers and cyclic AMP response elements. Journal of Virology 64, 264-267. CHEE, M., RUDOLPH, S.-A., PLACHTER,B., BARRELL,B. & JAHN, G. (1989). Identification of the major capsid protein gene of human cytomegalovirus. Journal of Virology 63, 1345-1353. DAVISON,A. J. & ScoTt, J. E. (1986). The complete DNA sequence of varicella-zoster virus. Journal of General Virology 67, 1759-1816. DEVEREUX,J., HAEBERLI,P. & SMITHIES,O. (1984). A comprehensive set of sequence analysis programs for the VAX. Nucleic Acids Research 12, 387-395. GAO, M. & KNIPE, D. M. (1989). Genetic evidence for multiple nuclear functions of the herpes simplex virus ICP8 DNA-binding protein. Journal of Virology 63, 5258-5267. GAD, i . , BOUCHEY,J., CURTIN, K. & KNIPE, D. M. (1988). Genetic identification of a portion of the herpes simplex virus ICP8 protein required for DNA-binding. Virology 163, 319-329. GIBSON, W. (1981). Structural and nonstructural proteins of strain Colburn cytomegalovirus. Virology 111, 516-537. GIBSON, W. (1983). Protein counterparts of human and simian cytomegaloviruses. Virology 128, 391-406. HENIKOFF, S. (1984). Unidirectional digestion with exonucIease III creates targeted breakpoints for DNA sequencing. Gene28, 351 359. HIOGINS, D. G. & SHARP, P. M. (1988). CLUSTAL: a package for performing multiple sequence alignments on a microcomputer. Gene 73, 237-244. HUNNINGHAKE, G. W., MONICK, M. M., LIU, B. & STINSKI, M. F. (1989). The promoter-regulatory region of the major immediate-early gene of human cytomegalovirus responds to T-lymphocyte stimulation and contains functional cyclic AMP response elements. Journal of Virology" 63, 3026-3033. JEANG, K.-T. & HAYWARD, G. S. (1983). A cytomegalovirus DNA sequence containing tracts of tandemly repeated CA nucleotides hybridizes to highly repetitive dispersed elements in mammalian cell genomes. Molecular and Cellular Biology 3, 1389-1402. KEMBLE, G. W., McCoRMICK, A. L., PEREIRA,L & MOCARSKI,E. S. (1987). A cytomegalovirus protein with properties of herpes simplex virus ICP8 : partial purification of the polypeptide and map position of the gene. Journal of Virology 61, 3143-3151. KOZAK, M. (1987). At least six nucleotides preceding the A U G initiator codon enhance translation in mammalian cells. Journal of Molecular Biology 196, 947 950. LAFEMINA,R. L. & HAYWARD,G. S. (1980). Structural organization of the DNA molecules from human cytomegalovirus. In Animal Virus Genetics, vol. 28, pp. 39-55. Edited by B. N. Fields, R. Jaenisch and C. F. Fox. New York: Academic Press. LEE, K. A. W., HAI, T.-S., SIVARAMAN,L., THIMMAPPAYA,B., HURST, H. C., JONES, N. C. & GREEN, M. R. (1987). A cellular protein, activating transcription factor, activates transcription of multiple E1A-inducible adenovirus early promoters. Proceedings of the Notional Academy of Sciences, U.S.A. 84, 8355-8359. LEINBACH,S. S. & HEATH,L. S. (1988). A carboxyl-terminal peptide of the DNA-binding protein ICP8 of herpes simplex virus contains a single-stranded DNA-binding site. Virology 166, 10-16. LE1NBACH,S. S. & HEATH,L. S. (1989). Characterization of the singlestranded DNA-binding domain of the herpes simplex virus protein ICP8. Bioehimica et biophysica acta 1008, 281-286. MCLAUCHLAN,J., GAFFNEY,O., WHITTON, J. L. & CLEMENTS,J. B. (1985). The consensus sequence YGTGTTYY located downstream from the AATAAA signal is required for efficient formation of mRNA 3' terminii. Nucleic Acids Research 13, 1347-1368. MONTMINY, M. R. & BILEZIKJIAN,L. M. (1987). Binding of a nuclear protein to the cyclic AMP response element of the somatostatin gene. Nature, London 328, 175-178. MONTMINY,M. R., SEVARINO,K. A., WAGNER,J. A., MANDEL,G. & GOODMAN,R. H. (1986). Identification of a cyclic-AMP-responsive element within the rat somatostatin gene. Proceedingsof the National Academy of Sciences, U.S.A. 83, 6682-6686. PABO, C. O., AGGARWAL, A. K., JORDAN, S. R., BLAMER, L. J., OBEYSEKARE,U. R. & HARRISON,S. C. (1990). Conserved residues make similar contacts in two repressor-operator complexes. Science 247, 1210-1213. PROUDFOO% N. J. & BROWNLEE,G. G. (1976). 3' non-coding region sequences in eukaryotic mRNA. Nature, London 263, 211-214. PRUIJN, G. J. M., VANDRIEL, W., VANMILTENBURG,R. T. & VANDER VLIET,P. C. (1987). Promoter and enhancer elements containing a conserved sequence motif are recognized by nuclear factor III, a protein stimulating adenovirus DNA replication. Embo Journal 6, 3771 3778. QUINN, J. P. & McGEOCH, D. J. (1985). DNA sequence of the region in the genome of herpes simplex virus type 1 containing the genes for DNA polymerase and the major DNA binding protein. NucleicAcids Research 13, 8143-8163. RAFIELD, L. F. & KNIPE, D. M. (1984). Characterization of the major mRNAs transcribed from the genes for glycoprotein B and DNAbinding protein ICP8 of herpes simplex virus type 1. Journal of Virology 49, 960-969. SANDER,R., NICKLEN,S. & COULSON,A. R. (1977). DNA sequencing with chain-terminating inhibitors. Proceedings of the National Academy of Sciences, U.S.A. 74, 5463-5467. STAPRANS,S. J., RABERT,D. K. & SPECTOR,D. H. (1988). Identification of sequence requirements and trans-acting functions necessary for regulated expression of a human cytomegalovirus early gene. Journal of Virology 62, 3463-3473. Su, L. & KNIVE,D. M. (1987). Mapping of the transcriptional initiation site of the herpes simplex virus type 1 ICP8 gene in infected and transfected cells. Journal of Virology 61, 615~620. WANG, Y. & HALL, J. D. (1990). Characterization of a major DNAbinding domain in the herpes simplex virus type 1 DNA-binding protein (ICP8). Journal of Virology 64, 2028-2089. (Received 17 April 1990; Accepted 25 June 1990) Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Thu, 15 Jun 2017 07:49:23

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Nucleotide sequence of a cytomegalovirus single