* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Identification of a cis-Element That Determines Autonomous DNA
Survey
Document related concepts
Transcript
THE JOURNAL OF BIOLOGICAL CHEMISTRY © 2003 by The American Society for Biochemistry and Molecular Biology, Inc. Vol. 278, No. 22, Issue of May 30, pp. 19649 –19659, 2003 Printed in U.S.A. Identification of a cis-Element That Determines Autonomous DNA Replication in Eukaryotic Cells* Received for publication, July 12, 2002, and in revised form, March 12, 2003 Published, JBC Papers in Press, March 27, 2003, DOI 10.1074/jbc.M207002200 Gerald B. Price‡§, Minna Allarakhia‡¶, Nandini Cossons储, Torsten Nielsen**‡‡, Maria Diaz-Perez‡, Paula Friedlander‡, Liang Tao‡, and Maria Zannis-Hadjopoulos‡ From the ‡McGill Cancer Centre, McGill University, Montreal, Quebec H3G 1Y6, the 储Department of Internal Medicine, the Ottawa Hospital-General Campus, Ottawa, Ontario K1H 8L6, and the **Faculty of Medicine, University of British Columbia, Vancouver, British Columbia V6T 1Z3, Canada A 36-bp human consensus sequence (CCTMDAWKSGBYTSMAAWTWBCMYTTRSCAAATTCC) is capable of supporting autonomous replication of a plasmid after transfection into eukaryotic cells. After transfection and in vitro DNA replication, replicated plasmid DNA containing a mixture of oligonucleotides of this consensus was found to reiterate the consensus. Initiation of DNA replication in vitro occurs within the consensus. One version, A3/4, in pYACneo, could be maintained under selection in HeLa cells, unrearranged and replicating continuously for >170 cell doublings. Stability of plasmid without selection was high (>0.9/cell/generation). Homologs of the consensus are found consistently at mammalian chromosomal sites of initiation and within CpG islands. Versions of the consensus function as origins of DNA replication in normal and malignant human cells, immortalized monkey and mouse cells, and normal cow, chicken, and fruit fly cells. Random mutagenesis studies suggest an internal 20-bp consensus sequence of the 36 bp may be sufficient to act as a core origin element. This cis-element consensus sequence is an opportunity for focused analyses of core origin elements and the regulation of initiation of DNA replication. A key to the development of our knowledge about yeast replication origins was the autonomous replicating sequence (ARS)1 assay; genomic fragments cloned into prokaryotic vectors were found to function as yeast replication origins. ARS plasmids transform yeast at a high frequency, replicate autonomously, and can be maintained in vivo as episomal genetic elements (1). These constructs, however, are lost without selective pressure because of imperfect partition and can integrate into the genome during long term culture. ARS plasmids * This work was supported in part by grants from the Cancer Research Society (to G. B. P.), the Canadian Institutes of Health Research (to M. Z.-H.), and REPLICor, Inc. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. § To whom correspondence should be addressed: McGill Cancer Centre, McGill University, 3655 Sir William Osler Promenade, Montreal, Quebec H3G 1Y6, Canada. ¶ Recipient of an Fonds pour la Formation de Chercheurs et l’Aide à Recherche Centre Studentship and Canadian Institutes of Health Research doctoral research studentship. ‡‡ Recipient of a Canadian Institutes of Health Research studentship. 1 The abbreviations used are: ARS, autonomous replicating sequence; BrdUrd, bromodeoxyuridine; HH, heavy-heavy; HL, heavy-light; LL, light-light; OBA, origin binding activity; ors, origin-enriched sequence(s). This paper is available on line at http://www.jbc.org were invaluable in identifying and defining replication origins in yeast (2, 3). Progress in understanding other eukaryotic DNA replication, particularly in mammalian cells, has been slower (for review, see Refs. 4 and 5). Our studies with small fragments of DNA which can support autonomous replication of a plasmid in mammalian cells (6 – 10) encouraged us to look further for putative replicator sequences. We report here the identification and testing of a putative consensus sequence that will aid in identification of initiation sites (origins) of DNA replication in mammalian and higher eukaryotic cells. We used four mammalian autonomously replicating sequences containing ␣-satellite sequence and a reiterative process between pairs of African green monkey and human sequences to minimize derivation of an ␣-satellite consensus. The resultant consensus sequence was 36 bp. EXPERIMENTAL PROCEDURES Cloning of a Mixture of Oligonucleotides—A mixture of oligonucleotides was generated using a putative consensus sequence. The oligonucleotides were designed to contain effective primers (T3, T7, and M13 reverse primers) for PCR amplification (e.g. CATTAACCCTCACTAAAGGGAACAAAAGCTGGGTACC-consensus sequence-TGAGCTCCAATTCACTGGCCGTCGTTTTAC). After PCR amplification, the products were cloned into the SrfI site of pCRscript (Stratagene) for individual variant analysis. A fraction of the ligation reaction mixture was used to transform bacteria (see below) and suggested that there was representation of greater than 100,000 independent clones. To ascertain the minimal essential sequence, a functional assay for “origin” activity was employed. A portion of the ligation mixture was subjected to replication after transfection into HeLa cells (9), and in an in vitro replication system (9, 11) after digestion of the reaction products with DpnI endonuclease to remove unreplicated pCRscript clones (9), the DpnI-resistant DNA was transfected into bacteria, and individual clones were isolated and sequenced to ascertain whether they contained consensus sequences. CpG Island Clones—Two clones (CP9, HS14C3R Locus; and 6K, HS8F7F Locus) were obtained from the UK HGMP Resource Centre, Cambridge, UK. The sequence homolog in each clone is CATCGAAGCGCTTGAAATCTCCACTTACAAATTCC for CP9, and CCTCAAAGCGCTTGAAAATCTCCACTTGCAAATTCC for 6K. The CpG clones are in the vector pGEM-5Zf(⫺). Plasmid DNA—All plasmid DNA clones were propagated in bacteria in LB medium containing 100 g/ml ampicillin, and large scale amounts of supercoiled plasmid DNA, essential for autonomous replication assays in vivo and in vitro, were prepared using the Qiagen tip 500 columns according to the manufacturer’s specifications (Qiagen). Cell Culture and Transfection—HeLa cells were obtained from the American Type Culture Collection and cultured in Alpha-modified Eagle’s medium (Invitrogen) supplemented with 10% fetal bovine serum (Flow Laboratories). The cultures were maintained in a 37 °C incubator containing an atmosphere of 10% CO2 ⫹ air. All normal primary cells (WI38 human embryo lung fibroblasts, bovine embryo kidney fibroblasts, and chicken embryo fibroblasts) were obtained from BioWhittaker and maintained in culture as described for HeLa cells. Drosophila S2 cells were maintained in Schneider’s Drosophila medium 19649 19650 cis-Element for DNA Replication (Invitrogen) with 10% heat-inactivated fetal calf serum supplemented with glutamine, asparagine, and penicillin/streptomycin. The cells were sealed in Nunc tissue culture flasks and incubated at room temperature in the dark. Transfections were carried out as described previously (12, 13). Cells were cultured at 1 ⫻ 104 cells/cm2 in tissue culture flasks, T25 (Nunclon) overnight before transfection with 5 g of supercoiled plasmid DNA, prepared using the calcium coprecipitation method (14). After transfection, the cells were grown for 24 h in medium containing bromodeoxyuridine (BrdUrd), as described previously (7, 12, 15). Plasmids were recovered by Hirt lysis (16), loaded onto CsCl gradients (initial refractive index 1.408), and centrifuged as described previously (7, 12). An aliquot of each fraction was either dot- or slot-blotted onto a GeneScreen Plus membrane (PerkinElmer Life Sciences), hybridized to 32Plabeled vector (pCRscript or pBluescript) DNA, exposed to an imaging plate, and quantified by densitometry performed using a PhosphorImager (Fuji BAS 2000). In some cases, episomal DNA was recovered by Hirt lysis 3 days after HeLa cells were cotransfected with plasmids containing various versions of the consensus sequence and an expression plasmid carrying the luciferase gene, pRSVLUC (17). The low molecular weight DNA was digested with DpnI and then used to transform the DH5␣ strain of Escherichia coli in a bacterial retransformation assay, as described previously (7, 9, 10). Some of the transfected HeLa cells were used to determine variations in cell density and efficiency of transfection, by measuring levels of luciferase as described previously (17). The levels of luciferase were used to normalize the transformed bacterial colonies detected on LB agar plates containing 100 g/ml ampicillin. In some cases, we also used pCMV/-galactosidase (Applied Biosystems) and a -galactosidase assay kit (Invitrogen). In Vitro DNA Replication—The cell-free replication assay was adapted from the method described previously (11) and as performed previously (18). The earliest labeled fragment method was performed using the in vitro DNA replication system, as described previously (11, 18). In brief, the in vitro reactions were stopped at 4 and 8 min of incubation, the DNA products were digested with DdeI and PvuII and then separated on a 1.5% agarose gel in 1 ⫻ TAE buffer. The gel was dried and exposed to a PhosphorImaging plate. Incorporation of [␣-32P]dCTP and [␣-32P]dTTP into each fragment was quantitated by densitometry of a PhosphorImager screen using the Fuji BAS 2000 analyzer and expressed as incorporation/kb of DNA. Stability of pYACneo Constructs Containing Origins and the A3/4 Consensus Sequence—After transfection of pYACneo (Clontech) constructs, including pYACneo with the A3/4 insert placed at the EcoRI site, clones of HeLa cells that were resistant to G418 were maintained in continuous culture. A fluctuation assay, as described previously (10), was performed upon six independent HeLa cell clones maintained for more than 40 cell doublings in medium containing 400 g/ml G418. Mutagenesis of A3/4 Version of the Consensus Sequence—The GeneMorph PCR mutagenesis kit (Stratagene) was used according to the manufacturer’s instructions to introduce random mutations in the 36-bp consensus sequence known as A3/4. One of the original clones containing A3/4 in pCRscript was used with T3 and M13 universal primers to prepare the product for ligation into pCRscript at the SmaI site. After transformation of DH5␣ competent cells (Invitrogen), numerous colonies were isolated and sequenced using the T7 primer in an ABI Prism 3700 DNA Analyzer (PerkinElmer Life Sciences). Among more than 100 clones examined, we found 38 variants with one or more mutations in the 36-bp region comprising A3/4 (see Table VI). After identification of these 38 variants, plasmid preparations were made using Qiagen HiSpeed Mini or Midi Kits. Then, an equimolar pool that contained DNA from the 38 variants plus A3/4 was used to transfect HeLa cells, as described above. After isolation of the low molecular weight DNA fraction by Hirt lysis, the DNA was digested with DpnI to remove unreplicated DNA. The digested DNA pool was then used to transform competent bacterial cells, and colonies containing replicated plasmid DNA were isolated. The sequence of plasmid DNA from 60 such clones was obtained using an ABI Prism 3700 DNA Analyzer (PerkinElmer Life Sciences) (see Table VI). RESULTS Consensus Sequence Derivation—A consensus sequence was derived from autonomously replicating sequences associated with ␣-satellite sequences that had been isolated previously from African green monkey CV-1 cells (ors14 and ors23) (7) and from autonomously replicating DNA associated with ␣-satellite TABLE I Recovery of autonomously replicated sequences No. timesb Name sequence recovereda A3/4 A6 A7 A15 A16 A1 A5 A39 5 3 2 2 2 1 1 1 CCTCAAATGGTCTCCAATTTTCCTTTGGCAAATTCC CCTAAATTGGTCTGCAAATTGCATTTAGCAAATTCC CCTAGATTGGCTTGAAATTTTCCCTTACCAAATTCC CCTCAATTGGTTTCCAATCAGCATTTAGCAAATTCC CCTCGATGGGTTTGCAAATTCCCCTTAGCAAATTCC CCTAGAAGCGGTTCCAATTTGCATTTAGCAAATTCC CCTCAATTGGTTTCCAAATATCACTTGGCAAATTCC CCTCTAATGGGTTGCAATCTGCATTTAGCAAATTCC a 36-bp consensus sequence recovered after autonomous replication in HeLa cells. b No. of times the sequence was detected from 17 isolates. TABLE II Fischer’s exact test of autonomously replicating sequences Results are p ⫽ 0.000153. Replicatedc Random pickd Total No. with one duplicatea No. with no duplicateb Total 14 0 14 3 8 11 17 8 25 a No. of consensus sequence clones recovered which have at least one identical clone in the assessed population. b No. of consensus sequence clones recovered which have no other example in the assessed population. c Population of consensus sequence clones that were recovered as having been replicated in HeLa cells. d Population of consensus sequence clones that were randomly picked from pool of clones generated from degenerate oligonucleotide mixture. sequences (F5 and F20) obtained as anticruciform antibody affinity-purified DNA from normal human skin fibroblasts (9). We used a reiterative process between pairs of African green monkey and human sequences to minimize derivation of an ␣-satellite consensus. We did a comparison of ors14 to F5 and F20, and ors23 to F5 using PILEUP (GCG software) to identify those regions that were useful to use in the generation of a consensus sequence, using CONSENSUS with a certainty level of 75%. Using these four sequences and minimizing ␣-satellite repetitive sequence, we derived a 36-bp consensus: CCTMDAWKSGBYTSMAAWTWBCMYTTRSCAAATTCC.2 Recovery of Autonomously Replicating Sequences—After synthesis of oligonucleotides containing T3, T7, and M13 reverse primers bracketing the consensus sequence, the mixed pool of oligonucleotides was amplified by PCR using the primers and ligated into pCRscript using the SrfI restriction site. The ligation pool was used in transfection of HeLa cells and as a template in an in vitro DNA replication system using HeLa cell extracts as a source of replication proteins (11). (Previously, we have shown that in this in vitro replication system, initiation is site-specific and maps to the same site as in vivo (8, 11, 19).) To eliminate unreplicated DNA, the recovered DNA after transfection into HeLa cells or in vitro replication was digested with DpnI. Then, the pool of DNA products from both replication systems, presumably containing some replicated DpnI-resistant plasmid ⫹ consensus inserts, was used to transform competent bacteria and obtain bacterial clones of versions of the consensus sequence capable of autonomous replication. All bacterial clones recovered after selection with ampicillin were found to contain plasmid constructs as identified by agarose gel electrophoresis of plasmid DNA preparations. 17 independent clones were sequenced and shown to contain various versions of the consensus with appropriate flanking sequence. Table I 2 Further details of sequence and method to generate the consensus are available upon request. Nucleotide code: M ⫽ A or C; D ⫽ A, G, or T; W ⫽ A or T; K ⫽ G or T; S ⫽ C or G; B ⫽ C, G, or T; Y ⫽ C or T; R ⫽ A or G; H ⫽ A, C, or T; V ⫽ A, C, or G; N ⫽ A, C, G, or T. cis-Element for DNA Replication 19651 TABLE III Examples of CpG island DNA homology to the consensus sequence Locusa Accession no.a HS8F7Fd HS43D5F HS171C10F HS71E12R HS17D2f HS77C2R HS14C3Rd HS36D10R HS30G4R HS8F11R HS12B11F HS28C6R HS18H3F HS90G9F HS37A8R HS173D8R Z66331 Z61072 Z57320 Z62676 Z54973 Z63037 Z59323 Z60841 Z58181 Z63768 Z56579 Z55240 Z57696 Z63828 Z55373 Z64869 Nucleotideb Lengthc Gapsc 36 36 35 36 24 34 35 35 35 35 35 35 35 35 35 35 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 Homologyc % 97–132 97–132 96–130 93–128 2–25 90–123 80–114 96–130 95–129 95–129 97–131 95–129 97–131 97–131 95–129 76–130 89 89 89 83 83 76 97 97 100 100 100 100 100 100 97 97 a Locus and accession no. from GenBank for CpG island sequences from Cross et al. (30). Nucleotide position of homologous sequence. c Length of homologous sequence, no. of gaps in homologous sequence, and percent homology to the consensus sequence. d HS8F7F corresponds to 6K, and HS14C3R corresponds to CP9. b TABLE IV Comparison of the autonomous replication activity of different versions of the consensus sequence in different species Autonomous replication activity was determined either by BrdUrd incorporation or by bacterial retransformation, as described under “Results” and “Experimental Procedures.” Cells used were: human (HeLa and WI38); avian (chicken embryo fibroblasts; CEF); bovine embryo kidney fibroblasts (BEKF); and murine (mouse 3T3). Replication assay BrdUrd incorporation a Bacterial retransformation Clone Human (HeLa) Avian (CEF) Bovine (BEKF) Human (WI38) Murine (3T3) CP9 6K A16 A3/4 30.4 ⫹ ⫺ ⫹ ⫹ ⫺ ⫹ ⫺ ⫹ ND ⫺ NDa ⫹ ND ND ⫺ ⫹ ⫺ ⫹ ⫹ ⫺ ⫹ ⫹ ⫹ ⫹ ⫺ ND, not determined. summarizes the consensus sequences recovered. Within the 17 clones, there were found multiple representations of some versions of the consensus; the A3/4 version of the consensus sequence was represented 5 times in this group of 17 clones. When these 17 sequences were compiled to generate another consensus sequence, we derived the same 36-bp consensus used to generate the oligonucleotide mixture used in this study. Only 3 clones, A1, A5, and A39 (Table I), were uniquely represented in the group of 17. We next asked whether the assortment of the 17 sequences could be considered as different from randomly selected clones, i.e. without selection by replication in our in vitro system. We randomly picked and sequenced 8 clones transformed with an aliquot of the same ligation pool, but not subjected to replication in vitro. All 8 clones were found to be unique versions with no multiple representations of any single version. As shown in the Fischer’s exact test of autonomously replicating sequences (Table II), the probability that the 17 independent clones recovered after replication in the in vitro system could be attributed to random sampling of the available plasmid ⫹ consensus insert versions is unlikely (p ⫽ 0.000153). In other words, the library from which autonomously replicating clones were obtained was not limited and did not have any apparent overrepresentation of the replicating clones. Occurrence of Homologs in Data Base Sequences—A search for homologs in the GenBank data base revealed significant similarity to a number of sequences, including some in regions wherein origins of DNA replication have been mapped. None of the sequences within such regions were 100% similar to the 36-bp consensus. Homologs varied in similarity from 71 to 88% over 21 to 35 bp in initiation regions for c-myc (20), lamin B2 (21), NOA3 (22), -globin (23), IgM chain enhancer (24), heat shock protein 70 (25), the Chinese hamster ovary dhfr (26), and the rodent RPS14 (27). In vivo footprinting of the lamin B2 origin region revealed that an area of 70 bp was protected on one strand (28). More recently, a 79% homology match over 24 bp of the 36-bp consensus sequence was observed at the human lamin B2 origin site and was mapped to the 5⬘ 3 3⬘ strand at a predicted bidirectional start site, position 3933 (29). Among the most interesting homologs were those present in sequences isolated and characterized as CpG islands (30). CpG islands are regions of about 1 kb that are GC-rich (65%) and occur in association with promoters of about 50% of all mammalian genes (31, 32). Replication origins have also been detected at several promoters, including those for the c-myc gene (20, 33, 34), the Hsp70 gene (25), the ppv1 gene at the 3⬘-end of lamin B2 (21), and rat aldolase B gene (35). Based upon this background, Delgado et al. (36) showed that CpG islands are initiation sites for both transcription and DNA replication. Over 36 bp, several sequences, identified as CpG islands (30), had 83 and 89% homology; and with the allowance of a single base gap over 35 bp, several CpG island sequences contained homologs of 100% similarity (Table III). A bidirectional origin of replication was mapped 3⬘ to the chicken lysozyme locus within a CpG island (37). Analysis of the 600-bp region within which the initiation site resides showed a 70% homology to the consensus over 24 bp. Autonomous Replicating Activity of Consensus Versions— Using the BrdUrd semiconservative replication assay, as described previously (6, 7, 38), we examined the ability of a 19652 cis-Element for DNA Replication FIG. 1. Semiconservative autonomous replication of consensus clone A3/4 (A) and consensus clone A16 (B) in HeLa cells. The relative DNA content (left ordinate and bars) in individual CsCl fractions (abscissa) as assessed by Southern dot blot analysis (below abscissa) is shown. The refractive index (right ordinate, diamonds), linearity of the gradient, and regions corresponding to HH, double-stranded substitution, HL, single-stranded substitution, and LL input DNA are indicated. DNA detected between regions is partially replicated, BrdUrd-substituted LL and HL DNA. number of plasmid clones for autonomously replicating activity after transfection into HeLa cells, bovine embryo kidney fibroblasts, and chicken embryo fibroblasts (Table IV). After one round of semiconservative replication, the DNA product should be of heavy-light (HL) hybrid density, whereas two or more rounds of replication should yield fully substituted DNA of heavy-heavy (HH) strands. Plasmids containing the A3/4 and A16 versions of the consensus sequence (for sequence, see Table I) exhibited efficient autonomous semiconservative replication (Fig. 1). For both plasmid clones shown, a peak of unreplicated (LL) DNA was recovered at the top of each gradient (Fig. 1A, region including fractions 22–24; and Fig. 1B, region including fractions 19 –24). Additional peaks of replicated, HH DNA were also obtained, indicating two or more rounds of replication, respectively (Fig. 1, A and B), with some potential HL DNA in Fig. 1A. The linearity of each gradient was verified by measuring the refractive index of every other fraction (Fig. 1, A and B). As negative control, a plasmid vector (pCRscript) alone was transfected in separate flasks, for which only LL DNA was recovered (data not shown; examples of negative controls can also be found elsewhere in Refs. 6, 7, 12, 38, and in Fig. 4, A and C, below). To confirm that the most frequently represented version of the consensus sequence could indeed initiate DNA replication, we performed an earliest labeled fragment assay on the in vitro replication (19) of a plasmid containing the A3/4 sequence. As shown in Fig. 2, the highest amount of incorporation of radioactive nucleotides appeared to occur within a 260-bp fragment, indicating that initiation occurred in that fragment. This fragment contains the 36-bp A3/4 consensus version plus the flank- FIG. 2. Earliest labeled fragment analysis of consensus clone A3/4. Solid bars, 4 min, and gray bars, 8 min of reaction time in the in vitro DNA replication system. Restriction fragment size (in kb) is indicated on the abscissa with the restriction map for DdeI (D) and PvuII (P) digests. The asterisk indicates the location of the A3/4 consensus sequence containing fragment. Relative incorporation/kb is indicated on the ordinate. ing sequence (consensus ⫹ primers and restriction enzyme sites is 104 bp total). Vector alone consistently shows no preferential initiation site(s), with any visible incorporation because of random repair of damage sites in the template vector (11, 18). Two clones of CpG islands (30), CP9 (HS14C3R locus) and 6K (HS8F7F locus), which contain versions of the consensus sequence, were also tested for autonomous replicating activity by the semiconservative BrdUrd incorporation assay after transfection into HeLa cells (Fig. 3). Only CP9 (Fig. 3A) was able to support autonomous replication after transfection into HeLa cells readily, with high amounts of both HH and HL replicated plasmid DNA, whereas clone 6K (Fig. 3B) was incapable of autonomous replication because all plasmid was recovered as unreplicated (LL). We also tested whether versions of the consensus sequence might support autonomous replication of plasmid DNA after transfection into eukaryotic cells of other species. Bovine embryo fibroblasts and chicken fibroblast cells were transfected with plasmid DNA containing different versions of the consensus sequence (Fig. 4). A negative control plasmid (clone 30.4; Fig. 4A) and the CpG island clone 6K containing the consensus (Fig. 4B) were tested for autonomous replication activity in bovine embryo kidney fibroblasts. The 6K plasmid clone showed strong autonomous replication activity with high amounts of both HL and HH DNA being recovered (Fig. 4B) relative to plasmid clone 30.4 (Fig. 4A), in which the majority of DNA was recovered as unreplicated (LL). In Fig. 4, C and D, clone 30.4 and the CP9 consensus clone, respectively, were tested for autonomous replication activity in chicken embryo fibroblasts. Again, in these cells, clone 30.4 exhibited a very low cis-Element for DNA Replication FIG. 3. Semiconservative autonomous replication of CpG island clone CP9 (A) and CpG island clone 6K (B) in HeLa cells. The ordinate is the percent relative DNA content where 100% is taken as the highest density value from Southern dot-blots from the fractions. See also the Fig. 1 legend. (background) level of replication (Fig. 4C), whereas clone CP9 exhibited efficient autonomous replicating activity, with large amounts of HH and HL DNA being recovered (Fig. 4D). Finally, the autonomously replicating activity of clone A16 (see Table I) in chicken embryo fibroblasts (Fig. 5A) was compared with that of clone 6K (Fig. 5B). Although clone A16 was able to replicate autonomously in chicken embryo fibroblasts, the activity of clone 6K in these cells was very low. We next assessed the ability of the different versions of the consensus sequence (i.e. A3/4, 6K, A16, and CP9) to replicate in normal human cells (WI38 embryo lung fibroblasts) and in immortal murine fibroblasts (3T3 cells). For this, we used the DpnI resistance assay to detect plasmid DNA replicated in mammalian cells that lack a deoxyadenosine methylase (dam) gene, making the replicated DNA resistant to digestion with DpnI. After low molecular weight DNA preparations from Hirt lysates of transfected cells were digested with DpnI, the DNA was used to transform bacteria, as an indicator of autonomous replication potential as previously described (10). As shown in Fig. 6, both mouse 3T3 cells and normal human (WI38) cells, supported the replication of the consensus sequence variants, compared with a negative control plasmid, 30.4. The most efficient replication was observed with the A3/4 consensus version plasmid in both mouse and human cells. 6K also demonstrated autonomous replicating activity in 3T3 cells, but not in human cells, consistent with the results obtained after its transfection into HeLa cells, using the semiconservative assay for autonomous replication (Fig. 3B). A16 and CP9 also replicated autonomously in both mouse and human cells, albeit with much lower efficiency than A3/4 or 6K (Fig. 6); both A16 and CP9 replicated with higher efficiency in mouse 3T3 cells than in human (WI38) cells. Table III summarizes these results. Finally, a double-stranded oligonucleotide of 40 bp (TTTTTTTTTTCCAATGATTTGTAATATACATTTTATGACT), 19653 spanning the region inclusive of the lamin B2 origin and start site (29) with homology to the consensus sequence (see “Occurrence of Homologs in Data Base Sequences”) was cloned into pBluescript II. This plasmid was then tested for its ability to support autonomous replication in HeLa cells by the DpnI resistance bacterial retransformation assay, as described previously (7, 9, 10). In preliminary experiments, the sequence inclusive of the lamin B2 start site (107 ⫾ 36 colonies/plate, mean ⫾ S.D. of three plates) was found to support autonomous replication as efficiently as did the 36-bp A3/4 consensus version cloned into pBluescript II (70 ⫾ 17 colonies/plate; background using a plasmid without consensus or lamin B2 sequence was 7 ⫾ 2 colonies/plate). Stability of pYACneo Constructs Containing the A3/4 Consensus Sequence—The A3/4 version of the consensus sequence was subcloned from the pCRscript clone into the EcoRI restriction site of pYACneo. After transfection into HeLa cells, independent clones were selected with G418 and maintained continuously in culture in the presence of 400 g/ml G418, as described previously (10). After ⬎170 cell doublings, one of the clones was labeled with BrdUrd and then low molecular weight episomal DNA was recovered. The DNA was loaded onto a CsCl gradient, and fractions were collected, blotted, and hybridized with pYACneo containing the A3/4 insert. As shown in Fig. 7, there is an absence of the usual high amount of unreplicated (LL) DNA present in short term (2–3 days after transfection) assays. There are additional peaks of replicated, HL and HH DNA, indicative of continuing efficient semiconservative replication of this episome in the HeLa cells. As before, the linearity of the gradient was verified by measuring the refractive index of every other fraction (Fig. 7). As a negative control, the pYACneo vector alone was transfected and monitored in parallel in separate flasks, for which only LL DNA was recovered (data not shown). Table V summarizes the results; for comparison, the data from previous fluctuation tests of short mammalian origin sequences maintained as HeLa episomes are also shown. HeLa cells transfected with pYACneo alone yielded stable cell clones (three of three) that had integrated the plasmid into the genomic DNA. pYACneo was not observed to be maintained as an episome. As can be seen for all six independent clones, there was no integration of plasmid and the pYACneo ⫹ A3/4 construct (A3/4 in pYACneo) was maintained as an episome. Furthermore, the stability of the episome in the absence of selection was found to be ⬃0.9/cell/generation compared with the nonepisomally maintained (integrated) plasmids that had a stability of 1.0/cell/generation. Low molecular weight episomal DNA was used to obtain bacterial transformants; the plasmid DNA recovered in three independent clones was tested with several different restriction enzymes to indicate any apparent rearrangements. For example, digests of DNA with AvaI and HindIII enzyme gave the predicted fragment size of DNA from each of three independent plasmid clones recovered from two of the six independent HeLa cell clones that contain only nonintegrated episomal pYACneo ⫹ A3/4 DNA (Fig. 8 and Table V). Autonomous Replication of Consensus Sequence Containing Plasmids in Drosophila Cells—The apparent activity of versions of the consensus sequence across many species, including the taxonomic classes, Mammalia and Avia (see Table IV), caused us to wonder whether versions of the consensus sequence might be active across phyla of Chordata and Arthropoda (e.g. Drosophila melanogaster, an invertebrate). Because homology (66 –73%) was detected in Drosophila DNA to the consensus sequence, we tested the ability of A3/4 to support autonomous replication across phyla of Chordata and Arthropoda (e.g. D. melanogaster, an invertebrate), by the semiconser- 19654 cis-Element for DNA Replication FIG. 4. Semiconservative autonomous replication of control plasmid clone 30.4 (A) and CpG island clone 6K (B) in bovine embryo kidney cells is shown. Semiconservative autonomous replication of control plasmid clone 30.4 (C) and CpG island clone CP9 (D) in chicken embryo fibroblasts is also shown. See the Fig. 3 legend. vative BrdUrd incorporation assay. After transfection of the A3/4 version of the consensus sequence cloned in pCRscript (pCRscript ⫹ A3/4) into Drosophila S2 cells, peaks of DNA near the HL and HH positions of the gradient were recovered (Fig. 9, open bars), indicative of autonomous replication, whereas the negative control plasmid, 30.4, was replication-negative (Fig. 9, solid bars). An unusual feature was the virtual absence (very low level) of input (LL) DNA recovered from either the (pCRscript ⫹ A3/4) or from clone 30.4 plasmids. Such a result suggests that in Drosophila cells the input plasmids that are not competent for replication were degraded rapidly. Preliminary Mutagenesis Studies—To test further the potential of this consensus sequence in control of eukaryotic DNA replication, preliminary mutagenesis studies of a version of the consensus sequence were conducted. Random mutagenesis was performed on the A3/4 version of the consensus sequence, resulting in 52 changes that occurred in 38 variant clones: 2 gaps; 8 pyrimidine to pyrimidine changes; 26 pyrimidine to purine changes; 10 purine to purine changes; and 6 purine to pyrimidine changes. Only 5 bases (at positions 3, 13, 14, 16, and 22; asterisks in Table VI) of the 36 bases had no change that was detectable in any of the clones. A pool of equimolar amounts of plasmid DNA obtained from each of the 38 clones plus A3/4 was transfected into HeLa cells. Of the 38 variant clones, 10 clones (representing those sequences obtained from more than a single bacterial colony) were recovered as resistant to DpnI and having replicated in the HeLa cells. These clones were detected by sequencing of DpnI-resistant plasmids isolated from each of 60 bacterial colonies. These 10 mutated versions of A3/4 plus unmutated A3/4 were found among 47 bacterial colonies (Table VI), whereas 13 additional clones were found represented in only a single bacterial colony. Thus a total of 23 of the 38 FIG. 5. Semiconservative autonomous replication of consensus clone A16 (A) and CpG island clone 6K (B) in chicken embryo fibroblasts. See the Fig. 3 legend. cis-Element for DNA Replication FIG. 6. Comparison of consensus sequence variants, A3/4, 6K, A16, and CP9 with the control sequence 30.4 in a bacterial retransformation assay (9) after short term (3-day) culture subsequent to their transfection into either 3T3 cells (solid bars) or normal human embryonic lung fibroblast WI38 cells (open bars). The average number of DpnI-resistant colonies/plate was normalized to each other for each independent experiment (i.e. transfection into 3T3 cells or into WI38 cells) using cotransfection of luciferase expression plasmid to control for transfection efficiency from one flask of cells to another. ⬎500 indicates plates that were estimated to be up to twice as many as 500 colonies but were in fact not countable. The bars denoting the number of DpnI-resistant colonies for 30.4 represent the background after DpnI digestion in these experiments. variant clones were resistant to digestion by DpnI, indicating that they had replicated autonomously in HeLa cells. For statistical analysis, a more stringent criteria of segregating the clones was used. The replicating, DpnI-resistant plasmids that were able to transform bacteria and were detected among 60 bacterial colonies 1) more than once or 2) not at all were compared with those that were 3) not detected as replicating or 4) detected in only a single bacterial colony of the 60. For those clones represented more than once among the 60 bacterial colonies containing plasmid, the probability (Fischer’s exact test) that 1) the 20-bp region (position 3–22; see line in Table VI) in the 36-bp A3/4 sequence would be present in all of the 10 variant clones as unmutated and 2) that there would be 17 clones with mutations in the 20-bp region or 11 clones outside the regions that were not represented more than once or at all is p ⬍ 0.03. (Note that there are two clones, clones 2 and 7, included as unmutated in the 20-bp region because the mutations are permissive relative to the consensus sequence; see Table VI and footnotes.) Thus, the 20-bp region has been identified as a putative minimal sequence that appears to be necessary. If the 20-bp internal sequence (3–22 in the 36-bp consensus sequence) is used for assessment of homology, the homology to CpG islands as shown in Table III improves in most cases, with many 100% homologies with no gaps (Table VII). There is only one case in which a gap is now detected (accession no. Z54973). For two regions mapped as initiation sites of DNA replication in c-myc, there is between 75 and 89% homology to the 20 bp over 18 bp to 20 bp (20, 34). The homology for the 20 bp to lamin B2 adjoins the initiation site in the lamin B2 locus (21, 29). The homology for the autonomously replicating sequence and origin known as NOA3 (12, 22) and heat shock protein 70 (25) is also shown in Table VII. An origin of DNA replication has been reported for the Chinese hamster ovary dhfr locus (26), and there are homologies of 89% present on each strand as located within an autonomously replicating fragment, X24 (8). We have used two 20-bp duplexes, each placed separately into the EcoRI site of pBluescript, to test for autonomous replication ability in HeLa cells using the DpnI resistance assay and bacterial retransformation. Clone 20 is identical to the relevant 20 bp of A3/4 except in the last position in which there is a G instead of C. Clone 85 is identical to A3/4 except for 19655 FIG. 7. Semiconservative autonomous replication of A3/4 consensus sequence cloned into the EcoRI site of pYACneo and maintained as episomal DNA in HeLa cells (HeLa cell clone A9) for >170 cell doublings under selection with G418. Cells were labeled with BrdUrd, and low molecular weight episomal DNA was isolated and run on a cesium chloride gradient. See also the Fig. 1 legend. inversion of the first 2 bases from TC in A3/4 to CT in the 20-mer clone (see Table VI). 20-mer clone 20 gave 85 ⫾ 16 (S.D.); 20-mer clone 85 gave 54 ⫾ 14; and A3/4 gave 63 ⫾ 13 bacterial colonies/plate. (Background, pBluescript vector alone (21 ⫾ 7), was subtracted from these values.) We also tested two examples of mutations in the internal 20-bp sequence of the A3/4 sequence which were not recovered as one of the replicating clones. Mutated clone A1 has a nonpermissive change from G to A at position 9 of the 36-bp consensus, and mutated clone C2 has a change from A3/4 of T to C at position 11 of the 36-bp consensus sequence. In autonomous replication experiments, A3/4 gave 41 ⫾ 7 and clone 85 gave 42 ⫾ 6, whereas mutated clone A1 and mutated clone C2 gave 0 ⫾ 0 and 1 ⫾ 1, respectively, demonstrating that these changes within the 20 bp of the consensus seemed to affect replication activity greatly. In other preliminary experiments, both 20-mer clones (clone 20 and clone 85) competed with the 36-bp A3/4 sequence for OBA/ Ku86 binding (41– 43) (data not shown). Distribution of 20-Mer Consensus Sequence on Human Chromosomes—The distribution of the 20-mer consensus sequence over 1 Mb of continuous human genomic sequence for chromosomes 1, 20, 21, and 22 was examined using fuzznuc of the EMBOSS suite of software. The results shown in Table VIII were obtained for up to two through five mismatches (90 through 75% homology) allowed with no gaps. Under these conditions, two mismatches gave a range of 19 –51 homologs, whereas more mismatches rapidly increased the number of homologs to a maximum of 9,101–12,597 for five mismatches, no gaps. The distribution on the DNA ⫹/⫺ strands was approximately equal. For the allowance of two mismatches and assuming an equal distribution, initiation sites would be spaced from ⬃20 kb to ⬃50 kb apart. However, as demonstrated in Fig. 10, the distribution is not equal and can vary from ⱕ1,000 bases to ⱖ200 kb. A comparison with the distribution of the Saccharomyces cerevisiae ARS core consensus sequence, WTTTATRTTTW, using fuzznuc with no mismatches for chromosomes IV, VII, XII, XV over the first 1 Mb of sequence as obtained from the Saccharomyces Genome Data base,3 indicated 25/23 (⫹/⫺), 20/23, 20/25, 17/19 homologs, respectively. The total of homologs, ranging from 3 K. Dolinski, R. Balakrishnan, K. R. Christie, M. C. Costanzo, S. S. Dwight, S. R. Engel, D. G. Fisk, J. E. Hirschman, E. L. Hong, L. Issel-Tarver, A. Sethuraman, C. L. Theesfeld, G. Binkley, C. Lane, M. Schroeder, S. Dong, S. Weng, R. Andrada, D. Botstein, and J. M. Cherry, ftp://genome-ftp.stanford.edu/pub/yeast/SacchDB/ March 11, 2003 (date of access). 19656 cis-Element for DNA Replication TABLE V Stability of consensus sequence constructs in comparison to other mammalian origins Host Clone Integrated Episomal Stabilitya HeLa HeLa HeLa HeLa HeLa HeLa HeLa HeLa HeLa S. cerevisiae circular ARS plasmidd S. cerevisiae linear ARS plasmidd S. cerevisiae CEN-containing YACd Any host integrated DNAd YACneo (3 clones) YACS3 (1 clone)b YACS3 (1 clone)b Y343 (1 clone)b Y343 (3 clones)b X24 (1 clone)b A3/4 in pYACneo (6 clones) Linear YACneob (1 clone) Linear Y343b (1 clone) Circular ARS plasmidd Linear ARS plasmidd CEN-containing YACd ⫹ ⫺ ⫹ ⫹ ⫺ ⫹ ⫺ ⫹ ⫹ ⫺ ⫺ ⫺ ⫹ ⫺ ⫹ ?c ⫺ ⫹ ?c ⫹ ⫺ ⫺ ⫹ ⫹ ⫹ ⫺ 1.0 0.8 1.0 1.0 0.8–0.9 1.0 0.9 1.0 1.0 0.7 0.8 0.999 1.0 a Stability per division is the chance per cell division that a daughter cell will inherit the selectable marker (10). Data summarized from Nielsen and co-workers (10), S3 (9, 22), 343 (12, 13), and X24 are sequences which contain origins of DNA replication. X24 contains the bidirectional origin of DNA replication, ori from the dhfr gene (8). c ? indicates that the presence of episomal constructs was not analyzed due to the detection of an integrated copy. d Data from Murray and Szostak (39) and Hahnenberg et al. (40). b of CG dinucleotides to the expected proportion on the basis of the GC content of the segment); that is, 0/13 in chromosome 1, 0/6 in chromosome 20, 0/1 in chromosome 21, 0/11 in chromosome 22. For ␣-satellite sequence, only the 1 Mb of sequence at chromosome 22q11.1 contained an example of ␣-satellite that did overlap 3 homologs and corresponding to those with the 3 above the vertical bar in Fig. 10. Homologs (2 mismatches, no gaps), using BESTFIT of the GCG suite of software, of the 20-mer sequence to ␣-satellite and centromere sequence of the individual chromosomes were 4/6 in chromosome 1, 1/10 in chromosome 20, 4/10 in chromosome 21, and 1/10 in chromosome 22. DISCUSSION FIG. 8. Restriction enzyme digests of DNA of plasmid clones recovered from episomal DNA of HeLa cell clones A9 and A37. AvaI (left panel) and HindIII (right panel) digests are displayed after separation by agarose gel electrophoresis. Arrows indicate the predicted sizes to be obtained from the plasmid DNA of pYACneo ⫹ A3/4 sequence. See also the Fig. 7 legend and Table IV. FIG. 9. Semiconservative autonomous replication of A3/4 consensus sequence compared with the control sequence 30.4 in Drosophila S2 cells. Open bars indicate pCRscript ⫹ A3/4 sequence versus solid bars indicating 30.4 plasmid. See also the Fig. 1 legend. 36 to 48, compares favorably with the 20-mer consensus sequence on 1 Mb of human chromosomes. The homologs of the 20-mer sequence did not overlap any CpG islands (UCSC Genome Browser v17; ⬎200-base length, ⬎0.5 GC content, and the ratio of ⬎0.60 for observed proportion Identification of the yeast ARS consensus was aided by the availability of a large number of minimal origin sequences defined by the ARS plasmid assay (44). We report here the identification and testing of a putative consensus sequence capable of identifying initiation sites (origins) of DNA replication. We used four autonomously replicating sequences containing ␣-satellite sequence and a reiterative process between pairs of African green monkey and human sequences to minimize derivation of an ␣-satellite consensus. The resultant consensus sequence was 36 bp. Most importantly, we took the opportunity to enrich for versions of the consensus sequence by using a mixed pool of plasmids bearing versions of this consensus as template in a mammalian in vitro replication system and then selecting for plasmids that had been replicated in the system. Because analysis of 17 clones containing eight different versions of the consensus regenerated the same consensus, when viewed one nucleotide at a time across the 36 bp, we believe that in this context and in these functional assays, the 36-bp consensus sequence will be useful in experiments of eukaryotic core origin sequences. Although homologies could be observed for stretches of sequence within initiation regions of defined origins of replication, the length of homology and degree of similarity were not complete. We have been successful in using the consensus sequence to predict probable regions that may contain autonomous replication activity and origins at the ␥-aminobutyric acid receptor subunit 3 and ␣5 gene cluster (45, 46) and at the dnmt1 (human DNA methyltransferase) locus, wherein binding to Ku86 was also verified (47). In a FASTA search of the GenBank data base, very significant homology was observed with certain CpG island clones (30), up to 89% over 36 bp and with the allowance of a single base gap, 100% over 36 bp. Using the internal 20-bp sequence from the 36 bp, the homology was improved with gaps being eliminated, except in one case (Table VII). Consistent with this observation cis-Element for DNA Replication 19657 TABLE VI Mutagenesis and functional assay of A3/4 version of the consensus sequence The combined sequence of all 38 variants and their individual mutations are given to indicate the departures from the original A3/4 sequence at each position. An underline under a letter indicates a gap in a variant clone at this position. * indicates no change in any of mutated clones or a conserved/permissible change relative to the consensus sequence. A dash indicates base positions that were not included in the 20-bp region of conserved sequence in the 10 replicating clones that were detected. Mut. clone indicates clonal designation followed by the number of times among 60 bacterial colonies that the clone was represented. Mutagenesis Summary Base No. Consensus sequence A3/4 sequence All 38 individual Mutations combined Conserved/unchanged* Region unchanged in 10 replicating clones from A3/4 and consensus 1 2 CC CC YM 3 4 5 TM D TCA T YM 6 7 8 9 AW K S AA TG RM D R 10 G G V 11 B T H 12 Y C M 13 T T T 14 S C C 15 M C H 16 A A A 17 A A W 18 W T T 19 Y T W 20 W T N 21 B T W 22 C C C 23 M C H 24 Y T K 25 T T W 26 T T K 27 R G T 28 S G T 29 C C B 30 A A W 31 A A V 32 A A R 33 T T K 34 T T W 35 C C H 36 C C M * * * * * – – TCAAA TG G T C T C M A A T T W T C – – – – – – – – – – – – – – Mutated and replicating clones A3/4, 7 colonies CC TCAAA TG G T C T C C A A T T T T C C T T T G G C A A A T T C C Mut. clone 1, 7 colonies T A Mut. clone 2, 7 colonies A T Mut. clone 3, 5 colonies G Mut. clone 4, 5 colonies A A Mut. clone 5, 4 colonies G T Mut. clone 6, 3 colonies A Mut. clone 7, 3 colonies A G G Mut. clone 8, 2 colonies A Mut. clone 9, 2 colonies A C Mut. clone 10, 2 colonies A TABLE VII Examples of DNA homology to 20 bp of the consensus sequence Accession no.a Nucleotideb Lengthc Z66331 Z61072 Z57320 Z62676 Z54973 Z63037 Z59323 Z60841 Z58181 Z63768 Z56579 Z55240 Z57696 Z63828 Z55373 Z64869 HUMMYCC HUMMYCC HUMLAMBBB HSHSP70A HSAUTONF CGDHFRORI CGDHFRORI 99–118 99–118 98–117 95–114 9–29 256–274 82–101 97–116 97–116 97–116 99–118 97–116 99–118 99–118 97–116 99–117 1877–1896 4918–4936 3910–3929 356–374 179–196 2470–2487 3806–3823 20 20 20 20 20 19 20 20 20 20 20 20 20 20 20 19 20 18 20 19 18 18 18 Homologyc % 95 100 95 90 95d 84 100 100 100 100 100 100 100 100 95 100 75 89 75 84 89 89 89 a Accession no. of locus from GenBank for CpG island sequences from Cross et al. (30) and for other origin-containing sequences. HUMMYCC is human c-myc (two initiation sites) (20, 34); HUMLAMBBB is human lamin B2 inclusive of the initiation start site (21, 29); HSHSP70A is human heat shock protein 70 (25); HSAUTONF contains an origin of DNA replication also known as NOA3 (22); and CGDHFRORI is the Chinese hamster ovary dhfr origin of DNA replication (ori) overlapping the autonomously replicating clone known as X24 (8, 26). b Nucleotide position of homologous sequence. c Length of homologous sequence for an internal 20-bp sequence of the consensus: TMDAWKSGBYTSMAAWYWBC d Contains a gap. is the report by Delgado et al. (36), which demonstrated initiation of DNA replication within CpG islands. Homologies to other fragments of DNA containing origins of replication were also, in general, improved (Table VII). A human origin binding activity (OBA) has been isolated using a minimal 186-bp fragment from the monkey autonomous replicating sequence ors8 (43). Homology to the consensus sequence was observed within a 59-bp fragment that was the most effective competitor for binding of OBA to the 186-bp minimal autonomous replicating sequence. We then showed that A3/4 (36 bp) consensus was as effective a competitor as the 59-bp fragment and used it to affinity purify OBA, which was identified as the Ku86 subunit of Ku antigen (42). Ku antigen is identical to a DNA-dependent ATPase isolated from HeLa cells (48), which had been reported previously to cofractionate with a 21 S multiprotein complex competent for DNA synthesis from HeLa cells (49) and is capable of interaction with a region containing the replication origin of lamin B2 (41). The finding that a version of the consensus sequence is an effective competitor for OBA/Ku86 binding to a minimal autonomously replicating sequence lends support to the functionality of at least some versions of the consensus sequence. More recently, we have demonstrated, using chromatin immunoprecipitation assays, that Ku is associated in vivo with mammalian origins of DNA replication including ors8 and ors12, in a cell cycle-specific fashion, namely at G1/S (50). Recently, we have also obtained DNase I footprints of OBA/Ku86 and recombinant Ku upon a plasmid fragment containing A3/4, including bases in positions of 1–3, 9 –29, and 32–36 (includes part of the 20-bp internal autonomously replicating sequence, positions 3–22); similarly, a footprint was obtained over the consensus homologous regions of ors8 (51). The isolation and identification of this origin binding activity, using A3/4 as an affinity purification step, and its in vivo association with origins of DNA replication, provide further supporting evidence consistent with A3/4 possessing origin activity in episomes as well as in vivo, in chromatin. Various versions of the 36-bp consensus sequence were capable of autonomous replication after transfection into HeLa cells, as shown by the BrdUrd incorporation assay, leading to the production of both HL (hybrid density) and HH (fully substituted) DNA, diagnostic of semiconservative replication. The 19658 cis-Element for DNA Replication TABLE VIII Homologs of 20-mer consensus sequence in 1 Megabase of human genomic sequence Homologs were obtained using fuzznuc of EMBOSS suite of software with specified mismatches. No. of mismatches Chromosme 22q11.1a 13025001-14025000 DNA strand ⫹/⫺ homologs Total homologs Chromosome 20p12.2a 9470001-10470000 DNA strand ⫹/⫺ homologs Total homologs Chromosome 1q44a 238500001-239500000 DNA strand ⫹/⫺ homologs Total homologs Chromosome 21q21.1a 19000001-20000000 DNA strand ⫹/⫺ homologs Total homologs 2 3 4 5 11/8 19 124/129 253 968/924 1992 4580/4521 9101 19/19 38 155/190 345 1154/1218 2372 5617/5642 11259 17/17 34 155/167 322 1119/1126 2245 5469/5438 10907 29/22 51 238/208 446 1372/1367 2739 6382/6215 12597 a Sequence obtained from UCSC Human Genome Browser Gateway, the Human Nov. 2002 assembly, http://genome.ucsc.edu/cgi-bin/ hgGateway?org⫽human. FIG. 10. Distribution of homologs of a 20-mer consensus sequence over 1 Mb on human chromosomes. Vertical bars indicate the positions of homologs with up to two mismatches but no gaps. The numbers over some bars indicate the number of homologs clustered too close together to place separately in the graph. The chromosome, band location, and position are given as obtained from the UCSC Human Genome Browser. The distribution of the yeast ARS consensus sequence homologs (no mismatches) on S. cerevisiae chromosome XV is also shown. See also the footnotes of Table VIII. consensus served as an initiation site, as shown by the earliest labeled fragment method in a mammalian in vitro DNA replication system (11), which mapped the earliest incorporation of radiolabeled nucleotides to a minimal fragment of the plasmid containing the A3/4 version of the consensus. Two of the CpG island clones, CP9 and 6K, which contain versions of the consensus sequence, were also tested for autonomous replication activity, with one (CP9) demonstrating autonomous replicating activity in HeLa cells. However, analysis of autonomous replicating activity of the various versions of the 36-bp consensus and homologs present in the CpG island sequences in human, bovine, and chicken cells suggested that there may be species preference for subsets of the versions of the consensus and its homologs. For example, 6K clone was found to have activity in bovine cells, but not in human and chicken cells, whereas CP9 clone had significant activity in chicken and human cells. This apparent species preference may be associated with the initiator proteins involved in the recognition of the critical nucleotide sequence elements of the consensus and its homologs. Furthermore, the context in which the consensus sequences are present was varied from multiple cloning sites in pCRscript, and pBluescript to CpG islands (Table III) and to the EcoRI site of pYACneo. It was not possible in these experiments to establish fully what contribution, if any, context may play in the activity of consensus sequences. Versions of the consensus sequence, particularly A3/4, were found to replicate in both normal and malignant human cells (WI38 and HeLa, respectively). A pYACneo construct containing A3/4 could be maintained exclusively as episomes under selection for long periods of time. After ⱖ170 cell doublings, we demonstrated the continuing autonomous replication of the episomes in HeLa cells. Episomes recovered did not indicate any rearrangements of the constructs introduced into HeLa cells. Removal of selective pressure demonstrated that the episome had a surprising stability of ⱖ0.9/cell/generation. In anticipation of the next step in derivation of a minimal consensus core sequence for eukaryotic and mammalian DNA replication, preliminary mutagenesis studies were done in which a minimal 20-bp region is apparently correlated with the ability to replicate in HeLa cells. The distribution of this 20mer consensus sequence over 1 Mb of human chromosomes is similar, quantitatively and qualitatively (relative proximity to each other) to the distribution of ARS sequence on S. cerevisiae chromosomes. However, it may be that specific elements or combination of bases involved in initiator protein or cooperating replicative proteins binding are still to be revealed, allowing further delimination of a minimal core consensus sequence. With functional testing and further mutagenesis, it should be possible to derive a minimal core consensus sequence. It appears likely that the 20-bp sequence might be required for control of autonomous replication. In a context related to its position with regard to other surrounding sequence, the associated sequence may play a role in regulation of replication origin activity at different times and in different cell types. This consensus will provide for a similar advancement in understanding of regulation of DNA replication in higher eukaryotes, as the yeast ARS consensus did for DNA replication in yeast. With this greater definition will come a more rational approach to the development of compounds that affect DNA cis-Element for DNA Replication replication. This technology also has direct application to the development of nonviral vectors for gene transfer. Currently, wherein adenoviral gene delivery systems in particular and gene therapy in general are under close scrutiny because of adverse effects in gene therapy trials (52, 53), a consensus sequence of host cell composition, which can maintain DNA replication and expression of accompanying genes, provides a new opportunity for a gene delivery system of cellular (host) DNA origin. REFERENCES 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. Stinchcomb, D. T., Stuhl, K., and Davis, R. W. (1979) Nature 282, 39 – 43 Fangman, W., and Brewer, B. (1991) Annu. Rev. Cell Biol. 7, 375– 402 Held, P., and Heintz, N. (1992) Biochim. Biophys. Acta 1130, 235–246 Zannis-Hadjopoulos, M., and Price, G. B. (1998) Crit. Rev. Eukaryot. Gene Expr. 8, 81–106 Zannis-Hadjopoulos, M., and Price, G. B. (1999) J. Cell. Biochem. Suppl. 32/33, 1–14 Frappier, L., and Zannis-Hadjopoulos, M. (1987) Proc. Natl. Acad. Sci. U. S. A. 84, 6668 – 6672 Landry, S., and Zannis-Hadjopoulos, M. (1991) Biochim. Biophys. Acta 1088, 234 –244 Zannis-Hadjopoulos, M., Nielsen, T. O., Todd, A., and Price, G. B. (1994) Gene (Amst.) 151, 273–277 Nielsen, T., Bell, D., Lamoureux, C., Zannis-Hadjopoulos, M., and Price, G. (1994) Mol. Gen. Genet. 242, 280 –288 Nielsen, T. O., Cossons, N. H., Zannis-Hadjopoulos, M., and Price, G. (2000) J. Cell. Biochem. 76, 674 – 685 Pearson, C. E., Frappier, L., and Zannis-Hadjopoulos, M. (1991) Biochim. Biophys. Acta 1090, 156 –166 Wu, C., Friedlander, P., Lamoureux, C., Zannis-Hadjopoulos, M., and Price, G. B. (1993) Biochim. Biophys. Acta 1174, 241–257 Wu, C., Zannis-Hadjopoulos, M., and Price, G. B. (1993) Biochim. Biophys. Acta 1174, 258 –266 Graham, F. L., and van der Eb, A. J. (1973) Virology 52, 456 – 467 Todd, A., Landry, S., Pearson, E. E., Khoury, V., and Zannis-Hadjopoulos, M. (1995) J. Cell. Biochem. 57, 280 –289 Hirt, B. (1967) J. Mol. Biol. 26, 365–369 Popperl, H., and Featherstone, M. S. (1992) EMBO J. 11, 3673–3680 Diaz-Perez, M. J., Wainer, I. W., Zannis-Hadjopoulos, M., and Price, G. B. (1996) J. Cell. Biochem. 61, 444 – 451 Pearson, C. E., Shihab-El-Deen, A., Price, G. B., and Zannis-Hadjopoulos, M. (1994) Somat. Cell Mol. Genet. 20, 147–152 Waltz, S. E., Trivedi, A. A., and Leffak, M. (1996) Nucleic Acids Res. 24, 1887–1894 Giacca, M., Zentilin, L., Norio, P., Diviacco, S., Dimitrova, D., Contreas, C., Biamonti, G., Perini, G., Weighardt, F., Riva, S., and Falaschi, A. (1994) Proc. Natl. Acad. Sci. U. S. A. 91, 7119 –7123 Tao, L., Nielsen, T., Friedlander, P., Zannis-Hadjopoulos, M., and Price, G. B. (1997) J. Mol. Biol. 273, 509 –518 Aladjem, M. I., Groudine, M., Brody, L. L., Dieken, E. S., Fournier, R. E. K., Wahl, G. M., and Epner, E. M. (1995) Science 270, 815– 819 19659 24. Ariizumi, K., Wang, Z., and Tucker, P. W. (1993) Proc. Natl. Acad. Sci. U. S. A. 90, 3695–3699 25. Taira, T., Iguchi-Ariga, S. M. M., and Ariga, H. (1994) Mol. Cell. Biol. 14, 6386 – 6387 26. Burhans, W. C., Vassilev, L. T., Caddle, M. S., Heintz, N. H., and DePamphilis, M. L. (1990) Cell 62, 955–965 27. Tasheva, E. S., and Roufa, D. J. (1994) Mol. Cell. Biol. 14, 5628 –5635 28. Dimitrova, D., Giacca, M., Demarchi, F., Biamonti, G., Riva, S., and Falaschi, A. (1996) Proc. Natl. Acad. Sci. U. S. A. 93, 1498 –1503 29. Abdurashidova, G., Deganuto, M., Klima, R., Riva, S., Biamonti, G., Giacca, M., and Falaschi, A. (2000) Science 287, 2023–2026 30. Cross, S. H., Charlton, J. A., Nan, X., and Bird, A. P. (1994) Nat. Genet. 6, 236 –244 31. Larsen, F., Gundersen, G., Lopez, R., and Prydz, H. (1992) Genomics 13, 1095–1107 32. Antequera, F., and Bird, A. (1993) Proc. Natl. Acad. Sci. U. S. A. 90, 11995–11999 33. Vassilev, L. T., and Johnson, E. M. (1990) Mol. Cell. Biol. 10, 4899 – 4904 34. Tao, L., Dong, Z., Leffak, M., Zannis-Hadjopoulos, M., and Price, G. B. (2000) J. Cell. Biochem. 78, 442– 457 35. Zhao, Y., Tsutsumi, R., Yamaki, M., Nagatsuda, Y., Ejiri, S., and Tsutsumi, K. (1994) Nucleic Acids Res. 22, 5385–5390 36. Delgado, S., Gomez, M., Bird, A., and Antequera, F. (1998) EMBO J. 17, 2426 –2435 37. Phi-van, L., and Stratling, W. H. (1999) Nucleic Acids Res. 27, 3009 –3017 38. Pelletier, R., Mah, D. C. W., Landry, S., Matheos, D., Price, G. B., and Zannis-Hadjopoulos, M. (1997) J. Cell. Biochem. 66, 87–97 39. Murray, A. W., and Szostak, J. W. (1993) Nature 305, 189 –193 40. Hahnenberger, K. M., Baum, M. P., Polizzi, C. M., Carbon, J., and Clarke, L. (1989) Proc. Natl. Acad. Sci. U. S. A. 86, 577–581 41. Toth, E. C., Marusic, L., Ochem, A., Patthy, A., Pongor, S., Giacca, M., and Falaschi, A. (1993) Nucleic Acids Res. 21, 3257–3263 42. Ruiz, M. T., Matheos, D., Price, G. B., and Zannis-Hadjopoulos, M. (1999) Mol. Biol. Cell 10, 567–580 43. Ruiz, M. T., Pearson, C. E., Nielsen, T. O., Price, G. B., and ZannisHadjopoulos, M. (1995) J. Cell. Biochem. 58, 221–236 44. Newlon, C. S. (1996) DNA Replication in Eukaryotic Cells (DePamphilis, M. L., ed) pp. 873–914, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY 45. Sinnett, D., Woolf, E., Xie, W., Glatt, K., Kirkness, E. F., Nielsen, T. O., Zannis-Hadjopoulos, M., Price, G. B., and Lalande, M. (1996) Gene (Amst.) 173, 171–177 46. Strehl, S., LaSalle, J. M., and Lalande, M. (1997) Mol. Cell. Biol. 17, 6157– 6166 47. Araujo, F. D., Knox, J. D., Ramchandani, S., Pelletier, R., Bigey, P., Price, G., Szyf, M., and Zannis-Hadjopoulos, M. (1999) J. Biol. Chem. 274, 9335–9341 48. Cao, Q. P., Pitt, S., Leszyk, J., and Baril, E. F. (1994) Biochemistry 33, 8548 – 8557 49. Vishwanatha, J. K., and Baril, E. F. (1990) Biochemistry 29, 8753– 8759 50. Novac, O., Matheos, D., Araujo, F. D., Price, G. B., and Zannis-Hadjopoulos, M. (2001) Mol. Biol. Cell 12, 3386 –3401 51. Schid-Poulter, C., Matheos, D., Novac, O., Cui, B., Giffin, W., Ruiz, M. T., Price, G. B., Zannis-Hadjopoulos, M., and Hache, R. J. G. (2003) DNA Cell Biol. 22, 65–78 52. Fox, J. L. (1999) Nat. Biotechnol. 17, 1153 53. Fox, J. L. (2000) Nat. Biotechnol. 18, 377