Download Identification of a cis-Element That Determines Autonomous DNA

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

DNA nanotechnology wikipedia , lookup

United Kingdom National DNA Database wikipedia , lookup

Zinc finger nuclease wikipedia , lookup

DNA polymerase wikipedia , lookup

Eukaryotic DNA replication wikipedia , lookup

DNA replication wikipedia , lookup

Microsatellite wikipedia , lookup

Replisome wikipedia , lookup

Helitron (biology) wikipedia , lookup

Transcript
THE JOURNAL OF BIOLOGICAL CHEMISTRY
© 2003 by The American Society for Biochemistry and Molecular Biology, Inc.
Vol. 278, No. 22, Issue of May 30, pp. 19649 –19659, 2003
Printed in U.S.A.
Identification of a cis-Element That Determines Autonomous DNA
Replication in Eukaryotic Cells*
Received for publication, July 12, 2002, and in revised form, March 12, 2003
Published, JBC Papers in Press, March 27, 2003, DOI 10.1074/jbc.M207002200
Gerald B. Price‡§, Minna Allarakhia‡¶, Nandini Cossons储, Torsten Nielsen**‡‡,
Maria Diaz-Perez‡, Paula Friedlander‡, Liang Tao‡, and Maria Zannis-Hadjopoulos‡
From the ‡McGill Cancer Centre, McGill University, Montreal, Quebec H3G 1Y6, the 储Department of Internal Medicine,
the Ottawa Hospital-General Campus, Ottawa, Ontario K1H 8L6, and the **Faculty of Medicine,
University of British Columbia, Vancouver, British Columbia V6T 1Z3, Canada
A 36-bp human consensus sequence (CCTMDAWKSGBYTSMAAWTWBCMYTTRSCAAATTCC) is capable of
supporting autonomous replication of a plasmid after
transfection into eukaryotic cells. After transfection
and in vitro DNA replication, replicated plasmid DNA
containing a mixture of oligonucleotides of this consensus was found to reiterate the consensus. Initiation of
DNA replication in vitro occurs within the consensus.
One version, A3/4, in pYACneo, could be maintained under selection in HeLa cells, unrearranged and replicating continuously for >170 cell doublings. Stability of
plasmid without selection was high (>0.9/cell/generation). Homologs of the consensus are found consistently
at mammalian chromosomal sites of initiation and
within CpG islands. Versions of the consensus function
as origins of DNA replication in normal and malignant
human cells, immortalized monkey and mouse cells, and
normal cow, chicken, and fruit fly cells. Random mutagenesis studies suggest an internal 20-bp consensus
sequence of the 36 bp may be sufficient to act as a core
origin element. This cis-element consensus sequence is
an opportunity for focused analyses of core origin
elements and the regulation of initiation of DNA
replication.
A key to the development of our knowledge about yeast
replication origins was the autonomous replicating sequence
(ARS)1 assay; genomic fragments cloned into prokaryotic vectors were found to function as yeast replication origins. ARS
plasmids transform yeast at a high frequency, replicate autonomously, and can be maintained in vivo as episomal genetic
elements (1). These constructs, however, are lost without selective pressure because of imperfect partition and can integrate into the genome during long term culture. ARS plasmids
* This work was supported in part by grants from the Cancer Research Society (to G. B. P.), the Canadian Institutes of Health Research
(to M. Z.-H.), and REPLICor, Inc. The costs of publication of this article
were defrayed in part by the payment of page charges. This article must
therefore be hereby marked “advertisement” in accordance with 18
U.S.C. Section 1734 solely to indicate this fact.
§ To whom correspondence should be addressed: McGill Cancer Centre, McGill University, 3655 Sir William Osler Promenade, Montreal,
Quebec H3G 1Y6, Canada.
¶ Recipient of an Fonds pour la Formation de Chercheurs et l’Aide à
Recherche Centre Studentship and Canadian Institutes of Health Research doctoral research studentship.
‡‡ Recipient of a Canadian Institutes of Health Research
studentship.
1
The abbreviations used are: ARS, autonomous replicating sequence;
BrdUrd, bromodeoxyuridine; HH, heavy-heavy; HL, heavy-light;
LL, light-light; OBA, origin binding activity; ors, origin-enriched
sequence(s).
This paper is available on line at http://www.jbc.org
were invaluable in identifying and defining replication origins
in yeast (2, 3). Progress in understanding other eukaryotic
DNA replication, particularly in mammalian cells, has been
slower (for review, see Refs. 4 and 5).
Our studies with small fragments of DNA which can support
autonomous replication of a plasmid in mammalian cells (6 –
10) encouraged us to look further for putative replicator sequences. We report here the identification and testing of a
putative consensus sequence that will aid in identification of
initiation sites (origins) of DNA replication in mammalian and
higher eukaryotic cells. We used four mammalian autonomously replicating sequences containing ␣-satellite sequence
and a reiterative process between pairs of African green monkey and human sequences to minimize derivation of an ␣-satellite consensus. The resultant consensus sequence was 36 bp.
EXPERIMENTAL PROCEDURES
Cloning of a Mixture of Oligonucleotides—A mixture of oligonucleotides was generated using a putative consensus sequence. The oligonucleotides were designed to contain effective primers (T3, T7, and M13
reverse primers) for PCR amplification (e.g. CATTAACCCTCACTAAAGGGAACAAAAGCTGGGTACC-consensus sequence-TGAGCTCCAATTCACTGGCCGTCGTTTTAC). After PCR amplification, the products
were cloned into the SrfI site of pCRscript (Stratagene) for individual
variant analysis. A fraction of the ligation reaction mixture was used to
transform bacteria (see below) and suggested that there was representation of greater than 100,000 independent clones. To ascertain the
minimal essential sequence, a functional assay for “origin” activity was
employed. A portion of the ligation mixture was subjected to replication
after transfection into HeLa cells (9), and in an in vitro replication
system (9, 11) after digestion of the reaction products with DpnI endonuclease to remove unreplicated pCRscript clones (9), the DpnI-resistant DNA was transfected into bacteria, and individual clones were
isolated and sequenced to ascertain whether they contained consensus
sequences.
CpG Island Clones—Two clones (CP9, HS14C3R Locus; and 6K,
HS8F7F Locus) were obtained from the UK HGMP Resource Centre,
Cambridge, UK. The sequence homolog in each clone is CATCGAAGCGCTTGAAATCTCCACTTACAAATTCC for CP9, and CCTCAAAGCGCTTGAAAATCTCCACTTGCAAATTCC for 6K. The CpG clones are in
the vector pGEM-5Zf(⫺).
Plasmid DNA—All plasmid DNA clones were propagated in bacteria
in LB medium containing 100 ␮g/ml ampicillin, and large scale
amounts of supercoiled plasmid DNA, essential for autonomous replication assays in vivo and in vitro, were prepared using the Qiagen tip
500 columns according to the manufacturer’s specifications (Qiagen).
Cell Culture and Transfection—HeLa cells were obtained from the
American Type Culture Collection and cultured in Alpha-modified
Eagle’s medium (Invitrogen) supplemented with 10% fetal bovine serum (Flow Laboratories). The cultures were maintained in a 37 °C
incubator containing an atmosphere of 10% CO2 ⫹ air. All normal
primary cells (WI38 human embryo lung fibroblasts, bovine embryo
kidney fibroblasts, and chicken embryo fibroblasts) were obtained from
BioWhittaker and maintained in culture as described for HeLa cells.
Drosophila S2 cells were maintained in Schneider’s Drosophila medium
19649
19650
cis-Element for DNA Replication
(Invitrogen) with 10% heat-inactivated fetal calf serum supplemented
with glutamine, asparagine, and penicillin/streptomycin. The cells were
sealed in Nunc tissue culture flasks and incubated at room temperature
in the dark.
Transfections were carried out as described previously (12, 13). Cells
were cultured at 1 ⫻ 104 cells/cm2 in tissue culture flasks, T25 (Nunclon) overnight before transfection with 5 ␮g of supercoiled plasmid
DNA, prepared using the calcium coprecipitation method (14). After
transfection, the cells were grown for 24 h in medium containing bromodeoxyuridine (BrdUrd), as described previously (7, 12, 15). Plasmids
were recovered by Hirt lysis (16), loaded onto CsCl gradients (initial
refractive index 1.408), and centrifuged as described previously (7, 12).
An aliquot of each fraction was either dot- or slot-blotted onto a GeneScreen Plus membrane (PerkinElmer Life Sciences), hybridized to 32Plabeled vector (pCRscript or pBluescript) DNA, exposed to an imaging
plate, and quantified by densitometry performed using a PhosphorImager (Fuji BAS 2000).
In some cases, episomal DNA was recovered by Hirt lysis 3 days after
HeLa cells were cotransfected with plasmids containing various versions of the consensus sequence and an expression plasmid carrying the
luciferase gene, pRSVLUC (17). The low molecular weight DNA was
digested with DpnI and then used to transform the DH5␣ strain of
Escherichia coli in a bacterial retransformation assay, as described
previously (7, 9, 10). Some of the transfected HeLa cells were used to
determine variations in cell density and efficiency of transfection, by
measuring levels of luciferase as described previously (17). The levels of
luciferase were used to normalize the transformed bacterial colonies
detected on LB agar plates containing 100 ␮g/ml ampicillin. In some
cases, we also used pCMV/␤-galactosidase (Applied Biosystems) and a
␤-galactosidase assay kit (Invitrogen).
In Vitro DNA Replication—The cell-free replication assay was
adapted from the method described previously (11) and as performed
previously (18). The earliest labeled fragment method was performed
using the in vitro DNA replication system, as described previously (11,
18). In brief, the in vitro reactions were stopped at 4 and 8 min of
incubation, the DNA products were digested with DdeI and PvuII and
then separated on a 1.5% agarose gel in 1 ⫻ TAE buffer. The gel was
dried and exposed to a PhosphorImaging plate. Incorporation of
[␣-32P]dCTP and [␣-32P]dTTP into each fragment was quantitated by
densitometry of a PhosphorImager screen using the Fuji BAS 2000
analyzer and expressed as incorporation/kb of DNA.
Stability of pYACneo Constructs Containing Origins and the A3/4
Consensus Sequence—After transfection of pYACneo (Clontech) constructs, including pYACneo with the A3/4 insert placed at the EcoRI
site, clones of HeLa cells that were resistant to G418 were maintained
in continuous culture. A fluctuation assay, as described previously (10),
was performed upon six independent HeLa cell clones maintained for
more than 40 cell doublings in medium containing 400 ␮g/ml G418.
Mutagenesis of A3/4 Version of the Consensus Sequence—The GeneMorph PCR mutagenesis kit (Stratagene) was used according to the
manufacturer’s instructions to introduce random mutations in the
36-bp consensus sequence known as A3/4. One of the original clones
containing A3/4 in pCRscript was used with T3 and M13 universal
primers to prepare the product for ligation into pCRscript at the SmaI
site. After transformation of DH5␣ competent cells (Invitrogen), numerous colonies were isolated and sequenced using the T7 primer in an ABI
Prism 3700 DNA Analyzer (PerkinElmer Life Sciences). Among more
than 100 clones examined, we found 38 variants with one or more
mutations in the 36-bp region comprising A3/4 (see Table VI). After
identification of these 38 variants, plasmid preparations were made
using Qiagen HiSpeed Mini or Midi Kits. Then, an equimolar pool that
contained DNA from the 38 variants plus A3/4 was used to transfect
HeLa cells, as described above. After isolation of the low molecular
weight DNA fraction by Hirt lysis, the DNA was digested with DpnI to
remove unreplicated DNA. The digested DNA pool was then used to
transform competent bacterial cells, and colonies containing replicated
plasmid DNA were isolated. The sequence of plasmid DNA from 60 such
clones was obtained using an ABI Prism 3700 DNA Analyzer
(PerkinElmer Life Sciences) (see Table VI).
RESULTS
Consensus Sequence Derivation—A consensus sequence was
derived from autonomously replicating sequences associated
with ␣-satellite sequences that had been isolated previously
from African green monkey CV-1 cells (ors14 and ors23) (7) and
from autonomously replicating DNA associated with ␣-satellite
TABLE I
Recovery of autonomously replicated sequences
No.
timesb
Name sequence recovereda
A3/4
A6
A7
A15
A16
A1
A5
A39
5
3
2
2
2
1
1
1
CCTCAAATGGTCTCCAATTTTCCTTTGGCAAATTCC
CCTAAATTGGTCTGCAAATTGCATTTAGCAAATTCC
CCTAGATTGGCTTGAAATTTTCCCTTACCAAATTCC
CCTCAATTGGTTTCCAATCAGCATTTAGCAAATTCC
CCTCGATGGGTTTGCAAATTCCCCTTAGCAAATTCC
CCTAGAAGCGGTTCCAATTTGCATTTAGCAAATTCC
CCTCAATTGGTTTCCAAATATCACTTGGCAAATTCC
CCTCTAATGGGTTGCAATCTGCATTTAGCAAATTCC
a
36-bp consensus sequence recovered after autonomous replication
in HeLa cells.
b
No. of times the sequence was detected from 17 isolates.
TABLE II
Fischer’s exact test of autonomously replicating sequences
Results are p ⫽ 0.000153.
Replicatedc
Random pickd
Total
No. with one
duplicatea
No. with no
duplicateb
Total
14
0
14
3
8
11
17
8
25
a
No. of consensus sequence clones recovered which have at least one
identical clone in the assessed population.
b
No. of consensus sequence clones recovered which have no other
example in the assessed population.
c
Population of consensus sequence clones that were recovered as
having been replicated in HeLa cells.
d
Population of consensus sequence clones that were randomly picked
from pool of clones generated from degenerate oligonucleotide mixture.
sequences (F5 and F20) obtained as anticruciform antibody
affinity-purified DNA from normal human skin fibroblasts (9).
We used a reiterative process between pairs of African green
monkey and human sequences to minimize derivation of an
␣-satellite consensus. We did a comparison of ors14 to F5 and
F20, and ors23 to F5 using PILEUP (GCG software) to identify
those regions that were useful to use in the generation of a
consensus sequence, using CONSENSUS with a certainty level
of 75%. Using these four sequences and minimizing ␣-satellite
repetitive sequence, we derived a 36-bp consensus: CCTMDAWKSGBYTSMAAWTWBCMYTTRSCAAATTCC.2
Recovery of Autonomously Replicating Sequences—After synthesis of oligonucleotides containing T3, T7, and M13 reverse
primers bracketing the consensus sequence, the mixed pool of
oligonucleotides was amplified by PCR using the primers and
ligated into pCRscript using the SrfI restriction site. The ligation pool was used in transfection of HeLa cells and as a
template in an in vitro DNA replication system using HeLa cell
extracts as a source of replication proteins (11). (Previously, we
have shown that in this in vitro replication system, initiation is
site-specific and maps to the same site as in vivo (8, 11, 19).) To
eliminate unreplicated DNA, the recovered DNA after transfection into HeLa cells or in vitro replication was digested with
DpnI. Then, the pool of DNA products from both replication
systems, presumably containing some replicated DpnI-resistant plasmid ⫹ consensus inserts, was used to transform competent bacteria and obtain bacterial clones of versions of the
consensus sequence capable of autonomous replication. All bacterial clones recovered after selection with ampicillin were
found to contain plasmid constructs as identified by agarose gel
electrophoresis of plasmid DNA preparations. 17 independent
clones were sequenced and shown to contain various versions of
the consensus with appropriate flanking sequence. Table I
2
Further details of sequence and method to generate the consensus
are available upon request. Nucleotide code: M ⫽ A or C; D ⫽ A, G, or
T; W ⫽ A or T; K ⫽ G or T; S ⫽ C or G; B ⫽ C, G, or T; Y ⫽ C or T; R ⫽
A or G; H ⫽ A, C, or T; V ⫽ A, C, or G; N ⫽ A, C, G, or T.
cis-Element for DNA Replication
19651
TABLE III
Examples of CpG island DNA homology to the consensus sequence
Locusa
Accession no.a
HS8F7Fd
HS43D5F
HS171C10F
HS71E12R
HS17D2f
HS77C2R
HS14C3Rd
HS36D10R
HS30G4R
HS8F11R
HS12B11F
HS28C6R
HS18H3F
HS90G9F
HS37A8R
HS173D8R
Z66331
Z61072
Z57320
Z62676
Z54973
Z63037
Z59323
Z60841
Z58181
Z63768
Z56579
Z55240
Z57696
Z63828
Z55373
Z64869
Nucleotideb
Lengthc
Gapsc
36
36
35
36
24
34
35
35
35
35
35
35
35
35
35
35
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
Homologyc
%
97–132
97–132
96–130
93–128
2–25
90–123
80–114
96–130
95–129
95–129
97–131
95–129
97–131
97–131
95–129
76–130
89
89
89
83
83
76
97
97
100
100
100
100
100
100
97
97
a
Locus and accession no. from GenBank for CpG island sequences from Cross et al. (30).
Nucleotide position of homologous sequence.
c
Length of homologous sequence, no. of gaps in homologous sequence, and percent homology to the consensus sequence.
d
HS8F7F corresponds to 6K, and HS14C3R corresponds to CP9.
b
TABLE IV
Comparison of the autonomous replication activity of different versions of the consensus sequence in different species
Autonomous replication activity was determined either by BrdUrd incorporation or by bacterial retransformation, as described under “Results”
and “Experimental Procedures.” Cells used were: human (HeLa and WI38); avian (chicken embryo fibroblasts; CEF); bovine embryo kidney
fibroblasts (BEKF); and murine (mouse 3T3).
Replication assay
BrdUrd incorporation
a
Bacterial retransformation
Clone
Human (HeLa)
Avian (CEF)
Bovine (BEKF)
Human (WI38)
Murine (3T3)
CP9
6K
A16
A3/4
30.4
⫹
⫺
⫹
⫹
⫺
⫹
⫺
⫹
ND
⫺
NDa
⫹
ND
ND
⫺
⫹
⫺
⫹
⫹
⫺
⫹
⫹
⫹
⫹
⫺
ND, not determined.
summarizes the consensus sequences recovered. Within the 17
clones, there were found multiple representations of some versions of the consensus; the A3/4 version of the consensus sequence was represented 5 times in this group of 17 clones.
When these 17 sequences were compiled to generate another
consensus sequence, we derived the same 36-bp consensus used
to generate the oligonucleotide mixture used in this study.
Only 3 clones, A1, A5, and A39 (Table I), were uniquely represented in the group of 17. We next asked whether the assortment of the 17 sequences could be considered as different from
randomly selected clones, i.e. without selection by replication in
our in vitro system. We randomly picked and sequenced 8
clones transformed with an aliquot of the same ligation pool,
but not subjected to replication in vitro. All 8 clones were found
to be unique versions with no multiple representations of any
single version. As shown in the Fischer’s exact test of autonomously replicating sequences (Table II), the probability that
the 17 independent clones recovered after replication in the in
vitro system could be attributed to random sampling of the
available plasmid ⫹ consensus insert versions is unlikely (p ⫽
0.000153). In other words, the library from which autonomously replicating clones were obtained was not limited and
did not have any apparent overrepresentation of the replicating
clones.
Occurrence of Homologs in Data Base Sequences—A search
for homologs in the GenBank data base revealed significant
similarity to a number of sequences, including some in regions
wherein origins of DNA replication have been mapped. None of
the sequences within such regions were 100% similar to the
36-bp consensus. Homologs varied in similarity from 71 to 88%
over 21 to 35 bp in initiation regions for c-myc (20), lamin B2
(21), NOA3 (22), ␤-globin (23), IgM␮ chain enhancer (24), heat
shock protein 70 (25), the Chinese hamster ovary dhfr (26), and
the rodent RPS14 (27). In vivo footprinting of the lamin B2
origin region revealed that an area of 70 bp was protected on
one strand (28). More recently, a 79% homology match over 24
bp of the 36-bp consensus sequence was observed at the human
lamin B2 origin site and was mapped to the 5⬘ 3 3⬘ strand at
a predicted bidirectional start site, position 3933 (29).
Among the most interesting homologs were those present in
sequences isolated and characterized as CpG islands (30). CpG
islands are regions of about 1 kb that are GC-rich (65%) and
occur in association with promoters of about 50% of all mammalian genes (31, 32). Replication origins have also been detected at several promoters, including those for the c-myc gene
(20, 33, 34), the Hsp70 gene (25), the ppv1 gene at the 3⬘-end of
lamin B2 (21), and rat aldolase B gene (35). Based upon this
background, Delgado et al. (36) showed that CpG islands are
initiation sites for both transcription and DNA replication.
Over 36 bp, several sequences, identified as CpG islands (30),
had 83 and 89% homology; and with the allowance of a single
base gap over 35 bp, several CpG island sequences contained
homologs of 100% similarity (Table III). A bidirectional origin
of replication was mapped 3⬘ to the chicken lysozyme locus
within a CpG island (37). Analysis of the 600-bp region within
which the initiation site resides showed a 70% homology to the
consensus over 24 bp.
Autonomous Replicating Activity of Consensus Versions—
Using the BrdUrd semiconservative replication assay, as described previously (6, 7, 38), we examined the ability of a
19652
cis-Element for DNA Replication
FIG. 1. Semiconservative autonomous replication of consensus clone A3/4 (A) and consensus clone A16 (B) in HeLa cells. The
relative DNA content (left ordinate and bars) in individual CsCl fractions (abscissa) as assessed by Southern dot blot analysis (below abscissa) is shown. The refractive index (right ordinate, diamonds), linearity of the gradient, and regions corresponding to HH, double-stranded
substitution, HL, single-stranded substitution, and LL input DNA
are indicated. DNA detected between regions is partially replicated,
BrdUrd-substituted LL and HL DNA.
number of plasmid clones for autonomously replicating activity
after transfection into HeLa cells, bovine embryo kidney fibroblasts, and chicken embryo fibroblasts (Table IV). After one
round of semiconservative replication, the DNA product should
be of heavy-light (HL) hybrid density, whereas two or more
rounds of replication should yield fully substituted DNA of
heavy-heavy (HH) strands. Plasmids containing the A3/4 and
A16 versions of the consensus sequence (for sequence, see Table I) exhibited efficient autonomous semiconservative replication (Fig. 1). For both plasmid clones shown, a peak of unreplicated (LL) DNA was recovered at the top of each gradient
(Fig. 1A, region including fractions 22–24; and Fig. 1B, region
including fractions 19 –24). Additional peaks of replicated, HH
DNA were also obtained, indicating two or more rounds of
replication, respectively (Fig. 1, A and B), with some potential
HL DNA in Fig. 1A. The linearity of each gradient was verified
by measuring the refractive index of every other fraction (Fig.
1, A and B). As negative control, a plasmid vector (pCRscript)
alone was transfected in separate flasks, for which only LL
DNA was recovered (data not shown; examples of negative
controls can also be found elsewhere in Refs. 6, 7, 12, 38, and in
Fig. 4, A and C, below).
To confirm that the most frequently represented version of
the consensus sequence could indeed initiate DNA replication,
we performed an earliest labeled fragment assay on the in vitro
replication (19) of a plasmid containing the A3/4 sequence. As
shown in Fig. 2, the highest amount of incorporation of radioactive nucleotides appeared to occur within a 260-bp fragment,
indicating that initiation occurred in that fragment. This fragment contains the 36-bp A3/4 consensus version plus the flank-
FIG. 2. Earliest labeled fragment analysis of consensus clone
A3/4. Solid bars, 4 min, and gray bars, 8 min of reaction time in the in
vitro DNA replication system. Restriction fragment size (in kb) is indicated on the abscissa with the restriction map for DdeI (D) and PvuII
(P) digests. The asterisk indicates the location of the A3/4 consensus
sequence containing fragment. Relative incorporation/kb is indicated on
the ordinate.
ing sequence (consensus ⫹ primers and restriction enzyme
sites is 104 bp total). Vector alone consistently shows no preferential initiation site(s), with any visible incorporation because of random repair of damage sites in the template vector
(11, 18).
Two clones of CpG islands (30), CP9 (HS14C3R locus) and 6K
(HS8F7F locus), which contain versions of the consensus sequence, were also tested for autonomous replicating activity by
the semiconservative BrdUrd incorporation assay after transfection into HeLa cells (Fig. 3). Only CP9 (Fig. 3A) was able to
support autonomous replication after transfection into HeLa
cells readily, with high amounts of both HH and HL replicated
plasmid DNA, whereas clone 6K (Fig. 3B) was incapable of
autonomous replication because all plasmid was recovered as
unreplicated (LL).
We also tested whether versions of the consensus sequence
might support autonomous replication of plasmid DNA after
transfection into eukaryotic cells of other species. Bovine embryo fibroblasts and chicken fibroblast cells were transfected
with plasmid DNA containing different versions of the consensus sequence (Fig. 4). A negative control plasmid (clone 30.4;
Fig. 4A) and the CpG island clone 6K containing the consensus
(Fig. 4B) were tested for autonomous replication activity in
bovine embryo kidney fibroblasts. The 6K plasmid clone
showed strong autonomous replication activity with high
amounts of both HL and HH DNA being recovered (Fig. 4B)
relative to plasmid clone 30.4 (Fig. 4A), in which the majority of
DNA was recovered as unreplicated (LL). In Fig. 4, C and D,
clone 30.4 and the CP9 consensus clone, respectively, were
tested for autonomous replication activity in chicken embryo
fibroblasts. Again, in these cells, clone 30.4 exhibited a very low
cis-Element for DNA Replication
FIG. 3. Semiconservative autonomous replication of CpG island clone CP9 (A) and CpG island clone 6K (B) in HeLa cells.
The ordinate is the percent relative DNA content where 100% is taken
as the highest density value from Southern dot-blots from the fractions.
See also the Fig. 1 legend.
(background) level of replication (Fig. 4C), whereas clone CP9
exhibited efficient autonomous replicating activity, with large
amounts of HH and HL DNA being recovered (Fig. 4D). Finally,
the autonomously replicating activity of clone A16 (see Table I)
in chicken embryo fibroblasts (Fig. 5A) was compared with that
of clone 6K (Fig. 5B). Although clone A16 was able to replicate
autonomously in chicken embryo fibroblasts, the activity of
clone 6K in these cells was very low.
We next assessed the ability of the different versions of the
consensus sequence (i.e. A3/4, 6K, A16, and CP9) to replicate in
normal human cells (WI38 embryo lung fibroblasts) and in
immortal murine fibroblasts (3T3 cells). For this, we used the
DpnI resistance assay to detect plasmid DNA replicated in
mammalian cells that lack a deoxyadenosine methylase (dam)
gene, making the replicated DNA resistant to digestion with
DpnI. After low molecular weight DNA preparations from Hirt
lysates of transfected cells were digested with DpnI, the DNA
was used to transform bacteria, as an indicator of autonomous
replication potential as previously described (10). As shown in
Fig. 6, both mouse 3T3 cells and normal human (WI38) cells,
supported the replication of the consensus sequence variants,
compared with a negative control plasmid, 30.4. The most
efficient replication was observed with the A3/4 consensus version plasmid in both mouse and human cells. 6K also demonstrated autonomous replicating activity in 3T3 cells, but not in
human cells, consistent with the results obtained after its
transfection into HeLa cells, using the semiconservative assay
for autonomous replication (Fig. 3B). A16 and CP9 also replicated autonomously in both mouse and human cells, albeit with
much lower efficiency than A3/4 or 6K (Fig. 6); both A16 and
CP9 replicated with higher efficiency in mouse 3T3 cells than
in human (WI38) cells. Table III summarizes these results.
Finally, a double-stranded oligonucleotide of 40 bp (TTTTTTTTTTCCAATGATTTGTAATATACATTTTATGACT),
19653
spanning the region inclusive of the lamin B2 origin and start
site (29) with homology to the consensus sequence (see “Occurrence of Homologs in Data Base Sequences”) was cloned into
pBluescript II. This plasmid was then tested for its ability to
support autonomous replication in HeLa cells by the DpnI
resistance bacterial retransformation assay, as described previously (7, 9, 10). In preliminary experiments, the sequence
inclusive of the lamin B2 start site (107 ⫾ 36 colonies/plate,
mean ⫾ S.D. of three plates) was found to support autonomous
replication as efficiently as did the 36-bp A3/4 consensus version cloned into pBluescript II (70 ⫾ 17 colonies/plate; background using a plasmid without consensus or lamin B2 sequence was 7 ⫾ 2 colonies/plate).
Stability of pYACneo Constructs Containing the A3/4 Consensus Sequence—The A3/4 version of the consensus sequence
was subcloned from the pCRscript clone into the EcoRI restriction site of pYACneo. After transfection into HeLa cells, independent clones were selected with G418 and maintained continuously in culture in the presence of 400 ␮g/ml G418, as
described previously (10). After ⬎170 cell doublings, one of the
clones was labeled with BrdUrd and then low molecular weight
episomal DNA was recovered. The DNA was loaded onto a CsCl
gradient, and fractions were collected, blotted, and hybridized
with pYACneo containing the A3/4 insert. As shown in Fig. 7,
there is an absence of the usual high amount of unreplicated
(LL) DNA present in short term (2–3 days after transfection)
assays. There are additional peaks of replicated, HL and HH
DNA, indicative of continuing efficient semiconservative replication of this episome in the HeLa cells. As before, the linearity
of the gradient was verified by measuring the refractive index
of every other fraction (Fig. 7). As a negative control, the
pYACneo vector alone was transfected and monitored in parallel in separate flasks, for which only LL DNA was recovered
(data not shown). Table V summarizes the results; for comparison, the data from previous fluctuation tests of short mammalian origin sequences maintained as HeLa episomes are also
shown. HeLa cells transfected with pYACneo alone yielded
stable cell clones (three of three) that had integrated the plasmid into the genomic DNA. pYACneo was not observed to be
maintained as an episome. As can be seen for all six independent clones, there was no integration of plasmid and the pYACneo ⫹ A3/4 construct (A3/4 in pYACneo) was maintained as an
episome. Furthermore, the stability of the episome in the absence of selection was found to be ⬃0.9/cell/generation compared with the nonepisomally maintained (integrated) plasmids that had a stability of 1.0/cell/generation. Low molecular
weight episomal DNA was used to obtain bacterial transformants; the plasmid DNA recovered in three independent clones
was tested with several different restriction enzymes to indicate any apparent rearrangements. For example, digests of
DNA with AvaI and HindIII enzyme gave the predicted fragment size of DNA from each of three independent plasmid
clones recovered from two of the six independent HeLa cell
clones that contain only nonintegrated episomal pYACneo ⫹
A3/4 DNA (Fig. 8 and Table V).
Autonomous Replication of Consensus Sequence Containing
Plasmids in Drosophila Cells—The apparent activity of versions of the consensus sequence across many species, including
the taxonomic classes, Mammalia and Avia (see Table IV),
caused us to wonder whether versions of the consensus sequence might be active across phyla of Chordata and Arthropoda (e.g. Drosophila melanogaster, an invertebrate). Because
homology (66 –73%) was detected in Drosophila DNA to the
consensus sequence, we tested the ability of A3/4 to support
autonomous replication across phyla of Chordata and Arthropoda (e.g. D. melanogaster, an invertebrate), by the semiconser-
19654
cis-Element for DNA Replication
FIG. 4. Semiconservative autonomous replication of control plasmid clone 30.4 (A) and CpG island clone 6K (B) in bovine embryo kidney cells is
shown. Semiconservative autonomous replication of control plasmid clone 30.4 (C) and CpG island clone CP9 (D) in chicken embryo fibroblasts is
also shown. See the Fig. 3 legend.
vative BrdUrd incorporation assay. After transfection of the
A3/4 version of the consensus sequence cloned in pCRscript
(pCRscript ⫹ A3/4) into Drosophila S2 cells, peaks of DNA near
the HL and HH positions of the gradient were recovered (Fig. 9,
open bars), indicative of autonomous replication, whereas the
negative control plasmid, 30.4, was replication-negative (Fig. 9,
solid bars). An unusual feature was the virtual absence (very
low level) of input (LL) DNA recovered from either the
(pCRscript ⫹ A3/4) or from clone 30.4 plasmids. Such a result
suggests that in Drosophila cells the input plasmids that are
not competent for replication were degraded rapidly.
Preliminary Mutagenesis Studies—To test further the potential of this consensus sequence in control of eukaryotic DNA
replication, preliminary mutagenesis studies of a version of the
consensus sequence were conducted. Random mutagenesis was
performed on the A3/4 version of the consensus sequence, resulting in 52 changes that occurred in 38 variant clones: 2 gaps;
8 pyrimidine to pyrimidine changes; 26 pyrimidine to purine
changes; 10 purine to purine changes; and 6 purine to pyrimidine changes. Only 5 bases (at positions 3, 13, 14, 16, and 22;
asterisks in Table VI) of the 36 bases had no change that was
detectable in any of the clones. A pool of equimolar amounts of
plasmid DNA obtained from each of the 38 clones plus A3/4 was
transfected into HeLa cells. Of the 38 variant clones, 10 clones
(representing those sequences obtained from more than a single bacterial colony) were recovered as resistant to DpnI and
having replicated in the HeLa cells. These clones were detected
by sequencing of DpnI-resistant plasmids isolated from each of
60 bacterial colonies. These 10 mutated versions of A3/4 plus
unmutated A3/4 were found among 47 bacterial colonies (Table
VI), whereas 13 additional clones were found represented in
only a single bacterial colony. Thus a total of 23 of the 38
FIG. 5. Semiconservative autonomous replication of consensus clone A16 (A) and CpG island clone 6K (B) in chicken embryo
fibroblasts. See the Fig. 3 legend.
cis-Element for DNA Replication
FIG. 6. Comparison of consensus sequence variants, A3/4, 6K,
A16, and CP9 with the control sequence 30.4 in a bacterial
retransformation assay (9) after short term (3-day) culture subsequent to their transfection into either 3T3 cells (solid bars) or
normal human embryonic lung fibroblast WI38 cells (open bars).
The average number of DpnI-resistant colonies/plate was normalized to
each other for each independent experiment (i.e. transfection into 3T3
cells or into WI38 cells) using cotransfection of luciferase expression
plasmid to control for transfection efficiency from one flask of cells to
another. ⬎500 indicates plates that were estimated to be up to twice as
many as 500 colonies but were in fact not countable. The bars denoting
the number of DpnI-resistant colonies for 30.4 represent the background after DpnI digestion in these experiments.
variant clones were resistant to digestion by DpnI, indicating
that they had replicated autonomously in HeLa cells.
For statistical analysis, a more stringent criteria of segregating the clones was used. The replicating, DpnI-resistant plasmids that were able to transform bacteria and were detected
among 60 bacterial colonies 1) more than once or 2) not at all
were compared with those that were 3) not detected as replicating or 4) detected in only a single bacterial colony of the 60.
For those clones represented more than once among the 60
bacterial colonies containing plasmid, the probability (Fischer’s
exact test) that 1) the 20-bp region (position 3–22; see line in
Table VI) in the 36-bp A3/4 sequence would be present in all of
the 10 variant clones as unmutated and 2) that there would be 17
clones with mutations in the 20-bp region or 11 clones outside the
regions that were not represented more than once or at all is p ⬍
0.03. (Note that there are two clones, clones 2 and 7, included as
unmutated in the 20-bp region because the mutations are permissive relative to the consensus sequence; see Table VI and
footnotes.) Thus, the 20-bp region has been identified as a putative minimal sequence that appears to be necessary.
If the 20-bp internal sequence (3–22 in the 36-bp consensus
sequence) is used for assessment of homology, the homology to
CpG islands as shown in Table III improves in most cases, with
many 100% homologies with no gaps (Table VII). There is only
one case in which a gap is now detected (accession no. Z54973).
For two regions mapped as initiation sites of DNA replication
in c-myc, there is between 75 and 89% homology to the 20 bp
over 18 bp to 20 bp (20, 34). The homology for the 20 bp to lamin
B2 adjoins the initiation site in the lamin B2 locus (21, 29). The
homology for the autonomously replicating sequence and origin
known as NOA3 (12, 22) and heat shock protein 70 (25) is also
shown in Table VII. An origin of DNA replication has been
reported for the Chinese hamster ovary dhfr locus (26), and
there are homologies of 89% present on each strand as located
within an autonomously replicating fragment, X24 (8).
We have used two 20-bp duplexes, each placed separately
into the EcoRI site of pBluescript, to test for autonomous replication ability in HeLa cells using the DpnI resistance assay
and bacterial retransformation. Clone 20 is identical to the
relevant 20 bp of A3/4 except in the last position in which there
is a G instead of C. Clone 85 is identical to A3/4 except for
19655
FIG. 7. Semiconservative autonomous replication of A3/4 consensus sequence cloned into the EcoRI site of pYACneo and
maintained as episomal DNA in HeLa cells (HeLa cell clone A9)
for >170 cell doublings under selection with G418. Cells were
labeled with BrdUrd, and low molecular weight episomal DNA was
isolated and run on a cesium chloride gradient. See also the Fig. 1
legend.
inversion of the first 2 bases from TC in A3/4 to CT in the
20-mer clone (see Table VI). 20-mer clone 20 gave 85 ⫾ 16
(S.D.); 20-mer clone 85 gave 54 ⫾ 14; and A3/4 gave 63 ⫾ 13
bacterial colonies/plate. (Background, pBluescript vector alone
(21 ⫾ 7), was subtracted from these values.) We also tested two
examples of mutations in the internal 20-bp sequence of the
A3/4 sequence which were not recovered as one of the replicating clones. Mutated clone A1 has a nonpermissive change from
G to A at position 9 of the 36-bp consensus, and mutated clone
C2 has a change from A3/4 of T to C at position 11 of the 36-bp
consensus sequence. In autonomous replication experiments,
A3/4 gave 41 ⫾ 7 and clone 85 gave 42 ⫾ 6, whereas mutated
clone A1 and mutated clone C2 gave 0 ⫾ 0 and 1 ⫾ 1, respectively, demonstrating that these changes within the 20 bp of
the consensus seemed to affect replication activity greatly. In
other preliminary experiments, both 20-mer clones (clone 20
and clone 85) competed with the 36-bp A3/4 sequence for OBA/
Ku86 binding (41– 43) (data not shown).
Distribution of 20-Mer Consensus Sequence on Human Chromosomes—The distribution of the 20-mer consensus sequence
over 1 Mb of continuous human genomic sequence for chromosomes 1, 20, 21, and 22 was examined using fuzznuc of the
EMBOSS suite of software. The results shown in Table VIII
were obtained for up to two through five mismatches (90
through 75% homology) allowed with no gaps. Under these
conditions, two mismatches gave a range of 19 –51 homologs,
whereas more mismatches rapidly increased the number of homologs to a maximum of 9,101–12,597 for five mismatches, no
gaps. The distribution on the DNA ⫹/⫺ strands was approximately equal. For the allowance of two mismatches and assuming an equal distribution, initiation sites would be spaced from
⬃20 kb to ⬃50 kb apart. However, as demonstrated in Fig. 10,
the distribution is not equal and can vary from ⱕ1,000 bases to
ⱖ200 kb. A comparison with the distribution of the Saccharomyces cerevisiae ARS core consensus sequence, WTTTATRTTTW,
using fuzznuc with no mismatches for chromosomes IV, VII, XII,
XV over the first 1 Mb of sequence as obtained from the Saccharomyces Genome Data base,3 indicated 25/23 (⫹/⫺), 20/23, 20/25,
17/19 homologs, respectively. The total of homologs, ranging from
3
K. Dolinski, R. Balakrishnan, K. R. Christie, M. C. Costanzo, S. S.
Dwight, S. R. Engel, D. G. Fisk, J. E. Hirschman, E. L. Hong, L.
Issel-Tarver, A. Sethuraman, C. L. Theesfeld, G. Binkley, C. Lane, M.
Schroeder, S. Dong, S. Weng, R. Andrada, D. Botstein, and J. M.
Cherry, ftp://genome-ftp.stanford.edu/pub/yeast/SacchDB/ March 11,
2003 (date of access).
19656
cis-Element for DNA Replication
TABLE V
Stability of consensus sequence constructs in comparison to other mammalian origins
Host
Clone
Integrated
Episomal
Stabilitya
HeLa
HeLa
HeLa
HeLa
HeLa
HeLa
HeLa
HeLa
HeLa
S. cerevisiae circular ARS plasmidd
S. cerevisiae linear ARS plasmidd
S. cerevisiae CEN-containing YACd
Any host integrated DNAd
YACneo (3 clones)
YACS3 (1 clone)b
YACS3 (1 clone)b
Y343 (1 clone)b
Y343 (3 clones)b
X24 (1 clone)b
A3/4 in pYACneo (6 clones)
Linear YACneob (1 clone)
Linear Y343b (1 clone)
Circular ARS plasmidd
Linear ARS plasmidd
CEN-containing YACd
⫹
⫺
⫹
⫹
⫺
⫹
⫺
⫹
⫹
⫺
⫺
⫺
⫹
⫺
⫹
?c
⫺
⫹
?c
⫹
⫺
⫺
⫹
⫹
⫹
⫺
1.0
0.8
1.0
1.0
0.8–0.9
1.0
0.9
1.0
1.0
0.7
0.8
0.999
1.0
a
Stability per division is the chance per cell division that a daughter cell will inherit the selectable marker (10).
Data summarized from Nielsen and co-workers (10), S3 (9, 22), 343 (12, 13), and X24 are sequences which contain origins of DNA replication.
X24 contains the bidirectional origin of DNA replication, ori␤ from the dhfr gene (8).
c
? indicates that the presence of episomal constructs was not analyzed due to the detection of an integrated copy.
d
Data from Murray and Szostak (39) and Hahnenberg et al. (40).
b
of CG dinucleotides to the expected proportion on the basis of
the GC content of the segment); that is, 0/13 in chromosome 1,
0/6 in chromosome 20, 0/1 in chromosome 21, 0/11 in chromosome 22. For ␣-satellite sequence, only the 1 Mb of sequence at
chromosome 22q11.1 contained an example of ␣-satellite that
did overlap 3 homologs and corresponding to those with the 3
above the vertical bar in Fig. 10. Homologs (2 mismatches, no
gaps), using BESTFIT of the GCG suite of software, of the
20-mer sequence to ␣-satellite and centromere sequence of the
individual chromosomes were 4/6 in chromosome 1, 1/10 in chromosome 20, 4/10 in chromosome 21, and 1/10 in chromosome 22.
DISCUSSION
FIG. 8. Restriction enzyme digests of DNA of plasmid clones
recovered from episomal DNA of HeLa cell clones A9 and A37.
AvaI (left panel) and HindIII (right panel) digests are displayed after
separation by agarose gel electrophoresis. Arrows indicate the predicted
sizes to be obtained from the plasmid DNA of pYACneo ⫹ A3/4 sequence. See also the Fig. 7 legend and Table IV.
FIG. 9. Semiconservative autonomous replication of A3/4 consensus sequence compared with the control sequence 30.4 in
Drosophila S2 cells. Open bars indicate pCRscript ⫹ A3/4 sequence
versus solid bars indicating 30.4 plasmid. See also the Fig. 1 legend.
36 to 48, compares favorably with the 20-mer consensus sequence
on 1 Mb of human chromosomes.
The homologs of the 20-mer sequence did not overlap any
CpG islands (UCSC Genome Browser v17; ⬎200-base length,
⬎0.5 GC content, and the ratio of ⬎0.60 for observed proportion
Identification of the yeast ARS consensus was aided by the
availability of a large number of minimal origin sequences
defined by the ARS plasmid assay (44). We report here the
identification and testing of a putative consensus sequence
capable of identifying initiation sites (origins) of DNA replication. We used four autonomously replicating sequences containing ␣-satellite sequence and a reiterative process between
pairs of African green monkey and human sequences to minimize derivation of an ␣-satellite consensus. The resultant consensus sequence was 36 bp. Most importantly, we took the
opportunity to enrich for versions of the consensus sequence by
using a mixed pool of plasmids bearing versions of this consensus as template in a mammalian in vitro replication system
and then selecting for plasmids that had been replicated in the
system. Because analysis of 17 clones containing eight different
versions of the consensus regenerated the same consensus,
when viewed one nucleotide at a time across the 36 bp, we
believe that in this context and in these functional assays, the
36-bp consensus sequence will be useful in experiments of
eukaryotic core origin sequences. Although homologies could be
observed for stretches of sequence within initiation regions of
defined origins of replication, the length of homology and degree of similarity were not complete. We have been successful
in using the consensus sequence to predict probable regions
that may contain autonomous replication activity and origins
at the ␥-aminobutyric acid receptor subunit ␤3 and ␣5 gene
cluster (45, 46) and at the dnmt1 (human DNA methyltransferase) locus, wherein binding to Ku86 was also verified (47). In
a FASTA search of the GenBank data base, very significant
homology was observed with certain CpG island clones (30), up
to 89% over 36 bp and with the allowance of a single base gap,
100% over 36 bp. Using the internal 20-bp sequence from the 36
bp, the homology was improved with gaps being eliminated,
except in one case (Table VII). Consistent with this observation
cis-Element for DNA Replication
19657
TABLE VI
Mutagenesis and functional assay of A3/4 version of the consensus sequence
The combined sequence of all 38 variants and their individual mutations are given to indicate the departures from the original A3/4 sequence
at each position. An underline under a letter indicates a gap in a variant clone at this position. * indicates no change in any of mutated clones or
a conserved/permissible change relative to the consensus sequence. A dash indicates base positions that were not included in the 20-bp region of
conserved sequence in the 10 replicating clones that were detected. Mut. clone indicates clonal designation followed by the number of times among
60 bacterial colonies that the clone was represented.
Mutagenesis Summary
Base No.
Consensus sequence
A3/4 sequence
All 38 individual
Mutations combined
Conserved/unchanged*
Region unchanged in 10
replicating clones
from A3/4 and
consensus
1 2
CC
CC
YM
3 4 5
TM D
TCA
T YM
6 7 8 9
AW K S
AA TG
RM D R
10
G
G
V
11
B
T
H
12
Y
C
M
13
T
T
T
14
S
C
C
15
M
C
H
16
A
A
A
17
A
A
W
18
W
T
T
19
Y
T
W
20
W
T
N
21
B
T
W
22
C
C
C
23
M
C
H
24
Y
T
K
25
T
T
W
26
T
T
K
27
R
G
T
28
S
G
T
29
C
C
B
30
A
A
W
31
A
A
V
32
A
A
R
33
T
T
K
34
T
T
W
35
C
C
H
36
C
C
M
*
* *
*
*
– – TCAAA TG G T C T C M A A T T W T C –
–
–
–
–
–
–
–
–
–
–
–
–
–
Mutated and replicating clones
A3/4, 7 colonies
CC TCAAA TG G T C T C C A A T T T T C C T T T G G C A A A T T C C
Mut. clone 1, 7 colonies T
A
Mut. clone 2, 7 colonies
A
T
Mut. clone 3, 5 colonies
G
Mut. clone 4, 5 colonies
A
A
Mut. clone 5, 4 colonies
G
T
Mut. clone 6, 3 colonies
A
Mut. clone 7, 3 colonies
A
G G
Mut. clone 8, 2 colonies
A
Mut. clone 9, 2 colonies
A
C
Mut. clone 10, 2 colonies
A
TABLE VII
Examples of DNA homology to 20 bp of the consensus sequence
Accession no.a
Nucleotideb
Lengthc
Z66331
Z61072
Z57320
Z62676
Z54973
Z63037
Z59323
Z60841
Z58181
Z63768
Z56579
Z55240
Z57696
Z63828
Z55373
Z64869
HUMMYCC
HUMMYCC
HUMLAMBBB
HSHSP70A
HSAUTONF
CGDHFRORI
CGDHFRORI
99–118
99–118
98–117
95–114
9–29
256–274
82–101
97–116
97–116
97–116
99–118
97–116
99–118
99–118
97–116
99–117
1877–1896
4918–4936
3910–3929
356–374
179–196
2470–2487
3806–3823
20
20
20
20
20
19
20
20
20
20
20
20
20
20
20
19
20
18
20
19
18
18
18
Homologyc
%
95
100
95
90
95d
84
100
100
100
100
100
100
100
100
95
100
75
89
75
84
89
89
89
a
Accession no. of locus from GenBank for CpG island sequences from
Cross et al. (30) and for other origin-containing sequences. HUMMYCC
is human c-myc (two initiation sites) (20, 34); HUMLAMBBB is human
lamin B2 inclusive of the initiation start site (21, 29); HSHSP70A is
human heat shock protein 70 (25); HSAUTONF contains an origin of
DNA replication also known as NOA3 (22); and CGDHFRORI is the
Chinese hamster ovary dhfr origin of DNA replication (ori␤) overlapping the autonomously replicating clone known as X24 (8, 26).
b
Nucleotide position of homologous sequence.
c
Length of homologous sequence for an internal 20-bp sequence of
the consensus: TMDAWKSGBYTSMAAWYWBC
d
Contains a gap.
is the report by Delgado et al. (36), which demonstrated initiation of DNA replication within CpG islands. Homologies to
other fragments of DNA containing origins of replication were
also, in general, improved (Table VII).
A human origin binding activity (OBA) has been isolated
using a minimal 186-bp fragment from the monkey autonomous replicating sequence ors8 (43). Homology to the consensus sequence was observed within a 59-bp fragment that was
the most effective competitor for binding of OBA to the 186-bp
minimal autonomous replicating sequence. We then showed
that A3/4 (36 bp) consensus was as effective a competitor as the
59-bp fragment and used it to affinity purify OBA, which was
identified as the Ku86 subunit of Ku antigen (42). Ku antigen
is identical to a DNA-dependent ATPase isolated from HeLa
cells (48), which had been reported previously to cofractionate
with a 21 S multiprotein complex competent for DNA synthesis
from HeLa cells (49) and is capable of interaction with a region
containing the replication origin of lamin B2 (41). The finding
that a version of the consensus sequence is an effective competitor for OBA/Ku86 binding to a minimal autonomously replicating sequence lends support to the functionality of at least
some versions of the consensus sequence. More recently, we
have demonstrated, using chromatin immunoprecipitation assays, that Ku is associated in vivo with mammalian origins of
DNA replication including ors8 and ors12, in a cell cycle-specific fashion, namely at G1/S (50). Recently, we have also obtained DNase I footprints of OBA/Ku86 and recombinant Ku
upon a plasmid fragment containing A3/4, including bases in
positions of 1–3, 9 –29, and 32–36 (includes part of the 20-bp
internal autonomously replicating sequence, positions 3–22);
similarly, a footprint was obtained over the consensus homologous regions of ors8 (51). The isolation and identification of
this origin binding activity, using A3/4 as an affinity purification step, and its in vivo association with origins of DNA replication, provide further supporting evidence consistent with
A3/4 possessing origin activity in episomes as well as in vivo, in
chromatin.
Various versions of the 36-bp consensus sequence were capable of autonomous replication after transfection into HeLa
cells, as shown by the BrdUrd incorporation assay, leading to
the production of both HL (hybrid density) and HH (fully substituted) DNA, diagnostic of semiconservative replication. The
19658
cis-Element for DNA Replication
TABLE VIII
Homologs of 20-mer consensus sequence in 1 Megabase of human genomic sequence
Homologs were obtained using fuzznuc of EMBOSS suite of software with specified mismatches.
No. of mismatches
Chromosme 22q11.1a
13025001-14025000
DNA strand ⫹/⫺ homologs
Total homologs
Chromosome 20p12.2a
9470001-10470000
DNA strand ⫹/⫺ homologs
Total homologs
Chromosome 1q44a
238500001-239500000
DNA strand ⫹/⫺ homologs
Total homologs
Chromosome 21q21.1a
19000001-20000000
DNA strand ⫹/⫺ homologs
Total homologs
2
3
4
5
11/8
19
124/129
253
968/924
1992
4580/4521
9101
19/19
38
155/190
345
1154/1218
2372
5617/5642
11259
17/17
34
155/167
322
1119/1126
2245
5469/5438
10907
29/22
51
238/208
446
1372/1367
2739
6382/6215
12597
a
Sequence obtained from UCSC Human Genome Browser Gateway, the Human Nov. 2002 assembly, http://genome.ucsc.edu/cgi-bin/
hgGateway?org⫽human.
FIG. 10. Distribution of homologs of a 20-mer consensus sequence over 1 Mb on human chromosomes. Vertical bars indicate
the positions of homologs with up to two mismatches but no gaps. The
numbers over some bars indicate the number of homologs clustered too
close together to place separately in the graph. The chromosome, band
location, and position are given as obtained from the UCSC Human
Genome Browser. The distribution of the yeast ARS consensus sequence homologs (no mismatches) on S. cerevisiae chromosome XV is
also shown. See also the footnotes of Table VIII.
consensus served as an initiation site, as shown by the earliest
labeled fragment method in a mammalian in vitro DNA replication system (11), which mapped the earliest incorporation of
radiolabeled nucleotides to a minimal fragment of the plasmid
containing the A3/4 version of the consensus. Two of the CpG
island clones, CP9 and 6K, which contain versions of the consensus sequence, were also tested for autonomous replication
activity, with one (CP9) demonstrating autonomous replicating
activity in HeLa cells. However, analysis of autonomous replicating activity of the various versions of the 36-bp consensus
and homologs present in the CpG island sequences in human,
bovine, and chicken cells suggested that there may be species
preference for subsets of the versions of the consensus and its
homologs. For example, 6K clone was found to have activity in
bovine cells, but not in human and chicken cells, whereas CP9
clone had significant activity in chicken and human cells. This
apparent species preference may be associated with the initiator proteins involved in the recognition of the critical nucleotide
sequence elements of the consensus and its homologs. Furthermore, the context in which the consensus sequences are present
was varied from multiple cloning sites in pCRscript, and pBluescript to CpG islands (Table III) and to the EcoRI site of
pYACneo. It was not possible in these experiments to establish
fully what contribution, if any, context may play in the activity
of consensus sequences.
Versions of the consensus sequence, particularly A3/4, were
found to replicate in both normal and malignant human cells
(WI38 and HeLa, respectively). A pYACneo construct containing A3/4 could be maintained exclusively as episomes under
selection for long periods of time. After ⱖ170 cell doublings, we
demonstrated the continuing autonomous replication of the
episomes in HeLa cells. Episomes recovered did not indicate
any rearrangements of the constructs introduced into HeLa
cells. Removal of selective pressure demonstrated that the episome had a surprising stability of ⱖ0.9/cell/generation.
In anticipation of the next step in derivation of a minimal
consensus core sequence for eukaryotic and mammalian DNA
replication, preliminary mutagenesis studies were done in
which a minimal 20-bp region is apparently correlated with the
ability to replicate in HeLa cells. The distribution of this 20mer consensus sequence over 1 Mb of human chromosomes is
similar, quantitatively and qualitatively (relative proximity to
each other) to the distribution of ARS sequence on S. cerevisiae
chromosomes. However, it may be that specific elements or
combination of bases involved in initiator protein or cooperating replicative proteins binding are still to be revealed, allowing further delimination of a minimal core consensus sequence.
With functional testing and further mutagenesis, it should be
possible to derive a minimal core consensus sequence. It appears likely that the 20-bp sequence might be required for
control of autonomous replication. In a context related to its
position with regard to other surrounding sequence, the associated sequence may play a role in regulation of replication
origin activity at different times and in different cell types.
This consensus will provide for a similar advancement in
understanding of regulation of DNA replication in higher eukaryotes, as the yeast ARS consensus did for DNA replication
in yeast. With this greater definition will come a more rational
approach to the development of compounds that affect DNA
cis-Element for DNA Replication
replication. This technology also has direct application to the
development of nonviral vectors for gene transfer. Currently,
wherein adenoviral gene delivery systems in particular and
gene therapy in general are under close scrutiny because of
adverse effects in gene therapy trials (52, 53), a consensus
sequence of host cell composition, which can maintain DNA
replication and expression of accompanying genes, provides a
new opportunity for a gene delivery system of cellular (host)
DNA origin.
REFERENCES
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
Stinchcomb, D. T., Stuhl, K., and Davis, R. W. (1979) Nature 282, 39 – 43
Fangman, W., and Brewer, B. (1991) Annu. Rev. Cell Biol. 7, 375– 402
Held, P., and Heintz, N. (1992) Biochim. Biophys. Acta 1130, 235–246
Zannis-Hadjopoulos, M., and Price, G. B. (1998) Crit. Rev. Eukaryot. Gene
Expr. 8, 81–106
Zannis-Hadjopoulos, M., and Price, G. B. (1999) J. Cell. Biochem. Suppl. 32/33,
1–14
Frappier, L., and Zannis-Hadjopoulos, M. (1987) Proc. Natl. Acad. Sci. U. S. A.
84, 6668 – 6672
Landry, S., and Zannis-Hadjopoulos, M. (1991) Biochim. Biophys. Acta 1088,
234 –244
Zannis-Hadjopoulos, M., Nielsen, T. O., Todd, A., and Price, G. B. (1994) Gene
(Amst.) 151, 273–277
Nielsen, T., Bell, D., Lamoureux, C., Zannis-Hadjopoulos, M., and Price, G.
(1994) Mol. Gen. Genet. 242, 280 –288
Nielsen, T. O., Cossons, N. H., Zannis-Hadjopoulos, M., and Price, G. (2000)
J. Cell. Biochem. 76, 674 – 685
Pearson, C. E., Frappier, L., and Zannis-Hadjopoulos, M. (1991) Biochim.
Biophys. Acta 1090, 156 –166
Wu, C., Friedlander, P., Lamoureux, C., Zannis-Hadjopoulos, M., and Price,
G. B. (1993) Biochim. Biophys. Acta 1174, 241–257
Wu, C., Zannis-Hadjopoulos, M., and Price, G. B. (1993) Biochim. Biophys.
Acta 1174, 258 –266
Graham, F. L., and van der Eb, A. J. (1973) Virology 52, 456 – 467
Todd, A., Landry, S., Pearson, E. E., Khoury, V., and Zannis-Hadjopoulos, M.
(1995) J. Cell. Biochem. 57, 280 –289
Hirt, B. (1967) J. Mol. Biol. 26, 365–369
Popperl, H., and Featherstone, M. S. (1992) EMBO J. 11, 3673–3680
Diaz-Perez, M. J., Wainer, I. W., Zannis-Hadjopoulos, M., and Price, G. B.
(1996) J. Cell. Biochem. 61, 444 – 451
Pearson, C. E., Shihab-El-Deen, A., Price, G. B., and Zannis-Hadjopoulos, M.
(1994) Somat. Cell Mol. Genet. 20, 147–152
Waltz, S. E., Trivedi, A. A., and Leffak, M. (1996) Nucleic Acids Res. 24,
1887–1894
Giacca, M., Zentilin, L., Norio, P., Diviacco, S., Dimitrova, D., Contreas, C.,
Biamonti, G., Perini, G., Weighardt, F., Riva, S., and Falaschi, A. (1994)
Proc. Natl. Acad. Sci. U. S. A. 91, 7119 –7123
Tao, L., Nielsen, T., Friedlander, P., Zannis-Hadjopoulos, M., and Price, G. B.
(1997) J. Mol. Biol. 273, 509 –518
Aladjem, M. I., Groudine, M., Brody, L. L., Dieken, E. S., Fournier, R. E. K.,
Wahl, G. M., and Epner, E. M. (1995) Science 270, 815– 819
19659
24. Ariizumi, K., Wang, Z., and Tucker, P. W. (1993) Proc. Natl. Acad. Sci. U. S. A.
90, 3695–3699
25. Taira, T., Iguchi-Ariga, S. M. M., and Ariga, H. (1994) Mol. Cell. Biol. 14,
6386 – 6387
26. Burhans, W. C., Vassilev, L. T., Caddle, M. S., Heintz, N. H., and DePamphilis,
M. L. (1990) Cell 62, 955–965
27. Tasheva, E. S., and Roufa, D. J. (1994) Mol. Cell. Biol. 14, 5628 –5635
28. Dimitrova, D., Giacca, M., Demarchi, F., Biamonti, G., Riva, S., and Falaschi,
A. (1996) Proc. Natl. Acad. Sci. U. S. A. 93, 1498 –1503
29. Abdurashidova, G., Deganuto, M., Klima, R., Riva, S., Biamonti, G., Giacca,
M., and Falaschi, A. (2000) Science 287, 2023–2026
30. Cross, S. H., Charlton, J. A., Nan, X., and Bird, A. P. (1994) Nat. Genet. 6,
236 –244
31. Larsen, F., Gundersen, G., Lopez, R., and Prydz, H. (1992) Genomics 13,
1095–1107
32. Antequera, F., and Bird, A. (1993) Proc. Natl. Acad. Sci. U. S. A. 90,
11995–11999
33. Vassilev, L. T., and Johnson, E. M. (1990) Mol. Cell. Biol. 10, 4899 – 4904
34. Tao, L., Dong, Z., Leffak, M., Zannis-Hadjopoulos, M., and Price, G. B. (2000)
J. Cell. Biochem. 78, 442– 457
35. Zhao, Y., Tsutsumi, R., Yamaki, M., Nagatsuda, Y., Ejiri, S., and Tsutsumi, K.
(1994) Nucleic Acids Res. 22, 5385–5390
36. Delgado, S., Gomez, M., Bird, A., and Antequera, F. (1998) EMBO J. 17,
2426 –2435
37. Phi-van, L., and Stratling, W. H. (1999) Nucleic Acids Res. 27, 3009 –3017
38. Pelletier, R., Mah, D. C. W., Landry, S., Matheos, D., Price, G. B., and
Zannis-Hadjopoulos, M. (1997) J. Cell. Biochem. 66, 87–97
39. Murray, A. W., and Szostak, J. W. (1993) Nature 305, 189 –193
40. Hahnenberger, K. M., Baum, M. P., Polizzi, C. M., Carbon, J., and Clarke, L.
(1989) Proc. Natl. Acad. Sci. U. S. A. 86, 577–581
41. Toth, E. C., Marusic, L., Ochem, A., Patthy, A., Pongor, S., Giacca, M., and
Falaschi, A. (1993) Nucleic Acids Res. 21, 3257–3263
42. Ruiz, M. T., Matheos, D., Price, G. B., and Zannis-Hadjopoulos, M. (1999) Mol.
Biol. Cell 10, 567–580
43. Ruiz, M. T., Pearson, C. E., Nielsen, T. O., Price, G. B., and ZannisHadjopoulos, M. (1995) J. Cell. Biochem. 58, 221–236
44. Newlon, C. S. (1996) DNA Replication in Eukaryotic Cells (DePamphilis, M. L.,
ed) pp. 873–914, Cold Spring Harbor Laboratory Press, Cold Spring Harbor,
NY
45. Sinnett, D., Woolf, E., Xie, W., Glatt, K., Kirkness, E. F., Nielsen, T. O.,
Zannis-Hadjopoulos, M., Price, G. B., and Lalande, M. (1996) Gene (Amst.)
173, 171–177
46. Strehl, S., LaSalle, J. M., and Lalande, M. (1997) Mol. Cell. Biol. 17,
6157– 6166
47. Araujo, F. D., Knox, J. D., Ramchandani, S., Pelletier, R., Bigey, P., Price, G.,
Szyf, M., and Zannis-Hadjopoulos, M. (1999) J. Biol. Chem. 274, 9335–9341
48. Cao, Q. P., Pitt, S., Leszyk, J., and Baril, E. F. (1994) Biochemistry 33,
8548 – 8557
49. Vishwanatha, J. K., and Baril, E. F. (1990) Biochemistry 29, 8753– 8759
50. Novac, O., Matheos, D., Araujo, F. D., Price, G. B., and Zannis-Hadjopoulos, M.
(2001) Mol. Biol. Cell 12, 3386 –3401
51. Schid-Poulter, C., Matheos, D., Novac, O., Cui, B., Giffin, W., Ruiz, M. T.,
Price, G. B., Zannis-Hadjopoulos, M., and Hache, R. J. G. (2003) DNA Cell
Biol. 22, 65–78
52. Fox, J. L. (1999) Nat. Biotechnol. 17, 1153
53. Fox, J. L. (2000) Nat. Biotechnol. 18, 377