Download Article - School of Chemistry and Biochemistry

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

DNA sequencing wikipedia , lookup

Helicase wikipedia , lookup

Homologous recombination wikipedia , lookup

DNA repair wikipedia , lookup

DNA replication wikipedia , lookup

DNA repair protein XRCC4 wikipedia , lookup

DNA profiling wikipedia , lookup

Zinc finger nuclease wikipedia , lookup

DNA polymerase wikipedia , lookup

Microsatellite wikipedia , lookup

Replisome wikipedia , lookup

DNA nanotechnology wikipedia , lookup

United Kingdom National DNA Database wikipedia , lookup

Helitron (biology) wikipedia , lookup

Transcript
© 2000 Nature America Inc. • http://structbio.nature.com
articles
Understanding the immutability of restriction
enzymes: crystal structure of BglII and its DNA
substrate at 1.5 Å resolution
© 2000 Nature America Inc. • http://structbio.nature.com
Christine M. Lukacs1, Rebecca Kucera2, Ira Schildkraut2 and Aneel K. Aggarwal1
Restriction endonucleases are remarkably resilient to alterations in their DNA binding specificity. To understand
the basis of this immutability, we have determined the crystal structure of endonuclease BglII bound to its
recognition sequence (AGATCT), at 1.5 Å resolution. We compare the structure of BglII to endonuclease BamHI,
which recognizes a closely related DNA site (GGATCC). We show that both enzymes share a similar α/β core, but in
BglII, the core is augmented by a β-sandwich domain that encircles the DNA to provide extra specificity.
Remarkably, the DNA is contorted differently in the two structures, leading to different protein–DNA contacts for
even the common base pairs. Furthermore, the BglII active site contains a glutamine in place of the glutamate at
the general base position in BamHI, and only a single metal is found coordinated to the putative nucleophilic
water and the phosphate oxygens. This surprising diversity in structures shows that different strategies can be
successful in achieving site-specific recognition and catalysis in restriction endonucleases.
Restriction endonucleases are paradigms for the study of protein–DNA recognition. Most of the >3,000 restriction endonucleases discovered to date belong to the type II class, which recognize
and cleave short palindromic DNA sites, requiring only Mg2+ for
optimal activity1. Their specificity is extraordinary. A single variation in the DNA sequence results in over a million-fold loss in
activity1. In marked contrast to several well-characterized families
of transcription factors, restriction endonucleases share little
sequence similarity. Nonetheless, several crystal structures of
restriction endonucleases have revealed a similar α/β core, consisting of a β-sheet flanked by several helices2,3. Interestingly, based on
these few structures, the similarity is strongest between endonucleases that share a similar cleavage pattern, such as BamHI4,5 and
EcoRI6, which cleave DNA to leave four base (5') overhangs, or
EcoRV7 and PvuII8,9, which cleave DNA to produce blunt ends.
a
Using the knowledge gained from these structures, a number
of attempts have been made to alter the specificity of restriction
endonucleases by both single-site and cassette mutagenesis10–15.
In general, these attempts have been unsuccessful. For instance,
the structure of the BamHI–DNA complex reveals that the
enzyme recognizes its cognate sequence GGATCC primarily
through residues Asn 116, Ser 118, Arg 122, Asp 154 and Arg 155
(ref. 15). However, cassette mutagenesis of these residues together with in vivo transcriptional selection reveals acceptable substitutions for the cognate but not any other DNA sequence15. Even
the closely related BglII recognition site AGATCT fails to yield
any viable mutations within the BamHI α/β core (unpublished
data), prompting us to ask why an existing restriction endonuclease scaffold cannot be modified to recognize a different, or
even a closely related, DNA site.
b
Fig. 1 Overall structure. a, A view of the BglII–DNA complex looking down the DNA axis. α-Helices, β-strands and loops A–E are labeled on one
monomer. Loops B and C make direct DNA contacts, while loops A and D are involved in water-mediated DNA contacts. The figure was produced
with Molscript38 and Raster3D39. b, The BglII–DNA complex rotated 90° from the view in (a) (left), and the corresponding view of the BamHI–DNA
structure (right). The common α/β core is displayed as a gray molecular surface and the rest of the protein is in ribbon form. In BglII, residues Asp 38,
Tyr 190 and Arg 189 from loops A and D of the β-sandwich subdomain and a water molecule (red sphere) participate in an extended hydrogen bond
network that effectively encloses the DNA. By comparison, BamHI has only a small secondary domain. The figure was produced with Grasp 40,
Molscript38 and Raster 3D39.
1Structural Biology Program, Department of Physiology and Biophysics, Mt. Sinai School of Medicine, 1425 Madison Avenue, New York, New York 10029, USA. 2New
England Biolabs, 32 Tozer Road, Beverly, Massachusetts 01915-5599, USA.
Correspondence should be addressed to A.K.A. email: [email protected]
134
nature structural biology • volume 2 number 1 • february 2000
© 2000 Nature America Inc. • http://structbio.nature.com
articles
a
© 2000 Nature America Inc. • http://structbio.nature.com
b
To understand how nature has solved the problem of distinguishing the closely related DNA sites, we undertook the structure determination of BglII as a complement to our earlier work
on BamHI4,5,16,17. BglII, a 223-amino acid protein, was originally
isolated from B. globigii18 and its sequence shows little homology
to other endonucleases, including BamHI. The enzyme was
cocrystallized with a 16-base pair DNA fragment containing its
recognition sequence AGATCT. The structure was determined at
1.5 Å resolution, which to our knowledge makes it the highest
resolution protein–DNA complex structure solved to date. We
show that despite the lack of sequence homology between BglII
and BamHI, the two enzymes share a similar α/β core. However,
in BglII, the α/β core is augmented by a β-sandwich subdomain
that provides additional specificity (Fig. 1). Surprisingly, the
common base pairs (AGA TCT) are recognized in rather different ways in the two complexes because of differences in DNA
conformation, and the BglII active site residues suggest a novel
mechanism of DNA hydrolysis. Taken together, this unexpected
diversity in DNA recognition, DNA structure and active site
challenges our assumptions about the evolution of restriction
enzymes in bacteria and provides a basis for the generally unsuccessful attempts at modifying their specificities.
Overall fold
At 1.5 Å resolution, the BglII–DNA structure provides an exceptionally detailed picture of the numerous interactions that underlie BglII–DNA recognition. The majority of water molecules at
the protein–DNA interface were seen directly in the experimental
1.7 Å electron density map, calculated with phases from a multiwavelength anomalous dispersion (MAD) experiment. The outstanding quality of the MAD map (Fig. 2) was affirmed by the fact
that even thymine and cytosine bases, whose shapes differ by a
single methyl group, were clearly differentiated.
The BglII dimer approaches DNA from the major groove side
and wraps around to the minor groove side. The monomer is comnature structural biology • volume 7 number 2 • february 2000
Fig. 2 Experimental electron density of the BglII–DNA complex
from MAD phasing. a, Stereo view of an guanine:cytosine base
pair and several water molecules near the center of the oligomer.
b, A β-strand and an α-helix. The experimental map (1.7 Å) was
calculated with the program SHARP34 and contoured at 1.2σ.
posed of two domains; an α/β core domain that bears
resemblance to BamHI, and a unique five-stranded
β-sandwich domain (Fig. 1a). Loops from the β-sandwich domain reach into both the major and minor
grooves to tightly grip the DNA (Fig. 1b, left).
Consequently, the encircling of the DNA is much more
extensive than in BamHI (Fig. 1b, right). Remarkably, Tyr
190 residues from the two monomers point directly at
each other and are involved in a network of water-mediated interactions that effectively leads to a completely
closed ring of protein around the DNA (Fig. 1b, left).
Protein conformation
The α/β core is similar to that of BamHI, containing a
six-stranded β-sheet (β1, β3, β4, β5, β6 and β7) surrounded by five α-helices (α1, α2, α3, α4 and α5), two
of which (α4 and α5) are involved in homodimerization. The loops preceding the dimerization helices
(loops B and C) carry the residues that contact bases in
the major groove. Although the BglII and BamHI cores
differ in the lengths of the α-helices and the β-strands
(particularly strands β4 and β5, consisting of 5 versus 9
and 8 versus 12 amino acids, respectively), a more profound difference is found upon comparing the dimers. A superimposition
of the central four strands of one monomer from each structure
shows that the other monomers do not overlie, and are in fact
displaced by as much as 4.3 Å in the outer helices. This alteration
in the relationship of the monomers in the two structures leads
to a narrower DNA binding cleft in BglII.
BamHI has a small two-stranded β-substructure outside of the
α/β core (Fig. 1b, right)5,16. In BglII, this small substructure is
elaborated into a full five-stranded β-sandwich subdomain (β2,
β8, β9, β10 and β11) that extends the overall size and shape of
the protein (Fig. 1a,b, left). β2 is composed of residues near the
N-terminus, whereas the rest of the domain is composed of the
C-terminal 50 residues. The residues following β2 form a long
loop (loop A) that projects into the DNA major groove, while the
corresponding residues in BamHI take a completely different
path and do not enter the groove (Fig. 1b, right). The loop
between strands β8 and β9 (loop D) is the primary site of interactions with bases in the minor groove. The loop between
strands β10 and β11 (loop E) is visible in only one of the
monomers, extending beyond the length of the 16-mer and making contacts with a translationally related DNA molecule (Fig.
1b, right). Overall, the second subdomain gives the BglII–DNA
complex a ‘squarelike’ appearance when viewed down the DNA
helix (Fig. 1a).
DNA conformation
The DNA is distorted by bending and by local unwinding and overwinding. In contrast to the relatively straight DNA in the BamHI
complex16, the BglII DNA bends away (∼23°) from the protein α/β
core in a fairly smooth fashion (Fig. 3a). This results in the stacked
DNA duplexes forming continuous, sinuous helices throughout
the crystal. However, the most striking contrast with BamHI DNA
arises in comparing the pattern of helical and propeller twists within the recognition sequence (Fig. 3b,c). In particular, the central
135
© 2000 Nature America Inc. • http://structbio.nature.com
articles
© 2000 Nature America Inc. • http://structbio.nature.com
Fig. 3 DNA parameters. a, Side view of the DNA from both
the BglII and BamHI complexes showing the axes of curvature. b, Stereo view of the inner and middle base pairs of
BglII DNA. The DNA unwinds by 20° between the two inner
base pairs, and the sugar of the middle cytosine has a
C3'-endo conformation. c, A comparison of the helical and
propeller twist values of BglII and BamHI. The DNA parameters were calculated with the program CURVES41.
a
step of the BglII recognition sequence (AGA TCT) is
unwound by over 15° as compared to the BamHI
complex, or to ideal B-DNA. This causes the orientation of the two inner base pairs to be almost superimb
posed, with nearly parallel Watson–Crick hydrogen
bonds. Interestingly, a similar unwinding at the central step has been described in EcoRV– and
EcoRI–DNA complexes, but in those cases, the two
inner base pairs become unstacked, assume high roll
angles, and cause large kinks in the DNA axis6,7.
Moreover, the unwinding in the EcoRV and EcoRI
complexes continues to the adjoining base steps,
whereas we observe overwinding (by ∼7°) between
the inner and middle base steps (AGA TCT). This
compensates for the unwinding at the center, so that c
the outer recognition base pairs (AGA TCT) in the
BglII DNA are only a few degrees offset from the
equivalent base pairs in the BamHI complex (GGA
TCC).
In analyzing the DNA backbone, the most striking
deviation from ideal B-DNA occurs in the torsion
angles specifying the inner and middle pyrimidines
(AGA TCT). Sugars of the middle cytosines adopt a
C3'-endo conformation (pseudorotation angles of
33–34°), which is typical of A-DNA. The A-DNA
character is further reflected in the torsion angles δ
(C4'–C3') and χ (C1'–N1) of the cytosines, which
take on values (∼88° and 204°, respectively) that normally distinguish A- from B-DNA19. The inner
thymines assume a conformation known as BII,
characterized by torsion angles ε (C3'–O3') and
ζ (O3'–P) that are gauche-, trans rather than
trans, gauche- as in the more common BI form of B-DNA19. These
atypical backbone conformations at the inner and middle pyrimidines are stabilized in part by hydrogen bonds between their
phosphate groups and the main chain NH groups of residues 38
and 40. Interestingly, residues 38 and 40 are carried on a loop
(loop A) that is missing in BamHI (Fig. 1b), which may explain the
lack of analogous backbone changes in the BamHI DNA. Whether
these transitions in the DNA backbone account for the unwinding
and overwinding of the central bases in the BglII DNA is
unproven, but highly suggestive. Overall, the resolution of our
structure provides a remarkably detailed view of the local changes
that occur in B-DNA upon protein binding. The accuracy of the
helical parameters and the backbone torsion angles is reflected by
their two-fold symmetry, even though the two halves of the DNA
were built independently and refined without any averaging.
DNA recognition
The BglII recognition site differs from that of BamHI by only the
outer base pair. Thus, we expected the recognition of the inner
and middle base pairs to be very similar. This turns out not to be
the case, even though the residues near these base pairs (loop B)
are similar. In particular, Asn 116, which recognizes the inner
and middle base pairs in BamHI (GGA)16, has an equivalent
136
residue in BglII (Asn 98), but it adopts a different configuration
and instead forms bidentate hydrogen bonds with its two-fold
related counterpart across the dimer interface. A superimposition of the two enzymes shows that Asn 98 is unable to interact
with the DNA in the same way as Asn 116 in BamHI, as a result of
differences in bending and other helical parameters between the
two DNAs. Consequently, the only contribution Asn 98 makes
toward recognition of the middle G:C base pair in BglII is a
water-mediated hydrogen bond to the guanine (Fig. 4a,b, left).
Also, there are no direct contacts to the inner A:T base pair in
BamHI (GGA TCC), but in BglII a direct hydrogen bond is
formed between the N7 of the adenine and the Oγ of Ser 97 from
the two-fold related monomer (Fig. 4b). The one common interaction maintained between BglII and BamHI is a hydrogen bond
between the N4 of cytosine of the middle G:C base pair and the
main chain carbonyl of Asn 140 (Asp 154 in BamHI).
Recognition of the variant outer base pair is achieved by two
equivalently positioned residues on the loop (loop C) preceding
the dimerization helix α5 (α6 in BamHI). Residues Asn 140 and
Ser 141 in BglII substitute for Asp 154 and Arg 155 in BamHI to
recognize an A:T base pair in place of a G:C base pair (Fig. 4b,c).
In BamHI, Arg 155 donates bidentate hydrogen bonds to the
outer guanine while Asp 154 accepts a single hydrogen bond from
nature structural biology • volume 7 number 2 • february 2000
© 2000 Nature America Inc. • http://structbio.nature.com
articles
a
c
© 2000 Nature America Inc. • http://structbio.nature.com
b
Fig. 4 DNA recognition. a, Stereo view of the protein– base contacts in one half-site of the recognition sequence. The two monomers of BglII are colored in different shades of yellow and are labeled A and B. b, Schematic representation of the phosphate and base contacts from (left) one monomer
of BglII and (right) one monomer of BamHI. c, Superimposition of the outer base pair recognition residues Asn 140 and Ser 141 from BglII with Asn
154 and Arg 155 of BamHI. The superimposition reveals a 2.5 Å shift in the position of the outer bases.
the cytosine. In BglII, Ser 141 forms bidentate hydrogen bonds
with the outer adenine, and Asn 140 donates a single hydrogen
bond to the O4 of the thymine (Fig. 4). Because of the low
sequence homology between BglII and BamHI, the structural
identity of these outer recognition residues was only apparent
upon solving the BglII structure.
In contrast to these residues that contact bases in the major
groove, no structural relationship exists between BglII and BamHI
for residues interacting in the minor groove. In BamHI, the minor
groove is contacted asymmetrically by a C-terminal ‘arm’ from
one subunit (Fig. 1b, right)16. However, in BglII the minor groove
is contacted symmetrically by a loop (residues Asp 187–Arg 192:
loop D) emanating from the β-sandwich domain, which wraps
around the minor groove to make water-mediated contacts with
all three base pairs in each half-site (Fig. 4a). In particular, Arg 192
forms a series of bridging water interactions with the pyrimidines
of the outer two base pairs (AGA), whereas the two symmetryrelated Tyr 190 residues collaborate to fix a single water molecule
at the center of the DNA, which in turn donates hydrogen bonds
to the O2 of thymines of the inner base pairs (AGA TCT). (A secnature structural biology • volume 7 number 2 • february 2000
ond water molecule between these two tyrosines also contacts
both Arg 189 residues, which in turn form hydrogen bonds with
Asp 38, allowing loop A to buttress loop D and further extend the
network involved in surrounding the DNA (Fig. 1b, left)).
The DNA backbone contacts extend well beyond the recognition sequence. Each subunit makes eight direct DNA phosphate
contacts and another 20 mediated via water molecules (Fig. 4b,
left). The extent of the DNA backbone contacts is enhanced by
the presence of the β-sandwich subdomain, which in addition to
the minor groove loop (loop D) has two other loops that project
outward, with one of them (loop A) brushing against the major
groove and the other (loop E) jutting beyond the length of the
16-base pair oligomer. Consequently, BglII buries substantially
more solvent-accessible surface area (∼5,515 Å2) than BamHI
(4,200 Å2) and far more than dimeric transcription factors such
as λ-repressor (3,022 Å2)20.
Active site
Structural studies on restriction endonucleases have revealed a
similar architecture for the active site with the residues following
137
© 2000 Nature America Inc. • http://structbio.nature.com
articles
© 2000 Nature America Inc. • http://structbio.nature.com
Fig. 5 Active site. A stereo view closeup of the BglII active site
around the scissile phosphodiester bond. Residues Asn 69, Asp
84, Glu 93 and Gln 95 correspond to the BamHI active site
residues Glu 77, Asp 94, Glu 111 and Glu 113. The structure
reveals numerous water molecules and an octahedrally coordinated cation (orange sphere) between the conserved residues
and the DNA. Five of the six ligands have distances ranging
from 2.2 to 2.5 Å; the sixth ligand (to the proposed nucleophilic water molecule) is 2.7 Å. One water molecule in the
coordination sphere of the cation (labeled NW) makes a 155°
angle with the scissile P–O3 bond, and appears to be the
attacking nucleophile.
the weak consensus sequence (Glu/Asp)-X(9–20)(Glu/Asp/Ser)-X-(Lys/Glu). BamHI is the only endonuclease
that contains a glutamate (Glu 113) at the final position of the
consensus, the importance of which is underlined by the fact that
mutating this residue to lysine inactivates the enzyme 21. The
presence of oppositely charged lysine or glutamate residues at the
last position of the consensus has led to much debate about the
catalytic mechanism(s) of restriction endonucleases. The BglII
structure adds a further twist to the debate by revealing for the
first time in restriction enzymes a glutamine at this position.
Architecturally, the BglII active site is similar to other endonucleases but follows the sequence Asp 84-X9-Glu 93-X-Gln 95 (Fig.
5). Remarkably, BamHI becomes inactive when Glu 113 is
mutated to a glutamine as in BglII (unpublished data).
We see evidence of a single metal in the BglII active site,
although no divalent cations were used in the crystallization of the
native complex. Thus the metal may be a sodium ion from the
protein or crystallization buffer (Tris or MES) occupying a potential Mg2+ binding site. A similarly bound metal was found when we
soaked the native cocrystals in calcium. The single cation is octahedrally coordinated with the side chain oxygen of Asp 84, the
backbone carbonyl of Val 94, a phosphate oxygen, and three water
molecules, including one that makes a ∼155° angle with the P–O3
bond of the scissile phosphate group (Fig. 5). This water molecule
is well poised to act as the attacking nucleophile with a pKa lowered by its contact with the metal and its orientation fixed by a
hydrogen bond with the Oε1 of Gln 95. The presence of a single
metal in the BglII active site contrasts with BamHI, which contains
two metals in the active site, both of which are postulated to be
involved in the stabilization of the pentacovalent transition
state16,17. A superimposition of the BglII and BamHI active site
residues places the metal only 0.3 Å away from the metal A in
BamHI, but BglII lacks the equivalent of metal B which lies close to
the leaving O3' atom and is coordinated by residue Glu77 (see
ref. 17 for details of the Bam HI active site). The equivalent metal
may be missing in BglII because it has a less acidic Asn 69 at the
position corresponding to Glu77 in BamHI (Fig. 5). Curiously,
Asp 93 in our structure is turned away from the active site,
although it could rotate to a conformation similar to Glu 111 in
BamHI in order to coordinate metal A. Another difference
between the two structures is that Gln 95 coordinates to the nucleophilic water, whereas its equivalent residue in BamHI (Glu 113)
directly contacts the phosphate oxygen.
Discussion
BglII is remarkable in both differences and similarities from
BamHI. Both enzymes share a similar α/β core that carries the catalytic center and the residues that contact DNA bases in the major
groove. However, the α/β core in BglII is augmented by a β-sandwich subdomain that extends the size and shape of the enzyme.
Several projections (loops A, D and E) extend outward from the
138
β-sandwich subdomain to grip the DNA, effectively leading to a
closed ring of protein around the DNA. This total enclosure of the
DNA is unprecedented for a sequence-specific DNA binding protein, and suggests a possible hingelike motion between the two subdomains that allows the DNA to enter. Critical to the closure of the
enzyme is a network of water-mediated hydrogen bonds between
two symmetry-related tyrosines (Tyr 190) that complete the
embrace around the DNA (Fig. 1a,b, left).
The protein–DNA contacts are different between BglII and
BamHI. For instance, the minor groove contacts in BamHI are
made by the C-terminal arm of one subunit16, whereas the analogous contacts in BglII are made in a symmetrical manner by loops
D emanating from the β-sandwich subdomains. One intriguing
difference lies in the recognition of the two common (inner and
middle) base pairs in the major groove. Both enzymes position an
equivalent asparagine residue near these base pairs, but in BamHI
the asparagine (Asn 116) is directly hydrogen bonded to these base
pairs whereas in BglII Asn 98 is mediating dimer contacts.
Recognition of the variable, outer base pair is, as expected, different between BglII and BamHI. Residues Asn 140 and Ser 141 in
BglII substitute for residues Asp 154 and Arg 155 in BamHI in
order to recognize an A:T base pair instead of a G:C base pair. The
pattern of donor and acceptor atoms on these sets of residues is
such that in BglII it could only match an A:T base pair and in
BamHI only a G:C base pair. At first glance, changing the specificity of BamHI in order to recognize BglII appears to be a simple case
of substituting an asparagine and a serine for Asp 154 and Arg 155.
However, when we model these substitutions into BamHI, we see a
2.5 Å displacement of the outer base, caused by differences in DNA
bending in the two structures, that places the base too far away to
interact with a short serine residue (Fig. 4c). This small but critical
difference in how the two enzymes contort the DNA plays an
important role in their specificity.
A major reason that we undertook the X-ray analysis of BglII was
to learn why one cannot select BamHI mutants to recognize the
closely related BglII site based on the cassette mutagenesis of
residues Asn 116, Ser 118, Asp 154 and Arg 155 (ref. 15; unpublished data). Several attempts have been made to switch the specificities of restriction endonucleases, but generally without much
success10,12–15. This lack of success contrasts with transcription factors, where at least partial success has been achieved with proteins
containing a similar DNA binding motif, such as the helix-turnhelix motif or the zinc finger22–24. To understand the immutability
of restriction enzymes in evolutionary terms, we suggest that
they are under a unique selective pressure not to look too much
alike. For instance, we can imagine that if the specificity of BamHI
was to change to that of BglII through just a few point mutations, it
would be lethal for the host bacterium because it would become
susceptible to cleavage at the unmethylated BglII sites on its
genome. Thus, restriction enzymes may be under strong selective
nature structural biology • volume 7 number 2 • february 2000
© 2000 Nature America Inc. • http://structbio.nature.com
articles
Table 1 Data collection, phasing and refinement statistics
MAD data collection and phasing statistics at 1.5Å resolution
© 2000 Nature America Inc. • http://structbio.nature.com
Wavelength (Å)
Number of reflections
Number of unique reflections (F+ + F-)
Completeness (%, last shell)
Rsym (last shell)1
I / σ(I)
Phasing power (dispersive acentrics)
Phasing power (anomalous acentrics)
Mean overall figure of merit (centric/acentric)
Se-edge
0.9792
623,003
123,511
99.4 (94.0)
0.070 (0.151)
11.3
2.38
3.51
–
Se-peak
0.9791
620,648
123,628
99.4 (92.8)
0.073 (0.146)
11.2
3.28
3.32
–
Refinement statistics
Resolution range (Å)
Number of reflections used in refinement
σ cutoff used in refinement
Rcryst / Rfree2
Number of atoms
Protein / DNA
Water / ions
B factors (Å2)
Protein / DNA
Water / ions
R.m.s. deviations
Bonds (Å)
Angles (°)
1
2
Se-remote
0.9494
629,080
123,664
99.8 (99.3)
0.069 (0.162)
12.1
–
2.79
0.46 / 0.69
Native
0.9790
346,887
91,032
96.6 (97.5)
0.078 (0.289)
7.8
20–1.7
62,634 (2,148)
0
0.182 / 0.200
20–1.5
84,917 (2,144)
1
0.195 / 0.225
3,590 / 650
543 / 2
3, 590 / 650
614 / 2
13.9 / 15.7
22.8 / 24.8
12.6 / 14.7
21.5 / 26.0
0.018
1.88
0.017
1.91
Rsym = Σ|Ih - <Ih>| / ΣIh over all h, where Ih is the intensity of reflection h.
Rcryst / Rfree = Σ||Fo| - |Fc|| / Σ|Fo|. Rfree was calculated using 3–4% of data excluded from refinement.
pressure to evolve an interface that is not only complementary
toward their recognition sequence but also one that cannot so easily be altered to recognize another unprotected sequence. From our
superimpositions, it is clear that additional residues, such as those
mediating dimerization and DNA backbone contacts, also need to
be mutated in order to configure the DNA for optimal contacts in
the major groove. All in all, the BglII structure strongly reinforces a
sense that protein–DNA recognition in restriction endonucleases
must be considered in the context of the whole protein and not just
the few residues that interact directly with the DNA bases.
The BglII structure reveals a novel active site in which a glutamine residue is seen for the first time at a position corresponding to
the glutamate in BamHI and lysine in most other restriction
endonucleases2. Thus, the rather tentative active site consensus
sequence is further weakened to (Glu/Asp)-X(9–20)-(Glu/Asp/Ser)X-(Lys/Glu/Gln). It is curious that even though all restriction
endonucleases catalyze the same chemical reaction and their active
sites are generally superimposable, there is this level of diversity in
the identity of the active site residues. Remarkably, BamHI
becomes inactive when the glutamate at the last position of the
consensus is changed to a lysine as in EcoRV21 or a glutamine as in
BglII (unpublished results), and (conversely) EcoRV loses activity
when lysine is changed to a glutamate25. Whether this variability in
active site residues actually reflects different mechanisms of catalysis is unclear at present.
For BamHI a two-metal mechanism of catalysis has been proposed17, whereas a variety of mechanisms have been postulated
for EcoRV, ranging from substrate-assisted catalysis to the
involvement of one, two and even three metals3,26–28. In the BglII
structure, we see evidence for the involvement of a single metal
in hydrolysis. The BglII active site bears resemblence to the active
site of the unrelated LAGLIDADG homing endonuclease I-CreI,
nature structural biology • volume 7 number 2 • february 2000
which also contains a single metal ion as well as a glutamine
residue coordinated to the nucleophilic water29. Thus, the structure supports suggestions of a link between the catalytic mechanisms of restriction enzymes and the LAGLIDADG homing
endonucleases29.
All in all, the number of known restriction endonuclease
structures (<10) is still very small and selective, when one considers that more than 3,000 restriction endonucleases have been
identified1. The disparity in BglII and BamHI structures shows
vividly the different successful strategies for site-specific recognition and catalysis in this large family of enzymes.
Methods
Crystallization. BglII was cloned from B. globigii RUB561 into
Escherichia coli18. Recombinant selenomethionyl BglII was expressed
in the presence of selenomethionine and by inhibiting the methionine synthetic pathway. The protein was purified by a series of lowpressure
chromatographic
steps:
DEAE
(flow-through),
phospho-cellulose, heparin-Sepharose, and Q-Sepharose. The 2'tritylated oligomer 5'-TATTATAGATCTATAA-3' was purified by
HPLC, dissolved to 1 mM in TE (10 mM Tris.HCl pH 8.0, 1 mM EDTA)
and 100 mM KCl, and reannealed. Before crystallization, protein at
1.6 mg ml-1 (in 100 mM KCl, 10 mM Tris pH 7.4, 1 mM dithiothreitol
and 10% glycerol) was incubated with a 0.6 M equivalent of duplex
oligonucleotide at 0 °C for 30 min, then concentrated to 8–10 mg
ml-1. A 1 µl drop of protein–DNA complex was combined with 1 µl of
precipitant (17–20% PEG 1540, 0.2 M ammonium sulfate, 0.1 M MES
pH 5.2) in a well of a microbatch plate under a layer of paraffin oil.
Crystals grew at 20 °C within a few days and continued to grow
slowly for several weeks. Crystals were transferred in steps to a solution of 25% PEG 1540, 10–20% glycerol, 0.2 M ammonium sulfate,
0.1 M MES pH 5.2 and then frozen by immersion in liquid propane.
Data collection and structure determination. Diffraction data
were collected at liquid nitrogen temperature at the Advanced
139
© 2000 Nature America Inc. • http://structbio.nature.com
© 2000 Nature America Inc. • http://structbio.nature.com
articles
Photon Source (Argonne National Laboratories) on beamline 17-ID,
which is equipped with a MAR CCD detector. Crystals diffracted to
better than 1.7 Å in resolution and belong to space group P212121
with unit cell parameters a = 48.8 Å, b = 101.7 Å, c = 116.9 Å. The
asymmetric unit contains one full complex (a dimer of BglII bound
to the 16-base pair DNA duplex). An X-ray fluorescence scan was
obtained in order to determine the wavelengths for use in MAD
data collection30. The data were measured by the inverse beam
method. Individual oscillation images (t = 5 s) spanning 0.5° were
collected at the wavelength corresponding to the selenium edge,
over a total angular range of 120° at a crystal-to-detector distance
of 120 mm. This was then repeated at the same wavelength for φ =
φ + 180°, and the sequence was then repeated at the other two
wavelengths corresponding to the selenium peak and the remote
point. Diffraction data were indexed and integrated using DENZO
and reduced using SCALEPACK31.
Anomalous difference Patterson maps for each wavelength were
calculated using the PHASES package32. Six sites were immediately
apparent, and were verified and refined using PATSOL33. Anomalous
difference Fourier maps were calculated in order to place the sites at
the same origin. MAD analysis was performed using SHARP34 (with six
input sites), using the remote wavelength data as the reference.
Density modification with SOLOMON (CCP4; ref. 35) produced exceptionally clean experimental maps, which were fit with the program
O36. The entire protein, except residues 119–122 in monomer A and
208–216 in monomer B, was fit into the experimental map, along with
31 of the 32 DNA bases. Coordinate refinement was first carried out
against the 1.7 Å data using CNS37. Following a four-body rigid-body
refinement, the crystallographic R factor was 0.448 (Rfree = 0.452).
After one round of positional refinement, the R factor dropped to
0.385 (Rfree = 0.403). Individual B-factor refinement and positional
refinement further dropped the R factor to 0.307 (Rfree = 0.328). At this
stage, 225 water molecules, the one remaining base, and residues
Ala 119–Ala 122 were added and simulated annealing carried out,
resulting in an R factor of 0.263 (Rfree = 0.279). Several cycles of rebuilding and fitting solvent molecules resulted in a model that included 543
water molecules, 2 ions, and excellent protein geometry (Table 1). At
this stage, a 1.5 Å resolution data set became available, measured
from a native crystal soaked in calcium at the National Synchrotron
Light Source Beamline X25, which is equipped with a Brandeis B4 CCD
detector. This data set was used to complete the refinement using
CNS. The refined 1.7 Å structure was used as a starting model. After
several cycles of simulated annealing, positional, b-factor, and occupancy (ions only) refinement, the final R factor is 0.198 (Rfree = 0.227)
with 614 water molecules. The only missing segment of the protein is
residues 210–214 of monomer B.
1. Roberts, R.J. & Halford, S.E. Type II restriction endonucleases. in Nucleases (eds.
Linn, S.M., Lloyd, R.S. & Roberts, R.J.) 35-88 (Cold Spring Harbor Laboratory Press,
Cold Spring Harbor, New York; 1993).
2. Aggarwal, A.K. Structure and function of restriction endonucleases. Curr. Opin.
Struct. Biol. 5, 11–19 (1995).
3. Pingoud, A. & Jeltsch, A. Recognition and cleavage of DNA by type-II restriction
endonucleases. Eur. J. Biochem. 246, 1–22 (1997).
4. Newman, M., Strzelecka, T., Dorner, L.F., Schildkraut, I. & Aggarwal, A.K.
Structure of restriction endonuclease BamHI and its relationship to EcoRI. Nature
368, 660–664 (1994).
5. Newman, M., Strzelecka, T., Dorner, L.F., Schildkraut, I. & Aggarwal, A.K.
Structure of restriction endonuclease BamHI phased at 1.95Å resolution by MAD
analysis. Structure 2, 439–452 (1994).
6. Kim, Y.C., Grable, J.C., Love, R., Greene, P.J. & Rosenberg, J.M. Refinement of
EcoRI endonuclease crystal structure: a revised protein chain tracing. Science
249, 1307–1309 (1990).
7. Winkler, F.K. et al. The crystal structure of EcoRV endonuclease and of its complexes
with cognate and non-cognate DNA fragments. EMBO J. 12, 1781–1795 (1993).
8. Cheng, X., Balendiran, K., Schildkraut, I. & Anderson, J.E. Structure of PvuII
endonuclease with cognate DNA. EMBO J. 13, 3927–3935 (1994).
9. Athanasiadis, A. et al. Crystal structure of PvuII endonuclease reveals extensive
structural homologies to EcoRV. Nature Struct. Biol. 1, 469–475 (1994).
10. Alves, J. et al. Changing the hydrogen-bonding potential in the DNA binding site
of EcoRI by site-directed mutagenesis drastically reduces the enzymatic activity,
not, however, the preference of this restriction endonuclease for cleavage within
the site -GAATTC-. Biochemistry 28(1989).
11. Heitman, J. & Model, P. Substrate recognition by the EcoRI endonuclease.
Proteins 7, 185–197 (1990).
12. Osuna, J., Flores, H. & Soberon, X. Combinatorial mutagenesis of three major
groove-contacting residues of EcoRI: single and double amino acid replacements
retaining methyltransferase-sensitive activities. Gene 106, 7–12 (1991).
13. Xu, S. & Schildkraut, I. Isolation of BamHI variants with reduced cleavage
activities. J. Biol. Chem. 266, 4425–4429 (1991).
14. Wenz, C. et al. Protein engineering of the restriction endonuclease EcoRV:
replacement of an amino acid residue in the DNA binding site leads to an altered
selectivity towards unmodified and modified substrates. Biochem. Biophys. Acta
1219, 73–80 (1994).
15. Dorner, L.F., Bitinaite, J., Whitaker, R.D. & Schildraut, I. Genetic analysis of the
base-specific contacts of BamHI restriction endonuclease. J. Mol. Biol. 285,
1515–1523 (1999).
16. Newman, M., Strzelecka, T., Dorner, L.F., Schildkraut, I. & Aggarwal, A.K.
Structure of BamHI endonuclease bound to DNA: partial folding and unfolding
on DNA binding. Science 269, 656–663 (1995).
17. Viadiu, H. & Aggarwal, A.K. The role of metals in catalysis by the restriction
endonuclease BamHI. Nature Struct. Biol. 5, 910–916 (1998).
18. Anton, B.P. et al. Cloning and characterization of the BglII restriction-modification
system reveals a possible evolutionary footprint. Gene 187, 19–27 (1997).
19. Schneider, B., Neidle, S. & Berman, H.M. Conformations of the sugar-phosphate
backbone in helical DNA crystal structures. Biopolymers 42, 113–124 (1997).
20. Beamer, L.J. & Pabo, C.O. Refined 1.8 Angstrom crystal structure of the λ-
repressor operator complex. J. Mol. Biol. 227, 177–196 (1992).
21. Dorner, L.F. & Schildkraut, I. Direct selection of binding proficient/catalytic deficient
variants of BamHI endonuclease. Nucleic Acids Res. 22, 1068–1074 (1994).
22. Reber, E.J. & Pabo, C.O. Zinc finger phage: affinity selection of fingers with new
DNA-binding specificities. Science 263, 671–673 (1994).
23. Choo, Y. & Klug, A. Toward a code for the interactions of zinc fingers with DNA:
selection of randomized fingers displayed on phage. Proc. Natl. Acad. Sci. 91,
11163–11167 (1994).
24. Wharton, R.P. & Ptashne, M. Changing the binding specificity of a repressor by
redesigning an alpha-helix. Nature 316, 601–605 (1985).
25. Selent, U. et al. A site-directed mutagenesis study to identify amino acid residues
involved in the catalytic function of the restriction endonuclease EcoRV.
Biochemistry 31, 4808–4815 (1992).
26. Vipond, I.B., Baldwin, G.S. & Halford, S.E. Divalent metal ions at the active sites of
EcoRV and EcoRI restriction endonucleases. Biochemistry 34, 697–704 (1995).
27. Jeltsch, A., Alves, J., Wolfes, H., Maass, G. & Pingoud, A. Substrate-assisted
catalysis in the cleavage of DNA by the EcoRI and EcoRV restriction enzymes.
Proc. Natl. Acad. Sci. USA 90, 8499–8503 (1993).
28. Horton, N.C., Newberry, K.J. & Perona, J.J. Metal ion-mediated substrate-assisted
catalysis in type II restriction endonucleases. Biochemistry 95, 13489–13494 (1998).
29. Jurica, M.S., Monnat, J., R.J. & Stoddard, B.L. DNA recognition and cleavage by
the LAGLIDADG homing endonuclease I-CreI. Mol. Cell 2, 469–476 (1998).
30. Hendrickson, W.A. Determination of macromolecular structures from anomalous
diffraction of synchrotron radiation. Science 254, 51–58 (1991).
31. Otwinowski, Z. & Minor, W. Processing of X-ray diffraction data collected in
oscillation mode. Methods Enzymol. 276, 307–326 (1997).
32. Furey, W. & Swaminathan, S. PHASES-95: a program package for the processing
and analysis of diffraction data from macromolecules. Methods Enzymol. 277,
590–629 (1997).
33. Tong, L. PATSOL - v1.2. (1993).
34. de La Fortelle, E. & Bricogne, G. Maximum-likelihood heavy-atom parameter
refinement for multiple isomorphous replacement and multiwavelength
anomalous diffraction methods. Methods Enzymol. 276, 472–494 (1997).
35. Collaborative Computational Project Number 4. CCP4 Suite: programs for protein
crystallography. Acta Crystallogr. D 50, 760–763 (1994).
36. Jones, A.T., Zou, J.Y., Cowan, S.W. & Kjeldgaard, M. Improved methods for
building protein models in electron density maps and the location of errors in
these models. Acta Crystallogr. A 47, 110–119 (1991).
37. Adams, P.D., Pannu, N.S., Read, R.J. & Brunger, A.T. Cross-validated manximum
likelihood enhances crystallographic simulated annealing refinement. Proc. Natl.
Acad. Sci. USA 94, 5018–5023 (1997).
38. Kraulis, P.J. MOLSCRIPT: a program to produce both detailed and schematic plots
of protein structures. J. Appl. Crystallogr. 24, 946–950 (1991).
39. Merritt, E.A. & Bacon, D.J. Raster3D photorealistic molecular graphics. Methods
Enzymol 277, 505–524 (1997).
40. Nicholls, A., Sharp, K. & Honig, B. Protein folding and association: insights from
the interfacial and thermodynamic properties of hydrocarbons. Proteins Struct.
Funct. Genet. 11, 281–296 (1991).
41. Laverly, R. & Sklenar, H. The definition of generalized helicoidal parameters and of
axis curvature for irregular nucleic acids. J. Biomol. Struct. Dyn. 6, 63–91 (1988).
140
Coordinates. Coordinates of both structures have been deposited in
the protein data bank (accession numbers 1D2I and 1DFM for the 1.7
and 1.5 Å structures, respectively).
Acknowledgments
We are grateful to P. Weber and C. Lesburg (Schering-Plough Research Institute)
for use of APS synchrotron time and for assistance with data collection,
respectively. We thank L. Berman and H. Lewis for facilitating data collection at
NSLS. We also thank H. Viadiu for stimulating discussion and analysis. A.K.A. is
supported by a grant from the NIH, and C.M.L. is supported by the Damon
Runyon-Walter Winchell Cancer Research Fund.
Received 5 November, 1999; accepted 28 December, 1999.
nature structural biology • volume 7 number 2 • february 2000