Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
DNA sequencing wikipedia , lookup
Homologous recombination wikipedia , lookup
DNA replication wikipedia , lookup
DNA repair protein XRCC4 wikipedia , lookup
DNA profiling wikipedia , lookup
Zinc finger nuclease wikipedia , lookup
DNA polymerase wikipedia , lookup
Microsatellite wikipedia , lookup
DNA nanotechnology wikipedia , lookup
© 2000 Nature America Inc. • http://structbio.nature.com articles Understanding the immutability of restriction enzymes: crystal structure of BglII and its DNA substrate at 1.5 Å resolution © 2000 Nature America Inc. • http://structbio.nature.com Christine M. Lukacs1, Rebecca Kucera2, Ira Schildkraut2 and Aneel K. Aggarwal1 Restriction endonucleases are remarkably resilient to alterations in their DNA binding specificity. To understand the basis of this immutability, we have determined the crystal structure of endonuclease BglII bound to its recognition sequence (AGATCT), at 1.5 Å resolution. We compare the structure of BglII to endonuclease BamHI, which recognizes a closely related DNA site (GGATCC). We show that both enzymes share a similar α/β core, but in BglII, the core is augmented by a β-sandwich domain that encircles the DNA to provide extra specificity. Remarkably, the DNA is contorted differently in the two structures, leading to different protein–DNA contacts for even the common base pairs. Furthermore, the BglII active site contains a glutamine in place of the glutamate at the general base position in BamHI, and only a single metal is found coordinated to the putative nucleophilic water and the phosphate oxygens. This surprising diversity in structures shows that different strategies can be successful in achieving site-specific recognition and catalysis in restriction endonucleases. Restriction endonucleases are paradigms for the study of protein–DNA recognition. Most of the >3,000 restriction endonucleases discovered to date belong to the type II class, which recognize and cleave short palindromic DNA sites, requiring only Mg2+ for optimal activity1. Their specificity is extraordinary. A single variation in the DNA sequence results in over a million-fold loss in activity1. In marked contrast to several well-characterized families of transcription factors, restriction endonucleases share little sequence similarity. Nonetheless, several crystal structures of restriction endonucleases have revealed a similar α/β core, consisting of a β-sheet flanked by several helices2,3. Interestingly, based on these few structures, the similarity is strongest between endonucleases that share a similar cleavage pattern, such as BamHI4,5 and EcoRI6, which cleave DNA to leave four base (5') overhangs, or EcoRV7 and PvuII8,9, which cleave DNA to produce blunt ends. a Using the knowledge gained from these structures, a number of attempts have been made to alter the specificity of restriction endonucleases by both single-site and cassette mutagenesis10–15. In general, these attempts have been unsuccessful. For instance, the structure of the BamHI–DNA complex reveals that the enzyme recognizes its cognate sequence GGATCC primarily through residues Asn 116, Ser 118, Arg 122, Asp 154 and Arg 155 (ref. 15). However, cassette mutagenesis of these residues together with in vivo transcriptional selection reveals acceptable substitutions for the cognate but not any other DNA sequence15. Even the closely related BglII recognition site AGATCT fails to yield any viable mutations within the BamHI α/β core (unpublished data), prompting us to ask why an existing restriction endonuclease scaffold cannot be modified to recognize a different, or even a closely related, DNA site. b Fig. 1 Overall structure. a, A view of the BglII–DNA complex looking down the DNA axis. α-Helices, β-strands and loops A–E are labeled on one monomer. Loops B and C make direct DNA contacts, while loops A and D are involved in water-mediated DNA contacts. The figure was produced with Molscript38 and Raster3D39. b, The BglII–DNA complex rotated 90° from the view in (a) (left), and the corresponding view of the BamHI–DNA structure (right). The common α/β core is displayed as a gray molecular surface and the rest of the protein is in ribbon form. In BglII, residues Asp 38, Tyr 190 and Arg 189 from loops A and D of the β-sandwich subdomain and a water molecule (red sphere) participate in an extended hydrogen bond network that effectively encloses the DNA. By comparison, BamHI has only a small secondary domain. The figure was produced with Grasp 40, Molscript38 and Raster 3D39. 1Structural Biology Program, Department of Physiology and Biophysics, Mt. Sinai School of Medicine, 1425 Madison Avenue, New York, New York 10029, USA. 2New England Biolabs, 32 Tozer Road, Beverly, Massachusetts 01915-5599, USA. Correspondence should be addressed to A.K.A. email: [email protected] 134 nature structural biology • volume 2 number 1 • february 2000 © 2000 Nature America Inc. • http://structbio.nature.com articles a © 2000 Nature America Inc. • http://structbio.nature.com b To understand how nature has solved the problem of distinguishing the closely related DNA sites, we undertook the structure determination of BglII as a complement to our earlier work on BamHI4,5,16,17. BglII, a 223-amino acid protein, was originally isolated from B. globigii18 and its sequence shows little homology to other endonucleases, including BamHI. The enzyme was cocrystallized with a 16-base pair DNA fragment containing its recognition sequence AGATCT. The structure was determined at 1.5 Å resolution, which to our knowledge makes it the highest resolution protein–DNA complex structure solved to date. We show that despite the lack of sequence homology between BglII and BamHI, the two enzymes share a similar α/β core. However, in BglII, the α/β core is augmented by a β-sandwich subdomain that provides additional specificity (Fig. 1). Surprisingly, the common base pairs (AGA TCT) are recognized in rather different ways in the two complexes because of differences in DNA conformation, and the BglII active site residues suggest a novel mechanism of DNA hydrolysis. Taken together, this unexpected diversity in DNA recognition, DNA structure and active site challenges our assumptions about the evolution of restriction enzymes in bacteria and provides a basis for the generally unsuccessful attempts at modifying their specificities. Overall fold At 1.5 Å resolution, the BglII–DNA structure provides an exceptionally detailed picture of the numerous interactions that underlie BglII–DNA recognition. The majority of water molecules at the protein–DNA interface were seen directly in the experimental 1.7 Å electron density map, calculated with phases from a multiwavelength anomalous dispersion (MAD) experiment. The outstanding quality of the MAD map (Fig. 2) was affirmed by the fact that even thymine and cytosine bases, whose shapes differ by a single methyl group, were clearly differentiated. The BglII dimer approaches DNA from the major groove side and wraps around to the minor groove side. The monomer is comnature structural biology • volume 7 number 2 • february 2000 Fig. 2 Experimental electron density of the BglII–DNA complex from MAD phasing. a, Stereo view of an guanine:cytosine base pair and several water molecules near the center of the oligomer. b, A β-strand and an α-helix. The experimental map (1.7 Å) was calculated with the program SHARP34 and contoured at 1.2σ. posed of two domains; an α/β core domain that bears resemblance to BamHI, and a unique five-stranded β-sandwich domain (Fig. 1a). Loops from the β-sandwich domain reach into both the major and minor grooves to tightly grip the DNA (Fig. 1b, left). Consequently, the encircling of the DNA is much more extensive than in BamHI (Fig. 1b, right). Remarkably, Tyr 190 residues from the two monomers point directly at each other and are involved in a network of water-mediated interactions that effectively leads to a completely closed ring of protein around the DNA (Fig. 1b, left). Protein conformation The α/β core is similar to that of BamHI, containing a six-stranded β-sheet (β1, β3, β4, β5, β6 and β7) surrounded by five α-helices (α1, α2, α3, α4 and α5), two of which (α4 and α5) are involved in homodimerization. The loops preceding the dimerization helices (loops B and C) carry the residues that contact bases in the major groove. Although the BglII and BamHI cores differ in the lengths of the α-helices and the β-strands (particularly strands β4 and β5, consisting of 5 versus 9 and 8 versus 12 amino acids, respectively), a more profound difference is found upon comparing the dimers. A superimposition of the central four strands of one monomer from each structure shows that the other monomers do not overlie, and are in fact displaced by as much as 4.3 Å in the outer helices. This alteration in the relationship of the monomers in the two structures leads to a narrower DNA binding cleft in BglII. BamHI has a small two-stranded β-substructure outside of the α/β core (Fig. 1b, right)5,16. In BglII, this small substructure is elaborated into a full five-stranded β-sandwich subdomain (β2, β8, β9, β10 and β11) that extends the overall size and shape of the protein (Fig. 1a,b, left). β2 is composed of residues near the N-terminus, whereas the rest of the domain is composed of the C-terminal 50 residues. The residues following β2 form a long loop (loop A) that projects into the DNA major groove, while the corresponding residues in BamHI take a completely different path and do not enter the groove (Fig. 1b, right). The loop between strands β8 and β9 (loop D) is the primary site of interactions with bases in the minor groove. The loop between strands β10 and β11 (loop E) is visible in only one of the monomers, extending beyond the length of the 16-mer and making contacts with a translationally related DNA molecule (Fig. 1b, right). Overall, the second subdomain gives the BglII–DNA complex a ‘squarelike’ appearance when viewed down the DNA helix (Fig. 1a). DNA conformation The DNA is distorted by bending and by local unwinding and overwinding. In contrast to the relatively straight DNA in the BamHI complex16, the BglII DNA bends away (∼23°) from the protein α/β core in a fairly smooth fashion (Fig. 3a). This results in the stacked DNA duplexes forming continuous, sinuous helices throughout the crystal. However, the most striking contrast with BamHI DNA arises in comparing the pattern of helical and propeller twists within the recognition sequence (Fig. 3b,c). In particular, the central 135 © 2000 Nature America Inc. • http://structbio.nature.com articles © 2000 Nature America Inc. • http://structbio.nature.com Fig. 3 DNA parameters. a, Side view of the DNA from both the BglII and BamHI complexes showing the axes of curvature. b, Stereo view of the inner and middle base pairs of BglII DNA. The DNA unwinds by 20° between the two inner base pairs, and the sugar of the middle cytosine has a C3'-endo conformation. c, A comparison of the helical and propeller twist values of BglII and BamHI. The DNA parameters were calculated with the program CURVES41. a step of the BglII recognition sequence (AGA TCT) is unwound by over 15° as compared to the BamHI complex, or to ideal B-DNA. This causes the orientation of the two inner base pairs to be almost superimb posed, with nearly parallel Watson–Crick hydrogen bonds. Interestingly, a similar unwinding at the central step has been described in EcoRV– and EcoRI–DNA complexes, but in those cases, the two inner base pairs become unstacked, assume high roll angles, and cause large kinks in the DNA axis6,7. Moreover, the unwinding in the EcoRV and EcoRI complexes continues to the adjoining base steps, whereas we observe overwinding (by ∼7°) between the inner and middle base steps (AGA TCT). This compensates for the unwinding at the center, so that c the outer recognition base pairs (AGA TCT) in the BglII DNA are only a few degrees offset from the equivalent base pairs in the BamHI complex (GGA TCC). In analyzing the DNA backbone, the most striking deviation from ideal B-DNA occurs in the torsion angles specifying the inner and middle pyrimidines (AGA TCT). Sugars of the middle cytosines adopt a C3'-endo conformation (pseudorotation angles of 33–34°), which is typical of A-DNA. The A-DNA character is further reflected in the torsion angles δ (C4'–C3') and χ (C1'–N1) of the cytosines, which take on values (∼88° and 204°, respectively) that normally distinguish A- from B-DNA19. The inner thymines assume a conformation known as BII, characterized by torsion angles ε (C3'–O3') and ζ (O3'–P) that are gauche-, trans rather than trans, gauche- as in the more common BI form of B-DNA19. These atypical backbone conformations at the inner and middle pyrimidines are stabilized in part by hydrogen bonds between their phosphate groups and the main chain NH groups of residues 38 and 40. Interestingly, residues 38 and 40 are carried on a loop (loop A) that is missing in BamHI (Fig. 1b), which may explain the lack of analogous backbone changes in the BamHI DNA. Whether these transitions in the DNA backbone account for the unwinding and overwinding of the central bases in the BglII DNA is unproven, but highly suggestive. Overall, the resolution of our structure provides a remarkably detailed view of the local changes that occur in B-DNA upon protein binding. The accuracy of the helical parameters and the backbone torsion angles is reflected by their two-fold symmetry, even though the two halves of the DNA were built independently and refined without any averaging. DNA recognition The BglII recognition site differs from that of BamHI by only the outer base pair. Thus, we expected the recognition of the inner and middle base pairs to be very similar. This turns out not to be the case, even though the residues near these base pairs (loop B) are similar. In particular, Asn 116, which recognizes the inner and middle base pairs in BamHI (GGA)16, has an equivalent 136 residue in BglII (Asn 98), but it adopts a different configuration and instead forms bidentate hydrogen bonds with its two-fold related counterpart across the dimer interface. A superimposition of the two enzymes shows that Asn 98 is unable to interact with the DNA in the same way as Asn 116 in BamHI, as a result of differences in bending and other helical parameters between the two DNAs. Consequently, the only contribution Asn 98 makes toward recognition of the middle G:C base pair in BglII is a water-mediated hydrogen bond to the guanine (Fig. 4a,b, left). Also, there are no direct contacts to the inner A:T base pair in BamHI (GGA TCC), but in BglII a direct hydrogen bond is formed between the N7 of the adenine and the Oγ of Ser 97 from the two-fold related monomer (Fig. 4b). The one common interaction maintained between BglII and BamHI is a hydrogen bond between the N4 of cytosine of the middle G:C base pair and the main chain carbonyl of Asn 140 (Asp 154 in BamHI). Recognition of the variant outer base pair is achieved by two equivalently positioned residues on the loop (loop C) preceding the dimerization helix α5 (α6 in BamHI). Residues Asn 140 and Ser 141 in BglII substitute for Asp 154 and Arg 155 in BamHI to recognize an A:T base pair in place of a G:C base pair (Fig. 4b,c). In BamHI, Arg 155 donates bidentate hydrogen bonds to the outer guanine while Asp 154 accepts a single hydrogen bond from nature structural biology • volume 7 number 2 • february 2000 © 2000 Nature America Inc. • http://structbio.nature.com articles a c © 2000 Nature America Inc. • http://structbio.nature.com b Fig. 4 DNA recognition. a, Stereo view of the protein– base contacts in one half-site of the recognition sequence. The two monomers of BglII are colored in different shades of yellow and are labeled A and B. b, Schematic representation of the phosphate and base contacts from (left) one monomer of BglII and (right) one monomer of BamHI. c, Superimposition of the outer base pair recognition residues Asn 140 and Ser 141 from BglII with Asn 154 and Arg 155 of BamHI. The superimposition reveals a 2.5 Å shift in the position of the outer bases. the cytosine. In BglII, Ser 141 forms bidentate hydrogen bonds with the outer adenine, and Asn 140 donates a single hydrogen bond to the O4 of the thymine (Fig. 4). Because of the low sequence homology between BglII and BamHI, the structural identity of these outer recognition residues was only apparent upon solving the BglII structure. In contrast to these residues that contact bases in the major groove, no structural relationship exists between BglII and BamHI for residues interacting in the minor groove. In BamHI, the minor groove is contacted asymmetrically by a C-terminal ‘arm’ from one subunit (Fig. 1b, right)16. However, in BglII the minor groove is contacted symmetrically by a loop (residues Asp 187–Arg 192: loop D) emanating from the β-sandwich domain, which wraps around the minor groove to make water-mediated contacts with all three base pairs in each half-site (Fig. 4a). In particular, Arg 192 forms a series of bridging water interactions with the pyrimidines of the outer two base pairs (AGA), whereas the two symmetryrelated Tyr 190 residues collaborate to fix a single water molecule at the center of the DNA, which in turn donates hydrogen bonds to the O2 of thymines of the inner base pairs (AGA TCT). (A secnature structural biology • volume 7 number 2 • february 2000 ond water molecule between these two tyrosines also contacts both Arg 189 residues, which in turn form hydrogen bonds with Asp 38, allowing loop A to buttress loop D and further extend the network involved in surrounding the DNA (Fig. 1b, left)). The DNA backbone contacts extend well beyond the recognition sequence. Each subunit makes eight direct DNA phosphate contacts and another 20 mediated via water molecules (Fig. 4b, left). The extent of the DNA backbone contacts is enhanced by the presence of the β-sandwich subdomain, which in addition to the minor groove loop (loop D) has two other loops that project outward, with one of them (loop A) brushing against the major groove and the other (loop E) jutting beyond the length of the 16-base pair oligomer. Consequently, BglII buries substantially more solvent-accessible surface area (∼5,515 Å2) than BamHI (4,200 Å2) and far more than dimeric transcription factors such as λ-repressor (3,022 Å2)20. Active site Structural studies on restriction endonucleases have revealed a similar architecture for the active site with the residues following 137 © 2000 Nature America Inc. • http://structbio.nature.com articles © 2000 Nature America Inc. • http://structbio.nature.com Fig. 5 Active site. A stereo view closeup of the BglII active site around the scissile phosphodiester bond. Residues Asn 69, Asp 84, Glu 93 and Gln 95 correspond to the BamHI active site residues Glu 77, Asp 94, Glu 111 and Glu 113. The structure reveals numerous water molecules and an octahedrally coordinated cation (orange sphere) between the conserved residues and the DNA. Five of the six ligands have distances ranging from 2.2 to 2.5 Å; the sixth ligand (to the proposed nucleophilic water molecule) is 2.7 Å. One water molecule in the coordination sphere of the cation (labeled NW) makes a 155° angle with the scissile P–O3 bond, and appears to be the attacking nucleophile. the weak consensus sequence (Glu/Asp)-X(9–20)(Glu/Asp/Ser)-X-(Lys/Glu). BamHI is the only endonuclease that contains a glutamate (Glu 113) at the final position of the consensus, the importance of which is underlined by the fact that mutating this residue to lysine inactivates the enzyme 21. The presence of oppositely charged lysine or glutamate residues at the last position of the consensus has led to much debate about the catalytic mechanism(s) of restriction endonucleases. The BglII structure adds a further twist to the debate by revealing for the first time in restriction enzymes a glutamine at this position. Architecturally, the BglII active site is similar to other endonucleases but follows the sequence Asp 84-X9-Glu 93-X-Gln 95 (Fig. 5). Remarkably, BamHI becomes inactive when Glu 113 is mutated to a glutamine as in BglII (unpublished data). We see evidence of a single metal in the BglII active site, although no divalent cations were used in the crystallization of the native complex. Thus the metal may be a sodium ion from the protein or crystallization buffer (Tris or MES) occupying a potential Mg2+ binding site. A similarly bound metal was found when we soaked the native cocrystals in calcium. The single cation is octahedrally coordinated with the side chain oxygen of Asp 84, the backbone carbonyl of Val 94, a phosphate oxygen, and three water molecules, including one that makes a ∼155° angle with the P–O3 bond of the scissile phosphate group (Fig. 5). This water molecule is well poised to act as the attacking nucleophile with a pKa lowered by its contact with the metal and its orientation fixed by a hydrogen bond with the Oε1 of Gln 95. The presence of a single metal in the BglII active site contrasts with BamHI, which contains two metals in the active site, both of which are postulated to be involved in the stabilization of the pentacovalent transition state16,17. A superimposition of the BglII and BamHI active site residues places the metal only 0.3 Å away from the metal A in BamHI, but BglII lacks the equivalent of metal B which lies close to the leaving O3' atom and is coordinated by residue Glu77 (see ref. 17 for details of the Bam HI active site). The equivalent metal may be missing in BglII because it has a less acidic Asn 69 at the position corresponding to Glu77 in BamHI (Fig. 5). Curiously, Asp 93 in our structure is turned away from the active site, although it could rotate to a conformation similar to Glu 111 in BamHI in order to coordinate metal A. Another difference between the two structures is that Gln 95 coordinates to the nucleophilic water, whereas its equivalent residue in BamHI (Glu 113) directly contacts the phosphate oxygen. Discussion BglII is remarkable in both differences and similarities from BamHI. Both enzymes share a similar α/β core that carries the catalytic center and the residues that contact DNA bases in the major groove. However, the α/β core in BglII is augmented by a β-sandwich subdomain that extends the size and shape of the enzyme. Several projections (loops A, D and E) extend outward from the 138 β-sandwich subdomain to grip the DNA, effectively leading to a closed ring of protein around the DNA. This total enclosure of the DNA is unprecedented for a sequence-specific DNA binding protein, and suggests a possible hingelike motion between the two subdomains that allows the DNA to enter. Critical to the closure of the enzyme is a network of water-mediated hydrogen bonds between two symmetry-related tyrosines (Tyr 190) that complete the embrace around the DNA (Fig. 1a,b, left). The protein–DNA contacts are different between BglII and BamHI. For instance, the minor groove contacts in BamHI are made by the C-terminal arm of one subunit16, whereas the analogous contacts in BglII are made in a symmetrical manner by loops D emanating from the β-sandwich subdomains. One intriguing difference lies in the recognition of the two common (inner and middle) base pairs in the major groove. Both enzymes position an equivalent asparagine residue near these base pairs, but in BamHI the asparagine (Asn 116) is directly hydrogen bonded to these base pairs whereas in BglII Asn 98 is mediating dimer contacts. Recognition of the variable, outer base pair is, as expected, different between BglII and BamHI. Residues Asn 140 and Ser 141 in BglII substitute for residues Asp 154 and Arg 155 in BamHI in order to recognize an A:T base pair instead of a G:C base pair. The pattern of donor and acceptor atoms on these sets of residues is such that in BglII it could only match an A:T base pair and in BamHI only a G:C base pair. At first glance, changing the specificity of BamHI in order to recognize BglII appears to be a simple case of substituting an asparagine and a serine for Asp 154 and Arg 155. However, when we model these substitutions into BamHI, we see a 2.5 Å displacement of the outer base, caused by differences in DNA bending in the two structures, that places the base too far away to interact with a short serine residue (Fig. 4c). This small but critical difference in how the two enzymes contort the DNA plays an important role in their specificity. A major reason that we undertook the X-ray analysis of BglII was to learn why one cannot select BamHI mutants to recognize the closely related BglII site based on the cassette mutagenesis of residues Asn 116, Ser 118, Asp 154 and Arg 155 (ref. 15; unpublished data). Several attempts have been made to switch the specificities of restriction endonucleases, but generally without much success10,12–15. This lack of success contrasts with transcription factors, where at least partial success has been achieved with proteins containing a similar DNA binding motif, such as the helix-turnhelix motif or the zinc finger22–24. To understand the immutability of restriction enzymes in evolutionary terms, we suggest that they are under a unique selective pressure not to look too much alike. For instance, we can imagine that if the specificity of BamHI was to change to that of BglII through just a few point mutations, it would be lethal for the host bacterium because it would become susceptible to cleavage at the unmethylated BglII sites on its genome. Thus, restriction enzymes may be under strong selective nature structural biology • volume 7 number 2 • february 2000 © 2000 Nature America Inc. • http://structbio.nature.com articles Table 1 Data collection, phasing and refinement statistics MAD data collection and phasing statistics at 1.5Å resolution © 2000 Nature America Inc. • http://structbio.nature.com Wavelength (Å) Number of reflections Number of unique reflections (F+ + F-) Completeness (%, last shell) Rsym (last shell)1 I / σ(I) Phasing power (dispersive acentrics) Phasing power (anomalous acentrics) Mean overall figure of merit (centric/acentric) Se-edge 0.9792 623,003 123,511 99.4 (94.0) 0.070 (0.151) 11.3 2.38 3.51 – Se-peak 0.9791 620,648 123,628 99.4 (92.8) 0.073 (0.146) 11.2 3.28 3.32 – Refinement statistics Resolution range (Å) Number of reflections used in refinement σ cutoff used in refinement Rcryst / Rfree2 Number of atoms Protein / DNA Water / ions B factors (Å2) Protein / DNA Water / ions R.m.s. deviations Bonds (Å) Angles (°) 1 2 Se-remote 0.9494 629,080 123,664 99.8 (99.3) 0.069 (0.162) 12.1 – 2.79 0.46 / 0.69 Native 0.9790 346,887 91,032 96.6 (97.5) 0.078 (0.289) 7.8 20–1.7 62,634 (2,148) 0 0.182 / 0.200 20–1.5 84,917 (2,144) 1 0.195 / 0.225 3,590 / 650 543 / 2 3, 590 / 650 614 / 2 13.9 / 15.7 22.8 / 24.8 12.6 / 14.7 21.5 / 26.0 0.018 1.88 0.017 1.91 Rsym = Σ|Ih - <Ih>| / ΣIh over all h, where Ih is the intensity of reflection h. Rcryst / Rfree = Σ||Fo| - |Fc|| / Σ|Fo|. Rfree was calculated using 3–4% of data excluded from refinement. pressure to evolve an interface that is not only complementary toward their recognition sequence but also one that cannot so easily be altered to recognize another unprotected sequence. From our superimpositions, it is clear that additional residues, such as those mediating dimerization and DNA backbone contacts, also need to be mutated in order to configure the DNA for optimal contacts in the major groove. All in all, the BglII structure strongly reinforces a sense that protein–DNA recognition in restriction endonucleases must be considered in the context of the whole protein and not just the few residues that interact directly with the DNA bases. The BglII structure reveals a novel active site in which a glutamine residue is seen for the first time at a position corresponding to the glutamate in BamHI and lysine in most other restriction endonucleases2. Thus, the rather tentative active site consensus sequence is further weakened to (Glu/Asp)-X(9–20)-(Glu/Asp/Ser)X-(Lys/Glu/Gln). It is curious that even though all restriction endonucleases catalyze the same chemical reaction and their active sites are generally superimposable, there is this level of diversity in the identity of the active site residues. Remarkably, BamHI becomes inactive when the glutamate at the last position of the consensus is changed to a lysine as in EcoRV21 or a glutamine as in BglII (unpublished results), and (conversely) EcoRV loses activity when lysine is changed to a glutamate25. Whether this variability in active site residues actually reflects different mechanisms of catalysis is unclear at present. For BamHI a two-metal mechanism of catalysis has been proposed17, whereas a variety of mechanisms have been postulated for EcoRV, ranging from substrate-assisted catalysis to the involvement of one, two and even three metals3,26–28. In the BglII structure, we see evidence for the involvement of a single metal in hydrolysis. The BglII active site bears resemblence to the active site of the unrelated LAGLIDADG homing endonuclease I-CreI, nature structural biology • volume 7 number 2 • february 2000 which also contains a single metal ion as well as a glutamine residue coordinated to the nucleophilic water29. Thus, the structure supports suggestions of a link between the catalytic mechanisms of restriction enzymes and the LAGLIDADG homing endonucleases29. All in all, the number of known restriction endonuclease structures (<10) is still very small and selective, when one considers that more than 3,000 restriction endonucleases have been identified1. The disparity in BglII and BamHI structures shows vividly the different successful strategies for site-specific recognition and catalysis in this large family of enzymes. Methods Crystallization. BglII was cloned from B. globigii RUB561 into Escherichia coli18. Recombinant selenomethionyl BglII was expressed in the presence of selenomethionine and by inhibiting the methionine synthetic pathway. The protein was purified by a series of lowpressure chromatographic steps: DEAE (flow-through), phospho-cellulose, heparin-Sepharose, and Q-Sepharose. The 2'tritylated oligomer 5'-TATTATAGATCTATAA-3' was purified by HPLC, dissolved to 1 mM in TE (10 mM Tris.HCl pH 8.0, 1 mM EDTA) and 100 mM KCl, and reannealed. Before crystallization, protein at 1.6 mg ml-1 (in 100 mM KCl, 10 mM Tris pH 7.4, 1 mM dithiothreitol and 10% glycerol) was incubated with a 0.6 M equivalent of duplex oligonucleotide at 0 °C for 30 min, then concentrated to 8–10 mg ml-1. A 1 µl drop of protein–DNA complex was combined with 1 µl of precipitant (17–20% PEG 1540, 0.2 M ammonium sulfate, 0.1 M MES pH 5.2) in a well of a microbatch plate under a layer of paraffin oil. Crystals grew at 20 °C within a few days and continued to grow slowly for several weeks. Crystals were transferred in steps to a solution of 25% PEG 1540, 10–20% glycerol, 0.2 M ammonium sulfate, 0.1 M MES pH 5.2 and then frozen by immersion in liquid propane. Data collection and structure determination. Diffraction data were collected at liquid nitrogen temperature at the Advanced 139 © 2000 Nature America Inc. • http://structbio.nature.com © 2000 Nature America Inc. • http://structbio.nature.com articles Photon Source (Argonne National Laboratories) on beamline 17-ID, which is equipped with a MAR CCD detector. Crystals diffracted to better than 1.7 Å in resolution and belong to space group P212121 with unit cell parameters a = 48.8 Å, b = 101.7 Å, c = 116.9 Å. The asymmetric unit contains one full complex (a dimer of BglII bound to the 16-base pair DNA duplex). An X-ray fluorescence scan was obtained in order to determine the wavelengths for use in MAD data collection30. The data were measured by the inverse beam method. Individual oscillation images (t = 5 s) spanning 0.5° were collected at the wavelength corresponding to the selenium edge, over a total angular range of 120° at a crystal-to-detector distance of 120 mm. This was then repeated at the same wavelength for φ = φ + 180°, and the sequence was then repeated at the other two wavelengths corresponding to the selenium peak and the remote point. Diffraction data were indexed and integrated using DENZO and reduced using SCALEPACK31. Anomalous difference Patterson maps for each wavelength were calculated using the PHASES package32. Six sites were immediately apparent, and were verified and refined using PATSOL33. Anomalous difference Fourier maps were calculated in order to place the sites at the same origin. MAD analysis was performed using SHARP34 (with six input sites), using the remote wavelength data as the reference. Density modification with SOLOMON (CCP4; ref. 35) produced exceptionally clean experimental maps, which were fit with the program O36. The entire protein, except residues 119–122 in monomer A and 208–216 in monomer B, was fit into the experimental map, along with 31 of the 32 DNA bases. Coordinate refinement was first carried out against the 1.7 Å data using CNS37. Following a four-body rigid-body refinement, the crystallographic R factor was 0.448 (Rfree = 0.452). After one round of positional refinement, the R factor dropped to 0.385 (Rfree = 0.403). Individual B-factor refinement and positional refinement further dropped the R factor to 0.307 (Rfree = 0.328). At this stage, 225 water molecules, the one remaining base, and residues Ala 119–Ala 122 were added and simulated annealing carried out, resulting in an R factor of 0.263 (Rfree = 0.279). Several cycles of rebuilding and fitting solvent molecules resulted in a model that included 543 water molecules, 2 ions, and excellent protein geometry (Table 1). At this stage, a 1.5 Å resolution data set became available, measured from a native crystal soaked in calcium at the National Synchrotron Light Source Beamline X25, which is equipped with a Brandeis B4 CCD detector. This data set was used to complete the refinement using CNS. The refined 1.7 Å structure was used as a starting model. After several cycles of simulated annealing, positional, b-factor, and occupancy (ions only) refinement, the final R factor is 0.198 (Rfree = 0.227) with 614 water molecules. The only missing segment of the protein is residues 210–214 of monomer B. 1. Roberts, R.J. & Halford, S.E. Type II restriction endonucleases. in Nucleases (eds. Linn, S.M., Lloyd, R.S. & Roberts, R.J.) 35-88 (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York; 1993). 2. Aggarwal, A.K. Structure and function of restriction endonucleases. Curr. Opin. Struct. Biol. 5, 11–19 (1995). 3. Pingoud, A. & Jeltsch, A. Recognition and cleavage of DNA by type-II restriction endonucleases. Eur. J. Biochem. 246, 1–22 (1997). 4. Newman, M., Strzelecka, T., Dorner, L.F., Schildkraut, I. & Aggarwal, A.K. Structure of restriction endonuclease BamHI and its relationship to EcoRI. Nature 368, 660–664 (1994). 5. Newman, M., Strzelecka, T., Dorner, L.F., Schildkraut, I. & Aggarwal, A.K. Structure of restriction endonuclease BamHI phased at 1.95Å resolution by MAD analysis. Structure 2, 439–452 (1994). 6. Kim, Y.C., Grable, J.C., Love, R., Greene, P.J. & Rosenberg, J.M. Refinement of EcoRI endonuclease crystal structure: a revised protein chain tracing. Science 249, 1307–1309 (1990). 7. Winkler, F.K. et al. The crystal structure of EcoRV endonuclease and of its complexes with cognate and non-cognate DNA fragments. EMBO J. 12, 1781–1795 (1993). 8. Cheng, X., Balendiran, K., Schildkraut, I. & Anderson, J.E. Structure of PvuII endonuclease with cognate DNA. EMBO J. 13, 3927–3935 (1994). 9. Athanasiadis, A. et al. Crystal structure of PvuII endonuclease reveals extensive structural homologies to EcoRV. Nature Struct. Biol. 1, 469–475 (1994). 10. Alves, J. et al. Changing the hydrogen-bonding potential in the DNA binding site of EcoRI by site-directed mutagenesis drastically reduces the enzymatic activity, not, however, the preference of this restriction endonuclease for cleavage within the site -GAATTC-. Biochemistry 28(1989). 11. Heitman, J. & Model, P. Substrate recognition by the EcoRI endonuclease. Proteins 7, 185–197 (1990). 12. Osuna, J., Flores, H. & Soberon, X. Combinatorial mutagenesis of three major groove-contacting residues of EcoRI: single and double amino acid replacements retaining methyltransferase-sensitive activities. Gene 106, 7–12 (1991). 13. Xu, S. & Schildkraut, I. Isolation of BamHI variants with reduced cleavage activities. J. Biol. Chem. 266, 4425–4429 (1991). 14. Wenz, C. et al. Protein engineering of the restriction endonuclease EcoRV: replacement of an amino acid residue in the DNA binding site leads to an altered selectivity towards unmodified and modified substrates. Biochem. Biophys. Acta 1219, 73–80 (1994). 15. Dorner, L.F., Bitinaite, J., Whitaker, R.D. & Schildraut, I. Genetic analysis of the base-specific contacts of BamHI restriction endonuclease. J. Mol. Biol. 285, 1515–1523 (1999). 16. Newman, M., Strzelecka, T., Dorner, L.F., Schildkraut, I. & Aggarwal, A.K. Structure of BamHI endonuclease bound to DNA: partial folding and unfolding on DNA binding. Science 269, 656–663 (1995). 17. Viadiu, H. & Aggarwal, A.K. The role of metals in catalysis by the restriction endonuclease BamHI. Nature Struct. Biol. 5, 910–916 (1998). 18. Anton, B.P. et al. Cloning and characterization of the BglII restriction-modification system reveals a possible evolutionary footprint. Gene 187, 19–27 (1997). 19. Schneider, B., Neidle, S. & Berman, H.M. Conformations of the sugar-phosphate backbone in helical DNA crystal structures. Biopolymers 42, 113–124 (1997). 20. Beamer, L.J. & Pabo, C.O. Refined 1.8 Angstrom crystal structure of the λ- repressor operator complex. J. Mol. Biol. 227, 177–196 (1992). 21. Dorner, L.F. & Schildkraut, I. Direct selection of binding proficient/catalytic deficient variants of BamHI endonuclease. Nucleic Acids Res. 22, 1068–1074 (1994). 22. Reber, E.J. & Pabo, C.O. Zinc finger phage: affinity selection of fingers with new DNA-binding specificities. Science 263, 671–673 (1994). 23. Choo, Y. & Klug, A. Toward a code for the interactions of zinc fingers with DNA: selection of randomized fingers displayed on phage. Proc. Natl. Acad. Sci. 91, 11163–11167 (1994). 24. Wharton, R.P. & Ptashne, M. Changing the binding specificity of a repressor by redesigning an alpha-helix. Nature 316, 601–605 (1985). 25. Selent, U. et al. A site-directed mutagenesis study to identify amino acid residues involved in the catalytic function of the restriction endonuclease EcoRV. Biochemistry 31, 4808–4815 (1992). 26. Vipond, I.B., Baldwin, G.S. & Halford, S.E. Divalent metal ions at the active sites of EcoRV and EcoRI restriction endonucleases. Biochemistry 34, 697–704 (1995). 27. Jeltsch, A., Alves, J., Wolfes, H., Maass, G. & Pingoud, A. Substrate-assisted catalysis in the cleavage of DNA by the EcoRI and EcoRV restriction enzymes. Proc. Natl. Acad. Sci. USA 90, 8499–8503 (1993). 28. Horton, N.C., Newberry, K.J. & Perona, J.J. Metal ion-mediated substrate-assisted catalysis in type II restriction endonucleases. Biochemistry 95, 13489–13494 (1998). 29. Jurica, M.S., Monnat, J., R.J. & Stoddard, B.L. DNA recognition and cleavage by the LAGLIDADG homing endonuclease I-CreI. Mol. Cell 2, 469–476 (1998). 30. Hendrickson, W.A. Determination of macromolecular structures from anomalous diffraction of synchrotron radiation. Science 254, 51–58 (1991). 31. Otwinowski, Z. & Minor, W. Processing of X-ray diffraction data collected in oscillation mode. Methods Enzymol. 276, 307–326 (1997). 32. Furey, W. & Swaminathan, S. PHASES-95: a program package for the processing and analysis of diffraction data from macromolecules. Methods Enzymol. 277, 590–629 (1997). 33. Tong, L. PATSOL - v1.2. (1993). 34. de La Fortelle, E. & Bricogne, G. Maximum-likelihood heavy-atom parameter refinement for multiple isomorphous replacement and multiwavelength anomalous diffraction methods. Methods Enzymol. 276, 472–494 (1997). 35. Collaborative Computational Project Number 4. CCP4 Suite: programs for protein crystallography. Acta Crystallogr. D 50, 760–763 (1994). 36. Jones, A.T., Zou, J.Y., Cowan, S.W. & Kjeldgaard, M. Improved methods for building protein models in electron density maps and the location of errors in these models. Acta Crystallogr. A 47, 110–119 (1991). 37. Adams, P.D., Pannu, N.S., Read, R.J. & Brunger, A.T. Cross-validated manximum likelihood enhances crystallographic simulated annealing refinement. Proc. Natl. Acad. Sci. USA 94, 5018–5023 (1997). 38. Kraulis, P.J. MOLSCRIPT: a program to produce both detailed and schematic plots of protein structures. J. Appl. Crystallogr. 24, 946–950 (1991). 39. Merritt, E.A. & Bacon, D.J. Raster3D photorealistic molecular graphics. Methods Enzymol 277, 505–524 (1997). 40. Nicholls, A., Sharp, K. & Honig, B. Protein folding and association: insights from the interfacial and thermodynamic properties of hydrocarbons. Proteins Struct. Funct. Genet. 11, 281–296 (1991). 41. Laverly, R. & Sklenar, H. The definition of generalized helicoidal parameters and of axis curvature for irregular nucleic acids. J. Biomol. Struct. Dyn. 6, 63–91 (1988). 140 Coordinates. Coordinates of both structures have been deposited in the protein data bank (accession numbers 1D2I and 1DFM for the 1.7 and 1.5 Å structures, respectively). Acknowledgments We are grateful to P. Weber and C. Lesburg (Schering-Plough Research Institute) for use of APS synchrotron time and for assistance with data collection, respectively. We thank L. Berman and H. Lewis for facilitating data collection at NSLS. We also thank H. Viadiu for stimulating discussion and analysis. A.K.A. is supported by a grant from the NIH, and C.M.L. is supported by the Damon Runyon-Walter Winchell Cancer Research Fund. Received 5 November, 1999; accepted 28 December, 1999. nature structural biology • volume 7 number 2 • february 2000