Download Degenerate PCR - Yale School of Medicine

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Molecular Inversion Probe wikipedia , lookup

Metagenomics wikipedia , lookup

DNA vaccination wikipedia , lookup

DNA supercoil wikipedia , lookup

Nucleic acid double helix wikipedia , lookup

RNA-Seq wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Extrachromosomal DNA wikipedia , lookup

Non-coding DNA wikipedia , lookup

Frameshift mutation wikipedia , lookup

Epigenomics wikipedia , lookup

Primary transcript wikipedia , lookup

DNA polymerase wikipedia , lookup

Molecular cloning wikipedia , lookup

Genomics wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

History of genetic engineering wikipedia , lookup

Gel electrophoresis of nucleic acids wikipedia , lookup

Gene wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Genomic library wikipedia , lookup

Point mutation wikipedia , lookup

Designer baby wikipedia , lookup

Replisome wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Microevolution wikipedia , lookup

Cell-free fetal DNA wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Helitron (biology) wikipedia , lookup

Expanded genetic code wikipedia , lookup

SNP genotyping wikipedia , lookup

Microsatellite wikipedia , lookup

Bisulfite sequencing wikipedia , lookup

Genetic code wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Transcript
Degenerate PCR
by Michael Koelle
The identification of novel members of gene families by PCR using degenerate primers
has been considered more of an art than a science, so much so that the methods books
I've come across have been too timid to discuss the considerations that go into the design
of this experiment, much less give a protocol for its execution. At the risk of leading my
readers on wild goose chases, I'm committing my methods to paper. The following is
based on my reading of the recent literature (e.g. Buck and Axel, Cell 65: 175-187, 1991;
Riddle et al., Cell 75: 1401-1416, 1993; Krauss et al., Cell 75:1431-1444, 1993),
discussions with several other successful practitioners of the art, and my own experience
isolating vertebrate homologs of the C. elegans egl-10 gene.
Primer design
This is the most important factor in the success of the experiment, and deserves careful
deliberation. I suggest diagramming out an alignment of the existing members of your
gene family, highlighting conserved residues, and labeling each important position in the
alignment with the number of codons that encode the amino acid(s) at that position. An
example based on the original members of the egl-10 gene family is included below, and
cited in the following discussion.
Primer degeneracy
In the early days of degenerate PCR some novel genes were successfully amplified using
primer pools that were over 1000-fold degenerate. However, primers of such high
degeneracy appear not to have been generally successful (with some exceptions, e.g.
Giovane et al., Genes Dev. 8: 1502-1513, 1994), and most recent successes have come
using primer pools of 100-fold degeneracy or less. Five methods for reducing the
degeneracy of the primers are discussed below:
1) judicious selection of the primer sites
The positioning of the primers is a compromise between placing them at the codons for
the most conserved amino acids, and placing them at the codons for less conserved amino
acids whose degeneracy may be lower. Consider the case of the 3' primers ("3T and 3A")
in the example shown below for the egl-10 gene family. At first it might seem more
sensible to place these primers 3 codons to the right, where there is a block of 5 out of 6
absolutely conserved amino acids: SY(P/Q)RFL. Unfortunately, this block of amino
acids is encoded by 5184 different DNA sequences. The actual primers used were placed
3 codons to the left. At this position only 3 out of 6 amino acids are absolutely
conserved: M(E/K)(K/N)(D/N)SY. However, this block of amino acids is encoded by
only 768 different DNA sequences.
2) the use of inosine as a "neutral" base
Inosine is a purine (which occurs naturally in tRNAs) that can form base pairs with
cytidine, thymidine, and adenosine (although the inosine:adenosine pairing presumably
doesn't fit quite correctly in double stranded DNA, so there may be an energetic penalty
to pay when the helix bulges out at this purine:purine pairing). Recently, most people
have been using inosine in their primers at positions where any of the four bases might be
required. Each use of inosine thus reduces the degeneracy of the primer pool 4-fold.
However, you risk the occurrence of I:G mismatches, and therefore must assume that
exact base pairing at other positions in the primer will overcome such a problem. Most
oligo synthesis facilities will make inosine-containing oligos, no problem. I had excellent
luck with inosine-containing primers with theegl-10 gene family, except in the case of the
primer "5out", a 20mer containing 5 inosines, (including 2 near the 3' end of the primer)
which failed to amplify products even from a cloned egl-10 cDNA. So, perhaps 5 out of
20 inosines is too many.
Using inosine in the primers requires that the DNA polymerase used in the PCR reaction
be capable of synthesizing DNA over an inosine-containing template. Taq polymerase is
capable of doing this, but some others (e.g. Vent) appear not to be able to.
3) using multiple separate oligo pools at a single position
In an effort to use primer pools with the lowest possible degeneracy, it is sometimes
useful to synthesize primers over a particular stretch of codons as two or more separate
pools, each of which will have lower degeneracy than you would get by synthesizing a
single pool including all of the same codons. The pools are then used separately to carry
out PCR reactions. For example, the primer pools "3T" and "3A" in the egl-10 example
below are identical, except at their serine codons. Sadly, serine is encoded by 6 different
codons, TC(A/G/C/T) and AG(T/C). Synthesizing a single pool covering all these
possibilities might require a high degeneracy and would necessarily include some nonserine codons. By splitting into two pools (one nondegenerate containing an inosine, the
other 2-fold degenerate) I was able to keep the degeneracy low, and avoid all non-serine
codons. Another example is shown in the case of primers "5inE" and "5inR", which again
are identical except at one codon.
4) including partial codons at the ends of the primers
The various codons encoding an amino acid or a set of similar amino acids are often
identical at their first (and maybe second) positions, but different at their third position.
You can take advantage of this by synthesizing only the first or first and second positions
of the 3' most codon covered by your primer pools, thus giving you one or two extra
positions of exact match base pairs without adding any degeneracy. In the egl10 example, the primer pools "3T" and "3A" cover a stretch in which the last codon must
encode proline or glutamine. The codons for these two amino acids all start with C, but
their last two positions are degenerate. Therefore, only the nondegenerate C was included
in the primer pools.
5) use of codon bias
Some organisms have strong biases for using particular codons to encode certain amino
acids. In theory you could reduce the degeneracy of a primer pool by only including these
most common codons, and taking the risk that the gene(s) you are looking for will follow
the organism's general codon bias enough to allow such primer pools to work. I haven't
heard of anyone actually using this codon bias strategy in a successful degenerate PCR
experiment, but you might try it if you're desperate.
Other considerations in primer design
1) primer length
In the example below the short stretches of sequence similarity among the egl-10 family
members forced me to use primers only 19-21 bases long. These are shorter than the
primers I have heard of people using in other successful experiments. For example, Linda
Buck's primers were 31-33mers.
2) 3' end
People I talked to emphasized the special importance of having an exact match between
the primer and template near the 3' end of the primer, although I'm not aware of specific
data supporting this idea. Foregl-10 I tried to avoid having any inosines near the 3' ends
of the primers (except for primer "5out", which in fact failed to give any products), and
also anchored the primers when possible with a nondegenerate codon at their 3' ends, so
that 100% of the primers in the pool would be able to pair perfectly with the correct
template over these last few bases.
3) nested primers
If the sequence similarity in your gene family permits, it is a good idea to make nested
sets of PCR primers. That way one round of PCR can be performed using the outside
primers, and individual products (or the whole mix) can then be reamplified using the
inside primers. Products amplified through both rounds are more likely to be the desired
new gene family members, and less likely to be spurious products from sequences that
happen to contain a couple of primer annealing sites by chance.
Determining optimal reaction conditions
A number of parameters can be varied to optimize reaction conditions for degenerate
PCR. These include: primer concentration, magnesium concentration, template
concentration, number of cycles of amplification, and the temperatures and times of each
step in the amplification cycle. If each of these parameters is to be independently varied,
the number of possibilities quickly reaches mind boggling proportions. My philosophy
has been to fix almost all these parameters at the standard levels that have been
successful for other people, and to vary only the one parameter that I think is the most
crucial: the temperature of the annealing step during amplification.
My standard PCR reactions are as follows:
1.5 µl template DNA (2-300 ng)
5 µl 5 µl 10X PCR buffer (10X buffer=100 mM Tris pH 8.3, 500 mM KCl, 15 mM
MgCl2, 0.01% gelatin)
8 µl dNTP mix (1.25 mM each dNTP)
0.2 µl "ampliTaq" polymerase (5 U/µl)
25 µl dH20
5 µl each primer pool at 20 µM each
total volume 50 µl
In practice, the reactions are set up by placing the primers and template into a 0.5 ml
tube, then adding two drops of mineral oil from a blue tip, and adding on top of the oil
38.5 µl of a premix containing all the other components. In this way, it is easy to set up
many different primer/template combinations at once. The tubes are then briefly spun in a
microfuge to combine the two aqueous phases, and the tubes are immediately placed in
the PCR block preheated to 95· for a "hot start".
My amplification program:
95· X 3 min. (hot start)
??· X 1 min. (this annealing temperature is varied to optimize the amplification)
72· X 2 min.
94· X 45 sec.
40 cycles of the above 3 steps
72· X 5 min.
hold at 4·
This takes ~4.5 hours to run on an MJ Research machine.
To test the primers and optimize the conditions, I do a series of amplification runs
starting with an annealing temperature of 25·, and increasing in 5· increments until
amplification fails to occur. Typically for each primer pair being tested, at each
temperature, I run 3 reactions containing different templates: 1) a positive control
containing 2 ng of a cloned member of the gene family of interest as template. 2) a
negative control containing no template (this is very important-- you don't want to get
fooled by contaminants). 3) an experimental reaction containing a complex template such
as genomic DNA or total cDNA. For total C. elegans genomic DNA I've been using 300
ng as a template. Using rat brain cDNA as a template I amplified off of only 2 ng.
However, this was only because I didn't have very much cDNA. If possible, it would be
better to use ~200 ng of cDNA as a template, as Linda Buck did to amplify the odorant
receptors.
Choice of template
Genomic DNA has the advantage that all members of your gene family are present in
equimolar amounts, and genomic DNA is probably readily available. The obvious
disadvantage is that introns may disrupt the primer sites, or may cause the amplification
product to be so long that it is not amplified efficiently.
cDNA templates, though harder to obtain, overcome this problem. A big advantage of
cDNA is that the desired amplification products should be of a known size, and you can
therefore easily pick them out from among spurious products of other sizes. Remember
that the "correct" sized band amplified off a cDNA template may be a complex mixture
of products from many gene family members, so you may have to analyze many clones
generated from such a band to assess its complexity. Linda Buck used random primed
cDNA for her template, presumably to avoid biasing the cDNA towards the 3' ends of
transcripts. In my case, I knew that the region I was amplifying should be at the extreme
3' end of the coding sequence, so I used oligo-dT primed cDNA.
For the lazy and rich, Clontech sells oligo-dT primed cDNA prepared from various
tissues of many species to use as templates for PCR.
Analysis of PCR products
After amplification run 20 µl of each reaction out on an agarose gel. I use 2% 3:1
Nusieve:SeaKem LE agarose (you can buy this premixed from FMC) in 1X TAE. This
gel is not very low melting, and thus isn't very suitable for cloning directly from the gel,
but it gives very nice resolution. I use the 123 bp ladder from Gibco as a size standard.
Obviously you expect to get products off the positive control, and not to get them off the
negative control. Using the complex template, you will probably get a smear at the lower
annealing temperatures, which will resolve into a small number of bands as the annealing
temperature rises. I pick an annealing temperature that gives a modest number of bands,
and then clone all these bands and sequence them.
If no products are evident in the experimental samples, a good trick to try is to use 2 µl of
the apparently failed reaction as a template, and reamplify under the same conditions.
This often gives visible products.
If you want to clone products that are only barely visible, you can get more of them by
just reamplifying the original reaction as described above. Another way to amplify
individual products separately is to cut the bands out of the gel that was used to analyze
the original reactions (actually I take a bore out of the gel with a Pasteur pipette), melt the
DNA containing agarose, and use 2 µl of it as a template to reamplify under the same
conditions.
The above described method for reamplifying specific bands can (and should) be used to
test amplified products to see if they are single primer artefacts. Use 2 µl of an agarose
gel bore to set up each of three PCR reactions, containing either individual primer or both
together. Obviously, you're only interested in products that require both primers in order
to be amplified.
I clone PCR products by running the PCR reaction out on a low melt agarose gel (2%
Nusieve agarose in 1X TAE). I cut the desired band out, melt it at 70·, mix well by
pipetting up/down, and use 5 µl of the melted agarose directly in a ligation reaction with
a dT tailed vector.
This vector DNA is prepared as follows: cut 1 µg bluescript SK with EcoRV in a 20 µl
reaction. Add 20 µl 1X PCR buffer, and 2 µl 2 mM dTTP. Add 0.5 µl "ampliTaq"
polymerase (2.45 U), and incubate at ~72· for 20 min. Run the DNA out on a 0.8%
Seaplaque agarose gel in 1X TAE, cut out the band, melt at 70·, mix well, and use 5 µl in
a ligation reaction. It turns out that only about 50% of the colonies obtained after
transformation of this type of reaction may have inserts; the rest are vector reclosures.
However, if blue/white selection is used, virtually all the white colonies have inserts.