Download Genotyping with RAD and ddRAD Sequencing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

United Kingdom National DNA Database wikipedia , lookup

Microsatellite wikipedia , lookup

Helitron (biology) wikipedia , lookup

DNA sequencing wikipedia , lookup

Exome sequencing wikipedia , lookup

Transcript
Exome Sequencing and Data Analysis
Genotyping with RAD and ddRAD Sequencing
Background
Genotyping requires thousands of genomes to be compared in a reliable, consistent way.
Restriction site associated DNA sequencing (RAD-Seq) interrogates a fraction of the genome across
many individuals, an ideal method for genotyping.1 By using restriction enzyme digestion and
sequencing the regions adjacent to restriction sites, researchers can examine the same subset of
genomic regions for thousands of individuals and identify many genetic markers along the
genome.2 Other NGS methods examine a larger portion of the genome and offer more data, but they
are costly and cannot be used to study the thousands of individuals required for genotyping.
In its early applications, RAD-Seq was used for genetic marker discovery in threespine
stickleback using microarray technology.3 Since, it has been modified for NGS technology to
genotype a variety of organisms with or without prior genomic information, including the perennial
ryegrass, eucalyptus, eggplant,
sorghum and barley.4-8
RAD-Seq applications include:
-Genetic marker discovery1-13
-Local genome assembly9
-QTL mapping4, 8
-Linkage mapping10
ddRAD-Seq at SciGenom
Figure 1 RAD-Seq can be used for various studies.11
SciGenom uses double digest RAD-Seq (ddRAD-Seq), a variation of RAD-Seq, for genotyping.
Traditional RAD-Seq uses one restriction enzyme and random shearing to generate fragments from
genomic DNA. However, these are high DNA loss steps and offer little control over the fragments that
are sequenced. For organisms without a reference genome, a significant portion of the RAD-Seq data
has been discarded due to sequence read errors and the presence of variable sites.4, 12, 13
ddRAD-Seq was designed to address RAD-Seq short-comings. In ddRAD-Seq, genomic DNA is
digested with two restriction enzymes, and the resulting fragments undergo adaptor ligations and
precise size selection before sequencing. Only a very small fraction of the fragments will be sequenced.
These fragments are naturally selected to be from the same genomic regions across individuals.
Whitepaper |
Scigenom.com
Exome Sequencing and Data Analysis
Further, ddRAD requires half as many reads to achieve high confidence SNP calling, because the
chance of obtaining duplicate reads from the same restriction site are very low. Due to these
modifications, ddRAD has become a more economical method to genotype thousands of individuals,
and has been used for SNP discovery between two Peromyscus species that have no reference
sequence.11
SciGenom uses the following steps for ddRAD:
Figure 2 Steps for ddRAD Sequencing.
Bioinformatic Analysis
SciGenom employs a version of the Stacks pipeline to analyze ddRAD-Seq data. The Stacks pipeline
uses RAD-Seq data to create genetic maps and conduct population analysis. It assembles loci de novo
from an individual’s sequence reads or by using a reference sequence. These loci are catalogued and
compared against other individuals’ loci to create a map of alleles. Stacks can identify thousands of
markers and use this information to study genomic structure and assembly. Stacks can export data to
JoinMap, R/gtl and VCF formats.14
In addition to Stacks, SciGenom has the ability to use GATK, MUSCLE, MCL and BLAST in the
analysis pipline.
Whitepaper |
Scigenom.com
Exome Sequencing and Data Analysis
References
1. Davey JW, Hohenlohe PA, Etter PD, Boone JQ, Catchen JM, Blaxter ML (2011) Genome-wide genetic
marker discovery and genotyping using next-generation sequencing. Nature Reviews Genetics 12: 499510
2. Baird NA, Etter PD, Atwood TS, Currey MC, Shiver AL, et al. (2008) Rapid SNP Discovery and Genetic
Mapping Using Sequenced RAD Markers. PLoS ONE 3(10): e3376
3. Miller MR, Dunham JP, Amores A, Cresko WA, Johnson EA (2007) Rapid and cost-effective
polymorphism identification and genotyping using restriction site associate DNA (RAD) markers.
Genome Resarch 17(2): 240-248
4. Pfender WF, Saha MC, Johnson EA, Slabaugh MB (2011) Mapping with RAD (restriction-site
associate DNA) markers to rapidly identify QTL for stem rust resistance in Lolium perenne. Theor Appl
Genet 122(8): 1467-80
5. Grattapaglia D, Alencar S, Pappas G (2011) Genome-wide genotyping and SNP discovery by ultradeep Restriction-Associated DNA (RAD) tag sequencing of pooled sample of E. grandis and E. globulus.
BMC Proceedings 5(Suppl7): P45
6. Barchi L, Lanteri S, Portis E, et al. (2011) Identification of SNP and SSR markers in eggplant using
RAD tag sequencing. BMC Genomics 12: 304
7. Nelson JC, Wang S, Wu Y, et al. (2011) Single-nucleotide polymorphism discovery by highthroughput sequencing in sorghum. BMC Genomics 12: 352
8. Chutimanitsakun Y, Nipper RW, Cuesta-Marcos A, et al. (2011) Construction and application for QTL
analysis of a Restriction Site Associated DNA (RAD) linkage map in barley. BMC Genomics 12:4
9. Etter PD, Preston JL, Bassham S, Cresko WA, Johnson EA (2011) Local De Novo Assembly of RAD
Paired-End Contigs Using Short Sequencing Reads. PLoS ONE 6(4): e18561
10. Baxter SW, Davey JW, Johnston JS, Shelton AM, Heckel DG, et al. (2011) Linkage Mapping and
Comparative Genomics Using Next-Generation RAD Sequencing of a Non-Model Organism. PLoS ONE
6(4): e19315
11. Peterson BK, Weber JN, Kay EH, Fisher HS, Hoekstra HE (2012) Double Digest RADseq: An
Inexpensive Method for De Novo SNP Discovery and Genotyping in Model and Non-Model
Species. PLoS ONE 7(5): e37135
12. Emerson KJ, Merz CR, Catchen JM, Hohenlohe PA, Cresko WA, et al. (2010) Resolving postglacial
phylogeography using high-throughput sequencing. Proceedings of the National Academy of Sciences
107: 16196
13. Hohenlohe PA, Amish SJ, Catchen JM, Allendorf FW, Luikart G (2011) Next- generation RAD
sequencing identifies thousands of SNPs for assessing hybridization between rainbow and westslope
cutthroat trout. Molecular Ecology Resources 11: 117–122
14. Catchen J, Amores A, Hohenlohe P, Cresko W, Postlethwait J (2011) Stacks: building and genotyping
loci de novo from short-read sequences. G3: Genes, Genomes, Genetics, 1:171-182
Whitepaper |
Scigenom.com