Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Exome Sequencing and Data Analysis Genotyping with RAD and ddRAD Sequencing Background Genotyping requires thousands of genomes to be compared in a reliable, consistent way. Restriction site associated DNA sequencing (RAD-Seq) interrogates a fraction of the genome across many individuals, an ideal method for genotyping.1 By using restriction enzyme digestion and sequencing the regions adjacent to restriction sites, researchers can examine the same subset of genomic regions for thousands of individuals and identify many genetic markers along the genome.2 Other NGS methods examine a larger portion of the genome and offer more data, but they are costly and cannot be used to study the thousands of individuals required for genotyping. In its early applications, RAD-Seq was used for genetic marker discovery in threespine stickleback using microarray technology.3 Since, it has been modified for NGS technology to genotype a variety of organisms with or without prior genomic information, including the perennial ryegrass, eucalyptus, eggplant, sorghum and barley.4-8 RAD-Seq applications include: -Genetic marker discovery1-13 -Local genome assembly9 -QTL mapping4, 8 -Linkage mapping10 ddRAD-Seq at SciGenom Figure 1 RAD-Seq can be used for various studies.11 SciGenom uses double digest RAD-Seq (ddRAD-Seq), a variation of RAD-Seq, for genotyping. Traditional RAD-Seq uses one restriction enzyme and random shearing to generate fragments from genomic DNA. However, these are high DNA loss steps and offer little control over the fragments that are sequenced. For organisms without a reference genome, a significant portion of the RAD-Seq data has been discarded due to sequence read errors and the presence of variable sites.4, 12, 13 ddRAD-Seq was designed to address RAD-Seq short-comings. In ddRAD-Seq, genomic DNA is digested with two restriction enzymes, and the resulting fragments undergo adaptor ligations and precise size selection before sequencing. Only a very small fraction of the fragments will be sequenced. These fragments are naturally selected to be from the same genomic regions across individuals. Whitepaper | Scigenom.com Exome Sequencing and Data Analysis Further, ddRAD requires half as many reads to achieve high confidence SNP calling, because the chance of obtaining duplicate reads from the same restriction site are very low. Due to these modifications, ddRAD has become a more economical method to genotype thousands of individuals, and has been used for SNP discovery between two Peromyscus species that have no reference sequence.11 SciGenom uses the following steps for ddRAD: Figure 2 Steps for ddRAD Sequencing. Bioinformatic Analysis SciGenom employs a version of the Stacks pipeline to analyze ddRAD-Seq data. The Stacks pipeline uses RAD-Seq data to create genetic maps and conduct population analysis. It assembles loci de novo from an individual’s sequence reads or by using a reference sequence. These loci are catalogued and compared against other individuals’ loci to create a map of alleles. Stacks can identify thousands of markers and use this information to study genomic structure and assembly. Stacks can export data to JoinMap, R/gtl and VCF formats.14 In addition to Stacks, SciGenom has the ability to use GATK, MUSCLE, MCL and BLAST in the analysis pipline. Whitepaper | Scigenom.com Exome Sequencing and Data Analysis References 1. Davey JW, Hohenlohe PA, Etter PD, Boone JQ, Catchen JM, Blaxter ML (2011) Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Nature Reviews Genetics 12: 499510 2. Baird NA, Etter PD, Atwood TS, Currey MC, Shiver AL, et al. (2008) Rapid SNP Discovery and Genetic Mapping Using Sequenced RAD Markers. PLoS ONE 3(10): e3376 3. Miller MR, Dunham JP, Amores A, Cresko WA, Johnson EA (2007) Rapid and cost-effective polymorphism identification and genotyping using restriction site associate DNA (RAD) markers. Genome Resarch 17(2): 240-248 4. Pfender WF, Saha MC, Johnson EA, Slabaugh MB (2011) Mapping with RAD (restriction-site associate DNA) markers to rapidly identify QTL for stem rust resistance in Lolium perenne. Theor Appl Genet 122(8): 1467-80 5. Grattapaglia D, Alencar S, Pappas G (2011) Genome-wide genotyping and SNP discovery by ultradeep Restriction-Associated DNA (RAD) tag sequencing of pooled sample of E. grandis and E. globulus. BMC Proceedings 5(Suppl7): P45 6. Barchi L, Lanteri S, Portis E, et al. (2011) Identification of SNP and SSR markers in eggplant using RAD tag sequencing. BMC Genomics 12: 304 7. Nelson JC, Wang S, Wu Y, et al. (2011) Single-nucleotide polymorphism discovery by highthroughput sequencing in sorghum. BMC Genomics 12: 352 8. Chutimanitsakun Y, Nipper RW, Cuesta-Marcos A, et al. (2011) Construction and application for QTL analysis of a Restriction Site Associated DNA (RAD) linkage map in barley. BMC Genomics 12:4 9. Etter PD, Preston JL, Bassham S, Cresko WA, Johnson EA (2011) Local De Novo Assembly of RAD Paired-End Contigs Using Short Sequencing Reads. PLoS ONE 6(4): e18561 10. Baxter SW, Davey JW, Johnston JS, Shelton AM, Heckel DG, et al. (2011) Linkage Mapping and Comparative Genomics Using Next-Generation RAD Sequencing of a Non-Model Organism. PLoS ONE 6(4): e19315 11. Peterson BK, Weber JN, Kay EH, Fisher HS, Hoekstra HE (2012) Double Digest RADseq: An Inexpensive Method for De Novo SNP Discovery and Genotyping in Model and Non-Model Species. PLoS ONE 7(5): e37135 12. Emerson KJ, Merz CR, Catchen JM, Hohenlohe PA, Cresko WA, et al. (2010) Resolving postglacial phylogeography using high-throughput sequencing. Proceedings of the National Academy of Sciences 107: 16196 13. Hohenlohe PA, Amish SJ, Catchen JM, Allendorf FW, Luikart G (2011) Next- generation RAD sequencing identifies thousands of SNPs for assessing hybridization between rainbow and westslope cutthroat trout. Molecular Ecology Resources 11: 117–122 14. Catchen J, Amores A, Hohenlohe P, Cresko W, Postlethwait J (2011) Stacks: building and genotyping loci de novo from short-read sequences. G3: Genes, Genomes, Genetics, 1:171-182 Whitepaper | Scigenom.com