* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Candidate gene screening using long-read sequencing
Human genome wikipedia , lookup
Oncogenomics wikipedia , lookup
Zinc finger nuclease wikipedia , lookup
Frameshift mutation wikipedia , lookup
Epigenomics wikipedia , lookup
History of genetic engineering wikipedia , lookup
Transposable element wikipedia , lookup
Behavioral epigenetics wikipedia , lookup
Cell-free fetal DNA wikipedia , lookup
Primary transcript wikipedia , lookup
Pharmacogenomics wikipedia , lookup
Public health genomics wikipedia , lookup
Point mutation wikipedia , lookup
Genetic engineering wikipedia , lookup
Genome (book) wikipedia , lookup
Saethre–Chotzen syndrome wikipedia , lookup
DNA sequencing wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Neuronal ceroid lipofuscinosis wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Gene expression profiling wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Gene expression programming wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Genome evolution wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Gene therapy wikipedia , lookup
Bisulfite sequencing wikipedia , lookup
Copy-number variation wikipedia , lookup
Pathogenomics wikipedia , lookup
Genomic library wikipedia , lookup
Gene desert wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Genome editing wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Gene nomenclature wikipedia , lookup
Whole genome sequencing wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Microevolution wikipedia , lookup
Helitron (biology) wikipedia , lookup
Designer baby wikipedia , lookup
Exome sequencing wikipedia , lookup
Metagenomics wikipedia , lookup
Candidate gene screening using long-read sequencing Jenny Ekholm, Ting Hon, Yu-Chih Tsai, David Greenberg, Tyson A. Clark, Steve Kujawa PacBio, Menlo Park, CA USA SMRT Sequencing overview We have developed several candidate gene screening applications for both Neuromuscular and Neurological disorders. The power behind these applications comes from the use of longread sequencing. It allows us to access previously unresolvable and even unsequencable genomic regions. SMRT Sequencing offers uniform coverage, a lack of sequence context bias, and very high accuracy. In addition, it is also possible to directly detect epigenetic signatures and characterize full-length gene transcripts through assembly-free isoform sequencing. Alzheimer's disease (AD): TOMM40 gene Gene isoform characterization Experimental pipeline Size Selection & PCR amplification cDNA synthesis with adapters 1 b a Experimental Pipeline 2 SMRTbell ligation polyA mRNA 3 Poly(A) mRNA AAAAA AAAAA TTTTT AAAAA 5PacBio raw sequence reads 4 AAAAA AAAAA AAAAA TTTTT AAAAA TTTTT AAAAA AAAAA AAAAA TTTTT AAAAA TTTTT AAAAA Informatics Sequel/PacBio RS II pipeline Sequencing AAAAA TTTTT cDNA synthesis with adapters AAAAA TTTTT AAAAA TTTTT AAAAA Remove adapters Remove artifacts AAAAA Clean sequence reads AAAA TTTT AAAA TTTT Reads clustering AAAA TTTT Coding sequence 5’ primer Raw Size partitioning & SMRT adapter PCR amplification (AAA)nn SampleNet: Iso-Seq Method with Clonetech cDNA Synthesis Kit A variable poly-T repeat at the rs10524523 SNP within intron 6 of the TOMM40 gene that in combination with APOE3 allele will affect the Poly-T age of onset of AD1) Poly(A) tail 3’AAAA primer TTTT (TTT)n AAAA Isoform clusters SMRT adapter (TTT)n TTTT Consensus calling AAAA TTTT Nonredundant transcript isoforms AAAA TTTT AAAA TTTT Quality filtering SMRTbell ligation (AAA)nn Informatics Pipeline 6 Final isoforms 7 PacBio raw sequence reads Reads of Insert 8 RS sequencing 9 Classify sequence reads Isoform clusters Evidenced-based to reference genome gene Map models 10 Nonredundant transcript isoforms Final isoforms Evidence-based gene models Remove adapters Remove artifacts Reads clustering Consensus calling Quality filtering Map to reference genome DevNet: Iso-Seq wiki page Figure 1 gDNA & Transcripts from SK-BR-3 Cell Line Captured with NimbleGen Oncology Panel - example AURKA gene Sample 1 All Sample 1 Phase 0 Sample 1 Phase I Long reads and heterozygous SNPs allow you to phase reads into two haplotypes Phased Transcripts reveal retained introns and skipped exons In addition to calling the bases, SMRT Sequencing uses the kinetic information from each nucleotide to distinguish between modified and native bases. A G C mA T G T T Template strand We successfully captured and sequenced the associated poly-T repeat in the TOMM40 gene and were able to determine that the sample had one short (15) allele and one Very Long (34) allele. Schizophrenia: C4 gene C GA G C T AG TTC A T G T Determines if 2 aa difference Long or Short in exon 26 determines C4A or C4B isotype Template strand Different possible combinations (L/S/C4A/C4B) Amplification-free targeted sequencing using CRISPR/Cas9 Example: N6-methyladenine Long-read sequencing data properties Accuracy Read Length - Two functionally distinct genes (isotypes); C4A and C4B - Both isotypes can have 1 - 3 functional copies - A human endogenous retroviral (HERV) insertion in intron 9 changes the length of the gene Data generated with 20 kb size-selected human library using 6 hour movies with P6-C4 chemistry using PacBio RS II, analyzed with SMRT® Analysis v 2.3. Each SMRT Cell generates ~55,000. reads. E. coli 20 kb-insert library, SMRT Analysis v2.3 Targeted sequencing and multiplexing Targeted sequencing workflow The C4 structural variant is highly complex:2) Barcoding options 1. Barcoded adapters C4 plays a role in signaling which connections between neurons should be “pruned” or removed, as the brain develops after childhood. And the more C4 was present, the higher the risk of developing schizophrenia. Certain versions of the C4 gene seem to increase people’s risk for developing schizophrenia by 27 to 50 percent. Repeat expansion disorders are challenging to interrogate due to the long repetitive regions. Using CRISRP/Cas9 we are able to access the repeat counts, interruption sequences as well as epigenetic information without introducing PCR HTT gene in HD sample bias. We successfully captured and sequenced the C4 gene as part of a MHC capture panel. We were able to see the two different C4 isotypes (C4A and C4B) as well as seeing the 7 kb HERV insertion in intron 9. Phase 0 2. Barcoded Universal Primer Direct methylation detection of FMR1 pre-mutation sample Phase 1 Example: Isotype C4A/L was successfully called for a sample using long-read sequencing The Huntingtin gene CAG repeat counts in HD patients Phase 0 Phase 1 CGG repeat region appears to be heavily methylated (5mC) References Intron 9 1) 2) Roses AD, et al. (2010). A TOMM40 variable-length polymorphism predicts the age of late-onset Alzheimer’s disease. Pharmacogenomics J. 10(5): 375-84 Sekar A, et al . (2016). Schizophrenia risk from complex variation of complement component 4. Nature. 530(7589):177-83 For Research Use Only. Not for use in diagnostics procedures. © Copyright 2016 by Pacific Biosciences of California, Inc. All rights reserved. Pacific Biosciences, the Pacific Biosciences logo, PacBio, SMRT, SMRTbell, Iso-Seq, and Sequel are trademarks of Pacific Biosciences. BluePippin and SageELF are trademarks of Sage Science. NGS-go and NGSengine are trademarks of GenDx. All other trademarks are the sole property of their respective owners.