* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Detection of Inherited Mutations for Breast and Ovarian
Public health genomics wikipedia , lookup
Genomic imprinting wikipedia , lookup
Ridge (biology) wikipedia , lookup
Population genetics wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Genomic library wikipedia , lookup
Pathogenomics wikipedia , lookup
Minimal genome wikipedia , lookup
Designer baby wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Whole genome sequencing wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Gene expression profiling wikipedia , lookup
Cancer epigenetics wikipedia , lookup
Genome evolution wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Helitron (biology) wikipedia , lookup
Microevolution wikipedia , lookup
Metagenomics wikipedia , lookup
Genome (book) wikipedia , lookup
Exome sequencing wikipedia , lookup
Frameshift mutation wikipedia , lookup
Point mutation wikipedia , lookup
Using the Bravo Liquid-Handling System for Next Generation Sequencing Sample Prep Tom Walsh, PhD Division of Medical Genetics University of Washington Next generation sequencing Sanger sequencing gold standard for over 30 years Next Generation Sequencing is massively parallel Millions of short reads, each 50-100bp 1,000-10,000 fold more sequence data Very low cost per base Target enrichment Current sequence capacity enables whole genome sequencing (3000Mb) or whole exome – coding regions (40Mb) Target enrichment allows specific capture and sequencing of only the genes associated with a particular disease/phenotype Smaller sequencing target = reduced cost/higher sample throughput Applying target enrichment and sequencing to the detection of mutations that predispose women to developing breast and ovarian cancer BRCA1 and BRCA2 Inherited mutations in BRCA1 and BRCA2 predispose to high risks of developing breast and ovarian cancer Clinical recommendations for women with BRCA1 and BRCA2 mutations include increased surveillance and risk reducing surgical removal of the ovaries and fallopian tubes after child-bearing is complete Advent of PARP inhibitors, which preferentially kill BRCA1 and BRCA2 mutated cancers, has increased the clinical incentive to identify mutation carriers Family 1. BRCA1 c.2800∆AA Family 1. BRCA1: 2800 ∆ AA 6 2 Pr 79 VN Br 74 Br 32 Br 45 Ov 61 91 Pr 77 NN VN Br 36 Ov 48 60 VN NN NN VN VN NN VN NN VN NN Br 28 Br 45 VN NN VN NN NN Br 29 58 39 58 54 Br 27 VN VN VN VN Pr 57 Br 25 Br 27 VN VN Br 57 45 VN VN 94 80 81 Br 39 Br 33 VN VN 92 VN NN 79 78 VN NN NN VN NN NN NN NN Br 41 NN 82 Br 59 NN Pa 66 Br 54 74 Co 54 VN NN NN VN NN VN NN 44 NN Br 34 Br 49 VN VN Br 29, 39 Two hits: Inherited mutation + somatic loss of wildtype allele Somatic mutation generally chromosomal deletion VN Family 16. BRCA2 c.1310∆AAGA Family 16 BRCA2 1529 del AAAG -> 456 stop Br 65 Br 72 Pa 73 80 72 79 71 VN NN VN NN Br 65 Br 36 Br 66 NN VN 63 NN Br 66 80 77 72 86 NN VN VN NN NN VN VN VN NN NN Br 35 Br 32 51 45 48 VN VN VN VN NN VN Es 68 68 82 NN VN 81 72 73 VN 71 Pr 72 Large genes, each with >1000 different cancer-predisposing mutations BRCA1 BRCA2 NHGRI, Breast Cancer Information Core Mutation spectrum also includes large deletions and duplications not detectable by PCR Genetic testing of BRCA1 and BRCA2 In the U.S., testing is carried out almost exclusively by Myriad Protocol is based on PCR amplification of individual exons followed by Sanger sequencing on capillary instruments Large deletions and duplications are detected by a second test (BART added in 2007) which measures copy number at exons Genetic testing of BRCA1 and BRCA2 In the U.S., testing is carried out almost exclusively by Myriad Protocol is based on PCR amplification of individual exons followed by Sanger sequencing on capillary instruments Large deletions and duplications are detected by a second test (BART added in 2007) which measures copy number at exons Our goal: develop a comprehensive ‘next generation sequencing’ approach for research testing of all breast cancer susceptibility genes 1. Rare multi organ cancer syndromes • Li-Fraumeni : sarcomas, leukemias, breast p53 • Cowden: thyroid, endometrial, breast PTEN • Diffuse gastric cancer: gastric and breast CDH1 • Peutz-Jeughers : colon and breast STK11 • Lynch: colon, endometrial, ovarian Mismatch Repair genes 2. Moderate risk breast cancer genes BRCA-Fanconi Anemia complex ATM p53 Mutations in 9 genes lead to 2-4 fold increased risk of developing breast cancer P P BARD1 Ub BRIP1 RAD51C BRCA2 PALB2 70 NBS1 Clinically relevant level of risk P BRCA1 FANCD2 Lower risk than BRCA1 and RAD51 BRCA2 but still >25% lifetime PTEN risk CHEK2 MRE11 RAD50 Capturing 21 breast cancer genes Capture exons, introns, untranslated regions and 10kb up/downstream Total capture size = 939kb High risk BRCA1 BRCA2 Moderate risk PALB2 CHEK2 BRIP1 NBS1 RAD50 MRE11 ATM RAD50/51C Rare syndromes p53 CDH1 PTEN STK11 Lynch syndrome MLH1 MSH2 PMS1 PMS2 MUTYH Capture design In solution capture with cRNA 120mer oligo baits (SureSelect) Repeat masked but allow 20bp overlap where exons are closely flanked by Alu repeats (BRCA1) cRNA baits (3x tiling) BRCA1 Repeat Tile through segmentally duplicated genes (CHEK2, PMS2, PTEN) Developing a ‘one stop’ genetic test Simultaneously capture and sequence 21 genes known to predispose to breast and/or ovarian cancer Detect all mutation classes Small: single base substitutions and indels Large: exon deletions and duplications Proof of principle: Test accuracy, sensitivity and specificity with 21 previously identified mutations from 10 genes Capture and Sequencing Paired-end library (200bp) Hybridize to biotinylated capture bait oligos (21 gene regions) Sonicate (3µg DNA) 2x76bp reads (9 days) • • • Identify SNP and indels (MAQ and BWA) Compare to dbSNP, mutation databases Identify CNVs (depth of coverage) Purify with streptavidin beads Test series results - small mutations 15/15 small mutations from 10 different genes accurately identified Nonsense, splice site, missense and indels (1 to 19bp) Zero false positive calls of mutations in any gene or any sample Test series results - small mutations 15/15 small mutations from 10 different genes accurately identified Nonsense, splice site, missense and indels (1 to 19bp) Zero false positive calls of mutations in any gene or any sample Test series results - small mutations 15/15 small mutations from 10 different genes accurately identified Nonsense, splice site, missense and indels (1 to 19bp) Zero false positive calls of mutations in any gene or any sample Mutation detection within duplicated regions Ratio of wildtype to mutation containing reads ~ 50/50 One exception: CHEK2 1100delC, approximately 15% mutant reads chr22:29,091, 857 segmental duplications Partial CHEK2 pseudogenes are located on chromosomes 15 and 16 4 extra copies of the target region reduces mutant to wildtype signal Test series results - large mutations 6/6 large mutations in BRCA1 and BRCA2 were accurately identified by depth of coverage ratios normalized for bait coverage and GC content BRCA1 Ratio Deletion exons 14-20 0.52 Deletion exons 17 0.49 Duplication exon 13 1.58 Deletion exons 1-15 0.51 Summary of proof of principle study DNA capture and sequencing is accurate and sensitive for detecting inherited mutations of clinically important genes Simultaneously evaluates all known breast and ovarian cancer genes Detects single base substitutions, indels and CNVs Accurate mutation detection in non unique regions of the genome Increasing throughput by barcoding samples Sequence coverage is very high (1000x) with one sample per lane Barcoding and pooling samples reduces sequencing costs Hybridize individual samples to SureSelect baits then add unique 6bp barcoded primer after capture by PCR amplification Sequence barcode, demultiplex samples, analyze samples individually Increasing throughput by barcoding samples Sequence coverage is very high (1000x) with one sample per lane Barcoding and pooling samples reduces sequencing costs Hybridize individual samples to SureSelect baits then add unique 6bp barcoded primer after capture by PCR amplification Sequence barcode, demultiplex samples, analyze samples individually Current throughput: 12 samples per lane, 96 per flow cell (GAIIx) Multiplexing 96 barcoded samples per flow cell Median coverage is 350x 97% of targeted bases >100x minimum coverage Data from multiplexing 96 barcoded samples Barcode Gene Mutation Within pool of 12 samples per lane Location (hg19) Wildtype Variant TGACCA BRCA1 4510del3insTT chr17:41,228,596-41,228,597 140 136 CAGATC BRCA1 5382insC chr17:41,228,596-41,228,597 89 80 TGACCA BRCA2 9179G>C chr13:32,953,650-32,953,650 212 201 GGCTAC BARD1 1210del21 chr2:215,645,503-215,645,524 145 112 CGATGT ATM chr2:215,645,503-215,645,524 132 145 1027delGAAA Data from multiplexing 96 barcoded samples Barcode Gene Mutation Within pool of 12 samples per lane Location (hg19) Wildtype Variant TGACCA BRCA1 4510del3insTT chr17:41,228,596-41,228,597 140 136 CAGATC BRCA1 5382insC chr17:41,228,596-41,228,597 89 80 TGACCA BRCA2 9179G>C chr13:32,953,650-32,953,650 212 201 GGCTAC BARD1 1210del21 chr2:215,645,503-215,645,524 145 112 CGATGT ATM chr2:215,645,503-215,645,524 132 145 BRCA2 1027delGAAA Data from multiplexing 96 barcoded samples Barcode Gene Mutation Within pool of 12 samples per lane Location (hg19) Wildtype Variant TGACCA BRCA1 4510del3insTT chr17:41,228,596-41,228,597 140 136 CAGATC BRCA1 5382insC chr17:41,228,596-41,228,597 89 80 TGACCA BRCA2 9179G>C chr13:32,953,650-32,953,650 212 201 GGCTAC BARD1 1210del21 chr2:215,645,503-215,645,524 145 112 CGATGT ATM chr2:215,645,503-215,645,524 132 145 1027delGAAA BRCA2 100x coverage enables accurate detection of all mutation classes Library prep is now the bottleneck Prep time for 96 sequence ready libraries is 3 weeks with 3 FTEs Most labor intensive part is magnetic bead (SPRI) clean ups Pre capture library prep (SPRI clean up x5) Capture hybridization x1 Post captures washes and amplification (SPRI clean up x2) Increasing throughput by automation x96 Individual sample handling x1 Increasing throughput by automation x96 Individual sample handling x1 96 sample handling Increasing throughput by automation All liquid handling, enzymatic incubations and post capture washes are performed on the deck x96 96 well magnet allows plate based SPRI clean up x1 Increasing throughput by automation Protocols can be edited easily and elution volumes changed x96 post capture ‘off bead’ PCR amplification x1 Incorporated and validated Summary Sample throughput increased with barcoding and automation With standalone Bravo: 96 samples sequence ready library preps no longer bottleneck Reduced from 3 weeks with 3 FTEs to 3 days with 1 FTE Manual tip box replacement (not so bad) Complete walk away automated system (on wish list) Ongoing projects Ovarian cancer – sequenced 21 genes in 384 patients All libraries prepped on Bravo, sequenced on 4 flowcells (GAIIx) Breast cancer – 1900 high risk families (KingLab collection) Running 96 samples in single lane of a HiSeq 96 post capture barcodes (early access from Agilent R+D) Testing Bravo for exome capture with NimbleGen EZ cap oligos Acknowledgments Ming Lee, PhD – Bioinformatics pipeline Alex Nord – CNV and breakpoint algorithms Anne Thornton, Chris Pennil, Silvia Casadei, PhD – library prep Mary-Claire King, PhD and Elizabeth Swisher, MD National Cancer Institute, Dept of Defense, Komen for the Cure