Download Detection of Inherited Mutations for Breast and Ovarian

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Public health genomics wikipedia , lookup

Genomic imprinting wikipedia , lookup

Ridge (biology) wikipedia , lookup

Population genetics wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Genomic library wikipedia , lookup

Pathogenomics wikipedia , lookup

Minimal genome wikipedia , lookup

Designer baby wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Whole genome sequencing wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Gene wikipedia , lookup

NEDD9 wikipedia , lookup

Gene expression profiling wikipedia , lookup

Cancer epigenetics wikipedia , lookup

Mutagen wikipedia , lookup

Genome evolution wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Genomics wikipedia , lookup

Helitron (biology) wikipedia , lookup

Epistasis wikipedia , lookup

Microevolution wikipedia , lookup

Metagenomics wikipedia , lookup

Genome (book) wikipedia , lookup

Mutation wikipedia , lookup

Exome sequencing wikipedia , lookup

Frameshift mutation wikipedia , lookup

BRCA1 wikipedia , lookup

Point mutation wikipedia , lookup

RNA-Seq wikipedia , lookup

BRCA mutation wikipedia , lookup

BRCA2 wikipedia , lookup

Oncogenomics wikipedia , lookup

Transcript
Using the Bravo Liquid-Handling System for
Next Generation Sequencing Sample Prep
Tom Walsh, PhD
Division of Medical Genetics
University of Washington
Next generation sequencing
Sanger sequencing gold standard for over 30 years
Next Generation Sequencing is massively parallel
Millions of short reads, each 50-100bp
1,000-10,000 fold more sequence data
Very low cost per base
Target enrichment
Current sequence capacity enables whole genome sequencing
(3000Mb) or whole exome – coding regions (40Mb)
Target enrichment allows specific capture and sequencing of only
the genes associated with a particular disease/phenotype
Smaller sequencing target = reduced cost/higher sample
throughput
Applying target enrichment and sequencing to the detection of
mutations that predispose women to developing breast and
ovarian cancer
BRCA1 and BRCA2
Inherited mutations in BRCA1 and BRCA2 predispose to high
risks of developing breast and ovarian cancer
Clinical recommendations for women with BRCA1 and
BRCA2 mutations include increased surveillance and risk
reducing surgical removal of the ovaries and fallopian tubes
after child-bearing is complete
Advent of PARP inhibitors, which preferentially kill BRCA1
and BRCA2 mutated cancers, has increased the clinical
incentive to identify mutation carriers
Family 1. BRCA1 c.2800∆AA
Family 1. BRCA1: 2800 ∆ AA
6
2
Pr 79
VN
Br 74 Br 32 Br 45
Ov 61
91
Pr 77
NN
VN
Br 36
Ov 48
60
VN NN NN VN VN NN VN NN VN NN
Br 28
Br 45
VN NN VN
NN NN
Br 29
58
39
58
54
Br 27
VN VN VN VN
Pr 57
Br 25
Br 27
VN
VN
Br 57
45
VN VN
94
80
81
Br 39
Br 33
VN
VN
92
VN NN
79
78
VN NN NN VN NN NN NN NN
Br 41
NN
82
Br 59
NN
Pa 66
Br 54
74
Co 54
VN NN NN VN NN VN NN
44
NN
Br 34
Br 49
VN
VN
Br 29, 39
Two hits: Inherited mutation + somatic loss of wildtype allele
Somatic mutation generally chromosomal deletion
VN
Family 16. BRCA2 c.1310∆AAGA
Family 16
BRCA2 1529 del AAAG -> 456 stop
Br 65
Br 72
Pa 73
80
72
79
71
VN
NN
VN
NN
Br 65
Br 36
Br 66
NN
VN
63
NN
Br 66
80
77
72
86
NN
VN
VN
NN
NN
VN
VN
VN
NN
NN
Br 35
Br 32
51
45
48
VN
VN
VN
VN
NN
VN
Es 68
68
82
NN
VN
81
72
73
VN
71
Pr 72
Large genes, each with >1000
different cancer-predisposing mutations
BRCA1
BRCA2
NHGRI, Breast Cancer Information Core
Mutation spectrum also includes large deletions
and duplications not detectable by PCR
Genetic testing of BRCA1 and BRCA2
In the U.S., testing is carried out almost exclusively by Myriad
Protocol is based on PCR amplification of individual exons
followed by Sanger sequencing on capillary instruments
Large deletions and duplications are detected by a second test
(BART added in 2007) which measures copy number at exons
Genetic testing of BRCA1 and BRCA2
In the U.S., testing is carried out almost exclusively by Myriad
Protocol is based on PCR amplification of individual exons
followed by Sanger sequencing on capillary instruments
Large deletions and duplications are detected by a second test
(BART added in 2007) which measures copy number at exons
Our goal: develop a comprehensive ‘next generation
sequencing’ approach for research testing of all breast cancer
susceptibility genes
1. Rare multi organ cancer syndromes
• Li-Fraumeni : sarcomas, leukemias, breast
p53
• Cowden: thyroid, endometrial, breast
PTEN
• Diffuse gastric cancer: gastric and breast
CDH1
• Peutz-Jeughers : colon and breast
STK11
• Lynch: colon, endometrial, ovarian
Mismatch
Repair genes
2. Moderate risk breast cancer genes
BRCA-Fanconi Anemia complex
ATM
p53
Mutations in 9 genes lead to
2-4 fold increased risk of
developing breast cancer
P
P
BARD1
Ub
BRIP1
RAD51C
BRCA2
PALB2
70
NBS1
Clinically relevant level of risk
P
BRCA1
FANCD2
Lower risk than BRCA1 and RAD51
BRCA2 but still >25% lifetime
PTEN
risk
CHEK2
MRE11
RAD50
Capturing 21 breast cancer genes
Capture exons, introns, untranslated regions and 10kb up/downstream
Total capture size = 939kb
High risk
BRCA1
BRCA2
Moderate risk
PALB2
CHEK2
BRIP1
NBS1
RAD50
MRE11
ATM
RAD50/51C
Rare syndromes
p53
CDH1
PTEN
STK11
Lynch syndrome
MLH1
MSH2
PMS1
PMS2
MUTYH
Capture design
In solution capture with cRNA 120mer oligo baits (SureSelect)
Repeat masked but allow 20bp overlap where exons are closely
flanked by Alu repeats (BRCA1)
cRNA baits
(3x tiling)
BRCA1
Repeat
Tile through segmentally duplicated genes (CHEK2, PMS2, PTEN)
Developing a ‘one stop’ genetic test
Simultaneously capture and sequence 21 genes known to
predispose to breast and/or ovarian cancer
Detect all mutation classes
Small: single base substitutions and indels
Large: exon deletions and duplications
Proof of principle: Test accuracy, sensitivity and specificity with
21 previously identified mutations from 10 genes
Capture and Sequencing
Paired-end
library
(200bp)
Hybridize to biotinylated
capture bait oligos
(21 gene regions)
Sonicate
(3µg DNA)
2x76bp reads
(9 days)
•
•
•
Identify SNP and indels (MAQ and BWA)
Compare to dbSNP, mutation databases
Identify CNVs (depth of coverage)
Purify with
streptavidin
beads
Test series results - small mutations
15/15 small mutations from 10 different genes accurately identified
Nonsense, splice site, missense and indels (1 to 19bp)
Zero false positive calls of mutations in any gene or any sample
Test series results - small mutations
15/15 small mutations from 10 different genes accurately identified
Nonsense, splice site, missense and indels (1 to 19bp)
Zero false positive calls of mutations in any gene or any sample
Test series results - small mutations
15/15 small mutations from 10 different genes accurately identified
Nonsense, splice site, missense and indels (1 to 19bp)
Zero false positive calls of mutations in any gene or any sample
Mutation detection within duplicated regions
Ratio of wildtype to mutation containing reads ~ 50/50
One exception: CHEK2 1100delC, approximately 15% mutant reads
chr22:29,091, 857
segmental duplications
Partial CHEK2 pseudogenes are located on chromosomes 15 and 16
4 extra copies of the target region reduces mutant to wildtype signal
Test series results - large mutations
6/6 large mutations in BRCA1 and BRCA2 were accurately identified by
depth of coverage ratios normalized for bait coverage and GC content
BRCA1
Ratio
Deletion exons 14-20
0.52
Deletion exons 17
0.49
Duplication exon 13
1.58
Deletion exons 1-15
0.51
Summary of proof of principle study
DNA capture and sequencing is accurate and sensitive for detecting
inherited mutations of clinically important genes
Simultaneously evaluates all known breast and ovarian cancer genes
Detects single base substitutions, indels and CNVs
Accurate mutation detection in non unique regions of the genome
Increasing throughput by barcoding samples
Sequence coverage is very high (1000x) with one sample per lane
Barcoding and pooling samples reduces sequencing costs
Hybridize individual samples to SureSelect baits then add unique 6bp
barcoded primer after capture by PCR amplification
Sequence barcode, demultiplex samples, analyze samples individually
Increasing throughput by barcoding samples
Sequence coverage is very high (1000x) with one sample per lane
Barcoding and pooling samples reduces sequencing costs
Hybridize individual samples to SureSelect baits then add unique 6bp
barcoded primer after capture by PCR amplification
Sequence barcode, demultiplex samples, analyze samples individually
Current throughput: 12 samples per lane, 96 per flow cell (GAIIx)
Multiplexing 96 barcoded samples per flow cell
Median coverage is 350x
97% of targeted bases >100x minimum coverage
Data from multiplexing 96 barcoded samples
Barcode
Gene
Mutation
Within pool of 12 samples per lane
Location (hg19)
Wildtype Variant
TGACCA BRCA1 4510del3insTT chr17:41,228,596-41,228,597
140
136
CAGATC BRCA1 5382insC
chr17:41,228,596-41,228,597
89
80
TGACCA BRCA2 9179G>C
chr13:32,953,650-32,953,650
212
201
GGCTAC BARD1 1210del21
chr2:215,645,503-215,645,524
145
112
CGATGT ATM
chr2:215,645,503-215,645,524
132
145
1027delGAAA
Data from multiplexing 96 barcoded samples
Barcode
Gene
Mutation
Within pool of 12 samples per lane
Location (hg19)
Wildtype Variant
TGACCA BRCA1 4510del3insTT chr17:41,228,596-41,228,597
140
136
CAGATC BRCA1 5382insC
chr17:41,228,596-41,228,597
89
80
TGACCA BRCA2 9179G>C
chr13:32,953,650-32,953,650
212
201
GGCTAC BARD1 1210del21
chr2:215,645,503-215,645,524
145
112
CGATGT ATM
chr2:215,645,503-215,645,524
132
145
BRCA2
1027delGAAA
Data from multiplexing 96 barcoded samples
Barcode
Gene
Mutation
Within pool of 12 samples per lane
Location (hg19)
Wildtype Variant
TGACCA BRCA1 4510del3insTT chr17:41,228,596-41,228,597
140
136
CAGATC BRCA1 5382insC
chr17:41,228,596-41,228,597
89
80
TGACCA BRCA2 9179G>C
chr13:32,953,650-32,953,650
212
201
GGCTAC BARD1 1210del21
chr2:215,645,503-215,645,524
145
112
CGATGT ATM
chr2:215,645,503-215,645,524
132
145
1027delGAAA
BRCA2
100x coverage enables accurate detection of all mutation classes
Library prep is now the bottleneck
Prep time for 96 sequence ready libraries is 3 weeks with 3 FTEs
Most labor intensive part is magnetic bead (SPRI) clean ups
Pre capture library prep (SPRI clean up x5)
Capture hybridization
x1
Post captures washes and amplification (SPRI clean up x2)
Increasing throughput by automation
x96
Individual sample handling
x1
Increasing throughput by automation
x96
Individual sample handling
x1
96 sample handling
Increasing throughput by automation
All liquid handling, enzymatic incubations and post capture washes are
performed on the deck
x96
96 well magnet allows plate based SPRI clean up
x1
Increasing throughput by automation
Protocols can be edited easily and elution volumes changed
x96 post capture ‘off bead’ PCR amplification
x1
Incorporated and validated
Summary
Sample throughput increased with barcoding and automation
With standalone Bravo:
96 samples sequence ready library preps no longer bottleneck
Reduced from 3 weeks with 3 FTEs to 3 days with 1 FTE
Manual tip box replacement (not so bad)
Complete walk away automated system (on wish list)
Ongoing projects
Ovarian cancer – sequenced 21 genes in 384 patients
All libraries prepped on Bravo, sequenced on 4 flowcells (GAIIx)
Breast cancer – 1900 high risk families (KingLab collection)
Running 96 samples in single lane of a HiSeq
96 post capture barcodes (early access from Agilent R+D)
Testing Bravo for exome capture with NimbleGen EZ cap oligos
Acknowledgments
Ming Lee, PhD – Bioinformatics pipeline
Alex Nord – CNV and breakpoint algorithms
Anne Thornton, Chris Pennil, Silvia Casadei, PhD – library prep
Mary-Claire King, PhD and Elizabeth Swisher, MD
National Cancer Institute, Dept of Defense, Komen for the Cure