Download Improving coverage of poorly sequenced regions in clinical exomes

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Human genome wikipedia , lookup

NEDD9 wikipedia , lookup

Epistasis wikipedia , lookup

X-inactivation wikipedia , lookup

Genetic engineering wikipedia , lookup

Whole genome sequencing wikipedia , lookup

Neuronal ceroid lipofuscinosis wikipedia , lookup

Long non-coding RNA wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Pharmacogenomics wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Oncogenomics wikipedia , lookup

Genomic library wikipedia , lookup

Copy-number variation wikipedia , lookup

Essential gene wikipedia , lookup

Gene therapy wikipedia , lookup

Gene nomenclature wikipedia , lookup

History of genetic engineering wikipedia , lookup

Public health genomics wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Gene desert wikipedia , lookup

Gene expression programming wikipedia , lookup

Genomic imprinting wikipedia , lookup

Pathogenomics wikipedia , lookup

Metagenomics wikipedia , lookup

Gene wikipedia , lookup

Ridge (biology) wikipedia , lookup

Genomics wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Minimal genome wikipedia , lookup

Genome evolution wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Genome (book) wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Microevolution wikipedia , lookup

RNA-Seq wikipedia , lookup

Gene expression profiling wikipedia , lookup

Designer baby wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Exome sequencing wikipedia , lookup

Transcript
Improving coverage of poorly sequenced regions in
clinical exomes
Eric J. White, Edgardo Lopez, Christopher Gault, Niru Chennagiri, Alexander Frieden, Rebecca
Batorsky, Anastasia Nikiforov, Tristen Ross and John F. Thompson
Abstract
Booster Amplicons Improve Gene Coverage
We have previously reported an orthogonal sequencing approach for clinical whole
exome sequencing in which results of two next-generation sequencing platforms are
combined for rapid variant confirmation. This both reduces the Sanger sequencing
confirmation burden by ~95% and increases overall assay sensitivity since each
platform uniquely sequences thousands of exons. In the current orthogonal approach,
we sequence the Agilent Clinical Research Exome (CRE) libraries on the Illumina
NextSeq and combine variants identified from AmpliSeq Exome libraries sequenced on
the Ion Torrent Proton. Although the orthogonal platform increases exome variant
sensitivity, there are still poorly covered regions that remain and may result in missed
pathogenic variants. To minimize this problem, we have designed new sets of primers
for low coverage AmpliSeq amplicons and amplified these independently at lower
multiplicity than the highly multiplexed standard amplicons. Independent pools are
designed for up to hundreds of genes in a phenotypically-driven manner. These are
used as a supplement to the standard amplicons and sequenced together with them.
We find that many of the low coverage regions are enhanced to the point that variants
can be called and sensitivity in those regions is substantially improved so that patients
receive a higher quality analysis.
The Claritas Clinical Exome forms the backbone of the phenotype specific Region of
Interest (ROI) tests offered by Claritas Genomics. While the orthogonal sequencing
approach improves coverage and sensitivity relative to either platform alone, small
regions of many genes remain that are poorly amplified or sequenced. AmpliSeq primers
were designed in a ROI based strategy to increase coverage of genes in the Ion Torrent
Proton AmpliSeq Exome sequencing run. Graphs below show cumulative numbers of
genes covered at 20X or greater in the individual ROI tests (NA12878). Orange lines
show gene coverage of Illumina only. Blue lines show gene coverage of Illumina and Ion
Torrent orthogonally combined. Green lines show coverage of orthogonally combined
runs with the addition of booster amplicons. Orthogonally Confirmed –
94.4%
Proton call matches NextSeq Pass call
Reliable 0.3%
Proton call matches NextSeq NoPass
Likely True Positives Singleton NextSeq call or Singleton
4.6%
Proton call with no NextSeq coverage
Likely False Positives Singleton NextSeq NoPass or
0.8%
Singleton Proton with NextSeq
coverage
# FP
# TP
PPV
1
0
49167
99.998%
0
0
134
97.81%
124
ND
2249
94.77%
346
79
ND
18.59%
Theaddi=onoftheIonTorrentAmpliSeqExometotheAgilentClinicalResearchExome
increasesthenumberoffullycoveredgenesby42%.
20000
18000
16000
14000
Number of Genes
Immunology ROI
Percent of Gene Covered ≥20x
Percent of Gene Covered ≥20x
Nephrology ROI
Nephrology ROI
12000
10000
Percent of Gene Covered ≥20x
Percent of Gene Covered ≥20x
Neurology ROI
Neurology ROI
6000
4000
Illumina
+IonTorrent
2000
0
10
20
30
40
50
60
70
80
90
95
98
99
100
Number of Genes
8000
Number of Genes
NumberofGenes
Immunology ROI
Number of Genes
Category
Percent of Gene Covered ≥20x
Number of Genes
% of
Total
Percent of Gene Covered ≥20x
Number of Genes
RefSeq intersection NIST intersection CRE intersection AmpliSeq
# FPs
Fixed
by
Sanger
Bone Marrow Failure ROI
Number of Genes
Orthogonal sequencing using independent target enrichment and sequencing methods
leads to nearly 100% PPV with NA12878 for variants found on both platforms and
increases exome assay sensitivity to >98%. Variants found on only one platform are
confirmed prior to reporting.
95-100% range
Bone Marrow Failure ROI
Number of Genes
Orthogonal Sequencing Improves Data Quality
Full scale
PercentofGeneCoveredat≥20x
Percent of Gene Covered ≥20x
Percent of Gene Covered ≥20x