Download Exome sequencing Genome sequencing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Medical sequencing
Principle and application of
exome sequencing
Yu Sun
[email protected]
•Sanger sequencing of known
disease causing genes
•For MR/MCA: SNP array
Exome sequencing
Genome sequencing
2
Outline
• Introduction of exome sequencing
• Technique, workflow, strategy
• An example case solved by exome sequencing
following standard strategy
• SCAR7 and TPP1
• What to do when standard strategy failed
• “Silent” variation can be disease causing
• Exome sequencing might catch branch point
mutation
• Genome sequencing
3
Exome sequencing

Exome: all exons in a region

Procedure

Exome capture

• SureSelect, Nimblegen, Truseq, etc
NGS sequencing
• Illumina, 454, etc

Usage

Find the mutation for Mendelian disease

Rare allele finding for complex disease
4
Sporadic
Familial
de novo
5
Bamshad, Nature Reviews Genetics 12, 745-755
Identify Mendelian disease genes
• By exome sequencing
• a simple “sample-sequence-compare” loop
6
http://www.genomics.agilent.com/GenericA.aspx?PageType=Custom&SubPageType=Custom&PageID=2098)
7
Data analysis of exome sequencing
Short reads
Find the genomic locations of
each short reads
Genomic location, nucleotide
changes, genotpye, etc
No biological info
Add biological info:
Gene, variant function,
conservation, etc
To find out the
disease genes
8
9
10
Standard filter and comparison strategy
•Tens of thousands of variants found
in each exome
•Filter and comparison shorten the
candidate list
11
Example case
SCAR7 and TPP1
Human Mutation, 2013, 34 (5), 706-713
STANDARD STRATEGY
12
SCAR7
• Autosomal recessive spinocerebellar ataxia 7
(SCAR7), OMIM 609207
• Phenotypes :
• difficulty walking and writing
• dysarthria
• limb ataxia
• cerebellar atrophy
13
Method
• SureSelect 50Mb all exons + Illumina HiSeq
• 100bp pair end sequencing
• Filtering
• In linkage region 11p15
• Low frequency (<5%) in NHLBI ESP exomes
• Homozygous/compound heterozygous
• Nonsense, missense, frameshift, coding indel,
splice site
14
What’s known?
11p15, 5.8cM, >200 genes
Breedveld, J Med Genet 2004, 858-866
15
Candidate list
Gene
Chr Position
Ref Sample
base genotype
HGVS nomenclature
Function
GVS
TPP1
11
6636430 A
A/C
NM_000391.3:c.1397T>G
missense
TPP1
11
6638385 C
C/G
NM_000391.3:c.509-1G>C
splice-3
DCHS1
11
6645264 G
A/G
NM_003737.2:c.7643C>T
missense
DCHS1
11
6662466 C
C/T
NM_003737.2:c.379G>A
missense
C11orf40 11
4594558 -
-/G
NM_144663.1:c.286_287insC
C11orf40 11
4598956 C
C/T
NM_144663.1:c.95G>A
frameshift Not
nonsense cosegregate
Not
cosegregate
TPP1: Encoding the lysosomal serine protease with tripeptidyl-peptidase 1 activity
16
TPP1 variants
17
TPP1 variants
Cosegregate within families
A
B
Wild type
c.1379T>G
c.509-1G>C
18
The first example
TOD and FLNA
“Silent” variation can be disease causing
The American Journal of Human Genetics, 2010, 87(1), 146-153
WHAT IF THE STAND STRATEGY
DOES NOT WORK
19
Terminal Osseous Dysplasia
•
•
•
•
Terminal Osseous Dysplasia, TOD (OMIM 300244)
Rare, all female
X-linked male lethal dominant disease
Phenotypes:
• pigmentary anomalies of the skin
• skeletal abnormalities of the limbs
• recurring digital fibroma
20
Xq
• Zhang et al, 2000
• Linkage analysis Xq27.3q28
• 8.7Mb in total
• 219 genes
21
Method
• The probands of the Dutch and Italian family
• SureSelect X-exome by Agilent
• Sequencing by Genome Analyser II, Illumina
1I:1
1II:1
1I:2
P 1II:2
1III:1
Dutch
1III:2
2I:1
2II:1
2I:2
2II:2
2III:1
Italian
22
Variant Filtering
•
•
•
•
In Xq27.3-q28
Heterozygous
Low frequency in european population
Missense, nonsense, frameshift, inframe indel, splice site
• Include silent mutation
• 1 gene 1 variant in common
• c.5217G>A in FLNA
– Not in dbSNP; not reported before; not present in 1000
genomes project; not found in >400 control X
chromosomes
23
Sanger sequencing confirmation c.5217G>A
Wild type
1I:1
Mutation
1I:2
2I:1
G/G
2I:2
3I:1
3I:2
G
1II:1
P 1II:2
2II:1
2II:2
G/A
G/A
3II:1
2III:1
1III:1
1III:2
Dutch
G/A
G/A
G/A
Italian
G/G
3II:2
G/A
3II:3
G/A
Israel-Arab
3 sporadic cases: G/A
24
FLNA summary
cytoskeletal protein filamin A
Flanking 26kb
NM_001456
NM_001110556
47 exons
48 exons
2639a.a.
2647a.a
25
Alter Splicing
• Fibroma cells from 1III:2
1I:1
1II:1
1I:2
P 1II:2
• Alter splicing
1III:1
1III:2
Family 1
26
The second example
Aarskog Scott Syndrome and FGD1
Exome sequencing can detect branch point mutation
Human Mutation, 2012, 34 (3), 430-434
WHAT IF THE STAND STRATEGY
DOES NOT WORK
27
Aarskog Scott Syndrome
• OMIM 305400, most cases X-linked inheritance;
Females mildly affected
• FGD1 gene (Xp11.22), 19 exons, 100 Kb
• Mutation detection rate +/- 20%
• Phenotypes
• Short stature (-1 > -2 SD)
• Facial dysmorphism
• Small hands and feet
• Shawl scrotum
• Mental retardation rare
28
Method
• Two affected boys, negative FGD1 mutation by
Sanger sequencing
• SureSelect all exons (Agilent)
• sequencing by Genome Analyzer II
29
Candidate list
Sample Variant
Gene
Function Depth dbSNP OMIM
III-1
NM_018325.2:c.607G>C
C9orf72 Missense 9
none
No
III-2
NM_018325.2:c.607G>C
C9orf72 Missense 9
none
No
III-1
NM_080818.3:c.512_513insA OXGR1 Frameshift 43
none
No
III-2
NM_080818.3:c.512_513insA OXGR1 Frameshift 67
none
No
Associated
with FTLD/ALS
Not
cosegregate
FTLD/ALS= frontotemporal dementia and/or amyotrophic lateral sclerosis, not
similar with ASS phenotype
Neither seems the causative variants
Extend the region 50nt into the intron
30
Sample Variant
Gene
Function Depth dbSNP OMIM
III-1
NM_018325.2:c.607G>C
C9orf72 Missense 9
none
No
III-2
NM_018325.2:c.607G>C
C9orf72 Missense 9
none
No
III-1
NM_080818.3:c.512_513insA OXGR1 Frameshift 43
none
No
III-2
NM_080818.3:c.512_513insA OXGR1 Frameshift 67
none
No
III-1
NM_004463.2:c.2016-35delA FGD1
intron
11
none
Yes
III-2
NM_004463.2:c.2016-35delA FGD1
intron
11
none
Yes
Associated
with FTLD/ALS
Not
cosegregate
31
Sanger Sequencing
delA
delA/A
delA
delA
•Not present in controls, 1000 genomes project
•Not reported before in literature
•Prediction: break splicing branch site
32
FGD1
Exon 12
III-1 III-2 II-2
Exon 13
I-1
Exon 14
controls
33
Exome sequencing
Pros
• Successes in disease gene
detection
• High throughput
• Exonic , easier to
understand the effect
• Cheaper and smaller data
compare to genome
sequencing
Cons
• Only exonic region, miss
other genetic information
• Capture bias, might cause
false negative
• High deviation in depth,
hard to detect copy number
changes
• Unsolved cases
34
Exome sequencing
Sample prep
Genome sequencing
Easier
(no capture)
Capture bias
Yes
No
Genetic information
Around selected exons
Whole genome
Data size
Much smaller
(1~2% of the genome)
Depth
More deviation
Less deviation
Price
Cheaper
More expensive
35
Exome
Genome
Less depth deviation in genome sequencing than
exome sequencing
36
Exome
Genome
Two variants missed by exome sequencing
Might due to capture failure
37
Summary
• Exome sequencing high throughput technique in
finding causative mutations of Mendelian diseases.
• Sample choice
De novo mutation: family trio
Inherited mutation: some sporadic cases
far away family members
• Standard filter/comparison gives general solution of
disease gene detection by exome sequencing.
38
Summary
• Step-wise filtering and comparison strategy:
Inheritance pattern: dominant – heterozygous;
recessive – homozygous/compound heterozygous
Predicted function: nonsense, splice site, missense,
frameshift, coding indel
Novelity : databases, etc
Comparison: familial – variant level; sporadic – gene
level
• The standard strategy does not always work.
39
Summary
• Use less stringent filtering will help to recover some
data (allow silent mutations, extend to the intronic
regions, etc).
• RNA-seq tools can be applied to find large indels
from the short reads.
• RNA provides valuable proof of variation effect with
simple experiments.
• Some genetic information is missed by exome
sequencing. Genome sequencing might overcome.
• Some imagination and luck are needed.
40
Related documents