Download SNP Discovery and Genotyping Workshop (PowerPoint)

Document related concepts

Genetic testing wikipedia , lookup

Genetic drift wikipedia , lookup

Copy-number variation wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Heritability of IQ wikipedia , lookup

Human genome wikipedia , lookup

Genome (book) wikipedia , lookup

Gene expression profiling wikipedia , lookup

Metagenomics wikipedia , lookup

Designer baby wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Gene desert wikipedia , lookup

Pathogenomics wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Gene expression programming wikipedia , lookup

Behavioural genetics wikipedia , lookup

Population genetics wikipedia , lookup

Hardy–Weinberg principle wikipedia , lookup

Microevolution wikipedia , lookup

Public health genomics wikipedia , lookup

Genealogical DNA test wikipedia , lookup

Human genetic variation wikipedia , lookup

RNA-Seq wikipedia , lookup

Pharmacogenomics wikipedia , lookup

Warfarin wikipedia , lookup

HLA A1-B8-DR3-DQ2 wikipedia , lookup

Molecular Inversion Probe wikipedia , lookup

A30-Cw5-B18-DR3-DQ2 (HLA Haplotype) wikipedia , lookup

Haplogroup G-M201 wikipedia , lookup

SNP genotyping wikipedia , lookup

Tag SNP wikipedia , lookup

Transcript
SNP Discovery and Analysis:
Application to Association Studies
Mark J. Rieder, PhD
Dana Crawford, PhD
Deborah Nickerson, PhD
SeattleSNPs PGA
July 19-20, 2005
Practical Aspects of SNP Association Studies
1. SNP Discovery:
Where do I find SNPs to use in my association studies?
(e.g. databases, direct resequencing)
2. SNP Selection:
How do I choose SNPs that are informative?
(i.e. assessing SNP correlation - linkage disequilibrium)
3. SNP Associations:
What analyses can I perform after genotyping these SNPs?
(e.g. single SNP data, haplotype data)
4. SNP Replication/Function:
How is function predicted or assessed.
(e.g. nonsynonymous SNPs, conserved non-coding regions (CNS)
transcription factor binding sites, gene expression)
SeattleSNPs Program for Genomic Applications: Overview
Aim 1: To establish a variation discovery resource capable of
comprehensive resequencing of candidate genes related to HLBS.
Biological Focus: Inflammation
Genes and Pathways: Coagulation, Complement, Cytokines
Interacting Partners
SNPs in Candidate Genes
SeattleSNPs
Average Gene Size - 26.5 kb ~ Compare 2 haploid - 1 in 1,200 bp
~130 SNPs
(200 bp) - 15,000,000 SNPs
~ 44 SNPs > 0.05 MAF
(600 bp) - 6,000,000 SNPs
SeattleSNPs PGA: Candidate Gene SNP Resource
•
•
•
•
•
4.9 Mb in 47 individuals = 230 Mb total sequence
Define sequence diversity - catalogue all SNPs
Select “optimal” tagSNPs sets
Determine haplotype structure
Provide necessary baseline data for association studies
Warfarin Pharmacogenetics
1. Background
• Warfarin characteristics
• Pharmacokinetics/Pharmacodynamics
• Discovery of VKORC1
2. VKORC1 - SNP Discovery
3. VKORC1 - SNP Selection (tagSNPs)
4. VKORC1 - SNP Testing
• SNP/Haplotype Inference
• Haplotype Inference, Testing
5. VKORC1 - SNP Replication/Function
Pharmacogenomics as a Model for Association Studies
Clear genotype-phenotype link
intervention
variable response
Pharmacokinetics - 5x variation
Quantitative intervention and response
drug dose, response time, metabolism rate, etc.
Target/metabolism of drug generally known
gene target that can be tested directly with response
Reduce variability and identify outliers.
Prospective testing
Personalized Medicine
Warfarin Background
• Commonly prescribed oral anti-coagulant
• In 2003, 21.2 million prescriptions were written for
warfarin (Coumadin)
• Prescribed following MI, atrial fibrillation, stroke,
venous thrombosis, prosthetic heart valve replacement,
and following major surgery
• Difficult to determine effective dosage
- Narrow therapeutic range
- Monitoring of prothrombin time (INR) - 2.0 - 3.0
- Large inter-individual variation
50
Ave: 5.2 mg/d
n = 186
European-American
No. of patients
40
30x dose variability
30
20
Add warfarin dose distribution
10
0
0
2
4
6
8
10
12
14
16
Warfarin Dose (mg/d)
Patient/Clinical/Environmental Factors
Pharmacokinetic/Pharmacodynamic - Genetic
Warfarin inhibits the vitamin K cycle
Warfarin
Epoxide
Reductase
CYP2C9
Inactivation
Pharmacokinetic
 -Carboxylase
(GGCX)
Vitamin K-dependent clotting factors
(FII, FVII, FIX, FX, Protein C/S/Z)
Warfarin Metabolism (Pharmacokinetics)
Major pathway for termination of pharmacologic effect
is through metabolism of S-warfarin in the liver by CYP2C9
• CYP2C9 SNPs alter warfarin metabolism:
CYP2C9*1 (WT) - normal
CYP2C9*2 (Arg144Cys) - low/intermediate
CYP2C9*3 (Ile359Leu) - low
• CYP2C9 alleles occur at a significant minor allele frequency
European: *2 - 10.7% *3 - 8.5 %
Asian: *2 - 0% *3 - 1-2%
African-American: *2 - 2.9% *3 - 0.8%
Effect of CYP2C9 Genotype on Anticoagulation-Related Outcomes
(Higashi et al., JAMA 2002)
WARFARIN MAINTENANCE DOSE
TIME TO STABLE ANTICOAGULATION
CYP2C9-WT ~90 days
CYP2C9-Variant ~180 days
*2 or *3 carriers take longer to
reach stable anticoagulation
N
127
28
4
18
3
5
- Variant alleles have significant clinical impact
- Still large variability in warfarin dose (15-fold) in *1/*1 “controls”?
Analysis of Independent Predictors of Warfarin Dose
Adapted from Gage et al., Thromb Haemost, 2004
Variable
Change in Warfarin Dose
P value
Target INR, per 0.5 increase
BMI, per SD
Ethnicity (African-American, [Asian])
21%
14%
13%, [ 10-15%]
<0.0005
<0.0001
0.003
Age, per decade
Gender, Female
Drugs (Amiodarone)
CYP2C9*2, per allele
CYP2C9*3, per allele
13%
12%
24%
19%
30%
<0.0001
<0.0001
0.007
<0.0001
<0.0001
~ 30% of the variability in warfarin dose is explained by these factors
What other candidate genes are influencing warfarin dosing?
Warfarin acts as a vitamin K antagonist
Pharmacodynamic
Epoxide
Reductase
Warfarin
CYP2C9
Inactivation
 -Carboxylase
(GGCX)
Vitamin K-dependent clotting factors
(FII, FVII, FIX, FX, Protein C/S/Z)
New Target Protein for Warfarin
Epoxide
Reductase
(VKORC1)
 -Carboxylase
(GGCX)
Clotting Factors
(FII, FVII, FIX, FX, Protein C/S/Z)
5 kb - chr 16
Rost et al. & Li, et al., Nature (2004)
Warfarin Resistance VKORC1 Polymorphisms
Rost, et. al. Nature (2004)
• Rare non-synonymous mutations in VKORC1 causative for warfarin resistance (15-35 mg/d)
• NO non-synonymous mutations found in ‘control’ chromosomes (n = ~400)
Frequency
Inter-Individual Variability in Warfarin Dose: Genetic Liabilities
SENSITIVITY
CYP2C9 coding
SNPs - *3/*3
Common
VKORC1
non-coding
SNPs?
RESISTANCE
VKORC1
nonsynonymous
coding
SNPs
0.5
5
15
Warfarin maintenance dose (mg/day)
SNP Discovery: Resequencing VKORC1
• PCR amplicons --> Resequencing of the complete genomic region
• 5 Kb upstream and each of the 3 exons and intronic segments; ~11 Kb
• SeattleSNPs PGA - pga.gs.washington.edu (24 African-Am./23 Europeans)
• Warfarin treated clinical patients (UWMC): 186 European
• Other populations: 96 European, 96 African-Am., 120 Asian
SNP Discovery: Resequencing Results
Summary of PGA samples (European, n = 23)
Total: 13 SNPs identified
10 common/3 rare (<5% MAF)
Clinical Samples (European patients n = 186)
Total: 28 SNPs identified
10 common/18 rare (<5% MAF)
15 - intronic/regulatory
7 - promoter SNPs
2 - 3’ UTR SNPs
3 - synonymous SNPs
1 - nonsynonymous
- single heterozygous indiv. - highest warfarin dose = 15.5 mg/d
How does the comprehensive SNP discovery compare to
what was known for this gene?
SNP Discovery: dbSNP database
dbSNP
-NCBI SNP database
SNP Discovery: dbSNP database (VKORC1)
SeattleSNPs Resequencing
28 SNPs --> 15 SNPs gene region
10 dbSNPs
• 8/10 confirmations
• 3 frequency/genotype data
• 7 new dbSNP entries generated
by SeattleSNPs resequencing
• 8 dbSNPs/15 SNPs (~50%)
SNP Discovery: dbSNP database
Nickerson and Kruglyak, Nature Genetics, 2001
Mar 2005 - 5.0 million (validated - 1/600 bp)
5.0/10.0 = 50% of all common SNPs (validated)!
SNP discovery is dependent on your sample population size
Fraction of SNPs Discovered
2 chromosomes
{
GTTACGCCAATACAGGATCCAGGAGATTACC
GTTACGCCAATACAGCATCCAGGAGATTACC
1.0
96
48
24
16
8
0.5
2
0.0
0.0
0.1
0.2
0.3
Minor Allele Frequency (MAF)
0.4
0.5
SNP Discovery: dbSNP database
dbSNP (Perlegen/HapMap)
SeattleSNPs
25%
Minor Allele Freq. (MAF)
{
50%
75%
Minor Allele Freq. (MAF)
Rarer and population specific SNPs are found by resequencing
dbSNP: Increasing numbers of SNPs now have genotype data
6.00
Validated SNPs
SNPs with Genotypes
SNPs(millions)
5.00
Perlegen
Data
4.00
3.00
2.00
1.00
0.00
Jan-03
Mar-03
Jun-03
Aug-03
Oct-03
Jan-04
Mar-04
dbSNP Release
Jun-04
Sep-04
Jan-05
Mar-05
HapMap
Phase II
Perlegen
Current State of dbSNP
12.00
Total Reference SNPs
Validated SNPs
Genotyped SNPs
SNPs(millions)
10.00
8.00
6.00
4.00
2.00
0.00
Jan-03
Mar-03 Jun-03 Aug-03
Oct-03
Jan-04
Mar-04 Jun-04
Sep-04
Jan-05
Mar-05
dbSNP Release
Many SNPs left to validate and characterize.
Development of a genome-wide SNP map: How many SNPs?
Nickerson and Kruglyak, Nature Genetics, 2001
~ 10 million common SNPs (>1- 5% MAF) - 1/300 bp
Mar 2005 - 5.0 million (validated - 1/600 bp)
5.0/10.0 = 50% of all common SNPs validated!
Coming Soon! 5.0 million validated SNPs with genotypes!
SNP Discovery: dbSNP database
dbSNP Issues:
Not comprehensive catalog (50% of SNPs)
Is the data confirmed? (50% are validated)
Information about allele frequency/population (50%)
No information about SNP correlations (linkage disequilibrium)
genotyping efficiency
SNP Selection: Using Linkage Disequilibrium
Frequency
• Common SNPs
• VKORC1 - 28 total - 10 SNPs > 10% MAF
Warfarin Dose (mg/d)
Does common variation in VKORC1 have a role in determining
warfarin dose?
• Evaluate linkage disequilibrium (non-random association of
genotype data)
SNP Selection: Using Linkage Disequilibrium
Site 1
Site 2
C : 50%
A : 50%
Maternal
T : 50%
G : 50%
Paternal
Possible
2-site comb.
C
C
T
T
A
G
A
G
Expected Freq.
0.5 X 0.5 = 0.25
0.5 X 0.5 = 0.25
0.5 X 0.5 = 0.25
0.5 X 0.5 = 0.25
Site 1
C
T
Observed Freq.
0.50 *
0.01
0.01
0.48 *
* Sites Correlated
Site 2
A
G
SNP Selection: Using Linkage Disequilibrium
• SNP discovery data (i.e. population of samples with genotypes)
• Find all correlated SNPs to minimize the total number of SNPs
• Maintains genetic information (correlations) for that locus
LD_Select - SNP tagging/binning algorithm - based on LD (r2), not haplotypes
Carlson, et al. AJHG (2004)
SNP Selection: VG/LD_Select on the Web
pga.gs.washington.ed/VG2
SNP Selection: tagSNP Data
SNP Selection: VKORC1 tagSNPs
SNP Testing: VKORC1 tagSNPs
Five Bins to Test
1.
2.
3.
4.
5.
381, 3673, 6484, 6853, 7566
2653, 6009
861
5808
e.g. Bin 1 - SNP 381
9041
Bin 1 - p < 0.001
Bin 2 - p < 0.02
Bin 3 - p < 0.01
Bin 4 - p < 0.001
Bin 5 - p < 0.001
C/C C/T T/T
SNP x SNP interactions - haplotype analysis?
VKORC1 Summary: SNP Discovery/SNP Selection
1. VKORC1 candidate gene for warfarin dose response
2. SNP discovery performed using PCR/resequencing to
catalog common SNPs
• 28 SNPs found
• 10 common SNPs
3. SNP discovery using dbSNP
• 8/10 dbSNPs confirmed
• 7 new SNPs added
4. SNP Selection using linkage disequilibrium
• 10 common SNPs (> 10% MAF)
• 5 informative SNPs for genotyping
Haplotypes in Genetic Association Studies
Two main approaches with haplotypes:
Haplotypes
Pick tagSNPs
Pick tagSNPs
Infer haplotypes
Genotype samples
Test for association
Haplotypes in Genetic Association Studies
1. How can you get haplotypes?
2. What information do you get from haplotypes?
3. How do you use haplotypes to find tagSNPs?
4. How do you use haplotypes to test for
associations?
Haplotypes – The Definition
“…a unique combination of genetic markers present
in a chromosome.” pg 57 in Hartl & Clark, 1997
Constructing Haplotypes
Collect pedigrees
Somatic cell hybrids
C/C, A/G
C/T, A/A
Rodent
Human
TT
GG
T/T, G/G
C/C, A/G
CC
AG
Hybrid
C/T, A/G
CT
AG
Allele-specific PCR
SNP 1
SNP 2
C/T
A/G
Constructing Haplotypes
Examples of Haplotype Inference Software:
EM Algorithm
Haploview
http://www.broad.mit.edu/mpg/haploview/index.php
Arlequin
http://lgb.unige.ch/arlequin/
PHASE v2.1
http://www.stat.washington.edu/stephens/software.html
HAPLOTYPER
http://www.people.fas.harvard.edu/~junliu/Haplo/docMain.htm
Haplotypes in SeattleSNPs
• >200 genes re-sequenced in inflammation response
• 2 populations: European- and African-Americans
• PHASEv2.0 results posted on website
• Interactive tool (VH1) to visualize and sort haplotypes
http://pga.gs.washington.edu
Haplotypes in SeattleSNPs
Haplotypes in SeattleSNPs
Haplotypes in SeattleSNPs
Haplotypes in SeattleSNPs
Haplotypes in SeattleSNPs
Haplotypes in SeattleSNPs
Haplotypes in SeattleSNPs
Haplotypes in SeattleSNPs
Haplotypes in SeattleSNPs
Haplotypes in SeattleSNPs
Haplotypes in SeattleSNPs
Haplotypes in Genetic Association Studies
Two main approaches with haplotypes:
Haplotypes
Pick tagSNPs
Genotype samples
Recombination
Natural selection
Haplotype block definition
Population history
Population demography
Pick tagSNPs
Infer haplotypes
Test for association
Measuring Pair-wise SNP Correlations
• SNP correlation described by linkage disequilibrium (LD)
• Pair-wise measures of LD: D´ and r2
D = pAB - pApB; D´ = D/Dmax
r2 =
D2
f(A1)f(A2)f(B1)f(B2)
Recombination
Power
Using LD and Haplotypes to Pick tagSNPs
• r2 is inversely related to power
1/r2
1,000 cases
1,000 controls
r2=1.0
1,250 cases
1,250 controls
Example: LDSelect
• D´ is related to recombination history
D´ = 1
D´ < 1
no recombination
historical recombination
Example: Haplotype “blocks”
r2 = 0.80
Haplotype “Blocks”
Daly etet
al 2001
Daly
al Nat. Genet. (2001)
Strong LD
Few Haplotypes
Represent most chromosomes
Block Definitions
Daly etet
al 2001
Daly
al Nat. Genet. (2001)
D´ [Gabriel et al Science (2002)]
Block Definitions
Four-gamete test:
A
B
a
b
A
B
a
A
b
b
a
B
<4 haplotypes, D´=1
block
4 haplotypes, D´<1
boundary
Haplotype Blocks and tagSNPs
Identifying blocks and tagSNPs:
• Manually
• Algorithms
– Haploview
Haplotype Blocks and tagSNPs
tagSNPs
IL1B:
19 SNPs (MAF >5%)
4 “common” haplotypes
Haplotype Blocks and tagSNPs
Identifying blocks and tagSNPs:
• Manually
• Algorithms
– HaploView
LD and tagSNPs using Haploview
VKORC1
European-Americans
PHASEv2.1 data
Minimal set of tagSNPs based on r2
Where to Find Tagging Software
HaploBlockFinder
Haploview
LDSelect
SNPtagger
TagIT
tagSNPs
http://cgi.uc.edu/cgi-bin/kzhang/haploBlockFinder.cgi
http://www.broad.mit.edu/personal/jcbarret/haplo/
http://droog.gs.washington.edu/ldSelect.html
http://www.well.ox.ac.uk/~xiayi/haplotype/index.html
http://popgen.biol.ucl.ac.uk/software.html
http://www-rcf.usc.edu/~stram/tagSNPs.html
Haplotypes, TagSNPs, and Caveats
• Haplotypes are inferred
• Block-like structure assumed for some software
• Different block definitions
• Block boundaries sensitive to marker density
• Genotype savings may not be great (recombination)
Haplotypes in Genetic Association Studies
Two main approaches with haplotypes:
Haplotypes
Pick tagSNPs
Pick tagSNPs
Infer haplotypes
Genotype samples
Test for association
Genetic diversity of sample
Multi-SNP analysis
Multi-SNP testing: Haplotypes
Five tagSNPs (10 total SNPs)
186 warfarin patients (European)
PHASE v2.1
9 haplotypes/5 common (>5%)
Multi-SNP testing: Haplotypes
Test for association between haplotype and warfarin
dose using multiple linear regression
Adjusted for all significant covariates: age, sex, amiodarone, CYP2C9 genotype
Multi-SNP testing: Haplotypes
Explore the evolutionary relationship across haplotypes
(381, 3673, 6484, 6853, 7566)
5808
861
9041
CCGATCTCTG-H1
CCGAGCTCTG-H2
A
TCGGTCCGCA-H7
TAGGTCCGCA-H8
TACGTTCGCG-H9
B
VKORC1 haplotypes cluster into divergent clades
Patients can be assigned a clade diplotype:
e.g. Patient 1 - H1/H2 = A/A
Patient 2 - H1/H7 = A/B
Patient 3 - H7/H9 = B/B
VKORC1 clade diplotypes show a strong association with warfarin dose
Low
High
Warfarin Dose (mg/d)
8
†
A/A
A/B
B/B
†
6
*
*
*
4
2
0
AA
BB
AB
All patients
AA
AB
BB
2C9 WT patients
AA
AB
BB
2C9 VAR patients
(n = 181)
(n = 124)
(n = 57)
Independent of INR levels across all groups
Multi-SNP testing: Haplotypes
• European - mean ~ 5 mg/d
• African-American - higher ~ 6.0-7.0 mg/d
• Asian - lower ~ 3.0-3.5 mg/d
Hypothesis: VKORC1 haplotypes contribute to racial
variability in warfarin dosing.
•
“Control” populations: 120 Europeans
96 African-Americans
120 Asian
Multi-SNP testing: Haplotypes
Explore the evolutionary relationship across populations
B
(11%)
B
(58%)
A
(37%)
European (CEPH)
Clade Distribution
Clade A = Low
Clade B = High
A
(89%)
Asian (Han)
Clade Distribution
Low dose phenotype
Other
(39%)
A
(14%)
B
(47%)
African-American
Clade Distribution
High dose phenotype
Common Errors in Association Studies
• Small sample size
Bell and Cardon (2001)
• Subgroup analysis and multiple testing
• Random error
• Poorly matched control group
• Failure to attempt study replication
e.g., Second case/control study
Gene expression studies
• Failure to detect LD with adjacent loci
• Overinterpreting results and positive publication bias
• Unwarranted ‘candidate gene’ declaration after identifying
association in arbitrary genetic region
SNP Replication: VKORC1
8
Warfarin Dose (mg/d)
Univ. of Washington
n = 185
†
†
6
*
*
*
4
2
0
AA AB BB
All patients
Brian Gage
Howard McCleod
Charles Eby
21% variance in
dose explained
8
Warfarin Dose (mg/d)
Washington University
n = 386
AA AB BB
2C9 WT patients
†
†
6
*
AA AB BB
2C9 VAR patients
*
†
4
2
0
AA AB BB
All patients
AA AB BB
2C9 WT patients
AA AB BB
2C9 VAR patients
SNP Function: VKORC1 Expression
mechanism
No nonsynonymous SNPs
Several SNPs are present in evolutionarily conserved non-coding regions
- mRNA expression in human liver cell lines
SNP Function: VKORC1 Expression
Expression in human liver tissue (n = 53) shows a graded
change in expression.
VKORC1 SNP alters liver-specific binding site
SNP Discovery and Analysis
Application to Association Studies
Summary
• Databases and resources available for SNP discovery
• Software for tagSNP selection available
• Both single and multi-SNP analysis are useful
• Replication required by several journals
SeattleSNPs
Genotyping Service
• Free genotyping (BeadArray or SNPlex)
• Emphasis on young investigators
• Research related to heart, lung, blood, or sleep disorders
• Moderate to large population samples
• Apply at pga.gs.washington.edu
• Due: October 15th, 2005
SNP Typing Formats
Scale
Microtiter
Plates - Fluorescence
Low
eg. Taqman - Good for a few markers - lots of
samples - PCR prior to genotyping
Size Analysis by Electrophoresis
Medium
eg. SNPlex - Intermediate Multiplexing
reduces costs - Genotype directly on
genomic DNA - new paradigm for high throughput
Arrays - Custom or Universal
eg. Illumina, ParAllele, Affymetrics - Highly multiplexed
- 1,500 SNPs and beyond (500K+)
High
Taqman
Genotyping with fluorescence-based homogenous assays
(single-tube assay) = 1 SNP/ tube
SNP Typing Formats
Scale
Microtiter
Plates - Fluorescence
Low
eg. Taqman - Good for a few markers - lots of
samples - PCR prior to genotyping
Size Analysis by Electrophoresis
Medium
eg. SNPlex - Intermediate Multiplexing
reduces costs - Genotype directly on
genomic DNA - new paradigm for high throughput
Arrays - Custom or Universal
eg. Illumina, ParAllele, Affymetrics - Highly multiplexed
- 1,500 SNPs and beyond (500K+)
High
Technological Leap - No advance PCR
Universal PCR after preparing multiple regions for analysis Several based on primer specific on genomic DNA followed by
PCR of the ligated products - different strategies
and different readouts.
SNPlex, Illumina, Parallele
Also, reduced representation - Affymetrix
- cut with restriction enzyme, then ligate linkers
and amplify from linkers and follow by chip
hybridization to read out.
Detection
9. Characterize on Capillary Sequencer
SNP 1
SNP 2
SNP Typing Formats
Scale
Microtiter
Plates - Fluorescence
Low
eg. Taqman - Good for a few markers - lots of
samples - PCR prior to genotyping
Size Analysis by Electrophoresis
Medium
eg. SNPlex - Intermediate Multiplexing
reduces costs - Genotype directly on
genomic DNA - new paradigm for high throughput
Arrays - Custom or Universal
eg. Illumina, ParAllele, Affymetrics - Highly multiplexed
- 1,500 SNPs and beyond (500K+)
High
Multiplexed Genotyping
C
- Universal Tag Readouts
G
A
T
Locus 2 Specif ic Sequence
Locus 1 Specific Sequence
Tag1 sequence
Tag2 sequence
cTag1 sequence
cTag2 sequence
Subst rat e
Bead or Chip
Subst rat e
Bead or Chip
Bead Array
Chip Array
Tag 1
Tag 2
Tag 3
Tag 4
Illumina
Multiplex ~1,000 SNPs
ParAllele
Not dependent on primary PCR
Affymetrics
Illumina Platform
96 Multi-array Matrix matches standard microtiter
plates
~ 1,500 SNPs typed per matrix for 96 samples
Affymetrix’s 100K Chip
Optimized for
250-2000bp
http://www.affymetrix.com/products/arrays/specific/100k.affx
High Throughput Chip Formats
Defining the scale of the genotyping project is key to selecting
an approach:
1000 individuals
5 to 10 SNPs in a candidate gene - Many approaches
(expensive ~ 0.60 per SNP/genotype)
$6,000
48 ( to 96) SNPs in a handful of candidate genes
(~ 0.25 per SNP/genotype)
$12,000
384 0 1,536 SNPs
(~0.15 - 0.08 per SNP/genotype)
$57,600-122,880
10,000 cSNPs - defined format
(~0.05 per SNP/genotype)
$500,000
100,000 Genic SNPs - defined format
(~0.005 per SNP/genotype
$500,000
500,000 SNPs defined format
(~0.004 per SNP/ genotype)
$2,000,000
Acknowledgements
Allan Rettie, Medicinal Chemistry
Alex Reiner
Dave Veenstra
Dave Blough
Ken Thummel
Washington University
Brian Gage
Howard McLeod
Charles Eby
Joyce You - Hong Kong
Noel Hastings
Maggie Ahearn
Josh Smith
Chris Baier
Peggy Dyer-Robertson