Download Single nucleotide polymorphisms

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Single nucleotide
polymorphisms
Usman Roshan
SNPs
• DNA sequence variations that occur when a
single nucleotide is altered.
• Must be present in at least 1% of the
population to be a SNP.
• Occur every 100 to 300 bases along the 3
billion-base human genome.
• Many have no effect on cell function but some
could affect disease risk and drug response.
Toy example
SNPs on the chromosome
SNP
Chromosome
Gene
Perl exercise
• Determining SNPs from a pairwise
genome alignment:
– Can we solve this problem with a Perl
script?
Bi-allelic SNPs
• Most SNPs have one of two nucleotides
at a given position
• For example:
– A/G denotes the varying nucleotide as
either A or G. We call each of these an
allele
– Most SNPs have two alleles (bi-allelic)
Perl exercise
• Determining SNP type from a multiple
genome alignment.
SNP genotype
• We inherit two copies of each chromosome
(one from each parent)
• For a given SNP the genotype defines the
type of alleles we carry
• Example: for the SNP A/G one’s genotype
may be
–
–
–
–
AA if both copies of the chromosome have A
GG if both copies of the chromosome have G
AG or GA if one copy has A and the other has G
The first two cases are called homozygous and
latter two are heterozygous
SNP genotyping
Perl exercise
• SNP encoding:
– Convert SNP genotype from a character
sequence to numeric one
Real SNPs
• SNP consortium: snp.cshl.org
• SNPedia: www.snpedia.com
Application of SNPs:
association with disease
• Experimental design to detect cancer
associated SNPs:
– Pick random humans with and without
cancer (say breast cancer)
– Perform SNP genotyping
– Look for associated SNPs
– Also called genome-wide association study
Case-control example
• Study of 100 people:
– Case: 50 subjects with
cancer
– Control: 50 subjects
without cancer
• Count number of
dominant and recessive
alleles and form a
contingency table
#Recessive
alleles
#Dominant
alleles
Case
10
40
Control
2
48
Perl exercise
• Contingency table:
– Compute contingency table given case and
control SNP genotype data
Odds ratio
• Odds of recessive in
cancer = a/b = e
• Odds of recessive in
no-cancer = c/d = f
• Odds ratio of recessive
in cancer vs no-cancer
= e/f
#Recessive
alleles
#Dominant
alleles
Cancer
a
b
No cancer
c
d
Risk ratio (Relative risk)
• Probability of recessive
in cancer = a/(a+b) = e
• Probability of recessive
in no-cancer = c/(c+d) =
f
• Risk ratio of recessive
in cancer vs no-cancer
= e/f
#Recessive
alleles
#Dominant
alleles
Cancer
a
b
No cancer
c
d
Odds ratio vs Risk ratio
• Risk ratio has a natural interpretation
since it is based on probabilities
• In a case-control model we cannot
calculate the probability of cancer given
recessive allele. Subjects are chosen
based disease status and not allele type
• Odds ratio shows up in logistic
regression models
Example
• Odds of recessive in case =
15/35
• Odds of recessive in control
= 2/48
• Odds ratio of recessive in
case vs control =
(15/35)/(2/48) = 10.3
• Risk of recessive in case =
15/50
• Risk of recessive in control =
2/50
• Risk ratio of recessive in
case vs control = 15/2 = 7.5
#Recessive
alleles
#Dominant
alleles
Case
15
35
Control
2
48
Odds ratios in genome-wide
association studies
• Higher odds ratio means stronger
association
• Therefore SNPs with highest odds
ratios should be used as predictors or
risk estimators of disease
• Odds ratio generally higher than risk
ratio
• Both are similar when small
Statistical test of association
(P-values)
• P-value = probability of the observed data (or
worse) under the null hypothesis
• Example:
– Suppose we are given a series of co in-tosses
– We feel that a biased coin produced the tosses
– We can ask the following question: what is the probability
that a fair coin produced the tosses?
– If this probability is very small then we can say there is a
small chance that a fair coin produced the observed tosses.
– In this example the null hypothesis is the fair coin and the
alternative hypothesis is the biased coin
Related documents