Download 1 6

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Human Genetics, part I
Liisa Kauppi (Keeney lab)
Mapping Mendelian and complex diseases
- Linkage mapping in pedigrees
- Association mapping in populations
Genes and Environment
QuickTime™ and a
TIFF (Uncompress ed) dec ompres sor
are needed to s ee this pic ture.
“Natural” mutants only
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
Heritability: first degree relatives of a
patient at greater risk
QuickTi me™ and a
T IFF (Uncom pressed) decom pressor
are needed to see t his pict ure.
For type I diabetes, l = 15 (6%/0.4%)
Twin studies:
Adopted (separated in infancy)
Fraternal vs. identical twins
Quic kT i me™ and a
T IFF (Unc ompres s ed) dec ompres s or
are needed t o s ee thi s pi c ture.
Biological vs. non-biological siblings
QuickT ime ™an d a
TIFF ( Uncomp res sed) deco mpre ssor
ar e need ed to see this pictur e.
Genes and Human Disease
MENDELIAN
High penetrance,
Single gene
COMPLEX/MULTIFACTORIAL
DISEASE
Polygenic,
Reduced penetrance
Osteoporosis
Schizophrenia
“Common disease”
Cystic fibrosis
Asthma
Blood type
Height
EASY
Pure environment
Infectious
disease Snakebite
Body weight
HARD
Language
Polymorphic markers are
needed for disease mapping
Microsatellites
Tandem arrays of simple repeats, for
example (CA)n, n=15…27
MULTIALLELIC
A
G
Single nucleotide polymorphisms (SNPs)
Abundant, perhaps 1 every 300 bp
- RFLPs
Mostly non-coding
BI-ALLELIC
Genotype frequencies: Hardy-Weinberg equation
B allele has frequency p
b allele has frequency q
p+q=1
p (B)
q (b)
p (B)
p2 (BB)
pq (Bb)
q (b)
pq (Bb)
q2 (BB)
p2 (BB) + 2pq (Bb) + q2 (bb) = 1
Hardy-Weinberg equilibrium
How are recessive traits maintained in a population?
HWE of allele frequencies: p2 + 2pq+ q2= 1
Hypothetical example:
in Sardinia, 1 in 5 individuals have straight hair
This trait is determined by a single gene and it is recessive.
S allele = curly hair, s allele = straight hair
Frequency of s/s homozygotes is 0.2
Frequency of s allele is 0.45 (0.2)
Frequency of S allele is 1 - 0.45 = 0.55
Gametes for next generation:
S
s
S
0.552=0.3
0.55 x 0.45
= 0.25
s
0.55 x 0.45
= 0.25
0.452=0.2
Frequencies of genotypes
and alleles remain
unchanged from one
generation to the next.
HWE allows calculations of carrier frequencies for recessive traits
(with caution)
Example: Cystic fibrosis, alleles CF and cf
Incidence 1/2000 births
p2 + 2pq+ q2= 1
Frequency of cf/cf homozygotes is 0.0005
Frequency of cf allele is 0.022 (0.0005)
Frequency of CF allele is 1- 0.022 = 0.978
Frequency of CF/cf heterozygotes is 2 x 0.978 x 0.022 = 0.043
So what if genotypes at a locus are not in HWE?
p2 + 2pq+ q2= 1
Suggests that assumptions are not met
Example: heterozygote deficit could arise from recent admixture
Population 1
B freq 0.9
b freq 0.1
Population 2
B freq 0.1
b freq 0.9
n=1000
0.81+0.18+0.01
0.01+0.18+0.81
810+180+10
10+180+810
B freq 0.5
b freq 0.5
n=2000
n=1000
0.25+0.5+0.25
500+1000+500
820+360+820
HWE
expected
observed
Departure from HWE (heterozygote excess):
the Prion protein gene and human disease
• PRNP gene linked to prion diseases e.g. CJD, kuru
• A common polymorphism, M129V, influences the course of
these diseases: the MV heterozygous genotype is protective
• Kuru acquired from ritual cannibalism was reported (1950s) in
the Fore people of Papua New Guinea, where it caused up to
1% annual mortality
• Departure from Hardy-Weinberg equilibrium for the M129V
polymorphism is seen in Fore women over 50 (23/30
heterozygotes, P = 0.01)
Linkage studies - recombination in a family
how often are 2 loci separated by meiotic recombination?
I
2 loci on same
chromosome
II
Informative and
uninformative
meioses
III
NR
NR
R
NR
Recombination fraction  is 2/6=0.33
NR
R
Recognizing recombinants
does the disease segregate with this marker?
1
I
25
16
II
6
21
34
III
31
32
41
NR
NR
NR
41
NR
Recombination fraction  is 1/6=0.167
42
32
NR
R
Recognizing recombinants
Often samples are missing
I
II
21
34
III
OR
31
32
41
NR
R
NR
R
NR
R
41
NR
R
42
NR
R
Recombination fraction  is 1/6=0.167 or 5/6=0.833
32
R
NR
Recognizing recombinants
Tracing additional family members can help
I
II
56
21
34
III
31
32
41
NR
NR
NR
41
NR
42
32
NR
R
15
16
But are these identical by descent?
Which marker is the disease locus closest to?
Lod scores
Logarithm of odds (Lod) score Z
Z = log
Likelihood of loci being linked
Likelihood of loci not being linked
For the example pedigree with 1/6 recombinants:
Z = log
(1 - 0.167)5 x 0.167
(0.5)6
= 0.632
Lod scores between -2 and +3 are inconclusive
Below -2  exclusion
Above +3  linkage
Requires a precise genetic model
Which marker is the disease locus closest to?
Multi-point lod scores
chr 3p12-14
Waardenburg syndrome type 2
After Hughes et al. (1994) Nature Genet 7, 509-512
Multifactorial diseases (no simple
Mendelian inheritance pattern)
Sib-pair analysis
21
34
Number of shared
parental alleles
probability
32
32
31
42
41
2
1
1
0
1/4
1/2
1/4
Affected sib-pairs
Which loci do the affected sibs share more often than expected by chance?
21
34
21
34
Number of shared
parental alleles
Number of shared
parental alleles
32
32
31
2
1
32
32
2
Detecting linkage in pedigrees can be complicated…
… and you need lots of meioses!
Association mapping in a population
Cases vs. controls
HLA-DR4 allele (UK)
General population
36%
Rheumatoid arthritis
patients
78%
Seek correlation between genotype and phenotype
Allele B is associated with disease D if people who have D
also have B more often than predicted from B’s frequency
To test every polymorphism is too expensive
Linkage disequilibrium (LD) measures
association between two alleles
Mutation creates new variants
A
G
A
A
A
G
T
A
Initially, the new allele is in LD with
nearby alleles
LD value = 1
Recombination reshuffles existing variation
A
G
A
G
X
T
A
A
T
LD diminishes
If enough crossovers take place, the
loci are in “free association”
Commonly used LD measures: D’ and r2
Haplotypes are sets of markers inherited as a “package”
meiotic recombination
creates novel haplotypes
Markers form
haplotype
blocks in the population
LD is a measure of allelic association in a population
2 SNP loci on the same chromosome
C/G
A/T
C
T
G
C
G
A
A
T
< 4 combinations -> LD
Conversely: all 4 combinations -> low or no
LD
But also:
population
history, drift,
selection…
Disease haplotypes shorten from
one generation to the next
Recombination hotspots are key in shaping haplotype blocks
Perhaps at least 90% of crossovers take place at highly localized hotspots
HLA class II
Recombination
activity
Haplotype blocks
Kauppi et al. (2004) Nat Rev Genet 5, 413-424
How do you extract haplotypes from genotype data?
A/T
C/G
A
T
C
G
Blood DNA
or
A
T
G
C
?
Other family members
Other individuals in population
A
A
C
C
T
T
G
G
Data just released:
A haplotype map of the human genome, Nature 437, 1299-1320
HapMap project
Examines haplotypes in four populations
DNA samples: 270 people in total
Yoruba (Nigeria): 30 parent-child trios
Whites with North and West European ancestry (USA): 30 trios
Japan: 45 unrelated individuals
China: 45 unrelated individuals
Identify “haplotype tag SNPs” to minimize genotyping effort
>3,500,000 SNPs typed in total
Limited within-block diversity
Example: a 8.5-kb long block on chr 2, 36 SNPs typed
In principle, could give rise to 236 different haplotypes
Only seven different haplotypes found among 120 European chromosomes
Recombination hotspots are widespread
and account for LD structure
7q21
The International HapMap Consortium
Pairwise tagging
A/T
1
A
A
T
T
G/A
2
G
G
A
A
high r2
G/C
3
G
C
G
C
T/C
4
T
C
C
C
high r2
G/C
5
A/C
6
A
C
C
C
G
C
G
C
high r2
After Carlson et al. (2004) AJHG 74:106
Tags:
SNP 1
SNP 3
SNP 6
3 in total
Test for association:
SNP 1
SNP 3
SNP 6
The Common-Disease CommonVariant Hypothesis
• Says
– disease-predisposing variants will exist at relatively high frequency
(i.e. >1%) in the population.
– are ancient alleles occurring on specific haplotypes.
– detectable in a case-control study using tagging SNPs.
• Alternative hypothesis says
– disease-predisposing alleles are sporadic new mutations,
perhaps around the same genes, on different haplotypes.
– families with history of the same disease owe their condition to
different mutations events.
Does same phenotype mean same genotype?
Coding SNPs, nonsynonymous or synonymous
“Regulatory” SNPs
Common Gene Variation in Complex Disease
• Case-control studies, comparing the frequencies of common gene
variants can identify susceptibility and protective alleles
• Some have multiple identified genes (*)
Phenotype
Gene
Variant
IDDM*
Alzheimer dementia
Deep venous thrombosis
Colorectal cancer
NIDDM
HLA
APOE
F5
APC
PPAR
DR3,4
E4
Leiden
3920A
12A
Other types of variation may also have a role in
complex disease
common copy number polymorphisms
large scale rearrangements, deletions and insertions
microsatellite expansions, small insertion/deletions etc.
Related documents