Download Lecture 3 Human Genetics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Tay–Sachs disease wikipedia , lookup

Gene desert wikipedia , lookup

Heritability of IQ wikipedia , lookup

Gene therapy of the human retina wikipedia , lookup

Pharmacogenomics wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Gene therapy wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Genetic engineering wikipedia , lookup

Fetal origins hypothesis wikipedia , lookup

SNP genotyping wikipedia , lookup

Epistasis wikipedia , lookup

Gene expression profiling wikipedia , lookup

Gene wikipedia , lookup

Mutation wikipedia , lookup

History of genetic engineering wikipedia , lookup

Genetic drift wikipedia , lookup

Dominance (genetics) wikipedia , lookup

Genome evolution wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Hardy–Weinberg principle wikipedia , lookup

Frameshift mutation wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Gene expression programming wikipedia , lookup

Neuronal ceroid lipofuscinosis wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Human genetic variation wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Point mutation wikipedia , lookup

RNA-Seq wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Population genetics wikipedia , lookup

Genome-wide association study wikipedia , lookup

Designer baby wikipedia , lookup

Genome (book) wikipedia , lookup

Public health genomics wikipedia , lookup

HLA A1-B8-DR3-DQ2 wikipedia , lookup

A30-Cw5-B18-DR3-DQ2 (HLA Haplotype) wikipedia , lookup

Microevolution wikipedia , lookup

Tag SNP wikipedia , lookup

Transcript
Note that the genetic map
is different for men and women
Recombination frequency is higher in meiosis
in women
How do we extend the map to identifying disease genes?
• The CEPH families were instrumental in constructing the map
• But our goal is to map human diseases
•You rarely get large multi-generational highly informative families
• How do we get to a lod score of 3 with small families?
That’s the awesome power of logarithms
Science (2006) 312:279-282
Recall: The small family (5 kids) and Mom (informative) was either:
Dd 12
dd 22
D
d
D
d
1
2
OR
2
Dd 12
dd 12
Dd 22
Dd 22
1
dd 12
What if there were no crossovers?
L =
[(0.9)5 + (0.1)5 ]/ 2
(0.5)5
Z = log10 L = 0.97
(For one crossover, Z = 0.021, the crossover penalty)
Add the lod scores from different families
This is the same as multiplying probabilities
What is the probability of two coin flips and getting two heads?
0.5 x 0.5 = 0.25 (product rule in statistics)
If the same markers are in two different families, then they are independent
4 or 5 small families, and a small number of crossovers, should suffice
Works extremely well for DNA markers, more problematic for diseases
If the disease (phenotype) is caused mutations in either of two genes,
then mixing lod scores will confound the analysis (called heterogeneity)
Autosomal Recessive
Use IBD Mapping
Look for homozygous region in affected
individuals, Not homozygous in
unaffected individuals
IDB preserves the haplotype
Similar principle as Linkage Disequilibrium
Except it is with individuals, not populations
Disequilibrium Mapping
A way to map genes using populations
Instead of using pedigress, use all of the patients
We are interested in haplotypes
Haplotype 1
2 alleles
2 alleles
2 alleles
C
T
A
C
A
A
C
G
CTTCC[1396bp]GAAGCTCAGAAAGG
GAAAGGAAAAGAAGATTT
G
GATAATATAAAAAATAT[2502bp]TTGGGAATTTACA
AATAC
Haplotype 2
CTTCC[1396bp]GAAGCTCAGAAAGG
GAAAGGAAAAGAAGATTT
GATAATATAAAAAATAT[2502bp]TTGGGAATTTACA
AATAC
Haplotype 3
GAAAGGAAAAGAAGATTT
CTTCC[1396bp]GAAGCTCAGAAAGG
GATAATATAAAAAATAT[2502bp]TTGGGAATTTACA
AATAC
Consider five loci each with two alleles
A
B
C
D
E
A1
B1
C1
D1
E1
A2
B2
C2
D2
E2
How Many Haploytpes?
Individual =
A1 B2 C1 D2 E1
A2 B2 C1 D1 E2
Two haplotypes
In theory there are 25 (32) possibilities IF the combinations are independent
In practice, far fewer (5-10 in sub-Mb distances)
WHY?
Some SNPs are “old”
Example A1 and A2, D1 and D2
If they are in Hardy Weinberg equilibrium,then 4 haplotypes
A1
A1
A2
A2
A
B
C
D
E
A1
B1
C1
D1
E1
A2
B2
C2
D2
E2
D1
D2
D1
D2
A new SNP arises (B2), but in just one haplotype
A1
B2
D1
A1
A1
A2
A2
B1
B1
B1
B1
D1
D2
D1
D2
New Haplotype
Even later, two new SNPs arise (C2 and E2)
A1 B2 C1 D1 E1
A1 B1 C1 D1 E1
A1 B1 C1 D2 E1
A2 B1 C1 D1 E1
A2 B1 C1 D2 E1
A1 B1 C2 D1 E1
A2 B1 C1 D1 E2
So we end up with a total of 7 haplotypes for 5 SNPs
There is a possibility of recombination between SNPs
However, this is very slow and improbable, especially for short distances
Now consider that a disease mutation arises between C and D
Just like the SNPs, it is likely to have arisen once
And it is in only one of the common 7 haplotypes
Therefore the SNP alleles in that haplotype are correlated with the mutation
This is the principle of DISEQUILIBRIUM MAPPING
It depends on:
1. Age of the mutation
2. Age of the SNPs in the haplotype
3. Age of the population
4. Frequency of recombination (distance between) SNPs
Disequilibrium mapping is particularly useful when:
There is a relatively new disease mutation
Relatively isolated (and hopefully new) population (Finland)
Population
Allele
Frequencies
1
2
A
B
0.3
0.7
0.2
0.8
C
0.8
0.2
*
D
E
0.7
0.3
0.2
0.8
If equilibrium, patients should have same allele frequencies
If disequilibrium, patients should have increased frequencies near the disease gene
The degree of deviation should be maximal near the disease gene
Simple Case:
Autosomal dominant disease arises between C and D of a particular genotype
A1
B2
C1
A Few Generations Later:
*
D2
E1
Allele Frequencies
Population
Patients
0.3
0.8
0.3
0.3
0.2
1
1
1
1
1
0.6
0.9
1
1
0.9
0.4
0.8
1
1
Over time:
(Patients only)
Later
0.7
Deviation from
Population
Frequency
Distance along Chromosome
Disease Gene
So this is it……………….
How do we find the gene and the mutation?
We need to make the correlation with the genetic map
(for example distance in cM) to the physical map (DNA)
Most important to have the physical map annotated
All methods give a map location
(Maximum Likelihood)
Distance along Chromosome
Disease Gene
Point your browser to genome.ucsc.edu
Identify the genes in the interval
Look for best candidate
Expression data (is the gene expressed in affected tissue?)
Is expression of the gene affected in patients?
Ultimately we must search for mutations
DNA sequencing is best (SSCP is usually done first)
Does the mutation make sense
For example, recessive= loss of function
SNP chips Lots of possibilities
Great Dane x Mexican Chihuahua
F1 Big (Great Danes)
3 Big : 1 Small?
Not Likely………….The sum total of many gene….multigenic
Many human disorders, conditions and predispositions are multigenic
Twin studies where identical twins are raised together or raised apart
Look at complex behaviors and ask if they are genetic or environment
Answer: For almost every single behavior…..it’s a little of both
“Heritability” or the fraction of the condition that is genetic
But how many genes?
Association studies…..use SNP chips and the awesome power of
Computational Biology