Download Lecture 4-POSTED-BISC441-2012

Document related concepts

Genome evolution wikipedia , lookup

Tay–Sachs disease wikipedia , lookup

RNA-Seq wikipedia , lookup

Gene expression programming wikipedia , lookup

Frameshift mutation wikipedia , lookup

Pharmacogenomics wikipedia , lookup

Heritability of IQ wikipedia , lookup

Gene wikipedia , lookup

Genetic engineering wikipedia , lookup

Point mutation wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Neuronal ceroid lipofuscinosis wikipedia , lookup

Behavioural genetics wikipedia , lookup

Group selection wikipedia , lookup

Tag SNP wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Genome-wide association study wikipedia , lookup

Epistasis wikipedia , lookup

History of genetic engineering wikipedia , lookup

Human leukocyte antigen wikipedia , lookup

Medical genetics wikipedia , lookup

Inbreeding wikipedia , lookup

Designer baby wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Polymorphism (biology) wikipedia , lookup

Genome (book) wikipedia , lookup

Public health genomics wikipedia , lookup

Human genetic variation wikipedia , lookup

Dominance (genetics) wikipedia , lookup

Hardy–Weinberg principle wikipedia , lookup

Genetic drift wikipedia , lookup

Population genetics wikipedia , lookup

Microevolution wikipedia , lookup

Transcript
Trinucleotide satellite lengths and AR transcriptional activity
The androgen receptor gene contains two polymorphic trinucleotide
microsatellites in exon 1. The first microsatellite (nearest the 5' end)
contains 8 to 60 repetitions of the glutamine codon "CAG"
and is thus known as the polyglutamine tract. The average number of
repetitions varies by ethnicity, with Caucasians exhibiting an average of 21
CAG repeats, and 18 in Blacks.
In men, disease states are associated with extremes in polyglutamine tract
length: prostate cancer, hepatocellular carcinoma, and mental
retardation are associated with too few repetitions, while spinal and
bulbar muscular atrophy is associated with a CAG repetition
length of 40 or more. Some studies indicate that the length of the
polyglutamine tract is inversely correlated with transcriptional activity in
the AR protein, and that longer polyglutamine tracts may be associated
with male infertility and undermasculinized genitalia in men. A
comprehensive meta-analysis of the subject published in 2007 supports
the existence of the correlation.
How it works…
Population genetics, health and
disease
(1) Why population genetics is important
for human health and disease
(2) Basics of population genetics: the
main forces, and examples
(3) How genes can contribute to disease
Population genetics, health and disease
(1) Why population genetics is important for
human health and disease
(a) Human evolution has been underlain by
adaptive and non-adaptive changes in allele
frequencies
(b) Diseases are commonly due to effects of alleles,
and alleles interacting with environments - there is a
spectrum from single-locus disorders to
polygenic disorders
Some genetically-based traits that evolved
in the human lineage
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
Hair that keeps growing
Blue eyes
Blond hair
Ability to digest milk after infancy
Highly-articulate speech
Schizophrenia
Liability to Alzheimer’s
Menopause
Big brains
etc etc
(2) Basics of population genetics: the main forces, and examples
Gregor Mendel
(1822-84)
• discovered the laws
of heredity in 7
hybridization
experiments on
19,959 pea plants
• published his results
in 1865, but they
were ignored until
1900
Mendel’s peas
Mendel investigated
inheritance of these
characteristics:
height (tall/short)
pea colour (green/yellow)
pea shape
(round/angular)
pod shape (full/pinched)
pod colour
(green/yellow)
flower colour
(purple/white)
Peas, disease, its all genetically the same for single locus phenotypes
What’s important about
Mendel
• He provided evidence that inheritance is
particulate, not blending
• He established the distinction between
inheritance and expression of genes, with
regard to dominance and recessiveness
(which are caused by evolved physiologicaldevelopmental effects of alleles)
Hardy-Weinberg Equilibrium: A Null Model to compare to your Real Data
ASSUMES:
Pretty much: No mutation, no selection, no migration (gene flow),
random mating by genotype, large population size (no random
changes due to sampling error [= genetic drift])
PREDICTS:
If allele frequencies are p and q, then genotype frequences are p2, 2pq, q2
alleles: A
B
genotypes: AA AB BB
Essence of Hardy-Weinberg Equilibrium:
NO POPULATION-GENETIC FORCE: NOTHING HAPPENS
like black and white marbles in a jar, in pairs, then single….
Hardy-Weinberg equilibrium
If the frequency of one
allele (A) is p and that of the
other allele (B) is q,
p
random mating
is random combining of
gametes, which leads to
q
(p + q)2 = p2 + 2pq + q2
Freq AA Freq AB Freq BB
p
q
p2
pq
pq
q2
Of what USE is the HW Equilibrium?
NO POPULATION-GENETIC FORCE: NOTHING HAPPENS
-can predict genotype frequencies (p2, 2pq, q2)
from allele frequencies (p, q)
YES IS A POPULATION GENETIC FORCE,
THEN SOMETHING SPECIFIC HAPPENS AND CAN SEEK
TO INFER WHAT CAUSED IT
For example:
Selection: changes relative frequencies of one or two of the
genotypes, due to differences in relative fitness (eg survival)
Inbreeding leads to more homozygotes than predicted
EXAMPLE
Observe genotypes (number of indviduals)
AA
8
AB
64
BB
128
Calculate observed genotype frequencies. NOW! I MEAN IT!
Calculate observed allele frequencies. NOW! I MEAN IT!
What are the genotype frequencies expected under
Hardy-Weinberg Equilibrium? EASY PEASY!
Compare observed with expected genotype frequencies
ANOTHER EXAMPLE
Observe genotypes (number of indviduals)
AA
20
AB
160
BB
20
Calculate observed genotype frequencies.
Calculate observed allele frequencies.
What are the genotype frequencies expected under
Hardy-Weinberg Equilibrium?
Compare observed with expected genotype frequencies
Hm-m. What next?
HOW unlikely is it to get such results BY CHANCE?
TEST vs CHANCE using -square test
Genotype Obs Exp number of individuals
AA
10 25
AB
80 50
BB
10 25
Sum of (Obs - Exp)2
Exp
= (10-25)2 + (80-50)2 + (10-25)2
25
50
25
= 36
Using distribution with 1 df, p < 0.001, so
odds of getting this result by chance are less
than 1 in 1000
What if your samples sizes were smaller? (AA: 1, AB: 8, BB: 1)
Still significant?
What if, in a different study, your samples sizes were very
large, such as 1,000,000, compared to moderate (such as 100)?
What if the same study has been conducted 19 times
previously with non-significant results, but you find p = 0.0499?
-false positives
-false negatives
-the file drawer problem
-statistical compared to biological ‘significance’
Statistics and clinical trials: publish trial design, planned
statistics in advance; consequences of false positives, negatives
(& double-blind design, placebo effects, conflicts of interest...)
STATISTICS
ARE ‘TRUTH’
Albinism
•
•
•
•
inheritance: recessive alleles at 2 loci
incidence: 1-in-40,000 births in Europe
symptoms: no body colour, visual deficits
cause: lack of tyrosinase means that melanin
can’t be synthesized
Doing some sums: albinism
•
•
•
•
•
•
•
•
•
autosomal recessive incidence of 1-in-40,000
if dominant is p, and recessive albinism is q
q2 = 1/40,000, or 0.000025
q = √0.000025 = 0.005
since p+q = 1, p = 1- 0.005 = 0.995
2pq = 2 x 0.995 x 0.005 = 0.00995 or 0.995%
ie, ±1-in-100 is a carrier of the albinism allele
chance of carriers mating: 1-in-10,000
chance of homozygosity: 1-in-4 = 1-in-40,000
The real genetic and genomic world is not A’s and peas:
Human genome: about 3 billion nucleotides, with about 3 million of them
variable among any two random humans (99.9% identity);
most variants probably have no phenotypic effects (are ‘neutral’)
Human Genome Project has provided the sequence (all online)
of one human, but the most interesting and important data as
regards health is the variation among humans, analyzed using the:
HapMap (Haplotype Map) project has characterized genetic
variation among three major populations, one African, one Asian,
one Caucasian (one or more common SNP genotyped at least every
5000 base pairs); > 1 million SNPs overall
1000 Genomes project: full sequences of 1000 humans -> rare variants
SNP - single nucleotide polymorphism (2 or more bases at a locus)
Haplotype - linear combination of SNPs or other markers on a chromosome
such as C...C....A.T (haplotype 1), C...G....A.T (haplotype 2); sets of
linked bases tend to be inherited together -- form flanked ‘blocks’
Microsatellites - repetitive elements with variable numbers of short repeats
such as CAGCAGCAG...or ATATAT - used as markers, and underly
some diseases
Copy number variation - variation in number of copies of large sections of
genome, including one or more genes (large deletions, duplications)
Some important findings from HapMap project (and earlier
studies using other genetic markers)
(1) About 10-15% of total human genetic variation is among
populations; rest is within populations
(2) Africa harbours substantially higher levels of human genetic
variation than other regions
(3) Patterns of natural selection ‘for’ given alleles (positive
selection) vary substantially among populations
-> local adaptation
More important facts about mutation:
-Types of mutation: somatic vs germline; single base pairs,
insertions/deletions repeats,rearrangements, copy number variation;
in coding, non-coding, regulatory DNA
-Mutations have deleterious effects in the great majority of cases,
so selection should minimize the mutation rate, subject to
constraints, tradeoffs (repair ability, time constraints in replication)
-Many human diseases are caused by de novo mutations
(eg about 10% of cases of autism may be due to de novo
germ-line mutations) - these diseases can persist under a balance
between mutation and selection - see OMIM for human
knowledge in this area
http://www.ncbi.nlm.nih.gov/sites/entrez?db=omim
-The larger the population, the greater the scope and
potential for mutations to turn out to be adaptive
Natural Selection, for mendelian loci
(1) Only force that can cause adaptation (can also result in
maladaptations)
World of ‘things’, which vary in size
Live in an environment, have a niche
oOooOOoOOoo
UuUuuUUu
Reproduce, there is heritable variation o -> o , O -> O
Change in environment,
selection for smaller size
UuUuuUUu
Evolutionary change, and
adaptation
oo ooooooo
Natural selection, a simple, general example
Resistance to antibacterial soap
Generation 1 1.00 not resistant
0.00 resistant
Natural selection
Resistance to antibacterial soap
Generation 1 1.00 not resistant
0.00 resistant
Natural selection
Resistance to antibacterial soap
Generation 1 1.00 not resistant
0.00 resistant
Generation 2 0.96 not resistant
0.04 resistant
mutation!
Natural selection
Resistance to antibacterial soap
Generation 1 1.00 not resistant
0.00 resistant
Generation 2 0.96 not resistant
0.04 resistant
Generation 3 0.76 not resistant
0.24 resistant
Natural selection
Resistance to antibacterial soap
Generation 1: 1.00 not resistant
0.00 resistant
Generation 2 0.96 not resistant
0.04 resistant
Generation 3 0.76 not resistant
0.24 resistant
Generation 4: 0.12 not resistant
0.88 resistant
Rapid evolution of adaptation by natural selection - genetic basis?
Natural Selection, for mendelian loci
(1) Only force that can cause adaptation (remember the ‘things’!)
(2) Common situation for functional sites is usually to have one
allele/haplotype common (ancestral), rare mutant (derived) alleles
selected against (‘purifying selection’), since mutations are
usually bad - genetic situation stays same
(3) Various forms of selection at one locus (AA,Aa,aa)
(a) for recessive mutation, fitnesses:
(b) for dominant mutation: fitnesses
aa > Aa, AA
Aa, AA > aa
It takes a LONG TIME for advantageous mutations to
reach fixation (dozens to hundreds to thousands of generations)
(all individuals, lineages with disadvantageous alleles must die)
->leads to MISMATCHES as environments, selection changes
More on various forms of selection at one locus (AA,Aa,aa)
(c) for heterozygous genotype
Aa > AA, aa
-due to nature of inheritance (a constraint), maladapted
homozygotes are generated every generation
(d) against recessive genotype
aa < AA,Aa
-due to fact that vast majority of ‘a’ alleles are in
heterozygotes it is exceedingly difficult for selection to
remove deleterious ‘a’ allele from population
For example, with p=0.95, q=0.05; q2=0.0025, 2pq=0.095
- very little variation is ‘visible’ to selection
(e) against dominant genotype
AA, Aa < aa
-very effective at removing ‘A’ allele, UNLESS phenotypic
effects manifest after age of reproduction
(eg Huntington’s disease)
Rare Alleles and Eugenics
•
•
•
•
A popular idea early in the 20th century was “eugenics”,
improving the human population through selective
breeding. The idea has been widely discredited, largely
due to the evils of “forced eugenics” practiced in certain
countries before and during World War 2. We no longer
force “genetically defective” people to be sterilized.
However, note that positive eugenics: encouraging people
to breed with superior partners, is still practiced in places.
The problem with sterilizing “defectives” is that most
genes that produce a notable genetic diseases are
recessive: only expressed in heterozygotes. If you only
sterilize the homozygotes, you are missing the vast
majority of people who carry the allele.
For example, assume that the frequency of a gene for a
recessive genetic disease is 0.001, a very typical figure.
Thus p = 0.999 and q = 0.001. Thus p2 = 0.998, 2pq =
0.002, and q2 = 0.000001. The ratio of heterozygotes
(undetected carriers) to homozygotes (people with the
disease) is 2000 to 1: you are sterilizing only 1/2000 of
the people who carry the defective allele. This is simply
not a workable strategy for improving the gene pool.
Recessive deleterious (and advantageous) alleles are present
mainly in heterozygotes, ‘hidden’ from selection; most people
have multiple, rare, homozygous-lethal alleles in their genome
Eradicating dominant disorders
• Huntington’s—and any other dominant
disorder—could in principle be eliminated
in one generation by aborting every foetus
carrying the gene
• however, this would not prevent
spontaneous mutations occurring (in
Huntington’s, ±1 in 100,000)
unless the entire population was screened
for them
Eradicating (as best possible) recessive
inherited disorders in genetic isolates
Ashkenazi heritable disorders
• Gaucher disease: ranges
from mild to severe, sometimes treatable
• Cystic Fibrosis: average life expectancy ±30
• Fanconi anemia: developmental and mental
retardation, proneness to cancer
• Nieman-Pick disease: fatal by age 4
• Bloom syndrome: fatal cancers by age 30
• Canavan disease: similar to Tay Sachs
Dor Yeshorim
• Committee for the Prevention
of Jewish Genetic Diseases
• founded by Rabbi Josef Ekstein after losing
4 children to Tay-Sachs
• community at first in denial, but later
testing became widely accepted
• 170,000 tested now for 9-10 diseases, ~1-in100 couples ‘incompatible’
How it works
• undisclosed tests
carried out at school
• if testees consider a relationship, they can
enquire about compatibility
• if only 1 is a carrier, no disclosure, but if both
are, advised ‘incompatible’ and counselled
• Tay-Sachs cases now almost eliminated
Natural selection - Lactase gene in humans
(1) Origin of animal husbandry and animal
(2) Milk as food source, less than 7000 years ago
(2) Selection for lactase persistence, ability to digest milk
after weaning, selects for allelic variants of LCT gene
(lactase-phlorizin hydrolase)(intolerant: gassy, farty, nauseous)
(3) Geographic distribution of lactase persistence matches
distribution of dairy farming (gene-culture ‘coevolution’)
(4) Two SNP polymorphisms in LCT gene are associated with
lactase persistence, have been selected for
BUT takes hundreds, thousands of years for selected SNPs to
spread
through
populations,
not yet fixed
Natural selection is often geographically-restricted
Sickle cell anemia - red blood cell protein polymorphism
SS homozygotes - sickle cell disease, early death
AS heterozygotes - relatively resistant to malaria
AA homozygoes - relatively susceptible to malaria
S allele is only favored in malarial area
Other red blood cell genes show similar patterns of heterozygote advantage
The frequencies of anti-malarial alleles are highest in malarial areas
Malaria
HB S allele
G6PD deficiency allele
Here, we see ‘fit’ between alleles and environments, and
variation is maintained locally by heterozygote advantage
Most-polymorphic loci known in humans are HLA loci, which
are involved in immune responses to pathogens
Is a positive correlation, among human populations, between
HLA heterozygosity levels and virus species richness,
suggesting that viruses impose selection for
maintenance of genetic variation at immune system loci
Worobey et al. (2008), Annual Reviews of Ecology, Evolution
and Systematics:
,
‘
‘
Was severe bottleneck at point of out of
Africa, for modern humans
Bottleneck, selection due to eruption
of supervolcano Toba about 70,000
years ago? Coincide with out of Africa?
Was severe bottleneck
at point of out of
Africa,
for modern humans
Consequences?
-Notably higher levels of genetic variation in Africa than elsewhere;
declines in heterozygosity as one goes further from Africa
-Some alleles may reach high frequency in non-African
regions by drift, such as, possibly, cystic fibrosis in northwest Europe
-Some disease alleles that are very rare in Africa, common
elsewhere, such as alleles for myotonic dystrophy
Typical of the derisive labeling experience of many religious groups, they were called Dunkers by
outsiders because they fully immersed or “dunked” their baptismal candidates in nearby streams,
three complete dunkings; a particular method of baptism that completely distinguished them from the
“sprinkling” Lutherans and Methodists, their kindred “pouring” Mennonites, and even single dunk Baptists.
OMIM entry for myotonic dystrophy
Adult-onset, autosomal dominant disorder
__________
_______________
More on ABO blood groups & tradeoffs in disease resistance
Losses of genetic variation due to drift can lead to
genetically-based vulnerability to pathogens and
parasites-especially salient for immune system loci
EXAMPLE: colonizations of New World
Population bottleneck in first
colonization of New world
led to loss of immune system
alleles
PARALLEL SITUATION:
West Africa - catch measles
from family member,
about twice as likely to die,
compared to from non-family
member
(Garenne and Aaby 1990)
PNAS, 2010
*
*
in fewer
s
Assortative mating as a cause of autism?
Both mothers and fathers
of children with autism tend
to be in systemizing occupations
(such as science and engineering),
score highly on tests related to
autistic traits
Consanguinity?
Effect of MIGRATION (gene flow):
Homogenize gene, genotype frequencies
Effect of breakup: losses of local adaptation to climate, etc
Data for metabolic-disease genes RAPTOR and PON1
2008
*
*
HOW GENES CAUSE DISEASE: Roles of the
population-genetic forces
Mutation - de novo mutations, mutation-selection balance
Selection -takes a long time to fix good alleles
-hard for recessive bad alleles or late-acting dominant alleles to be
removed
-heterozygote advantage generates maladapted homozygotes
-selection leads to resistence in human pathogens, cancer progression,
with tradeoffs
Drift, inbreeding - loss of alleles, loss of heterozygosity, fixation or
higher frequency of deleterious alleles
Non-random mating - can lead to increase in disease expression if
mating is assortative; disassortative can maintain variation (eg HLA)
Migration/gene flow -loss of local adaptation
-introductions of novel diseases (eg SARS, HIV)
HOW GENES CAUSE DISEASE: Roles of different parties
Beneficiaries of alleles that increase disease risk
(1) No one: rare mutations especially in large genes, recessive and
dominant alleles that are hard to eliminate, drift & founder effects
(2) Individual with the gene:
-benefits only in some environments, geographic areas,
-benefits in ancestral environment only, not in current one,
-benefits outweigh costs overall
(eg benefits early, costs late; pleiotropy (eg APOE4),linkage --->
(3) Other individuals:
-heterozygote advantage (b to hets, c to others),
-benefit to fetus and cost to mother,
-benefit to one parent, cost to other (genomic imprinting),
-sexual antagonism (eg androgen receptor)
(4) Gene, at expense of unlinked genes, individual (meiotic drive)
Pleiotropy and linkage effects
+
+
-
replaced by
o
high frequency
+
-
at very low frequency
+-
new, beneficial mutation ‘drags’ disadvantageous
allele to high frequency, due to tight linkage
Models of evolution of alleles underlying disease risk
(1) Mutation-selection balance: ancestral alleles (tell ancestral by
comparison with chimp or Neanderthal) are adaptive, derived
alleles are rare, increase risk, and are selected against
(2) Ancestral susceptibility: Ancestral alleles good in ancestral
environments, bad in current environment; derived alleles good
in current environment
Old envt
New envt
Ancestral good, maintained
Ancestral bad; derived allele selected for
APOE E4 alleles
APOE E3 allele good
Thrifty alleles good
Thrifty alleles bad (type 2 diabetes, obesity)
Salt-retaining good
Salt-retaining leads to hypertension
Examples of genes showing evidence of ancestral susceptibility
Relation of population-genetic factors
to ultimate causes of health and disease
(a) Novel environments
(b) Novel genes, genotypes (via mutation, drift, inbreeding,
gene flow, selection)
(c) Tradeoffs between opposing selective pressures
(d) Conflicts within and between species
(e) Constraints on optimization (evolutionary legacies)
(f) Trait involves benefits to own reproduction, or to kin, that
offset costs to phenotype (genes that increase reproduction
spread even if they decrease health, happiness or longevity)
(g) Trait is not a disease but a beneficial protective response
(eg cough,fever,pain,nausea,vomiting,anxiety,fatigue)
Kidd & Kidd (2007) claim that whether a disease-associated
genetic variant is common due to drift or selection is not
relevant from a medical standpoint
DO YOU AGREE?
Kidd & Kidd (2007) claim that whether a disease-associated
genetic variant is common due to drift or selection is not
relevant from a medical standpoint
DO YOU AGREE?
Drift: variant is just a ‘bad gene’, common due to chance events
Selection: variant may have some unknown benefit in
all environments, or may be useful in some environments
(as a function of geographic variation); or variant may be linked
to allele under positive selection; or heterozygote advantage
may be involved
GWAS….