* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Lecture 4-POSTED-BISC441-2012
Genome evolution wikipedia , lookup
Tay–Sachs disease wikipedia , lookup
Gene expression programming wikipedia , lookup
Frameshift mutation wikipedia , lookup
Pharmacogenomics wikipedia , lookup
Heritability of IQ wikipedia , lookup
Genetic engineering wikipedia , lookup
Point mutation wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Neuronal ceroid lipofuscinosis wikipedia , lookup
Behavioural genetics wikipedia , lookup
Group selection wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Genome-wide association study wikipedia , lookup
History of genetic engineering wikipedia , lookup
Human leukocyte antigen wikipedia , lookup
Medical genetics wikipedia , lookup
Designer baby wikipedia , lookup
Quantitative trait locus wikipedia , lookup
Polymorphism (biology) wikipedia , lookup
Genome (book) wikipedia , lookup
Public health genomics wikipedia , lookup
Human genetic variation wikipedia , lookup
Dominance (genetics) wikipedia , lookup
Hardy–Weinberg principle wikipedia , lookup
Genetic drift wikipedia , lookup
Trinucleotide satellite lengths and AR transcriptional activity The androgen receptor gene contains two polymorphic trinucleotide microsatellites in exon 1. The first microsatellite (nearest the 5' end) contains 8 to 60 repetitions of the glutamine codon "CAG" and is thus known as the polyglutamine tract. The average number of repetitions varies by ethnicity, with Caucasians exhibiting an average of 21 CAG repeats, and 18 in Blacks. In men, disease states are associated with extremes in polyglutamine tract length: prostate cancer, hepatocellular carcinoma, and mental retardation are associated with too few repetitions, while spinal and bulbar muscular atrophy is associated with a CAG repetition length of 40 or more. Some studies indicate that the length of the polyglutamine tract is inversely correlated with transcriptional activity in the AR protein, and that longer polyglutamine tracts may be associated with male infertility and undermasculinized genitalia in men. A comprehensive meta-analysis of the subject published in 2007 supports the existence of the correlation. How it works… Population genetics, health and disease (1) Why population genetics is important for human health and disease (2) Basics of population genetics: the main forces, and examples (3) How genes can contribute to disease Population genetics, health and disease (1) Why population genetics is important for human health and disease (a) Human evolution has been underlain by adaptive and non-adaptive changes in allele frequencies (b) Diseases are commonly due to effects of alleles, and alleles interacting with environments - there is a spectrum from single-locus disorders to polygenic disorders Some genetically-based traits that evolved in the human lineage (1) (2) (3) (4) (5) (6) (7) (8) (9) Hair that keeps growing Blue eyes Blond hair Ability to digest milk after infancy Highly-articulate speech Schizophrenia Liability to Alzheimer’s Menopause Big brains etc etc (2) Basics of population genetics: the main forces, and examples Gregor Mendel (1822-84) • discovered the laws of heredity in 7 hybridization experiments on 19,959 pea plants • published his results in 1865, but they were ignored until 1900 Mendel’s peas Mendel investigated inheritance of these characteristics: height (tall/short) pea colour (green/yellow) pea shape (round/angular) pod shape (full/pinched) pod colour (green/yellow) flower colour (purple/white) Peas, disease, its all genetically the same for single locus phenotypes What’s important about Mendel • He provided evidence that inheritance is particulate, not blending • He established the distinction between inheritance and expression of genes, with regard to dominance and recessiveness (which are caused by evolved physiologicaldevelopmental effects of alleles) Hardy-Weinberg Equilibrium: A Null Model to compare to your Real Data ASSUMES: Pretty much: No mutation, no selection, no migration (gene flow), random mating by genotype, large population size (no random changes due to sampling error [= genetic drift]) PREDICTS: If allele frequencies are p and q, then genotype frequences are p2, 2pq, q2 alleles: A B genotypes: AA AB BB Essence of Hardy-Weinberg Equilibrium: NO POPULATION-GENETIC FORCE: NOTHING HAPPENS like black and white marbles in a jar, in pairs, then single…. Hardy-Weinberg equilibrium If the frequency of one allele (A) is p and that of the other allele (B) is q, p random mating is random combining of gametes, which leads to q (p + q)2 = p2 + 2pq + q2 Freq AA Freq AB Freq BB p q p2 pq pq q2 Of what USE is the HW Equilibrium? NO POPULATION-GENETIC FORCE: NOTHING HAPPENS -can predict genotype frequencies (p2, 2pq, q2) from allele frequencies (p, q) YES IS A POPULATION GENETIC FORCE, THEN SOMETHING SPECIFIC HAPPENS AND CAN SEEK TO INFER WHAT CAUSED IT For example: Selection: changes relative frequencies of one or two of the genotypes, due to differences in relative fitness (eg survival) Inbreeding leads to more homozygotes than predicted EXAMPLE Observe genotypes (number of indviduals) AA 8 AB 64 BB 128 Calculate observed genotype frequencies. NOW! I MEAN IT! Calculate observed allele frequencies. NOW! I MEAN IT! What are the genotype frequencies expected under Hardy-Weinberg Equilibrium? EASY PEASY! Compare observed with expected genotype frequencies ANOTHER EXAMPLE Observe genotypes (number of indviduals) AA 20 AB 160 BB 20 Calculate observed genotype frequencies. Calculate observed allele frequencies. What are the genotype frequencies expected under Hardy-Weinberg Equilibrium? Compare observed with expected genotype frequencies Hm-m. What next? HOW unlikely is it to get such results BY CHANCE? TEST vs CHANCE using -square test Genotype Obs Exp number of individuals AA 10 25 AB 80 50 BB 10 25 Sum of (Obs - Exp)2 Exp = (10-25)2 + (80-50)2 + (10-25)2 25 50 25 = 36 Using distribution with 1 df, p < 0.001, so odds of getting this result by chance are less than 1 in 1000 What if your samples sizes were smaller? (AA: 1, AB: 8, BB: 1) Still significant? What if, in a different study, your samples sizes were very large, such as 1,000,000, compared to moderate (such as 100)? What if the same study has been conducted 19 times previously with non-significant results, but you find p = 0.0499? -false positives -false negatives -the file drawer problem -statistical compared to biological ‘significance’ Statistics and clinical trials: publish trial design, planned statistics in advance; consequences of false positives, negatives (& double-blind design, placebo effects, conflicts of interest...) STATISTICS ARE ‘TRUTH’ Albinism • • • • inheritance: recessive alleles at 2 loci incidence: 1-in-40,000 births in Europe symptoms: no body colour, visual deficits cause: lack of tyrosinase means that melanin can’t be synthesized Doing some sums: albinism • • • • • • • • • autosomal recessive incidence of 1-in-40,000 if dominant is p, and recessive albinism is q q2 = 1/40,000, or 0.000025 q = √0.000025 = 0.005 since p+q = 1, p = 1- 0.005 = 0.995 2pq = 2 x 0.995 x 0.005 = 0.00995 or 0.995% ie, ±1-in-100 is a carrier of the albinism allele chance of carriers mating: 1-in-10,000 chance of homozygosity: 1-in-4 = 1-in-40,000 The real genetic and genomic world is not A’s and peas: Human genome: about 3 billion nucleotides, with about 3 million of them variable among any two random humans (99.9% identity); most variants probably have no phenotypic effects (are ‘neutral’) Human Genome Project has provided the sequence (all online) of one human, but the most interesting and important data as regards health is the variation among humans, analyzed using the: HapMap (Haplotype Map) project has characterized genetic variation among three major populations, one African, one Asian, one Caucasian (one or more common SNP genotyped at least every 5000 base pairs); > 1 million SNPs overall 1000 Genomes project: full sequences of 1000 humans -> rare variants SNP - single nucleotide polymorphism (2 or more bases at a locus) Haplotype - linear combination of SNPs or other markers on a chromosome such as C...C....A.T (haplotype 1), C...G....A.T (haplotype 2); sets of linked bases tend to be inherited together -- form flanked ‘blocks’ Microsatellites - repetitive elements with variable numbers of short repeats such as CAGCAGCAG...or ATATAT - used as markers, and underly some diseases Copy number variation - variation in number of copies of large sections of genome, including one or more genes (large deletions, duplications) Some important findings from HapMap project (and earlier studies using other genetic markers) (1) About 10-15% of total human genetic variation is among populations; rest is within populations (2) Africa harbours substantially higher levels of human genetic variation than other regions (3) Patterns of natural selection ‘for’ given alleles (positive selection) vary substantially among populations -> local adaptation More important facts about mutation: -Types of mutation: somatic vs germline; single base pairs, insertions/deletions repeats,rearrangements, copy number variation; in coding, non-coding, regulatory DNA -Mutations have deleterious effects in the great majority of cases, so selection should minimize the mutation rate, subject to constraints, tradeoffs (repair ability, time constraints in replication) -Many human diseases are caused by de novo mutations (eg about 10% of cases of autism may be due to de novo germ-line mutations) - these diseases can persist under a balance between mutation and selection - see OMIM for human knowledge in this area http://www.ncbi.nlm.nih.gov/sites/entrez?db=omim -The larger the population, the greater the scope and potential for mutations to turn out to be adaptive Natural Selection, for mendelian loci (1) Only force that can cause adaptation (can also result in maladaptations) World of ‘things’, which vary in size Live in an environment, have a niche oOooOOoOOoo UuUuuUUu Reproduce, there is heritable variation o -> o , O -> O Change in environment, selection for smaller size UuUuuUUu Evolutionary change, and adaptation oo ooooooo Natural selection, a simple, general example Resistance to antibacterial soap Generation 1 1.00 not resistant 0.00 resistant Natural selection Resistance to antibacterial soap Generation 1 1.00 not resistant 0.00 resistant Natural selection Resistance to antibacterial soap Generation 1 1.00 not resistant 0.00 resistant Generation 2 0.96 not resistant 0.04 resistant mutation! Natural selection Resistance to antibacterial soap Generation 1 1.00 not resistant 0.00 resistant Generation 2 0.96 not resistant 0.04 resistant Generation 3 0.76 not resistant 0.24 resistant Natural selection Resistance to antibacterial soap Generation 1: 1.00 not resistant 0.00 resistant Generation 2 0.96 not resistant 0.04 resistant Generation 3 0.76 not resistant 0.24 resistant Generation 4: 0.12 not resistant 0.88 resistant Rapid evolution of adaptation by natural selection - genetic basis? Natural Selection, for mendelian loci (1) Only force that can cause adaptation (remember the ‘things’!) (2) Common situation for functional sites is usually to have one allele/haplotype common (ancestral), rare mutant (derived) alleles selected against (‘purifying selection’), since mutations are usually bad - genetic situation stays same (3) Various forms of selection at one locus (AA,Aa,aa) (a) for recessive mutation, fitnesses: (b) for dominant mutation: fitnesses aa > Aa, AA Aa, AA > aa It takes a LONG TIME for advantageous mutations to reach fixation (dozens to hundreds to thousands of generations) (all individuals, lineages with disadvantageous alleles must die) ->leads to MISMATCHES as environments, selection changes More on various forms of selection at one locus (AA,Aa,aa) (c) for heterozygous genotype Aa > AA, aa -due to nature of inheritance (a constraint), maladapted homozygotes are generated every generation (d) against recessive genotype aa < AA,Aa -due to fact that vast majority of ‘a’ alleles are in heterozygotes it is exceedingly difficult for selection to remove deleterious ‘a’ allele from population For example, with p=0.95, q=0.05; q2=0.0025, 2pq=0.095 - very little variation is ‘visible’ to selection (e) against dominant genotype AA, Aa < aa -very effective at removing ‘A’ allele, UNLESS phenotypic effects manifest after age of reproduction (eg Huntington’s disease) Rare Alleles and Eugenics • • • • A popular idea early in the 20th century was “eugenics”, improving the human population through selective breeding. The idea has been widely discredited, largely due to the evils of “forced eugenics” practiced in certain countries before and during World War 2. We no longer force “genetically defective” people to be sterilized. However, note that positive eugenics: encouraging people to breed with superior partners, is still practiced in places. The problem with sterilizing “defectives” is that most genes that produce a notable genetic diseases are recessive: only expressed in heterozygotes. If you only sterilize the homozygotes, you are missing the vast majority of people who carry the allele. For example, assume that the frequency of a gene for a recessive genetic disease is 0.001, a very typical figure. Thus p = 0.999 and q = 0.001. Thus p2 = 0.998, 2pq = 0.002, and q2 = 0.000001. The ratio of heterozygotes (undetected carriers) to homozygotes (people with the disease) is 2000 to 1: you are sterilizing only 1/2000 of the people who carry the defective allele. This is simply not a workable strategy for improving the gene pool. Recessive deleterious (and advantageous) alleles are present mainly in heterozygotes, ‘hidden’ from selection; most people have multiple, rare, homozygous-lethal alleles in their genome Eradicating dominant disorders • Huntington’s—and any other dominant disorder—could in principle be eliminated in one generation by aborting every foetus carrying the gene • however, this would not prevent spontaneous mutations occurring (in Huntington’s, ±1 in 100,000) unless the entire population was screened for them Eradicating (as best possible) recessive inherited disorders in genetic isolates Ashkenazi heritable disorders • Gaucher disease: ranges from mild to severe, sometimes treatable • Cystic Fibrosis: average life expectancy ±30 • Fanconi anemia: developmental and mental retardation, proneness to cancer • Nieman-Pick disease: fatal by age 4 • Bloom syndrome: fatal cancers by age 30 • Canavan disease: similar to Tay Sachs Dor Yeshorim • Committee for the Prevention of Jewish Genetic Diseases • founded by Rabbi Josef Ekstein after losing 4 children to Tay-Sachs • community at first in denial, but later testing became widely accepted • 170,000 tested now for 9-10 diseases, ~1-in100 couples ‘incompatible’ How it works • undisclosed tests carried out at school • if testees consider a relationship, they can enquire about compatibility • if only 1 is a carrier, no disclosure, but if both are, advised ‘incompatible’ and counselled • Tay-Sachs cases now almost eliminated Natural selection - Lactase gene in humans (1) Origin of animal husbandry and animal (2) Milk as food source, less than 7000 years ago (2) Selection for lactase persistence, ability to digest milk after weaning, selects for allelic variants of LCT gene (lactase-phlorizin hydrolase)(intolerant: gassy, farty, nauseous) (3) Geographic distribution of lactase persistence matches distribution of dairy farming (gene-culture ‘coevolution’) (4) Two SNP polymorphisms in LCT gene are associated with lactase persistence, have been selected for BUT takes hundreds, thousands of years for selected SNPs to spread through populations, not yet fixed Natural selection is often geographically-restricted Sickle cell anemia - red blood cell protein polymorphism SS homozygotes - sickle cell disease, early death AS heterozygotes - relatively resistant to malaria AA homozygoes - relatively susceptible to malaria S allele is only favored in malarial area Other red blood cell genes show similar patterns of heterozygote advantage The frequencies of anti-malarial alleles are highest in malarial areas Malaria HB S allele G6PD deficiency allele Here, we see ‘fit’ between alleles and environments, and variation is maintained locally by heterozygote advantage Most-polymorphic loci known in humans are HLA loci, which are involved in immune responses to pathogens Is a positive correlation, among human populations, between HLA heterozygosity levels and virus species richness, suggesting that viruses impose selection for maintenance of genetic variation at immune system loci Worobey et al. (2008), Annual Reviews of Ecology, Evolution and Systematics: , ‘ ‘ Was severe bottleneck at point of out of Africa, for modern humans Bottleneck, selection due to eruption of supervolcano Toba about 70,000 years ago? Coincide with out of Africa? Was severe bottleneck at point of out of Africa, for modern humans Consequences? -Notably higher levels of genetic variation in Africa than elsewhere; declines in heterozygosity as one goes further from Africa -Some alleles may reach high frequency in non-African regions by drift, such as, possibly, cystic fibrosis in northwest Europe -Some disease alleles that are very rare in Africa, common elsewhere, such as alleles for myotonic dystrophy Typical of the derisive labeling experience of many religious groups, they were called Dunkers by outsiders because they fully immersed or “dunked” their baptismal candidates in nearby streams, three complete dunkings; a particular method of baptism that completely distinguished them from the “sprinkling” Lutherans and Methodists, their kindred “pouring” Mennonites, and even single dunk Baptists. OMIM entry for myotonic dystrophy Adult-onset, autosomal dominant disorder __________ _______________ More on ABO blood groups & tradeoffs in disease resistance Losses of genetic variation due to drift can lead to genetically-based vulnerability to pathogens and parasites-especially salient for immune system loci EXAMPLE: colonizations of New World Population bottleneck in first colonization of New world led to loss of immune system alleles PARALLEL SITUATION: West Africa - catch measles from family member, about twice as likely to die, compared to from non-family member (Garenne and Aaby 1990) PNAS, 2010 * * in fewer s Assortative mating as a cause of autism? Both mothers and fathers of children with autism tend to be in systemizing occupations (such as science and engineering), score highly on tests related to autistic traits Consanguinity? Effect of MIGRATION (gene flow): Homogenize gene, genotype frequencies Effect of breakup: losses of local adaptation to climate, etc Data for metabolic-disease genes RAPTOR and PON1 2008 * * HOW GENES CAUSE DISEASE: Roles of the population-genetic forces Mutation - de novo mutations, mutation-selection balance Selection -takes a long time to fix good alleles -hard for recessive bad alleles or late-acting dominant alleles to be removed -heterozygote advantage generates maladapted homozygotes -selection leads to resistence in human pathogens, cancer progression, with tradeoffs Drift, inbreeding - loss of alleles, loss of heterozygosity, fixation or higher frequency of deleterious alleles Non-random mating - can lead to increase in disease expression if mating is assortative; disassortative can maintain variation (eg HLA) Migration/gene flow -loss of local adaptation -introductions of novel diseases (eg SARS, HIV) HOW GENES CAUSE DISEASE: Roles of different parties Beneficiaries of alleles that increase disease risk (1) No one: rare mutations especially in large genes, recessive and dominant alleles that are hard to eliminate, drift & founder effects (2) Individual with the gene: -benefits only in some environments, geographic areas, -benefits in ancestral environment only, not in current one, -benefits outweigh costs overall (eg benefits early, costs late; pleiotropy (eg APOE4),linkage ---> (3) Other individuals: -heterozygote advantage (b to hets, c to others), -benefit to fetus and cost to mother, -benefit to one parent, cost to other (genomic imprinting), -sexual antagonism (eg androgen receptor) (4) Gene, at expense of unlinked genes, individual (meiotic drive) Pleiotropy and linkage effects + + - replaced by o high frequency + - at very low frequency +- new, beneficial mutation ‘drags’ disadvantageous allele to high frequency, due to tight linkage Models of evolution of alleles underlying disease risk (1) Mutation-selection balance: ancestral alleles (tell ancestral by comparison with chimp or Neanderthal) are adaptive, derived alleles are rare, increase risk, and are selected against (2) Ancestral susceptibility: Ancestral alleles good in ancestral environments, bad in current environment; derived alleles good in current environment Old envt New envt Ancestral good, maintained Ancestral bad; derived allele selected for APOE E4 alleles APOE E3 allele good Thrifty alleles good Thrifty alleles bad (type 2 diabetes, obesity) Salt-retaining good Salt-retaining leads to hypertension Examples of genes showing evidence of ancestral susceptibility Relation of population-genetic factors to ultimate causes of health and disease (a) Novel environments (b) Novel genes, genotypes (via mutation, drift, inbreeding, gene flow, selection) (c) Tradeoffs between opposing selective pressures (d) Conflicts within and between species (e) Constraints on optimization (evolutionary legacies) (f) Trait involves benefits to own reproduction, or to kin, that offset costs to phenotype (genes that increase reproduction spread even if they decrease health, happiness or longevity) (g) Trait is not a disease but a beneficial protective response (eg cough,fever,pain,nausea,vomiting,anxiety,fatigue) Kidd & Kidd (2007) claim that whether a disease-associated genetic variant is common due to drift or selection is not relevant from a medical standpoint DO YOU AGREE? Kidd & Kidd (2007) claim that whether a disease-associated genetic variant is common due to drift or selection is not relevant from a medical standpoint DO YOU AGREE? Drift: variant is just a ‘bad gene’, common due to chance events Selection: variant may have some unknown benefit in all environments, or may be useful in some environments (as a function of geographic variation); or variant may be linked to allele under positive selection; or heterozygote advantage may be involved GWAS….