Download 1 shared allele

Medical Genetics: Complex disorders Lecturer: David Saffen. Ph.D. Laboratory for Molecular Neuropsychiatric Genetics Department of Cellular and Genetic Medicine School of Medicine, Fudan University [email protected] Outline A. Historical background B. Phenotypes in populations C. Genes in populations D. Mapping disease genes E. Complex disorders A. Historical background • Francis Galton: normal distributions of quantitative traits • Ronald A. Fisher: polygenic models for quantitative traits Biometrics and Mendelian genetics Francis Galton was a pioneer in using statistical methods to quantify human traits and behaviors. For example, he recognized that the distribution of many traits such as height, weight, intelligence closely approximate the “normal” (aka “Gaussian”) distribution. He also recognized that inherited traits tended to move toward average values, a phenomenon he termed, “regression to the mean.” Sir Francis Galton 1822-1911 (English Victorian polymath; Cousin of Charles Darwin; biometician; eugenicist) Most of Galton’s work on inheritance was carried out before the re-discovery of Mendel’s experiments. The paradigm under which Galton and other “biometricians” worked was that inheritance of human traits involved the mixing or blending of factors present in the parents. This picture is very different from that obtained from Mendel’s experiments, which implied that inherited traits are determined by discrete factors that remain unchanged from generation to generation. Normal distributions of quantitative traits and “regression to the mean” 45° slope Mean height SD = standard deviation Unification of biometrics and Mendelian genetics RA Fisher was a 20th century genius who made fundamental contributions to the fields of statistics and biology. In statistics, he developed analysis of variance (ANOVA), the maximum likelihood method for estimating the values of parameters based on experimental data, permutation testing to estimate statistical significance (P-values), and exact tests for estimating statistical significance in small samples. Sir Ronald A. Fisher (1890-1962) English statistician, evolutionary biologist, geneticist, eugenicist; Published: “The correlation Between relatives on the supposition of Mendelian Inheritance” in 1918. In biology, Fisher (together with Sewall Wright and J.B.S. Haldane) is considered one of the founders of the “Modern Evolutionary Synthesis,” which unified Darwin’s theory of natural selection with Mendelinspired concepts of modern genetics. Among many contributions, Fisher was the first to propose the idea of heterozygote advantage to explain the persistence of harmful genetic variants in certain populations. The combined effects of multiple genes can produce normal distributions for quantitative traits B. Phenotypes in populations • Familial aggregation of disease • Disease risk as a quantitative trait • Factors that obscure patterns of inheritance Familial aggregation of disease • Relative risk l Prevalence of the disease among the relatives of an affected person r = Prevalence of the disease in the general population • Concordance and allele sharing among relatives 0.25 (2 shared alleles) + 0.5 (1 shared allele) + 0.25 (0 shared alleles) = 1 shared allele (average). One-of-two alleles shared = 50% shared alleles. Correlations between risk of developing schizophrenia and the degree of relatedness among relatives (1/8) (1/4) (1/2) Disease risk as a quantitative trait: Susceptibility and protective genes influence the liability (risk) of developing a polygenic disease Model for schizophrenia liability Factors that obscure patterns of disease inheritance • • • • • Phenocopy Variable penetrance and expressivity Locus heterogeneity Allelic heterogeneity Environmental influences C. Genes in populations • DNA polymorphisms • Allele and genotypes frequencies • Ethnic differences in allele frequencies and susceptibility to disease • Out of Africa DNA polymorphisms Single nucleotide polymorphisms (SNPs) Microsatellites Minisatellites Technically, a polymorphism is a genetic variant that is present in a population at a frequency of > 1%. SNPs and haplotypes rs384756 (A > C) or (T > G) [0.67/0.33] Chromosome 1 AATGCCGATTCAGGGCTTAACG TTACGGCTAAGTCCCGAATTCG Chromosome 2 AATGCCGATTCCGGGCTTAACG TTACGGCTAAGGCCCGAATTCG SNP1 (A/C) SNP2 (G/T) SNP3 (C/A) SNP4 (T/C) Chromosome A A C C G T T G C C A A T T C T H1 H2 H3 H4 Haplotypes Allele and genotype frequencies in populations: the Hardy-Wienberg equilibrium (HWE) Let: PA = population frequency of A-allele = p Pa = population frequency of a-allele = q PA + Pa = p + q = 1 Sperm A-allele Sperm a-allele p q Egg A-allele p AA p2 aA qp Egg a-allele q Aa pq aa q2 AA 2Aa aa p2 + 2pq + q2 = 1 Assumptions: large population; no: new mutations, selection; migration Frequency of DCCR5 alleles in Europe, the Middle East and India DCCR5 is a 32 bp deletion within the cytokine receptor, subtype 5 (CCR5) gene. (Individuals homozygous for DCCR5 are resistant to some strains of HIV virus.) Out of Africa D. Mapping disorder genes • • • • • Linkage Linkage analysis within pedigrees Linkage analysis within populations Whole exome or genome sequencing The “architecture” of complex disorders Linkage “Marker” locus Chromosome 1 M Disease risk locus (a = liability allele) M D d Recombination Chromosome 2 m Gametes d m M D M d M D M d m d m D m d m D D The alleles of chromosomal markers located in close proximity to a disease risk locus tend not to be separated by recombination and therefore co-assort with the risk and non-risk alleles. In the case above, the “m” marker allele co-assorts with the “a” risk allele. For this reason, the marker tends to be co-inherited with the disease. Linkage analysis within pedigrees • Examines the coinheritance of a disease with chromosomal markers within multiple extended families • SNPs, mini- or “micro-satellites are often used as markers • Usually allows localization to only a general region of a particular chromosome (e.g. within several million bp) • Additional mapping is required to identify the disease gene • Relatively insensitive to allelic heterogeneity Example Linkage-analysis within populations: case-control association studies • Depends upon “linkage-disequilibrium” between genetic markers and disease risk variants • Genetic markers are usually SNPs or CNVs. Commercially available genotyping platforms allow over 1 million SNPS to be examined for association with disease in each individual tested. • Sensitive to allelic heterogeneity, but allows localization of susceptibility alleles to within 10 – 100 kb. Linkage-disequilibrium (LD) SNP1 (A/a) SNP2 (B/b) A B A b a B a b D’ = [PAB - PAPB]/DMAX PA = frequency of allele A in population PB = frequency of allele B in population PAB = frequency of association of alleles A and B DMAX = Maximal value of |PAB-PAPB| D’ = 0 when there is no LD; ±1 for complete association or dissociation. TPH2 haplotypes Linkage-disequilibrium (D’) map of the tryptophan hydroxylase 2 (TPH2) gene Case-control association analysis (population-based method) Whole exome sequencing and whole genome sequencing Lupski JR et al, Whole genome sequencing in a patient with Charcot-Marie-Toot Neuropathy, The New England Journal of Medicine 362, 1181-1191, 2010 The “architecture” of complex diseases Common disease-common variant (CDCV) model vs Common disease-rare variant (CDRV) model Case-control association studies are an effective tool for identifying common susceptibility alleles of moderate effect. By contrast, large-scale DNA sequencing is more efficient for detecting rare genetic variants of large effect. Sequenced genetic variants in an individual (Ref: Lupski JR et al, New England J. Medicine 362, 2010) *In this table, ”SNPs” also includes small indels and other possible duplications or deletions. Total “SNPs” = 2,858,587 known + 561,719 novel. Note: this study also identified 234 CNVs ranging in size from 1690 to 1,627,813 bp; 220 of these overlap with known CNVs. E. Selected complex disorders • Digenic retinitis pigmentosa • Venous thrombosis • Hirschsprung disease • Coronary artery disease • Alzheimer’s disease Digenic retinitis pigmentosa Ref: Kajiwara K et al, Science 264, 1994 Venous thrombosis (Risk influenced by two genes + “environmental” factor) Factor V Arg506Gln WT (Arg) Leiden (Gln) 3’-UTR SNP G20210A (rs1799963) A-allele increases levels of prothrombin mRNA. ~2.9% of Caucasians are heterozygous at this locus. Factor V Leiden (FVL) is more stable than wild type Factor V Risk for thrombosis is increased 7-fold for heterozygotes (80-fold for homozygotes) ~ 5% of Caucasians are heterozygous for FVL OC = oral contraceptives increase expression levels of Factor X and prothrombin a = activated form Note: OC use + rs1799963-A increases risk of cerebral vein thrombosis 30 - 150-fold! Hirschsprung disease [HSCR] (Congenital aganglionic megacolon) Proportion cases Incidence ~1/5000 children; males affected 2-4 times as frequently than females. Cause: incomplete development of sympathetic nervous system (myenteric plexis) in one or more segments of colon. Lack of these nerves prevents the colon from relaxing, resulting in intestinal blockage. To date, at least ten genes have been implicated in HSCR. Among these, the tyrosine receptor kinase RET has been identified as the major disease-causing gene. Syndromic 18% Associated with abnormal chromosomes 12% Sporadic 70% Inheritance Short segment (S-HSCR) Recessive or multigenic Long segment (L-HSCR) Dominant; low penetrance Common SNPs that disrupt the binding of transcription factors to an enhancer element located within the first intron of RET reduce RET mRNA expression and are highly associated with sporadic HSCR. Coronary artery disease [CAD] Coronary artery disease kills about 450,000 every year in the US. Cast of coronary arteries (yellow = right; red = left arterial trees) Steps leading to CAD Genetics of CAD Familial CAD Familial hypercholesterolemia: autosomal dominant disorder caused by inactivating mutations of the low-density lipoprotein receptor (LDLR) gene located at 19p13.2. Familial aggregation Proband with CAD Recurrence risk* Sister 7-fold for brothers Brother 2.5-fod for sisters < 55 years old 11.5-fold for siblings Male < 55 with MI 6- to 8-fold for MZ twin** 3-fold for male DZ twin** Female < 55 with MI 15-fold for MZ twin** 2.6-fold for DZ twin** *Compared to general population; **After controlling for risk factor including diabetes, hypertension and smoking Idiopathic CAD Genetic risk factors: hypertension, obesity, diabetes mellitus (each a disease with complex genetic components) Non-genetic risk factors: Age, sex (male>female), smoking, physical inactivity, stress GWAs studies have identified candidate CAD risk genes that function within biological pathways related to serum lipid transport and metabolism, vasoactivity, blood coagulation, inflammatory and immune pathways, and arterial wall components Alzheimer’s disease [AD] • Progressive, incurable neurodegenerative disease that leads to dementia and death • Symptoms usually appear after age of 65; early on-set forms also known • Familial AD (FAD): accounts for ~5% of cases; Sporadic AD (Late-onset AD: LOAD) • Currently thought to affect 5 M individuals in the US and 35 M individuals worldwide; this number may increase to > 115 M worldwide by 2050! Brain pathology in AD (1) AD brain normal brain Brain pathology in AD (2) AD brain normal brain Brain pathology in AD (3) amyloid plaques and neurofibrillary tangles (A) Low-power: amyloid plaques, (B) high-power: amyloid plaque, (C) neurofibrillary tangles (NFT), silver stained (D) electron micrograph of neurofibrillary tangles composed of hyperphosphorylated tau Co-staining of amyloid plaques & NFT Pathways of amyloid protein precursor (APP) proteolysis Li H, Wolfe MS and Selkoe DJ, Structure 17, 2009 Genetics of AD Familial AD (FAD) Early onset AD (EOAD) Sporadic AD (SAD) Late onset AD (LOAD) Proportion of AD cases ~5% ~95% Age of onset < 65 > 65 Liability genes APP PSEN1 PSEN2 ApoE4 + many additional genes The effects of ApoE genotypes on AD risk CHRM1 PRKCA APP APP ADAM9 ADAM10 ADAM17 α-secretase C83 + sAPPa b-secretase C99 + sAPPb miR-107 miR-9, mir29a/b-1 BACE1-AS RNA CALHM1 ATXN1 IL33 BACE1 BACE2 PION TNFRSF1, 2 TNF , GSK3A γ-secretase p3 + AICD g-secretase Ab40/42 + AICD PSEN1, PSEN2 PSENEN, NCSTN APH1A, APH1B degraded Ab (liver) C3 CR1* LRP1 LRP2 LDLR CLU ABCA1 APOEe3 APOEe4 degraded Ab degraded Ab (extracellular) (intracellular) IDE NEP PLAU* proteolysis clearance Ab serum deposition RAGE blood vessel (BBB) Ab (extracellular) sRAGE CST3 APOeE oligomers (extracellular) plaques (extracellular) SORL1 SORC1 PICALM* secretion uptake APOEe4 LRP1 FPRL1 CHRNA7 UCHL1 PARK2 proteolysis Ab (intracellular) oligomers (intracellular) lipid homeostasis blood pressure LDL CETP APOEe4 ACE blood vessel pathology neuroinflammation LRP1 APOE4 APP APP (surface) SORL1 (endosomal) SORC1 APBA1 PICALM* APAB2 LRP1B VEGF PIN1 APOE2 IL33 APBB1 Ab burden IL1,6, 8 TNF,TNK1 activation of microglia RAGE MAPK1,3,14 PTPRC (monomers, oligomers, plaques) synapse dysfunction PICALM* CHRNA7 CHRNB2 CHRM1 cholinergic neurotransmission PIN1 cytokine secretion MAPT GSK3A GSK3B CDK5 CDK5R1 GAB2 phagocytosis & complement-mediated clearance (in liver) C3 CR1* free-radical production neurofibrillary tangles (hyper-phosphorylated tau) TNF TNK1 FPRL1 DAPK11 NGF BDNF WWC1 neuronal cell death AD mitochondria dysfunction TFAM References and further reading RI Nussbaum, RR McInnes and HF Willard, “Thompson & Thompson Genetics in Medicine, Edition 7,” 2007, Saunders Elsevier, Philadelphia, PA; ISBN: 978-1-4160-3080-5 (Chapters 8 -10) T Strachan and A Read, “Human Molecular Genetics, 4th Edition,” 2011, Garland Science, New York, New York ISBN: 978-0-815-34149-9 (Chapters 3, 14 &15) Appendix • Normal (Gaussian) distributions • Punnett squares for multiple loci (genes) Normal (Gaussian) distributions 0.4 0.3 f(x) 0.2 0.1 0.0 The standard deviation, , is a measure of the “dispersion” or “variance” (v) of the measured quantity with respect to the average or “mean” value. Technically,  = √v. 1777-1855; German mathematician and physical scientist; Professor: University of Göttingen ~95% of total area under curve Punnett squares for multiple loci (genes) Eggs Sperm AB Ab aB ab A a AB ABAB AbAB aBAB abAB A AA aA Ab ABAb AbAb aBAb abAb a Aa aa aB ABaB AbaB aBaB abaB ab ABab Abab aBab abab (a + A)2 = aa + 2Aa + AA (3 terms) (a + A)2 (b + B)2 = aabb + aaBB + 2aaBb+ 2Aabb + 4AaBa + 2AaBB + 2AABb + AAbb + AABB (9 terms) ABC AbC aBC abC ABc Abc aBc abc ABC ABCABC AbCABC aBCABC abCABC ABcABC AbcABC aBcABC abcABC AbC ABCAbC AbCAbC aBCAbC abCAbC ABcAbC AbcAbC aBcAbC abcAbC aBC ABCaBC AbCaBC aBCaBC abCaBC ABcaBC AbcaBC aBcaBC abcaBC abC ABCabC AbCabC aBCabC abCabC ABcabC AbcabC aBcabC abcabC ABc ABCABc AbCABc aBCABc abCABc ABcABc AbcABc aBcABc abcABc Abc ABCAbc AbCAbc aBCAbc abCAbc ABcAbc AbcAbc aBcAbc abcAbc aBc ABCaBc AbCaBc aBCaBc abCaBc ABcaBc AbcaBc aBcaBc abcaBc abc ABCabc AbCabc aBCabc abCabc ABcabc Abcabc aBcabc abcabc (a + A)2 (b + B)2 (c + C)2 = [aabb + aaBB + 2aaBb+ 2Aabb + 4AaBa + 2AaBB + 2AABb + AAbb + AABB][cc + 2Cc + CC] = [ aabbcc + aaBBcc + 2aaBbcc + 2Aabbcc + 4AaBacc + 2AaBBcc + 2AABbcc + AAbbcc + AABBcc + 2aabbCc + 2aaBBCc + 4aaBbCc + 4AabbCc + 8AaBaCc + 4AaBBCc + 4AABbCc + 2AAbbCc + 2AABBCc + aabbCC + aaBBCC + 2aaBbCC + 2AabbCC +4AaBaCC +2AaBBCC + 2AABbCC + AAbbCC + AABBCC] (27 terms)

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download 1 shared allele