* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Lecture Slides - McMaster University`s Faculty of Health Sciences
Epigenetics of diabetes Type 2 wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Non-coding DNA wikipedia , lookup
Pathogenomics wikipedia , lookup
Genetic drift wikipedia , lookup
Human genome wikipedia , lookup
Gene expression programming wikipedia , lookup
Gene expression profiling wikipedia , lookup
Genomic imprinting wikipedia , lookup
Genetic testing wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Minimal genome wikipedia , lookup
Pharmacogenomics wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Human genetic variation wikipedia , lookup
Genetic engineering wikipedia , lookup
Population genetics wikipedia , lookup
Quantitative trait locus wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Behavioural genetics wikipedia , lookup
Medical genetics wikipedia , lookup
Heritability of autism wikipedia , lookup
History of genetic engineering wikipedia , lookup
Genome evolution wikipedia , lookup
Designer baby wikipedia , lookup
Heritability of IQ wikipedia , lookup
Genome (book) wikipedia , lookup
Betwixt and Between; Common and Rare Genetic Variants in Human Disease Peter Szatmari MD Offord Centre for Child Studies McMaster University McMaster Children’s Hospital Financial Disclosure The Canadian Institutes of Health Research Autism Speaks Sinneave Family Foundation Ontario Research Fund Royalties from Guildford Press No other sources of funding (stocks, industry, Big Pharma etc) Objectives What have we learned about the genetic architecture of ASD; Focus on explanatory power of common and rare variants Copy Number Variants as examples of rare risk factors Neither story provides much explanatory power So we are “betwixt and between”; what does the future hold? WGS? What is Genetic Epidemiology? The study of inherited factors in disease Combination of epidemiology and statistical genetics Uses a variety of study designs to meet its objectives Steps in Genetic Epidemiology Is the disorder familial?- family studies Is the familiality due to genetic factors?-twin and adoption studies Can candidate genes be identified? Can chromosomal susceptibility regions be identified?-GW linkage and association studies Exome and Whole genome sequencing? A disease can be genetic without being inherited The history of autism genetics thru these steps Autism spectrum disorders An heterogeneous ‘spectrum’ disorder involving deficits in 3 domains of function Social communication deficits 0.6 % to 1% prevalence 4 to 1 sex ratio, more females with severe ID Changing epidemiology; more non-autism ASD Strict autism Spectrum Changing epidemiology; less frequent ID Increasing prevalence due to better case finding Diagnostic substitution occurring medical comorbidities 25-40% 6 Family Studies RR to sibs; 5% but based on old data collected retrospectively Stoppage rules; when taken into account, sib RR increases to 10% Baby sibs studies; RR now 19% Intermediate phenotypes in another 20% Twin Studies Twin studies; traits in general population and in diagnosed twins Older studies of ASD twins; Hallmeyer et al 2011; MZ vs DZ concordance MZ vs DZ=.65 vs .05 Heritability >90% Males .58 vs.21 Females .60 vs. 27 Greater role for shared environmental factors (55%) than genetic (37%) The Genetic Architecture of ASD Some single gene disorders; TS, FraX, NF, etc (5%) Chromosomal abnormalities spread throughout the genome (5%) Kelleher III R.J and Bear M.F (2008) Cell 135, October 31, 2008 391 cytogenetically-visible breakpoints in autism Source: http://projects.tcag.ca/autism/ 1 2 3 4 5 6 7 8 9 10 11 12 13 Breakpoints Translocation (n=126) Deletion (n=128) Inversion (n=37) Duplication (n=100) 14 15 16 17 18 19 20 21 22 X Y What About the Other 90%? Little family history of autism, low risk to sibs and twins Like other genetically “complex” disorders such as CVD, epilepsy, obesity, diabetes, etc Except that effect on fertility is greater Two models of genetic complexity Common disease-common variant Common disease-rare variant The Common disease-common variant model; finding genes Candidate gene studies Genome wide linkage GWAS Common Disease-Common Variant model Non-syndromic, non-Mendelian ASD is a common disease, therefore it might be caused by common genetic variants Polygenic multifactorial model; each gene has a small to moderate effect size Many different variants with an additive effect The London Underground; de Vries Nature Medicine 15 (8) August 2009 Candidate Gene Studies ASD considered to be “caused” by neurotransmitters; 5HT, dopamine, NE Focus on genes associated with regulating those proteins Hundreds of positive results Hundreds of non-replications Small sample sizes, multiple testing of different alleles, marker density, population stratification etc Linkage Studies Common variants of moderate to large effect size Genetic (locus) homogeneity Focus on affected sib pairs and nonparametric models Linkage; Parametric Methods Based on non-independent segregation of genetic markers and disease alleles Developed for Mendelian disorders “Log of the odds” of linkage vs no linkage (>3.0 is significant) Need dense families Accurate classification is essential Must specify a genetic model (gene frequency, mode of transmission, penetrance) Non Parametric methods Degree of allele sharing among affected relatives, most commonly sibs Sibs share 0,1 and 2 alleles at 25%, 50% and 25% Is there distortion in allele sharing? Model free, less vulnerable to misclassification Major challenge is power; esp when there is genetic (locus) heterogeneity! Common Disorder/Common Variant Linkage Studies in ASD Many genome wide linkage studies using affected sib pairs (using non-parametric methods) Each with sample size 50 to 400 Many significant linkage peaks but few are replicable Conclusion; disorder is so heterogeneous and effect of common variant so small we need very large sample sizes Autism Genome Project Phase I Affymetrix 10k SNP genotype data Linkage analysis in 1146 multiplex autism families Initial scan for CNV Phase II Illumina 1M SNP genotype data High-resolution scan for de novo and inherited CNV Genome-wide association analysis Molecular studies of candidate loci Linkage Peaks Stratified by Sex Problems with Linkage for Complex Disorders Very sensitive to locus heterogeneity Low power for loci of small to moderate effects Very sensitive to misclassification of phenotype Turn to GWAS; much greater power than linkage for alleles of small effect Genome Wide Association Studies (GWAS) 1 Million genetic markers (SNP’s are biallelic markers) Which markers in which genes are more common in children with ASD than expected? Trio based or case-control Are those markers located in genes (or in LD with genes) that are expressed in brain? GWAS Very successful if MAF>5% 500 SNP’s (genetic markers) associated with many common diseases Eg Type 2 diabetes; 5000 cases and 5000 controls 18 SNP’s associated with type 2 diabetes (OR=1.09 to 1.37) Explain 6% of the heritability Actual causal variant not discovered GWAS Wang et al (2009); cadherin genes at 5p14 Ma et al (2009); also at 5p14 but only in secondary analysis Weiss et al (2009) 5p15 at SEMA5A Anney et al (2010) MACROD2 All Ancestry − Autism Dx − Additive Model MACROD2 All Ancestry − ASD Dx − Additive Model MACROD2 Bottom Line of GWAS? One SNP barely reaches GWS No subtype or ASD quantitative trait reaches GWS (especially if correct for multiple testing) None of the other results can be replicated But beware of the “Winner’s Curse”! GWAS very sensitive to allele frequency and allelic heterogeneity Power curves 2.0 1.8 1.6 Odds Ratio Power 1.4 1.2 Risk allele frequency Largest sample evaluated in Stage 1 N = 1385 ASD subjects 1.6- 2.0 1.4 Odds Ratio Power 1.2 Risk allele frequency The Argument for the Common Variant Model We should be studying more “familial cases” We should be using intermediate phenotypes, quantitative traits We should be looking at gene X gene, gene X environment interactions We should be looking at parent of origin effects We should ignore p-values and instead rank order SNP’s All true, next generation of GWAS The Argument Against the Common Variant Model ASD is associated with reduced fertility New variants must arise de novo that are risk factors to keep prevalence stable If they are new they are rare Each person carries on average 175 de novo mutations, deletions, duplications that are mostly benign If a deleterious variant occurs in a brain expressed gene? Might cause ASD Is ASD a Common Disease/Rare Variant? ASD a disorder with reduced fertility De novo mechanisms of causation (like a spontaneous mutation) These will necessarily be rare until they diffuse thru the population What is a Rare Event? Frequency of risk factor<1% Variation in DNA sequence that affects protein coding SNP; biallelic marker (by itself or in LD with a DNA sequence) Structural variant; chromosomal abnormality (ie a CNV, insertions, duplications, translocations etc) But they might have a big effect size Slide courtesy of Dr. C. Marshall The Boston Underground; de Vries Nature Medicine 15(8) August 2009 What are Copy Number Variants (CNV’s)? Variations in DNA segments >1kb Deletions, insertions, duplications, others Rare or common; inherited from parents or arise de novo? If CNV overlaps a gene expressed in brain, AND it disrupts the function of that gene, it could lead to ASD Copy Number Variation (CNV) Deletion Duplication “CNV refers to DNA segments for which copy number differences have been observed in the comparison of two or more genomes” Slide courtesy of Dr. C. Marshall Lee and Scherer, Expert Reviews in Mol. Med. 2010 Slide courtesy of Julie Cohen, ScM, CGC, Kennedy Krieger Institute Copy Number Variations (CNVs) • • • We all have them! Most of them do not harm us Most of them we inherited from our parents Rare Variants in ASD What is the evidence that rare variants, as measured by CNV’s, play a role in ASD? Simple comparison of “global burden” of brain expressed CNVs or previously implicated CNVs in ASD vs controls Autism Genome Project Collaboration of 13 research groups Pooling of families (1500 families) Common genotyping (1M SNP’s) and clinical measures (ADI/ADOS) for all affected sib pairs Funded by Autism Speaks, CIHR, Genome Canada, UK MRC, HRB (Ireland) Global burden for rare CNVs in cases vs. controls 3 measures: • CNV rate • Estimated size • CNV location and # of genes affected * PLINK v. 1.07, genome-wide P values, one-sided tests, 100,000 permutations *Pcorr, controlled for global case-control differences, logistic regression 48 CNV burden in known ASD and/or ID genes n=46 n=127 n=103 Enrichment of genic-CNVs in known ASD and ID loci (1.69 fold, P= 3.4 x 10-4) Genes in which CNV’s have been replicated Neuroligin 3 and 4 Neurexin Shank2 and Shank3 Contactin associated protein 2 PTCHD1 Large region on chromosome 16p11 New ones reported each week! Each one seen in <1% of cases Range of effects; linked in common networks, Walsh C.A., Morrow E.M. and Rubenstein J.L (2008) Cell 135, October 31, 2008 Functional Enrichment Gene-set Map for ASD ASD and ID risk genes may be linked in a connected pathway Familial segregation - examples 5444 G / -- 829 kb dup 64 Kb del 2 adjacent 17q25.3 de novo CNVs de novo del 17q25.3: SLC16A3, CSNK1D de novo dup 17q25.3, 829Kb, 37 genes 5298 5290 -- / T T/G -- / G 121kb del 121kb del 121kb del 791 kb dup de novo CNV dup 8p23.3, 791kb, disrupts DLGAP2 maternal Xp22.11 del in males DDX53/ PTCHD1AS (non-coding RNA for PTCHD1) maternal missense mutation Xp21.3, IL1RAPL1 (A117S, 349G>T) 1/325 cases; 0/250 controls *In red if there is previous evidence suggesting gene involvement in ASD or ID 54 MM0088 – MPX family. Proband has 676 kb de novo loss at 16p11.2 SK0102 – SPX family. Proband has 432 kb de novo gain at 16p11.2 SK0019 – SPX family. Proband has 676 kb de novo loss at 16p11.2 MM0088 676 kb loss SK0102 432 kb gain SK0019 676 kb loss III. What does a de novo change mean in a complex disord MPX #62346: SPX #HSC0215: De novo 1.2 Mb deletion at 3p25.1, De novo 1 Mb deletion at 1p21.3 3.4 Mb deletion at 5p15, t(5;7)(p15;p13) Inherited t(19;21)(p13.q22.1) t(19;21) t(19;21) PDD AD del 1p21.3 t(19;21) AD del 3p25.1 del 5p15 Prefer multiple lines of evidence supporting locus involvem MM0160/MM1470-72 [SHANK1 deletion] ? ? MM0160_007 lymphocyte 64kb del 11 5 MM1470_002 blood no del MM0160_001 blood 64kb del ? 11 MM1470_004 saliva 64kb del Asperger 12 7c 6 MM0160_005 blood 64kb del MM0160_006 blood no del 13 MM1470_005 MM1470_003 saliva blood no del 64kb del c3 2 MM0160_002 blood no del 4c MM1472_002 blood no del c 14 MM1472_003 blood no del MM0160_008 lymphocyte no loss 8 10 c 9 MM0160_003 MM0160_004 no del (from old DNA) blood blood 64kb del no del MM1471_002 refused blood collection 15 MM1471_003 saliva no del 16 CNV’s in ASD More de novo CNV’s in genes implicated in ASD and ID, OR=1.69; 7% of cases vs 4% of controls Population attributal risk is 3% Discovered functional networks of genes In ASD, a shift from neurotransmitters to synaptic genes Same CNV’s seen in ID, epilepsy, ADHD, schizophrenia, BAD (?) Next Generation of Studies Search for rare inherited variants thru linkage CNV’s smaller than 1KB More complicated structural rearrangements Whole exome and whole genome sequencing Current efforts at WGS in ASD identifying variants in another 10-15%? Rare mutations common in unaffected controls as well Challenges Annotation of functional significance of variants Determination of “causation” when risk factor is rare and disorder is multifactorial Are the health benefits of identifying rare genetic variants worth the cost? Diagnostics and therapeutics? Heterogeneity is the main obstacle Recent findings from WGS Rare variants are common; due to populaiton overgrowth and weak purifying selection Most SNV in the genome are rare >90% of SNVs detected to be functionally relevant were rare But it will take huge sample sizes to detect the majority of rare variants involved in disease mechanisms. A final twist! A final twist! A final twist! Conclusions Data a mix of many genetic subgroups Conclusions Draw a sample May get lucky and catch lots of the "orange" type Conclusions Draw a new sample = reshuffling the mix And now the disorder looks green conclusions ASD is a complex genetic disorder with more complexity than previously imagined Many rare, de novo, variants account for an increasing proportion of cases Low hanging fruit in ASD genetics Common vs rare variant models an oversimplification Many unanswered questions remain