Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
De novo mutations in psychiatric disorders; a New Paradigm Simon L. Girard, [email protected] Université de Montréal Schizophrenia 2 Genetics of Schizophrenia Girard et al. COGEDE 2011 3 Reduced reproductive fitness Rates of reproduction are significantly reduced in SCZ = negative selection that should reduce the number of mutant alleles in the population. However, SCZ has been maintained at a constant high prevalence worldwide. Two possible explanations: There is a strong positive selection New disease alleles are continuously generated through de novo mutations The relatively uniform high worldwide incidence of SCZ across a wide range of environments argues against drift or positive selection. De novo mutations, which continually add disease alleles to the population, provides a possible explanation. Our hypothesis Common SNPs doesn’t work De novo (rare) CNV does work • Why don’t we look for small de novo (rare) DNA polymorphism (DNAp)? 5 S2D- Project Overview Pool of available patients Databases PubMed Selection criteria 1,000 synaptic genes (12 fragments/gene) 1,370 SCZ 440 ASD 731 MR 143 SCZ 142 ASD 95 NSMR PCR 380 patients + 4 controls Direct re-sequencing 4,560,000 fragments Variant Detection Genetic Validation Biological (functional) validation Worm Fly Fish Validated Genes Mouse 23 genes De Novo mutations in Schizophrenia GENE Mutation Type Mutation Location AA change NRXN1 INDEL CODING G140DfsX29 MAP2K1 INTRONIC INTRONIC Within intron SHANK3 NONSENSE CODING R1117X SHANK3 MISSENSE CODING R536W KIF17 NONSENSE CODING Y575X BSN SILENT CODING V1665V ATP2B4 SILENT CODING N195N Small DNAp de novo study • Population design : Family Trios • Rationale : Look for all variants present in proband but absent in either of the parents • Case selection : Sporadic Schziphrenia • Proband : DSM-IV criteria for schizophrenia (DIGS) • Parents : Clear of any mental disorders (FIGS) • Population : All patients were recruited in France, through a consortium (MO Krebs) • In total : 14 trios (42 individuals) • Probands : 7 M / 7 F 10 Experimental Design • High throughput sequencing • Exome Capture (Agilent SureSelect 38MB) • Sequencing on GAIIx (one sample by lane) • Bioinformatics analysis • Read mapping and storage: BWA and Samtools • SNP-calling : Varscan • Low stringency for parents • High stringency for probands • Annotation : Annovar • Segregation analysis • Priorization • In total 73 variants were kept for validation (sanger sequencing) 11 Girard et al. Nat Gen (2011) Technical challenge : The high number of false positive Fraction of SNVs found in 1K genome project Fraction of SNVs with a coverage > 4x % of mutation with a cov> 4x Fraction of mutation found in 1KGP De novo mutation are sporadic event seen in only one individual; they are usually mistaken for a False Positve 100.00% 90.00% 80.00% 70.00% 60.00% 50.00% 40.00% 30.00% 20.00% 10.00% 0.00% 90.00% 80.00% 70.00% 60.00% 50.00% 40.00% 30.00% 20.00% 10.00% 0.00% 1 2 3 4 Number of individual carrying the mutation 5 1 2 3 4 5 Number of individual carrying the mutation It is very important to set an appropriated threshold in order to restrict the number of candidate de novo to validate Technical challenge : Use of an appropriate control dataset Due to technical error (false negative in parents), it is important to use an external control dataset True DNM 0,0003% Low Qual Variant 3% False Positive 0,002% Additional Control 5% Found in parent 92% Systematic challenge : How to distinguish between a benign and a pathogenic de novo mutation Once true de novo mutations are identified, many challenges remains, notably how to select which mutations are linked to diseases. Many suggested approach : • Establish a mutation prediction profile using amino acid changes and compare against a neutral database (Vissers et al. Nat Gen 2010) • Comparison of the mutation against a simulated profile made using control exomes (O’Roak et al. Nat Gen 2011) • Comparison of the ratio of protein truncating variants against a neutral database and a pathogenic database (Girard et al. Nat Gen 2011, based on Awadalla et al. AJHG 2010) Additionnal approach could include : • Systems biology approach : Network of genes harboring de novo mutations • Additionnal screening of each gene harboring de novo mutations in a disease population Girard et al. Nature Genetics 2011 Girard et al. Nature Genetics 2011 The de novo mutation rate in SCZ 18 The DNM rate amongst SCZ patients • Reason #1 : The DNM rate • 𝐷𝑁𝑀𝑟 = 1.1 × 10−8 • To estimate our DNMr : • Cross-referenced regions from the Agilent Probe Sheet with the CCDS • ~ 31 Mb / individuals • A total of 289 Mb screened in 14 individuals • Using the standard DNMr rate, we would expect ~ 6.87 DNM • SCZ cohort DNMr : 2.42 x 10-8 • Binomial test indicates that the number of DNM observed in our study differs significantly • p-value = 0.007736, • CI 95% = 2.6427 x 10-8 – 8.1103 x 10-8 • Conclusion #1 : The DNM rate is significantly higher in our cohort of SCZ patients Why this is interesting ? • Reason #2 : The number of nonsense variants • 4 nonsense mutation in 14 total DNM • a 4/14 ratio of NS to MS mutation is significantly higher from the expected ratio of 1/20, as calculated by Kryukov et al. (p-value = 0.004173 using a binomial test, CI 95% = 0.0838 – 0.5810) • amongst all mutations reported to cause Mendelian diseases (HGMD), the ratio of NS versus MS mutations is roughly 1/4, which is not significantly different from the 4/14 ratio observed in our study • Conclusion #2 : The high number of NS mutations suggests that at least some of them are causative Observed (SCZ) Expected (dbsnp) Nonsense Missense Nonsense Missense Validation is The Challenge • Many genes will be identified – need rapid methods to flag those that are causative • Screen more trios to find multiple de novo mutations in the same gene • Genetic validation of the genes by sequencing additional cases – rare variants mean must sequence many cases • Bioinformatic analysis to identify pathways • Biological validation of genes and pathways Epic Quote In the past two years, we have sequenced thousands of human genomes. However, not a single one of those reaches the quality of the only one we did in 2005. E. Eichler, Genome Informatics 2011 Acknowledgements Université de Montréal Guy Rouleau, Patrick Dion Julie Gauthier Anne Noreau Lan Xiong Alexandre Dionne-Laporte Dan Spiegelman Edouard Henrion, M.Sc. Ousmane Diallo Loubna Jouan Sirui Zhou Marie-Pierre Dubé RQCHP (Quebec’s High-Performance Computation group) Jonathan Ferland Suzanne Talon INSERM Marie-Odile Krebs Hong Kong Si Lok