* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Introduction
Genealogical DNA test wikipedia , lookup
Metagenomics wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Frameshift mutation wikipedia , lookup
Genetic engineering wikipedia , lookup
Non-coding RNA wikipedia , lookup
History of genetic engineering wikipedia , lookup
Human genome wikipedia , lookup
Gene expression profiling wikipedia , lookup
Molecular Inversion Probe wikipedia , lookup
Protein moonlighting wikipedia , lookup
Cancer epigenetics wikipedia , lookup
Gene desert wikipedia , lookup
Fetal origins hypothesis wikipedia , lookup
Genome evolution wikipedia , lookup
Gene therapy wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Oncogenomics wikipedia , lookup
Primary transcript wikipedia , lookup
Non-coding DNA wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Gene nomenclature wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Genome (book) wikipedia , lookup
Microevolution wikipedia , lookup
Point mutation wikipedia , lookup
Designer baby wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Neuronal ceroid lipofuscinosis wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Helitron (biology) wikipedia , lookup
Public health genomics wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Presented by: Andrew McMurry Boston University Bioinformatics Children’s Hospital Informatics Program Harvard Medical School Center for BioMedical Informatics This Presentation Available at: http://pixelshelf.com/~justandy/f-snp.ppt Outline Incidental Findings and Disconnected Patient Cohorts Disease Association Studies Using SNPs How SNPs cause disease Computationally predict affect of SNPs within introns, exons, and regulatory regions The Future Is Now: SNPs, Personalized Medicine, and Translational Research Incidental Findings and Disconnected Patient Cohorts IF the central dogma of Biology is: “From DNA ->RNA ->Protein” THEN where is the patient data for association studies? Very little patient data spanning DNA/RNA/ protein/phenotype across a single cohort Need to obtain “robust” sample sizes to avoid incidental findings due to multiple testing [1] [1] Isaac Kohane, Daniel Masys, and Russ Altman. "The Incidentalome: A Threat to Genomic Medicine" JAMA 296(2): 212-215. July 12, 2006. Disease Association Studies Using SNPs DNA sequencing technologies still very expensive Stunningly few patients Minimal sequence coverage Could change in time with Solexa/454 Even with solexa/454 there is a massive task of piecing together the results (often max sequence read shorter than single repeated gene) Rate limiting step: Adoption rate of DNA sequencing Use what is available in abundance! SNP chips Abundance of SNP chips in public repos on many diseases Whole genome coverage 500k SNPs for $250 Disease Association Studies Using SNPs DNA to RNA to Protein Associating DNA & RNA GEO alone well over 100k Gene Expression Arrays What if we could correlate SNPs affect on Gene Expression? Associating DNA & Gene Product (protein) Countless public protein databases What if we could correlate SNPs affect on Protein Coding? Association studies involving multiple genomic measurements What are the existing studies and models (HMMs/Bayes nets) that could be strengthened with evidence from SNP chips? How SNPs cause disease Intron Protein Coding • • Incorrect final mRNA transcript Transcriptional Regulation • • Missense • Synonymous Same Amino Acid • Non Synonymous Different Amino Acid Nonsense • Premature STOP Splicing Regulation • • Likely no affect Differential gene expression Post Translational • Protein phosphorylation So how do we measure all these affects of SNPs? F-SNP : integrated approach 1. Classify SNP site using dbSNP • • • • • Intron Coding Region Splice Site TF binding Site Post-Translational Site 2. Evaluate using the specialized algorithms/dbs • • • • 3. Coding region Splice Site TF binding Site Post-Translational Site (missense/nonsense mutations) (intronic/exonic sites) (promoter/repressor/etc) (Phospho/Tyrosine/0-glycosylation) “Majority Vote” across algorithms F-SNP decision procedure for functional SNPs F-SNP: User Interfaces & Data Download Public Web Site Federated Query = entire database cannot be downloaded Currently: no SOAP (webservice) support no RSS support No source code available However: Paper gives explicit instructions on how to reproduce the algorithm and construct the database using dbSNP, OMIM, etc. “Large N Study” using F-SNP Functional Category # of Assessed SNPs # of Functional SNPs Protein Coding 154,140 66,899 Splicing Regulation 73,051 8,075 Transcriptional Regulation 453,710 78,296 Post Translation 64,736 4,477 Total 559,322 115,356 Evaluate Individual SNP (rs28897699) SNP summary and Functional Predictions SNP Primary Information (rs28897699) Locus Alleles Ancestral Allele Validation (if any) Region Link to References F-SNP: Functional Predictions F-SNP Prediction Detail: PolyPhen = benign affect on protein coding F-SNP Prediction Detail: SNPs3D = deleterious to protein coding NCBI Gene Information Product breast cancer 1, early onset Other names,BRCA1,BRCAI,BRCC1,IRIS,PSCP,RNF53 NCBI Entrez Gene Summary: This gene encodes a nuclear phosphoprotein that plays a role in maintaining genomic stability and acts as a tumor suppressor. (…) Mutations in this gene are responsible for approximately 40% of inherited breast cancers and more than 80% of inherited breast and ovarian cancers. Alternative splicing plays a role in modulating the subcellularlocalization and physiological function of this gene. Many alternatively spliced transcript variants have been described for this gene but only some have had their full-length natures identified. (…) F-SNP functional prediction on Protein Coding 2 votes benign, 1 deleterious, 1 nonsynonymous on Splicing Regulation predicted functional impact (by majority vote) Gene level view of BRCA1 Query by gene name = “BRCA1” Returns list of SNPs in BRCA1 Returns list of Cancers associated with BRCA1 Gene level view of BRCA1 our SNP has functional impact our SNP has neighboring functional SNPS Disease Level View : Breast Cancer Disease Level View : Breast Cancer Show all disease genes associated with breast cancer Denote if SNPs are present in those genes (5k up/downstream) Recap of Disease Level View The Future Is Now: SNPs, Personalized Medicine, and Translational Research SNP profiling becoming part of routine care [2] Increase # of clinically annotated SNP chips Increase # of disease association studies using SNPs Increase in NIH focus on “translational research” that bridges routine care delivery with research efforts Genome Wide Association Studies (GWAS) that actually get funded [2] Kohane IS, Mandl KD, Taylor PL, Holm IA, Nigrin DJ, Kunkel “LM. Medicine. Reestablishing the researcher-patient compact.” Science. 2007 Nov 16;318(5853):1068. F-SNP Summary Incidental Findings and Disconnected Patient Cohorts Central dogma of biology DNA->RNA-Protein, yet we lack cohort spans all measurements Using limited sample size will inevitably lead to incidental outcomes Disease Association Studies Using SNPs Don’t wait for DNA sequencing to become widespread SNPs are becoming an abundant resource and not going to disappear How SNPs cause disease Protein Coding Splicing Regulation Transcription Regulation Post Translation Computationally predict affect of SNPs within introns, exons, and regulatory regions Multitude of existing SNP analysis tools and resources F-SNP provides a single web based resource to mine SNP disease associations Query and analysis by SNP, Gene, Disease The role of SNPs in Personalized Medicine & and Translational Research