Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Post-Analytic Clinical Informatics for Molecular and Genomic Medicine Federico A. Monzon M.D. Pathology Informatics Summit 2014 Disclosures • Employment − Cancer Genetics Laboratory, Baylor College of Medicine o − Invitae Corporation o • Up to Aug 2013 Since Sept 2013 Advisory Board − Complete Genomics (April 2012 – June 2013) Agenda What goes into interpretation of NGS data? • Information challenges for interpretation of NGS results • Challenges and opportunities for reporting and delivery of NGS results • Pathologists https://www.invitae.com/en/news/2013/07/03/invitae-blog-jill-hagenkord-are-physicians-ready/ You can ride the Genetics/Genomics wave!!! And Informatics is your surf board! http://photofunnypicture.com/hd-nature-desktop-wallpapers/surfing-group-wallpaper-hd/ BCM Mercury pipeline for Cancer Exomes Variant call format (.vcf) Jeffrey Reid Review and Interpretation Pipeline Confirmation: Report Sanger seq AmpliSeq CAST PCR QC, Interpret and Rank Jeffrey Reid Tracking test/analysis handoffs • Excel, email, etc… • Temporary solution for low volume but is NOT scalable Paul Lurix Bird's Eye View of NGS Data Workflow Reece Hart Interpretation • Determine the significance of the mutation in the context of: − − − − − − − − • Quality of sequencing data Frequency of variant call Patient’s phenotype Family history Mutated gene(s) (assoc with disease/phenotype) Type of mutation (silent vs nonsense/missense/splicing) Mutation previously reported in disease/tumor Location of mutation in the protein Automated vs manual review/curation approaches • Apply interpretation guidelines (ACMG/CAP/AMP) Variant Call Format (VCF) File • Stores variation information for one or more samples based on genomic location • Important! Version of the genome reference is essential. Currently GRCh37 or hg19 (UCSC) for clinical use, but GRCh38/hg38 is now available External and Internal Databases 12 6/13/2014 | Copyright © InVitae, Inc. All Rights Reserved | CONFIDENTIAL Annotation databases (examples) • dbSNP: reported yes/no, clinical significance – broad / less specific • HGMD: focused on association with inherited disease • COSMIC: focused on association with with cancer (mostly somatic) • BIC: focused on breast cancer genes • My Cancer Genome: focused on therapeutic significance All of these have caveats!!! Annotated VCF • • • • Variant frequency and quality, prediction algorithms Frequency and association with disease: dbSNP, HGMD, OMIM, etc.. Other external annotations: COSMIC, ClinVar, etc.. Internal annotations: Seen before? What did we call it? How to present the information for interpretation? Cancer Exome Dashboard at BCM Variant Review and Reporting @ Invitae Alamut Alamut Integrative Genomics Viewer (IGV) http://www.broadinstitute.org/igv/ Integrative Genomics Viewer (IGV) http://www.broadinstitute.org/igv/ Commercial software available Things to consider • Normal and Tumor for cancer applications • Quality Metrics are important • Calling mutation − Which transcript to use? Mapping & Alignment Issues in a Nutshell A ➊ gap - Transcript ≠ Reference ➋ InDel – downstream coordinates shifted T NCBI UCSC ➌ Exon coordinate discrepancies across sources NCBI ➍ Historical transcripts no longer available 25/4 Reece Hart Example: rs11340767 (RND3) Source AC Reference exons EUtils NM_005168.3 GRCh37.p10 1146 / 125 / 320 / 1998 NM_005168.4 NG_008492.1 1398 / 125 / 320 / 1998 seqgene NM_005168.3 GRCh37.p10 102 / 1046 / 125 / 321 / 143 / 1855 UCSC NM_005168.4 hg19 1398 / 135 / 244 / 76 / 1997 26/4 Reece Hart Example: PECAM1 Coordinate Discrepancy UCSC and NCBI coordinates for exon 1 of PECAM1 differ by 2.5kb Ensembl says the whole gene is in a patch 16kb away. 27/4 Reece Hart Shared tools to deal with some of these issues Universal Transcript Archive An archive of all (recent) versions of transcripts, from multiple sources, and multiple alignment methods. http://bitbucket.org/invitae/uta/ HGVS Parser, Mapper, Validator, Formatter Python tools for manipulating HGVS, including mapping between transcripts and reference, inferring protein consequence, and lifting over between transcripts. http://bitbucket.org/invitae/hgvs/ Reece Hart Interpretation Issues • Limited evidence on the clinical utility of a specific mutation or gene − − Genes with mild/moderate penetrance: magnitude of risk? What to do? Few mutations are listed in consensus management guidelines (NCCN, ASCO, etc) − Few institutional and national efforts to gather and curate evidence (e.g. www.mycancergenome.org) • Clinical significance of well-studied mutations in a different tumor type BRAF V600E mutation – key in metastatic melanoma. Significance in breast cancer? − Requires pathologists to research the evidence of clinical utility in order to issue a clinically relevant interpretation. − • Clinical significance for novel mutations in targetable genes − − Unreported mutation in EGFR Do they confer susceptibility to Geftifinib/Erlotinib Limitations of Current LIS/EMR structure Assay Information • LISs are not ready to deal with genomic information. Single analyte vs. multiparametric assays − Performance characteristics of the assay. − • For example, for a specific reported mutation one should store: sequence information, sequencing depth and quality, location, genome build, gene transcript evaluated, technology used, etc. • For a negative result, one would should store: sequencing depth/quality, regions interrogated (or not) Limitations of Current LIS/EMR structure Reporting • We have information communication standards that do not support data formatting and metadata (data associated to the result) and thus we need to “dumb-down” the result into text files and or tables in order to be reported. • Limits ability to convey information in a graphical or interactive manner and to provide access to additional information about assay or the clinical relevance of the result. • As it currently stands, text reporting of NGS-based assay results is suboptimal. Report Example – Leukemia Panel What did you find? What does this mean for my patient? What can I do next? What is the evidence? NGS Reporting Wish List • Ability to deliver report metadata − What was covered and how well? o − What was missed? o − What are the sources? How many patients studied? Similar patients to mine? What were the outcome measures?..... Is there new information about a reported change? o • Was gene XYZ adequately covered? What is the evidence for the interpretation? o − Assay design information, historical performance, individual run performance Access to “just in time” information HIS that can handle molecular and genomic data! Is my gene of interest covered? How well? Approaches with the current limitations of LIS/EMR structure • Summarize key aspects of the results and provide high-level background and “on request” access to more detailed information Short concise report with key findings (as example before) − Feedback from oncologists indicate this is preferred − • Report all available and potentially useful information − • Long report with extensive graphical and bibliographic information Do something new The eReport • BCM iPad Exome Reporting app BCM iPad Exome Reporting app Integration to Pathology Report • In Diagnostic section, Comment or Addendum Does it change tumor classification? − Does it impact clinical management? − • Currently at Texas Children’s Hospital the Cancer Exome Report is a stand alone report in the EMR − Pathologists may or may not issue a comment/addendum • How to best fit this information into pathology reports? • AP-LISs will need modifications to accommodate reports from NGS data Raw Data Release • Controversial point • Patient owns their genome? • Can physicians have access to all raw data regardless of their capacity to analyze and interpret? • Clash of models: – Patient only has access to interpreted results through physician (physician knows best) – Patient owns their data and can decide on how best to use it (patient knows best) • Ethical, social and legal implications In Closing • Challenges in bioinformatics pipeline for consistent sequence re-alignments and cross-check of T/N sequences • Challenges in the access of relevant information for gene/mutation • Challenges in the amount of clinical evidence for interpretation • Challenges in our understanding of biology to make sense of all observed variation. • Commercial and open source tools are − − • Most have been developed for research use Increasing number of tools suitable for clinical use These are not new challenges, we encounter these issues in all “genomic” technologies (panels, arrays, etc..) • These are actually opportunities to improve our management of genomic data Conclusions • Interpretation and reporting of NGS data requires the evaluation and availability of different sources of information − This information needs to available to the clinical care team with informatics tools that conform to the clinical environment − Need to develop tools that provide the geneticist/pathologist with the information needed to provide adequate interpretation of genetic variants • Web enabled reporting technologies are a potential solution to enable graphical and interactive display of NGS results − − Implementation of these solutions in a HIPAA compliant manner Interactive reports can present the information in different ways to members of the healthcare team and patient − Links to internal and external information sources that allow members of the healthcare team to further explore the results and the evidence used to guide the interpretation • As adoption of clinical sequencing kicks into high gear, molecular diagnosticians will be faced with managing genomic information and producing high-content reports that supports clinical decision making Acknowledgements • Senior Leadership – Arthur Beaudet, Richard Gibbs, Jim Lupski, Sharon Plon, David Wheeler, Kent Osborne, Martha Mims, Jim Versalovic, Tom Wheeler • BASIC3 Leadership – Sharon Plon, Will Parsons, Amy McGuire... • CGL, TCH Pathology – Marilyn Li, Liu Liu, Angshumoy Roy, Lola Lopez-Terrada… • Whole Genome Laboratory / HGSC / MGL – Yaping Yang, Donna Muzny, Christine Eng, Jeff Reid, Matthew Bainbridge, Peter Pham, Doreen Ng…. WHOLE EXOME SEQUENCING In Vitae Team CROP Team: Jill Hagenkord, Scott Topper, Jon Sorenson, Tim Chiu, Steven Ciraolo, Geoff Nilsen, David Pirkle, and Emily Hare Questions? invitae.com [email protected] 415.374.7782 Federico A. Monzon M.D., FCAP [email protected]