Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
National Cancer Institute U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES National Institutes of Health Importance of Semantics in Precision Oncology at NCI Sherri de Coronado, MS, MBA NCI CBIIT May 15, 2015 Mind Map of Precision Oncology Space May 12 2015 + Reusable +BD2K +BD2K Semantics Related Opportunities New Genomic Data Sharing Policy • The new Genomic Data Sharing (GDS) Policy was released in draft form in September 2013 (NOT-OD-13-119) • Draft Policy put out in Federal Register for a 60-day public comment period • November 2013 public comments collected by the Office of Science Policy. Policy modified with feedback from the IC Directors and NIH GWAS data sharing Governance committees (TSDS, PPDM, SOC) • The final Genomic Data Sharing (GDS) Policy was released August 27 2014 (NOT-OD-14-124) Trans-NCI Data Sharing WG • Responsible for the activities necessary for the Institute to implement and maintain the GDS policy framework • • • Develop a plan & recommend any resources needed Propose governance needs Develop and disseminate materials for implementation • Focus Areas • Data Standards: Define baseline expectations, including data types & timelines • Process: Develop processes and resources facilitate implementation and compliance. • Resources: Consider all resource needs to implement and oversee policy expectations. • Governance: Consider governance needs and procedures for adjudication of implementation issues, and oversight. Extending Genomic Data Sharing Policies GWAS Policy Scope Consent Standard -Existing* Collections *Before the effective date of the GDS policy Consent Standard – Future* Collections Applies to human GWAS data Applies to all genomic data types, human and non-human If research consent, IRB reviews for consistency. If no research consent exists, data may still be submitted to NIH databases. Same N/A Samples or cell lines should be consented for research use and broad data sharing. Exceptions can be requested. *After the effective date of the GDS policy Data Data submitted as soon as quality Submission control procedures are completed Data Release GDS Policy Immediate data release. 12 month publication embargo Timelines vary by data type, but generally as soon quality control procedures are complete 6 month deferral of data release. No publication embargo New NCI MATCH TRIAL "Precision Medicine uses genetic information from a person’s cancer to determine a patient’s treatment with a treatment targeted to that particular genetic abnormality." NCI MATCH trial • Question: Can molecular markers predict response to targeted therapies in patients with advanced cancer resistant to standard treatment? • Biopsies from tumors from up to 3,000 patients to undergo DNA/RNA extraction; assay workflow to identify actionable mutations. • ECOG-ACRIN leading study with NCI; Multiple arms, matching particular molecular profile to specific available drugs. • Objectives: Assess response and time to progression based on tumor profile, regardless of tumor origin. TCGA History • About three years post-Human Genome Project – Large scale tumor profiling in a systematic way. • Initiated in 2005, pilots 2006, extend 2009 • Collaboration of NHGRI and NCI to examine GBM, Lung and Ovarian cancer using genomic techniques in 2006. • Expanded to 20+ tumor types TCGA Drivers • Provide high quality reference sets for 20+ tissue types • Provide a platform for systems biology and hypothesis generation • Provide a test bed for understanding the real world implications of consent and data access policies on genomic and clinical data. • Now, data collection over, but MANY users and many pan cancer and other papers. (>2700) • Kinds of questions we want to ask and CAN ask have changed and grown. 13 Genomic Data Commons (GDC) • In transition from The Cancer Genome Atlas (TCGA) to GDC, a Commons to host TCGA, TARGET and other future genomic data sets • University of Chicago and NCI collaborating to initiate the Genomic Data Commons (GDC), (Robert Grossman, Dir) • To enable any researcher to test their ideas, to bring their analytics to the data. NCI Cancer Genomics Data Commons ... Genomic + clinical data GDC Cancer information donor NCI Genomics Data Commons NCI Genomic Data Commons • Unified repository for cancer genomics data – Accept from both NCI Center for Cancer Genomics (CCG) and external projects – Including submissions from small laboratories • Unifying repository for cancer genomics data – Perform reproducible, consistent bioinformatics pipelines to generate standard higher-level data (e.g., tumor variant calls) – Pipelines designed and updated with community input to represent the best practices of the field • The availability of genomic data will make it possible for researchers to better classify disease. GDC Context From: Mark Jensen GDC GDC ConOps From: Mark Jensen, GDC Clinical Data at GDC • Key issues: – Low barriers to data submission • Minimal number of required data elements – Ongoing curation and semantic assignment • Balance acceptance of submitter-provided semantic information with GDC curation – Provide cross-project searches over clinical data elements to filter genomic data • Allow users acquire data intuitively, but also provide semantic sources and IDs as available • Ideal: – Expose clinical data intuitively, but manage with rigorous semantic information Cancer Genome Cloud Pilots Three pilots, initiated Fall 2014, to be public "cancer knowledge clouds" in which data repositories would be co-located with advanced computing resources. • • • • Broad Institute, UCSC, UC Berkeley ISB-led team, Google, SRA Seven Bridges Genomics Begin piloting components and gathering feedback required by Jan 2016 Cancer Genome Cloud Pilots • Goals: – democratize access to large-scale data repositories and – computational infrastructure – co-locate data and compute to minimize unnecessary data transfer – integrate public and private datasets – allow web-based exploration of hosted data – transform and accelerate collaborative cancer research Cancer Genome Cloud Pilots • People can register at any or all of these sites, if they are interested in getting involved: • Seven Bridges cancergenomicscloud.org • Broad Firecloud.org • Institute for Systems Biology cgc.systemsbiology.net Precision Medicine Opportunities involve Semantics The era of precision medicine and precision oncology is predicated on the integration of research, care, and molecular medicine and the availability of data for modeling, risk analysis, and optimal care Warren Kibbe The promise of precision medicine will only be fully realized if the research community can adapt its clinical trials methodology to study molecularly characterized tumors instead of the traditional histologic classification. » Abrams et al, National Cancer Institute's Precision Medicine Initiatives for the New National Clinical Trials Network, 2014 Semantic Opportunities: Heard from this meeting and beyond • Imaging – – – Pathology Imaging ontology gaps - terms/formal defs to characterize histopathology images and algorithms. NLP effort to automate image annotation with ontologies to create metadata for large image collections by training classifiers. QHIO- terms/relationships whole lifecycle of images • Proteomics, Chris Kinsinger, CPTAC – better clinical biospecimen annotation • Cancer Phenotypes – Cohorts/ finding patients – Cancer Pathology Protocol changes • Modeling tumor micro environments – integration of multiscale cancer data –effort to model cancer state as an ecological problem • Cancer classification • Data Needs vs Ontological Classification • Pan Cancer analyses can be improved using DO (Hive) Semantic Opportunities (2): Heard from this meeting and beyond • Tools/ Resources/Standards – Getting usable, effective, efficient software into peoples hands will increase uptake of semantically well described metadata, terms and ontologies, and better integration of metadata and terminology – Integrated use of a variety of ontologies – Ways to manage research and clinical data streams, bridge – Tools to help harmonize/ use/ metadata and terminology – Provenance – use of checklists early on. Bottom up. – Research Commons Thank you Sherri de Coronado [email protected] Thanks to content contributors: Gilberto Fragoso, Mark Jensen, Warren Kibbe, Juli Klemm, Elizabeth Gillanders and others.