Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Interoperability of large scale image data sets from different biological scales BioMedBridges Annual General Meeting/Mid-term Review 11 March 2014, Florence Jan Ellenberg (WP Leader) and Tanja Ninkovic on behalf of use case partners Gabriella Rustici Simon Jupp Frauke Neff Johan Lundin Collaborating BMS RIs 2 Scientific problem Cell ü Cellular phenotype ü Genetic information ü Molecular mechanism Mouse ü Tissue phenotype ü Genetic information Human ü Tissue phenotype Genome Imaging Imaging Imaging By linking these three different types of data sets, we can better understand diseases, predict novel drug targets and biomarkers 3 Cell Human Gene knockdown Disease Make data interoperable Predict disease gene/biomarker 4 Matching phenotypes in cells and tissues Prometaphase Metaphase Anaphase Graped micronucleus Cell line – gene knockdown Human cancer tissue State of the art: finding a match by chance 5 To compare and integrate image data we need interoperable standards Sample Images Volocity MVD2 Olympus OIB JPEG OME-TIFF PNG Zeiss LSM Different Leica LIF NDP DeltaVision DV file formats HDF5 No consistent phenotype annotation/ontology Assay Different image metadata Olympus OIF Automated comparative analysis of image data sets was impossible 6 To compare and integrate image data we need interoperable standards Images Volocity MVD2 Olympus OIB JPEG OME-TIFF PNG Zeiss LSM Different Leica LIF NDP DeltaVision DV file formats HDF5 No consistent phenotype annotation/ontology Different image metadata Olympus OIF • • 7 Inventory of image file formats Defined standard tools for interconversion ? • • Inventory of image metadata formats Defined standard tools for interconversion What ontologies are already available? 8 Cultured human cells Gene Ontology, Cell cycle ontology, Cell line ontology, Cell ontology, Cell culture ontology, Phenotypic Quality Ontology, Mammalian Phenotype Ontology, Fission Yeast Phenotype Ontology, Human Phenotype Ontology Mouse Histology Tissue samples Gene Ontology (BP), Cell ontology, Phenotypic Quality Ontology, Mammalian Phenotype Ontology, Mammalian pathology ontology (MPATH-Pathbase), Adult Mouse Anatomy Dictionary Human Histology Tissue samples Human Phenotype Ontology, Terminologica Histologica, Terminologica Embryologica, Human Developmental Anatomy, International Classification of Diseases (ICD), SNOMED CT, BRENDA Tissue Ontology Existing ontologies are not enough ¡ Existing ontologies either lack coverage or are incomplete to describe cellular scale phenotypes ¡ No species neutral ontology for cellular phenotypes ¡ Such ontology is needed for data interoperability Ø WP6 developed the Cellular Microscopy Phenotype Ontology (CMPO) 9 Building CMPO Cellular phenotypes: entities, processes and qualities Cellular component Cell types Gene Ontology – Biological process Size Gene Ontology – Cellular Component Shapes Cell type ontology (CTO) Biological Processes Abnormal Temporal quality Absent 10 Phenotype and trait ontology (PATO) Building CMPO Composing a phenotype description Entity a bearer of some quality + Quality characteristic of the entity Examples: ¡ Phenotype: “Large nucleus” ¡ ¡ Entity: nucleus (GO_000xxxx) Quality: large (PATO_000xxxx) ¡ Phenotype: “Cells stuck in metaphase due to metaphase arrest” ¡ ¡ 11 Entity: mitotic metaphase (GO_0000089) Quality: arrested (PATO_0000297) Cellular Microscopy Phenotype Ontology (CMPO) ¡ Species neutral ontology ¡ Relating to the whole cell, cellular components, cellular processes and cell populations ¡ Compatible with related ontology efforts (Fission Yeast Phenotype Ontology, Ascomycete Phenotype Ontology, Mammalian Phenotype Ontology) allowing for future cross species integration of phenotypic data ¡ Released in October 2013 ¡ Can be browsed at: the Ontology Lookup Service1, Bioportal2 and Github3 1 http://www.ebi.ac.uk/ontology-lookup/browse.do?ontName=CMPO 2 http://bioportal.bioontology.org/ontologies/CMPO?p=classes 3 https://github.com/EBISPOT/CMPO 12 Enabling standardised data generation Phenotator: user-friendly ontology annotation of image data Original phenotypic description Ontology based annotations http://wwwdev.ebi.ac.uk/fgpt/phenotator/ 13 Cell Human Gene knockdown Disease Integrate file formats CMPO term: graped micronucleus CMPO_0000156 Integrate metadata Apply phenotype ontology CMPO term: graped micronucleus CMPO_0000156 Predict disease gene/biomarkers 14 Ontology building using Phenotator User 1 User 4 User 2 Annotation tool Ontology Terms 1. Distribute tool to consortium members for phenotype annotation 2. Workshop on ontology development with WP6 partners 3. One to one sessions with data producers 15 User 3 Collect phenotype-ontology mappings provided by the users Build the Cellular Microscopy Phenotype Ontology (CMPO) Future plans Phenotator: Automation ¡ Semi-automated mapping of cellular phenotypes to CMPO terms 1 http://www.ebi.ac.uk/fgpt/zooma List of phenotypes provided by the user Zooma mappings to CMPO 16 Future plans CMPO: integration in existing applications ¡ Widgets1 (in collaboration with WP4) ¡ deployable in existing web applications ¡ autocomplete search boxes ¡ ontology terms are readily available in user-facing applications ¡ Integrating CMPO into FIMM’s Webmiscroscope Portal2 and EMBL’s CellBase. Data producers can utilise ontologies for annotating their data sets already at the data production stage 1http://www.ebi.ac.uk/Tools/biojs/registry/ 2http://biomedbridges.webmicroscope.net/ 17 Future plans Scientific Use Case: Correlative analysis and biomarkers prediction Cellular image datasets1 Mouse image datasets2 Human image datasets3 Data hosted by EBI Cellular Phenotype Database/ EMBL CellBase Data hosted by Webmicroscope Annotate cellular, mouse and human image datasets using CMPO Correlative analysis of now interoperable cell and tissue image datasets to predict novel biomarker candidates Novel candidate biomarker prediction Focus on cell cycle and cell division control genes 1 Mitocheck, including genetic information; www.mitocheck.org mouse lines, cancer models by GMC, PREDECT, International Mouse Phenotyping Consortium 3 Webmicroscope cancer tissue collection 2 Helmholtz’s 18 Correlative analysis and biomarker prediction Candidate biomarker genes from cellular tumor suppressor screens have been identified: ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ MLL3 PAPPA SF3B1 PRPF8 CENPE CIT ASPM ESPL1 DYNC1H1 ASCC3 KIF4A Mouse and human tissue WP6 partners are looking for and/or generating the data relevant to these genes to be used for analysis. 19 Correlative analysis and biomarker prediction Promising gene candidates from cellular screens ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ MLL3 PAPPA SF3B1 PRPF8 CENPE CIT ASPM ESPL1 DYNC1H1 ASCC3 KIF4A Mouse Cell line ASPM Knockdown ASPM Mut Mouse ASPM WT Polylobed Polylobed nucleus nucleus CMPO_0000157 Mouse and human tissue WP6 partners are looking for and/or generating the data relevant to these genes to be used for analysis. 20 Making large scale image data sets from different biological scales interoperable 01/2013 Start of WP6 03/2013 Identification of standards and ontologies used for cellular/mouse/human image data sets 12/2013 -> Inventory of image file formats and ontologies -> Defined future standards Mapping of standards and ontologies between the different image reference data sets 12/2014 -> Cellular Phenotype Ontology and annotation tool 01/2015 Set of predicted biomarkers 12/2015 -> Predict new biomarker genes Deadline for the deliverable 21 BMS RI partners Euro-BioImaging Jan Ellenberg Tanja Ninkovic Elixir Gabriella Rustici Jean-Karim Heriche Infrafrontier BBMRI Simon Jupp Frauke Neff 22 Wolfgang Huber Philipp Gormanns Johan Lundin Mikael Lundin Acknowledgments ¡ WP6 partners ¡ James Malone, Tony Burdett and Helen Parkinson, EMBL-EBI ¡ In particular, we wish to thank: 23 ¡ Anna Melidoni, Ruth Lovering and Jennifer Rohn (UCL) ¡ Beate Neumann and Jean Karim Heriche (EMBL) ¡ Bob Van De Water (U. Leiden) ¡ Bram Herpers (OcellO) ¡ Claudia Lukas (U. Copenhagen) ¡ Greg Pau (Genentech) ¡ Sylvia Le Dévédec (LUMC) ¡ Thomas Walter (Institut Curie) ¡ Wies Roosmalen (U. Twente) ¡ Zvi Kam (Weizmann Institute) Thank you for your attention. 24