Download Interoperability of large scale image data sets from different

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Interoperability of large scale
image data sets
from different biological
scales
BioMedBridges Annual General Meeting/Mid-term Review
11 March 2014, Florence
Jan Ellenberg (WP Leader) and Tanja Ninkovic
on behalf of use case partners
Gabriella Rustici
Simon Jupp
Frauke Neff
Johan Lundin
Collaborating BMS RIs
2
Scientific problem
Cell
ü  Cellular phenotype
ü  Genetic information
ü  Molecular mechanism
Mouse
ü  Tissue phenotype
ü  Genetic information
Human
ü  Tissue phenotype
Genome
Imaging
Imaging
Imaging
By linking these three different types of data sets, we can better
understand diseases, predict novel drug targets and biomarkers
3
Cell
Human
Gene knockdown
Disease
Make data interoperable
Predict disease gene/biomarker
4
Matching phenotypes
in cells and tissues
Prometaphase
Metaphase
Anaphase
Graped
micronucleus
Cell line – gene
knockdown
Human cancer
tissue
State of the art: finding a match by chance
5
To compare and integrate image data we
need interoperable standards
Sample
Images
Volocity MVD2
Olympus OIB
JPEG
OME-TIFF
PNG
Zeiss LSM
Different
Leica LIF
NDP
DeltaVision DV
file
formats
HDF5
No consistent
phenotype
annotation/ontology
Assay
Different
image
metadata
Olympus OIF
Automated comparative analysis of image data sets was impossible
6
To compare and integrate image data we
need interoperable standards
Images
Volocity MVD2
Olympus OIB
JPEG
OME-TIFF
PNG
Zeiss LSM
Different
Leica LIF
NDP
DeltaVision DV
file
formats
HDF5
No consistent
phenotype
annotation/ontology
Different
image
metadata
Olympus OIF
• 
• 
7
Inventory of image
file formats
Defined standard
tools for
interconversion
?
• 
• 
Inventory of image
metadata formats
Defined standard
tools for
interconversion
What ontologies are already available?
8
Cultured human
cells
Gene Ontology, Cell cycle ontology, Cell line ontology,
Cell ontology, Cell culture ontology, Phenotypic Quality
Ontology, Mammalian Phenotype Ontology, Fission Yeast
Phenotype Ontology, Human Phenotype Ontology
Mouse Histology
Tissue samples
Gene Ontology (BP), Cell ontology, Phenotypic Quality
Ontology, Mammalian Phenotype Ontology, Mammalian
pathology ontology (MPATH-Pathbase), Adult Mouse
Anatomy Dictionary
Human Histology
Tissue samples
Human Phenotype Ontology, Terminologica Histologica,
Terminologica Embryologica, Human Developmental
Anatomy, International Classification of Diseases (ICD),
SNOMED CT, BRENDA Tissue Ontology
Existing ontologies are not enough
¡ Existing ontologies either lack coverage or are
incomplete to describe cellular scale phenotypes
¡ No species neutral ontology for cellular phenotypes
¡ Such ontology is needed for data interoperability
Ø  WP6 developed the
Cellular Microscopy Phenotype Ontology (CMPO)
9
Building CMPO
Cellular phenotypes: entities, processes and qualities
Cellular
component
Cell types
Gene Ontology – Biological process
Size
Gene Ontology – Cellular Component
Shapes
Cell type ontology (CTO)
Biological
Processes
Abnormal
Temporal
quality
Absent
10
Phenotype and trait ontology (PATO)
Building CMPO
Composing a phenotype description
Entity
a bearer of some quality
+
Quality
characteristic of the entity
Examples:
¡  Phenotype: “Large nucleus”
¡ 
¡ 
Entity: nucleus (GO_000xxxx)
Quality: large (PATO_000xxxx)
¡  Phenotype: “Cells stuck in metaphase due to metaphase arrest”
¡ 
¡ 
11
Entity: mitotic metaphase (GO_0000089)
Quality: arrested (PATO_0000297)
Cellular Microscopy Phenotype Ontology (CMPO)
¡  Species neutral ontology
¡  Relating to the whole cell, cellular components, cellular
processes and cell populations
¡  Compatible with related ontology efforts (Fission Yeast Phenotype
Ontology, Ascomycete Phenotype Ontology, Mammalian Phenotype
Ontology) allowing for future cross species integration of phenotypic
data
¡  Released in October 2013
¡  Can be browsed at: the Ontology Lookup Service1, Bioportal2 and
Github3
1 http://www.ebi.ac.uk/ontology-lookup/browse.do?ontName=CMPO
2 http://bioportal.bioontology.org/ontologies/CMPO?p=classes
3 https://github.com/EBISPOT/CMPO
12
Enabling standardised data generation
Phenotator: user-friendly ontology annotation of image data
Original
phenotypic
description
Ontology based
annotations
http://wwwdev.ebi.ac.uk/fgpt/phenotator/
13
Cell
Human
Gene knockdown
Disease
Integrate file formats
CMPO term:
graped micronucleus
CMPO_0000156
Integrate metadata
Apply phenotype ontology
CMPO term:
graped micronucleus
CMPO_0000156
Predict disease gene/biomarkers
14
Ontology building using Phenotator
User 1
User 4
User 2
Annotation tool
Ontology Terms
1.  Distribute tool to consortium members
for phenotype annotation
2.  Workshop on ontology development with
WP6 partners
3.  One to one sessions with data producers
15
User 3
Collect phenotype-ontology
mappings provided by the users
Build the Cellular Microscopy
Phenotype Ontology (CMPO)
Future plans Phenotator: Automation
¡  Semi-automated mapping of cellular phenotypes to CMPO terms
1
http://www.ebi.ac.uk/fgpt/zooma
List of
phenotypes
provided by
the user
Zooma mappings to CMPO
16
Future plans CMPO: integration in existing
applications
¡  Widgets1 (in collaboration with WP4)
¡ 
deployable in existing web applications
¡ 
autocomplete search boxes
¡ 
ontology terms are readily available in user-facing applications
¡  Integrating CMPO into FIMM’s Webmiscroscope Portal2 and EMBL’s
CellBase.
Data producers can utilise ontologies for annotating their data sets already at
the data production stage
1http://www.ebi.ac.uk/Tools/biojs/registry/
2http://biomedbridges.webmicroscope.net/
17
Future plans Scientific Use Case:
Correlative analysis and biomarkers prediction
Cellular
image
datasets1
Mouse
image
datasets2
Human
image
datasets3
Data hosted by
EBI Cellular Phenotype Database/
EMBL CellBase
Data hosted by
Webmicroscope
Annotate cellular, mouse and human image datasets using CMPO
Correlative analysis of now interoperable cell and tissue
image datasets to predict novel biomarker candidates
Novel candidate biomarker prediction
Focus on cell cycle and cell division control genes
1 Mitocheck,
including genetic information; www.mitocheck.org
mouse lines, cancer models by GMC, PREDECT, International Mouse Phenotyping Consortium
3 Webmicroscope cancer tissue collection
2 Helmholtz’s
18
Correlative analysis and biomarker prediction
Candidate biomarker genes from cellular tumor suppressor screens have been
identified:
¡ 
¡ 
¡ 
¡ 
¡ 
¡ 
¡ 
¡ 
¡ 
¡ 
¡ 
MLL3
PAPPA
SF3B1
PRPF8
CENPE
CIT
ASPM
ESPL1
DYNC1H1
ASCC3
KIF4A
Mouse and human tissue WP6 partners are looking for and/or generating the
data relevant to these genes to be used for analysis.
19
Correlative analysis and biomarker prediction
Promising gene candidates from cellular screens
¡ 
¡ 
¡ 
¡ 
¡ 
¡ 
¡ 
¡ 
¡ 
¡ 
¡ 
MLL3
PAPPA
SF3B1
PRPF8
CENPE
CIT
ASPM
ESPL1
DYNC1H1
ASCC3
KIF4A
Mouse
Cell line
ASPM Knockdown ASPM Mut
Mouse
ASPM WT
Polylobed
Polylobed
nucleus
nucleus
CMPO_0000157
Mouse and human tissue WP6 partners are looking for and/or
generating the data relevant to these genes to be used for analysis.
20
Making large scale image data sets from
different biological scales interoperable
01/2013
Start of WP6
03/2013
Identification of standards and ontologies
used for cellular/mouse/human image data sets
12/2013
-> Inventory of image file formats and ontologies
-> Defined future standards
Mapping of standards and ontologies between the
different image reference data sets
12/2014
-> Cellular Phenotype Ontology and annotation tool
01/2015
Set of predicted biomarkers
12/2015
-> Predict new biomarker genes
Deadline for
the deliverable
21
BMS RI partners
Euro-BioImaging
Jan Ellenberg
Tanja Ninkovic
Elixir
Gabriella Rustici
Jean-Karim
Heriche
Infrafrontier
BBMRI
Simon Jupp
Frauke Neff
22
Wolfgang
Huber
Philipp
Gormanns
Johan Lundin
Mikael Lundin
Acknowledgments
¡  WP6 partners
¡  James Malone, Tony Burdett and Helen Parkinson, EMBL-EBI
¡  In particular, we wish to thank:
23
¡ 
Anna Melidoni, Ruth Lovering and Jennifer Rohn (UCL)
¡ 
Beate Neumann and Jean Karim Heriche (EMBL)
¡ 
Bob Van De Water (U. Leiden)
¡ 
Bram Herpers (OcellO)
¡ 
Claudia Lukas (U. Copenhagen)
¡ 
Greg Pau (Genentech)
¡ 
Sylvia Le Dévédec (LUMC)
¡ 
Thomas Walter (Institut Curie)
¡ 
Wies Roosmalen (U. Twente)
¡ 
Zvi Kam (Weizmann Institute)
Thank you for your attention.
24