Download What do we want?

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
Data analysis and integration
How to get from a pile of unprocessed data to knowledge:
The user’s perspective
Guido Jenster, Ph.D.
Professor of Experimental Urological Oncology
Department of Urology
Erasmus MC
[email protected]
Structure of Cancer Research Projects
Functional
Research
Prevention
Research
Technology
&
Protocols
Models
&
Biobanks
Marker
Research
Therapy
Research
Datasets
Bioinformatics
&
Statistics
Organization & Management; Education; Outreach
Prostate Cancer Molecular Medicine
Clinical
Research
DATA
QUERY
VIEWING
NEW
KNOWLEDGE
Imaging
DATA
INTEGRATION
DATA
PROCESSING
DATA
STORAGE
DATA
GENERATION
Experimental
Research
Biobanking
Prostate Cancer Molecular Medicine
What do we want?
Use case:
Identify novel fusion genes from DNA and RNA sequencing data
PUSH
TO
START
Data analysis and integration
Where is the Red Button?
Why is it so difficult to make?
-
Different types of data
Different platforms and their limitations
Different data analysis tools
Limitations in storage and compute power
Analysis and integration is dependent on research question and the
needs of the scientist
Markers and therapy targets for prostate cancer
Markers and therapy targets:
An inventory of the differences between normal and cancer cells:
DNA
RNA
Protein
Metabolite
Morphology
Cellular behavior
DNAseq Data Analysis
Copy Number
Abberations
SNVs / InDels
TF Binding
B-Allele Frequency
DNAseq data
Chromatin Interactions
Methylation
Structural Variations
Active Chromatin
Identify
Integration Sites
Read
Barcode
RNAseq Data Analysis
Differential expression
SNVs / InDels
RNAseq data
Novel Transcripts
Alternative splicing &
Promoters
Read-Through &
Fusion Transcripts
DNA and RNA analysis platforms
DNA level:
-
Home made array CGH
1M SNP arrays (Illumina)
Ion Proton low pass DNAseq
Ion Proton exome DNAseq
Complete Genomics whole genome DNAseq
FAIREseq, ChIPseq, MeDIPseq, Methylation arrays (Illumina)
RNA level:
-
Home made cDNA and oligo arrays
Affymetrix Exon arrays
Illumina RNAseq (small RNA and mRNA)
Ion Proton RNAseq
Data analysis and integration
Where is the Red Button?
Why is it so difficult to make?
-
Different types of data
Different platforms and their limitations
Different data analysis tools
Limitations in storage and compute power
Analysis and integration is dependent on research question and the
needs of the scientist
Prostate Cancer Molecular Medicine
Clinical
Research
DATA
QUERY
VIEWING
Imaging
DATA
INTEGRATION
DATA
PROCESSING
NEW
KNOWLEDGE
DATA
STORAGE
DATA
GENERATION
Experimental
Research
Where is the Red Button?
How to solve the issues?
Biobanking
TraIT subdivision into work packages
Four data generating
work packages
Data integration & analysis
across the four platforms
Shared hardware and
professional training &
support
The TraIT mansion requires good support
Phenotype
Database
Chipster
Workflow
Galaxy
tEPIS
Logis
Keosys
coLIMS
tranSMART
TOP
desk
Alfresco
Website
Wiki
Jira
SurfConext
XNAT
Catalogue
TTP
Open
Clinica
Data storage + CPU power
BMIA
Data analysis and integration
Where is the Red Button?
How to solve the issues?
DATA
INTEGRATION
DATA STORAGE
& COMPUTE
-Own (external hard drives)
-Central
CSC, CCBC, GEO, ENA
-Commercial
Clouds
Adopt, Adapt, Create
DATA
PROCESSING
-Own pipelines and tools
-Commercial programs
CLCBio, etc.
-Central / Open Source tool
platforms
-Own (Access)
-Commercial (NextBio)
-Central
Oracle TRC, tranSMART
DATA MINING
VIEWING
Data Mining: Query & Viewing Tools
Platform:
Where do I get
my data from?
Level:
Which level do I want to
mine?
Between-Study Level
Study Level
Patient/Sample Level
Molecular Level
Tool:
What is the best query
& viewing tool?
cBioPortal
Prostate Cancer Molecular Medicine
What do we want?
Use case:
Identify novel fusion genes from DNA and RNA sequencing data
PUSH
TO
START
Andrew Stubbs
http://www.erasmusmc.nl/bioinformatica/
Harmen van de Werken
http://www.erasmusmc.nl/ccbc/
Please attend the monthly Bridge Meetings:
http://www.molmed.nl/
(MolMed Lectures)