Download The Road to Personalized Medicine is Paved with Data and Information John Quackenbush

Document related concepts

Helitron (biology) wikipedia , lookup

Transcript
The Road to
Personalized Medicine
is Paved with Data and Information
John Quackenbush
University of Rochester
Big Data Forum
October 5, 2012
Watson and Crick
DNA’s Structure reveals its properties
The Success of DNA and its Structure
!   Explained how genetic information could reliably
passed from one generation to another.
!   Explained how polymorphisms can arise giving rise
to genetic variation.
!   Provides a logical framework in which RNA could
mediate cellular chemistry.
!   Allowed us to begin to understand the molecular
basis for genetic disease and to begin to develop
tests and treatments.
Molecular Biology in 7 Words
Function
RNA
Protein
Folding
Regulation
Gene
Structure
Finding Disease Genes: A Difficult Task
!   There are 46 chromosomes:
22 autosome pairs and the
X and Y sex chromosomes
!   In the 22 autosomes plus the X and Y,
there are ~3,000,000,000 base-pairs of DNA
(That’s the number of seconds in 95 years.)
Comple'on of the Human Genome Announced June 26, 2000 7
The Genome Project Has Provided a
“Parts List” for a Human Cell
8
Different Cell Types Express Different Sets of Genes Neuron
Thyroid Cell
Lung Cell
Cardiac Muscle
Pancreatic Cell
Kidney Cell
Skeletal Muscle
Skin Cell
9
Molecular Biology in X
7 Words
8 Words
Function
RNA
Network
Protein
Folding
Regulation
Gene
Structure
10
Molecular Biology in X
7X
89
Words
Words
Function
RNA
Network
Protein
Folding
Regulation
Gene
Structure
11
Disease Progression and
Personalized Care
Birth
Treatment
Natural History of Disease
Clinical Care
Environment
+ Lifestyle
Outcomes
Treatment
Options
Disease
Staging
Patient
Stratification
Early
Detection
Genetic
Risk
Biomarkers
Quality
Of Life
Death
Turning the vision into a reality
!   Assure access to samples and rational consent
!   Develop a technology platform
!   Make information integration as a central mission
!   Conduct research as a vital component
!   Present data and information to the local community
!   Enable research beyond your own
!   Engage corporate partners
!   Communicating the mission to the community.
Assure Access to Samples
Access, Research, Security
!   Patients want to be part of the process of curing disease
!   Informed consent needs to be structured to allow patients
to be partners in the research process
!   HIPPA requires both informed consent and that we assure
patient confidentiality
!   But “identifiability” is a moving target in a genomic age
!   With the <$1000 genome, in the age of Facebook, what
this means remains unclear
!   The new Genomics is a disruptive technology.
Develop a
Technology Platform
The cost decreases exponentially with
time
Illumina GAII
ABI SOLiD
Continuing the Regression:
Genomes for $100 in February 2014
The $1000 Genome:
October 2012
17
2010: Enabling a New Era in Genome
Analysis
Illumina HiSeq
100Gb (~30X genome
coverage)
150bp reads
Two samples/week
<$10,000 per genome
Just Announced: The Life Technologies
Ion Torrent Proton
The Promise from LTI
A Genome in ~24 hours
for $1000
Promised in Q3 2012
Let the games begin!
The Oxford Nanopore MiniON
The USB sequencer
The Challenge
!   New technologies inspired by the Human Genome
Project are transforming biomedical research from
a laboratory science to an information science
!   We need new approaches to making sense of the
data we generate
!   The winners in the race to understand disease are
going to be those best able to collect, manage,
analyze, and interpret the data.
Make information integration
as a central mission
http://compbio.dfci.harvard.edu
Gene
RNA
Gene Index
Databases
Protein
TM4
Microarray
Software
Network
Patient
Predict Network
Candidate Gene(s)
Perturb Network (RNAi)
Assay Response (µA)
Resourcerer
Other Databases
Other tools
MeSHer
ClusterMed
Bayesian Nets
Central
Warehouse
DNA Microarray
Analysis
Beating Information Overload
Clinical
Data
Genomics
Cytogenomics
Metabolomics
Transcriptomics
Central
Warehouse
Chemical
Biology
Clinical
Trials
Etc.
Epigenomics
Proteomics
Improved Diagnostics
Individualized Therapies
More Effective Agents
PubMed
The
HapMap
The
Genome
Disease
Databases
(OMIM)
Published
Datasets
Drug
Bank
misc
Dana Farber Clinical Systems
PubMed
GenBank
Rules
Engine
Web Center Portal
BAM
Dashboard
Portals
Business Intelligence
Partners
OMICS
IDX
Rx
Lab
Enterprise Service Bus
Dana
Farber
Lab
External External
Dana-Farber Research DB Conceptual Architecture
Clinical
Trial
Idm &
Security
HTB ODS
genomics
Web Service Directory
BPEL
……
Custom
De-identification
Terminology
EMPI
A
Facts
C
…..
Severity Score
Mapping
Clinical
Pathways
Security Auditing
RFID
B
A
D
C
Facts
B
D
Build or Buy
Oracle
Existing
Conduct research as a vital
component
What can we learn from networks?
Normal Tissue
Network
Chemosensitive
Tumor
Chemoresistant
Tumor
29
Another Idea: Message Passing
.
Transcription Factor
The TF is Responsible for
communicating with its Target
Downstream Target
The Target must be Available
to respond to the TF
Inhaled Corticosteroids in Asthma
Sham
Dex
Present data and information
to the local community
LGRC Research Portal LGRC Data Download Data download
•  Browse by basic metadata
•  Browse by clinical /
phenotype attributes
•  Download ‘raw’ data
•  Secure transfer via single
use ‘tickets’ . Enables
authorized users access to
the specified result basket for
a single session.
LGRC Research Portal LGRC Gene Catalog PAGE DETAILS Search -­‐ Facets -­‐ Search within results -­‐ Keyword prompts -­‐ Search history Table: -­‐ Paged results -­‐ Sortable columns Ac'ons: -­‐ Go to Gene detail page -­‐ Add genes to ‘gene set’ LGRC Gene Detail LGRC Research Portal LGRC Research Portal LGRC Cohort Selector LGRC Research Portal LGRC Cohort-­‐Based Analysis Engage corporate partners
We need to find the best tools
!   We received an $1M Oracle Commitment grant to
create our integrated clinical/research data warehouse
!   We’ve partnered with IDBS to create data portals
!   We are working with Illumina on a variety of projects
!   We are forging relationships with Thomson-Reuters to
link genomic profiling data to drug, trial, and patent
information
!   We are building partnerships with Roche, Genomatix,
NEB, and others interested in entering the personal
genomics space.
Enable research beyond
your own
John Quackenbush, Director
Mick Correll, Associate Director
The Mission
The mission of the CCCB is to provide broad-based support for the
analysis and interpretation of ‘omic data and in doing so to further basic,
clinical and translational research. CCCB also will conduct research that
opens new ways of understanding cancer.
CCCB
Collaborative Consulting Model
1.  Initial meeting to understand project scope and objectives
2.  Development of an analysis plan and time/cost estimate
Sequencing
IT Infrastructure
Consulting
3.  During project execution, data and results are exchanged
through a secure, password-protected collaboration portal
4.  Available as ad-hoc service, or larger scale support agreements
Communicate the mission to
the community.
The LGRC
Why Patient Involvement is
Essential
!   Patients want to be our partners in curing disease
!   The incentive structure in medical research is
skewed away from success
!   We all say, “We want to cure disease.”
!   We mean, “We want to cure disease, but only if
I am the one to cure disease.”
!   The only way to break the logjam is to have
patients involved in the process.
Genomics is here to stay
The future is here.
It's just not widely distributed yet.
- William Gibson
Acknowledgments
The Gene Index Team
Corina Antonescu
Valentin Antonescu
Fenglong Liu
Geo Pertea
Razvan Sultana
John Quackenbush
Array Software Hit Team
Katie Franklin
Eleanor Howe
John Quackenbush
Dan Schlauch
Raktim Sinha
Joseph White
Eskitis Institute
Christine Wells
Alan Mackay-Sim
<[email protected]>
Center for Cancer
Computational Biology
Mick Correll
Victor Chistyakov
Howie Goodell
Lan Hui
Lev Kuznetsov
Niall O'Connor
Jerry Papenhausen
Yaoyu Wang
John Quackenbush
http://cccb.dfci.harvard.edu
Gene Expression Team
Fieda Abderazzaq
Stefan Bentink
Aedin Culhane
Kathleen Fleming
Benjamin Haibe-Kains
Jessica Mar
Melissa Merritt
Megha Padi
Renee Rubio
(Former) Stellar Students
Martin Aryee
Kaveh Maghsoudi
Jess Mar
Systems Support
Stas Alekseev, Sys Admin
Priya Karanam, DBA
Administrative Support
Joan Coraccio
Julianna Coraccio
http://compbio.dfci.harvard.edu