Download Functional genomics

Document related concepts

Long non-coding RNA wikipedia , lookup

Fetal origins hypothesis wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Gene desert wikipedia , lookup

History of genetic engineering wikipedia , lookup

Ridge (biology) wikipedia , lookup

Genomic imprinting wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Pharmacogenomics wikipedia , lookup

Pathogenomics wikipedia , lookup

Oncogenomics wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Gene nomenclature wikipedia , lookup

Gene therapy wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Minimal genome wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Gene expression programming wikipedia , lookup

Gene therapy of the human retina wikipedia , lookup

Gene wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Genome evolution wikipedia , lookup

Microevolution wikipedia , lookup

NEDD9 wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Neuronal ceroid lipofuscinosis wikipedia , lookup

RNA-Seq wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Gene expression profiling wikipedia , lookup

Genome (book) wikipedia , lookup

Designer baby wikipedia , lookup

Public health genomics wikipedia , lookup

Transcript
Integrative Functional
Genomics
Anil Jegga
Biomedical Informatics, CCHMC
[email protected]
Two Separate Worlds…..
Disease
World
Medical Informatics
Bioinformatics & the “omes”
Genome
Regulome
Transcriptome
miRNAome
Disease
Database
Patient
Records
Clinical
Trials
Proteome
Interactome
Metabolome
Variome
Pharmacogenome
PubMed
→Name
Physiome
OMIM
→Synonyms
Clinical
→Related/Similar Diseases
Synopsis
→Subtypes
Pathome
→Etiology
→Predisposing Causes
→Pathogenesis
>380 “omes” so far………
→Molecular Basis
→Population Genetics
→Clinical findings
and there is “UNKNOME” too →System(s) involved
→Lesions
genes with no function known
→Diagnosis
→Prognosis
http://en.wikipedia.org/wiki/List_of_omics_topics_in_biology
→Treatment
http://omics.org/index.php/Alphabetically_ordered_list_of_omics
→Clinical Trials……
With Some Data Exchange…
Motivation
To correlate diseases with anatomical parts
affected, the genes/proteins involved, and
the underlying physiological processes
(interactions, pathways, processes). In other
words, bringing the disciplines of Medical
Informatics (MI) and BioInformatics (BI)
together (Biomedical Informatics - BMI) to
support personalized or “tailor-made”
medicine.
How to integrate multiple types of genome-scale data
across experiments and phenotypes in order to find genes
associated with diseases and drug response
Model Organism Databases: Common Issues
• Heterogeneous Data Sets - Data Integration
– From Genotype to Phenotype
– Experimental and Consensus Views
• Incorporation of Large Datasets
– Whole genome annotation pipelines
– Large scale mutagenesis/variation projects (dbSNP)
• Computational vs. Literature-based Data
Collection and Evaluation (MedLine)
• Data Mining
– extraction of new knowledge
– testable hypotheses (Hypothesis Generation)
Support Complex Queries
• Show me all genes involved in brain development
that are expressed in the Central Nervous
System.
• Show me all genes involved in brain development in
human and mouse that also show iron ion binding
activity.
• For this set of genes, what aspects of function
and/or cellular localization do they share?
• For this set of genes, what mutations are
reported to cause pathological conditions?
Bioinformatic Data-1978 to present
•
•
•
•
•
•
DNA sequence
Gene expression
Protein expression
Protein Structure
Genome mapping
SNPs & Mutations
•
•
•
•
•
•
Metabolic networks
Regulatory networks
Trait mapping
Gene function analysis
Scientific literature
and others………..
Human Genome Project – Data Deluge
No. of Human Gene Records
currently in NCBI: ~30K
(excluding pseudogenes,
mitochondrial genes and obsolete
records).
Includes ~700 microRNAs
NCBI Human Genome Statistics – as on November 4, 2009
The Gene Expression Data Deluge
Till 2000: 413 papers on microarray!
Year
2001
2002
2003
2004
2005
2006
2007
2008
2009
PubMed
Articles
834
1557
2421
3508
4400
4824
5108
5884
5207…..
Problems Deluge!
Allison DB, Cui X, Page GP,
Sabripour M. 2006. Microarray
data analysis: from disarray to
consolidation and consensus.
Nat Rev Genet. 7(1): 55-65.
Information Deluge…..
• 3 scientific journals in 1750
• Now - >120,000 scientific journals!
• >500,000 medical articles/year
• >4,000,000 scientific articles/year
• >16 million abstracts in PubMed
derived from >32,500 journals
A researcher would have to scan 130 different
journals and read 27 papers per day to follow a
single disease, such as breast cancer (Baasiri et al.,
1999 Oncogene 18: 7958-7965).
Data-driven Problems…..
What’s in a name!
Rose is a rose is a rose is a rose!
Gene Nomenclature
Disease names
•Accelerin
•Draculin
•
•Antiquitin
•Fidgetin
•Bang Senseless
•Gleeful
•
•Bride of Sevenless •Knobhead
•
•Christmas Factor •Lunatic Fringe •
•Cockeye
•Mortalin
•
•Crack
•Orphanin
•Draculin
•Profilactin
•Dickie’s small eye •Sonic Hedgehog
Mobius Syndrome with
Poland’s Anomaly
Werner’s syndrome
Down’s syndrome
Angelman’s syndrome
Creutzfeld-Jacob
disease
1.
Generally, the names refer to
some feature of the mutant
phenotype
2.
Dickie’s small eye (Thieler et al.,
1978, Anat Embryol (Berl), 155:
81-86) is now Pax6
3.
Gleeful: "This gene encodes a
C2H2 zinc finger transcription
factor with high sequence
similarity to vertebrate Gli
proteins, so we have named the
gene gleeful (Gfl)." (Furlong et
al., 2001, Science 293: 1632)
•
How to name or describe proteins, genes, drugs, diseases and conditions consistently and
coherently?
•
How to ascribe and name a function, process or location consistently?
•
How to describe interactions, partners, reactions and complexes?
Some Solutions
•
Develop/Use controlled or restricted vocabularies (IUPAC-like naming conventions,
HGNC, MGI, UMLS, etc.)
•
Create/Use thesauruses, central repositories or synonym lists (MeSH, UMLS, etc.)
•
Work towards synoptic reporting and structured abstracting
Rose is a rose is a rose is a rose….. Not Really!
What is a cell?
•
any small compartment;
•
(biology) the basic structural and functional unit of all
organisms; they may exist as independent units of life (as in
monads) or may form colonies or tissues as in higher plants
and animals
•
a device that delivers an electric current as the result of a
chemical reaction
•
a small unit serving as part of or as the nucleus of a larger
political movement
•
cellular telephone: a hand-held mobile radiotelephone for use
in an area divided into small sections, each with its own shortrange transmitter/receiver
•
small room is which a monk or nun lives
•
a room where a prisoner is kept
Image Sources: Somewhere from the internet…
Semantic Groups, Types and Concepts:
•
Semantic Group Biology – Semantic Type Cell
•
Semantic Groups Object OR Devices – Semantic
Types Manufactured Device or Electrical Device
or Communication Device
•
Semantic Group Organization – Semantic Type
Political Group
Foundation Model Explorer
No. of Records
Database name
Query= p53
Query= TP53
(HGNC)
Query= p53
OR TP53
PubMed
48,679
3360
49,469
PMC
21,193
1529
21,564
Book
782
504
820
Nucleotide
9473
592
9773
Protein
6219
509
6377
Genome
22
1
23
OMIM
403
141
414
SNP
424
337
453
Gene
1642
338
1750
63
9
68
352,684
15,140
358,999
302
161
463
Homologene
GEO Profiles
Cancer Chr
The REAL
Problems
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
COLORECTAL CANCER [3-BP DEL, SER45DEL]
COLORECTAL CANCER [SER33TYR]
PILOMATRICOMA, SOMATIC [SER33TYR]
HEPATOBLASTOMA, SOMATIC [THR41ALA]
DESMOID TUMOR, SOMATIC [THR41ALA]
PILOMATRICOMA, SOMATIC [ASP32GLY]
OVARIAN CARCINOMA, ENDOMETRIOID TYPE, SOMATIC [SER37CYS]
HEPATOCELLULAR CARCINOMA SOMATIC [SER45PHE]
HEPATOCELLULAR CARCINOMA SOMATIC [SER45PRO]
MEDULLOBLASTOMA, SOMATIC [SER33PHE]
1.
CTNNB1
MET
HEPATOCELLULAR
CARCINOMA SOMATIC
[ARG249SER]
TP53*
Hepatocellular Carcinoma
TP53
Many disease states are
complex, because of many genes
(alleles & ethnicity, gene
families, etc.), environmental
effects (life style, exposure,
etc.) and the interactions.
aflatoxin B1, a mycotoxin
induces a very specific Gto-T mutation at codon 249
in the tumor suppressor
gene p53.
Environmental Effects
The REAL
Problems
1.
2.
3.
4.
5.
6.
7.
ALK in cardiac myocytes
Cell to Cell Adhesion Signaling
Inactivation of Gsk3 by AKT causes accumulation
of b-catenin in Alveolar Macrophages
Multi-step Regulation of Transcription by Pitx2
Presenilin action in Notch and Wnt signaling
Trefoil Factors Initiate Mucosal Healing
WNT Signaling Pathway
1.
2.
CTNNB1
HEPATOCELLULAR CARCINOMA
MET
LIVER:
•Hepatocellular carcinoma;
•Micronodular cirrhosis;
•Subacute progressive viral hepatitis
NEOPLASIA:
•Primary liver cancer
CBL mediated ligand-induced downregulation
of EGF receptors
Signaling of Hepatocyte Growth Factor
Receptor
1.
TP53
Estrogen-responsive protein Efp
controls cell cycle and breast tumors
growth
2. ATM Signaling Pathway
3. BTG family proteins and cell cycle
regulation
4. Cell Cycle
5. RB Tumor Suppressor/Checkpoint
Signaling in response to DNA damage
6. Regulation of transcriptional activity
by PML
7. Regulation of cell cycle progression by
Plk3
8. Hypoxia and p53 in the Cardiovascular
system
9. p53 Signaling Pathway
10. Apoptotic Signaling in Response to
DNA Damage
11. Role of BRCA1, BRCA2 and ATR in
Cancer Susceptibility….Many More…..
Integrative Genomics - what is it?
Another buzzword or a meaningful concept useful for
biomedical research?
Acquisition, Integration, Curation, and
Analysis of biological data
Hypothesis
Integrative Genomics: the study of complex interactions
between genes, organism and environment, the triple helix
of biology. Gene <–> Organism <-> Environment
It is definitely beyond the buzzword stage - Universities
now have programs named 'Integrated Genomics.'
Information is not knowledge - Albert Einstein
Methods for Integration
1. Link driven federations
• Explicit links between databanks.
2. Warehousing
• Data is downloaded, filtered, integrated
and stored in a warehouse. Answers to
queries are taken from the warehouse.
3. Others….. Semantic Web, etc………
Link-driven Federations
1. Creates explicit links between databanks
2. query: get interesting results and use web
links to reach related data in other
databanks
Examples: NCBI-Entrez, SRS
http://www.ncbi.nlm.nih.gov/Database/datamodel/
http://www.ncbi.nlm.nih.gov/Database/datamodel/
http://www.ncbi.nlm.nih.gov/Database/datamodel/
http://www.ncbi.nlm.nih.gov/Database/datamodel/
http://www.ncbi.nlm.nih.gov/Database/datamodel/
Link-driven Federations
1. Advantages
• complex queries
• Fast
2.Disadvantages
• require good knowledge
• syntax based
• terminology problem not solved
Data Warehousing
Data is downloaded, filtered, integrated and
stored in a warehouse. Answers to queries are
taken from the warehouse.
Advantages
Disadvantages
1. Good for very-specific,
task-based queries and
studies.
1. Can become quickly
outdated – needs
constant updates.
2. Since it is custom-built
and usually expertcurated, relatively less
error-prone
2. Limited functionality –
For e.g., one diseasebased or one systembased.
No Integrative Genomics is
Complete without Ontologies
Gene World
• Gene Ontology
(GO)
Biomedical World
• Unified Medical
Language System
(UMLS)
The 3 Gene Ontologies
• Molecular Function = elemental activity/task
– the tasks performed by individual gene products; examples are
carbohydrate binding and ATPase activity
– What a product ‘does’, precise activity
• Biological Process = biological goal or objective
– broad biological goals, such as dna repair or purine metabolism,
that are accomplished by ordered assemblies of molecular
functions
– Biological objective, accomplished via one or more ordered assemblies of
functions
• Cellular Component = location or complex
– subcellular structures, locations, and macromolecular
complexes; examples include nucleus, telomere, and RNA
polymerase II holoenzyme
– ‘is located in’ (‘is a subcomponent of’ )
http://www.geneontology.org
Example: Gene Product = hammer
Function (what)
Process (why)
Drive a nail - into wood
Carpentry
Drive stake - into soil
Gardening
Smash a bug
Pest Control
A performer’s juggling object Entertainment
http://www.geneontology.org
GO term associations: Evidence Codes
• ISS: Inferred from sequence or structural
similarity
• IDA: Inferred from direct assay
• IPI: Inferred from physical interaction
• TAS: Traceable author statement
• IMP: Inferred from mutant phenotype
• IGI: Inferred from genetic interaction
• IEP: Inferred from expression pattern
• ND: no data available
http://www.geneontology.org
What can researchers do with GO?
•
Access gene product functional
information
•
Find how much of a proteome is
involved in a process/ function/
component in the cell
•
Map GO terms and incorporate
manual annotations into own
databases
•
Provide a link between
biological knowledge and
•
gene expression
profiles
•
proteomics data
And how?
• Getting the GO and
GO_Association Files
• Data Mining
– My Favorite Gene
– By GO
– By Sequence
• Analysis of Data
– Clustering by
function/process
• Other Tools
http://www.geneontology.org/
Gene list enrichment analysis tools (DAVID, FatiGO, ToppGene)
Open biomedical ontologies
http://obo.sourceforge.net/
Unified Medical Language System Knowledge
Server– UMLSKS
http://umlsks.nlm.nih.gov/kss/
• The UMLS Metathesaurus contains information about biomedical
concepts and terms from many controlled vocabularies and
classifications used in patient records, administrative health data,
bibliographic and full-text databases, and expert systems.
• The Semantic Network, through its semantic types, provides a
consistent categorization of all concepts represented in the UMLS
Metathesaurus. The links between the semantic types provide the
structure for the Network and represent important relationships in
the biomedical domain.
• The SPECIALIST Lexicon is an English language lexicon with many
biomedical terms, containing syntactic, morphological, and
orthographic information for each term or word.
•
•
•
•
•
Unified Medical Language System
Metathesaurus
about >1 million biomedical concepts
About 5 million concept names from more than 100 controlled vocabularies
and classifications (some in multiple languages) used in patient records,
administrative health data, bibliographic and full-text databases and expert
systems.
The Metathesaurus is organized by concept or meaning. Alternate names for
the same concept (synonyms, lexical variants, and translations) are linked
together.
Each Metathesaurus concept has attributes that help to define its meaning,
e.g., the semantic type(s) or categories to which it belongs, its position in
the hierarchical contexts from various source vocabularies, and, for many
concepts, a definition.
Customizable: Users can exclude vocabularies that are not relevant for
specific purposes or not licensed for use in their institutions.
MetamorphoSys, the multi-platform Java install and customization program
distributed with the UMLS resources, helps users to generate pre-defined
or custom subsets of the Metathesaurus.
• Uses:
– linking between different clinical or biomedical vocabularies
– information retrieval from databases with human assigned subject index terms
and from free-text information sources
– linking patient records to related information in bibliographic, full-text, or factual
databases
– natural language processing and automated indexing research
UMLSKS – Semantic Network
• Complexity reduced by grouping concepts according to the
semantic types that have been assigned to them.
• There are currently 15 semantic groups that provide a partition of
the UMLS Metathesaurus for 99.5% of the concepts.
ACTI|Activities & Behaviors|T053|Behavior
ANAT|Anatomy|T024|Tissue
CHEM|Chemicals & Drugs|T195|Antibiotic
CONC|Concepts & Ideas|T170|Intellectual Product
Semantic
Groups (15)
DEVI|Devices|T074|Medical Device
DISO|Disorders|T047|Disease or Syndrome
GENE|Genes & Molecular Sequences|T085|Molecular Sequence
GEOG|Geographic Areas|T083|Geographic Area
LIVB|Living Beings|T005|Virus
OBJC|Objects|T073|Manufactured Object
OCCU|Occupations|T091|Biomedical Occupation or Discipline
ORGA|Organizations|T093|Health Care Related Organization
PHEN|Phenomena|T038|Biologic Function
PHYS|Physiology|T040|Organism Function
PROC|Procedures|T061|Therapeutic or Preventive Procedure
Semantic
Types (135)
Concepts
(millions)
UMLSKS – Semantic Navigator
Part 2
Integrative Functional
Genomic Approaches to
Identify and Prioritize
Disease Genes
Disease Gene Identification and
Prioritization
Hypothesis: Majority of genes that impact or
cause disease share membership in any of several
functional relationships OR Functionally similar or
related genes cause similar phenotype.
Functional Similarity – Common/shared
•Gene Ontology term
•Pathway
•Phenotype
•Chromosomal location
•Expression
•Cis regulatory elements (Transcription factor binding sites)
•miRNA regulators
•Interactions
•Other features…..
Background, Problems & Issues
1. Most of the common diseases are multifactorial and modified by genetically and
mechanistically complex polygenic
interactions and environmental factors.
2. High-throughput genome-wide studies like
linkage analysis and gene expression
profiling, tend to be most useful for
classification and characterization but do
not provide sufficient information to
identify or prioritize specific disease causal
genes.
Background, Problems & Issues
3. Since multiple genes are associated with
same or similar disease phenotypes, it is
reasonable to expect the underlying genes
to be functionally related.
4. Such functional relatedness (common
pathway, interaction, biological process,
etc.) can be exploited to aid in the finding
of novel disease genes. For e.g., genetically
heterogeneous hereditary diseases such as
Hermansky-Pudlak syndrome and Fanconi
anaemia have been shown to be caused by
mutations in different interacting proteins.
PPI - Predicting Disease Genes
1. Direct protein–protein interactions (PPI) are
one of the strongest manifestations of a
functional relation between genes.
2. Hypothesis: Interacting proteins lead to same
or similar disease phenotypes when mutated.
3. Several genetically heterogeneous hereditary
diseases are shown to be caused by mutations
in different interacting proteins. For e.g.
Hermansky-Pudlak syndrome and Fanconi
anaemia. Hence, protein–protein interactions
might in principle be used to identify
potentially interesting disease gene candidates.
7
Known Disease Genes
Mining human
interactome
HPRD
BioGrid
Direct Interactants
of Disease Genes
Indirect Interactants
of Disease Genes
Prioritize candidate genes in the
interacting partners of the diseaserelated genes
•
Training sets: disease related genes
•
Test sets: interacting partners of the
training genes
66
Which of these
interactants are
potential new
candidates?
778
ToppGene Suite – General Schema
http://toppgene.cchmc.org
ToppGene Suite – Applications
http://toppgene.cchmc.org
Application
Description
ToppFun
Detects functional enrichment of input
gene list based on Transcriptome (gene
expression), Proteome (protein domains
and interactions), Regulome (TFBS and
miRNA), Ontologies (GO, Pathway),
Phenotype (human disease and mouse
phenotype), Pharmacome (Drug-Gene
associations), and Bibliome (literature cocitation).
Input
Supported identifiers
include NCBI Entrez
gene IDs, approved
human gene symbols,
NCBI Reference
Sequence accession
numbers;
Single gene list.
Output
Html output;
Tab-delimited
downloadable text file;
Graphical charts
ToppGene
Same as above but with
two gene lists (training
and test)
Same as above
Html output
ToppNet
ToppGeNet
Prioritize or rank candidate genes based
on functional similarity to training gene
list.
Prioritize or rank candidate genes based
on topological features in protein-protein
interaction network.
Identify and prioritize the neighboring
Single gene list
genes of the “seeds” in protein-protein
interaction network based on functional
similarity to the "seed" list (ToppGene) or
topological features in protein-protein
interaction network (ToppNet).
Html output;
Cytoscape compatible
input file;
Graphical networks
Same as above
Results of the genetic disease prioritizations using ToppGene and ToppNet
The gene-disease associations were from recently reported GWAS
and include novel disease gene associations.
Training sets: Compiled
using “phenotype/disease”
annotations in NCBI’s
Entrez Gene records and
OMIM
Test set genes: Artificial
linkage interval Candidate gene + 99
nearest neighboring genes
based on their genomic
distance on the same
chromosome.
Disease
Bipolar Disorder
Bipolar Disorder
Bipolar Disorder
Reference
Le-Niculescu et al.
Le-Niculescu et al.
Le-Niculescu et al.
Gene
KLF12
RORB
RORA
Bipolar Disorder
Le-Niculescu et al.
ALDH1A1
10
Bipolar Disorder
Cardiomyopathy
Celiac Disease
Celiac Disease
Celiac Disease
Celiac Disease
Le-Niculescu et al.
Dhandapany et al.
Hunt et al.
Hunt et al.
Hunt et al.
Hunt et al.
AK3L1
MYBPC3
SH2B3
CCR3
IL18R1
RGS1
11
1
1
2
3
9
Celiac Disease
Celiac Disease
Crohns Disease
Crohns Disease
Hunt et al.
Hunt et al.
Fisher et al.
Fisher et al.
TAGAP
IL12A
MST1
NKX2-3
14
14
1
1
Crohns Disease
Crohns Disease
Crohns Disease
Fisher et al.
Villani et al.
Fisher et al.
Barrett et al.
Franke et al.
Franke et al.
Renstrom et al.
IRGM
NLRP3
IL12B
2
5
7
15
18
13
No interaction
data
No interaction
data
2
8
3
29
26
No interaction
data
10
27
27
No interaction
data
1
1
STAT3
PTPN2
MC4R
Mean
11
30
1
6.8
1
6
1
11.75
Crohns Disease
Crohns Disease
Obesity
ToppGene Rank
2
4
7
ToppNet
Rank
ToppGene Suite (http://toppgene.cchmc.org)
ToppGene Suite (http://toppgene.cchmc.org)
ToppGene Suite (http://toppgene.cchmc.org)
ToppGene Suite (http://toppgene.cchmc.org)
ToppGene Suite (http://toppgene.cchmc.org)
Why is a test set gene ranked higher?
Part 3
Drug Repositioning
What is Drug Repositioning
Discovery of novel disease indications for existing drugs
1. Drug development: It takes about 15 years and $800
million to bring a drug to market!
2. The number of new drugs approved by the FDA each
year remains at just 20–30 compounds. At this rate it
will take more than 300 years for the number of
approved drugs to double!
3. Instead start from existing (already in the market)
or failed drugs (late-stage failures – discontinued in
development), and test them to uncover new
applications.
4. By-pass early stages of drug development required to
assess toxicity - Enter clinical trials comparatively
quickly
“The most fruitful basis for the discovery of a new drug is to start with an old
drug” - Sir James Black, Nobel Laureate, Physiology and Medicine, 1988
Viagra
1.
Rogaine
Because existing drugs have
known pharmacokinetics and
safety profiles, and are often
approved by regulatory agencies
for human use, any newly
identified use can be rapidly
evaluated in phase II clinical
trials, which last ~two years and
cost much less (~$17 million).
2. In 2008, of the 31 new medicines
that reached their first markets,
drug repositioning accounted for
one-third.
3. Since this strategy is economically
more attractive than the de novo
drug discovery and development,
pharmaceutical and biotech
companies have directed their
efforts towards it.
PRADAR (Pharmacoinformatics Radar): Pattern Recognition
Algorithms for Drug Analysis and Repositioning
Topiramate: From
epilepsy to obesity
Integrative Functional
Genomics Approaches
Adverse Drug Reactions – Mouse Phenotype: New Indications?
From serendipity to “systematic serendipity”
the Ultimate Goal…….
Disease
World
Medical Informatics
Bioinformatics
Genome
PubMed
Regulome
Personalized Medicine
►Decision Support System
►Outcome Predictor
►Course Predictor
→Name
►Diagnostic Test Selector
→Synonyms
→Related/Similar►Diseases
Clinical Trials Design
→Subtypes
►Better therapeutics
→Etiology
→Predisposing Causes
►Hypothesis Generator…..
→Pathogenesis
Patient
Records
Clinical
Trials
Variome
►
→Molecular Basis
→Population Genetics
→Clinical findings
→System(s) involved
→Lesions
→Diagnosis
→Prognosis
→Treatment
→Clinical Trials……
Integrative
Genomics Biomedical
Informatics
OMIM
Proteome
Interactome
Metabolome
Physiome
Pathome
Pharmacogenome
Disease
Database
Transcriptome