Download Present

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Ridge (biology) wikipedia , lookup

Gene wikipedia , lookup

Signal transduction wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Paracrine signalling wikipedia , lookup

Endogenous retrovirus wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Promoter (genetics) wikipedia , lookup

Secreted frizzled-related protein 1 wikipedia , lookup

Biochemical cascade wikipedia , lookup

Gene expression profiling wikipedia , lookup

RNA-Seq wikipedia , lookup

Gene regulatory network wikipedia , lookup

Transcript
Bayesian Decomposition
Michael Ochs
Fox Chase Cancer Center
Bioinformatics
Fox Chase Cancer Center
Making
Proteins
Bioinformatics
Fox Chase Cancer Center
A Closer Look at Translation
Post-Translational
Modification
RNA Splicing
miRNA
Bioinformatics
Fox Chase Cancer Center
Identifying Pathways
A
1
B
3
2
C
D
A
B
C
D
www.promega.com
Bioinformatics
Fox Chase Cancer Center
Goal of Analysis
Take measurements of
thousands of genes, some of
which are responding to stimuli
of interest
And find the correct set of basis
vectors that link to pathways
1
2
3
*
*
*
*
*
then identify* the pathways
Bioinformatics
Fox Chase Cancer Center
Data
Bioinformatics
X
condition M
gene 1 * * * *
****
****
****
****
****
****
The behavior of
****
one gene can be
****
explained as a
****
mixture of patterns
****
gene N * * * *
=
condition 1
pattern k
gene 1 * * * * * * * * * *
**********
**********
**********
**********
**********
**********
**********
**********
**********
**********
gene N * * * * * * * * * *
Distribution of
Patterns
pattern 1
condition 1
condition M
BD: Matrix Decomposition
* * * * * * * * * * pattern 1
**********
**********
* * * * * * * * * * pattern k
Patterns of
with different Behavior
behaviors
Fox Chase Cancer Center
Patterns as Basis Vectors
Bioinformatics
Fox Chase Cancer Center
Data
Bioinformatics
=
gene 1 * * * *
****
****
****
****
****
****
****
****
****
****
gene N * * * *
X
condition M
condition 1
pattern k
gene 1 * * * * * * * * * *
**********
**********
**********
**********
**********
**********
**********
**********
**********
**********
gene N * * * * * * * * * *
Distribution of
Patterns
pattern 1
condition 1
condition M
BD with Knowledge of Classes
* * * * 0 0 0 0 0 0 pattern 1
0000****00
00000000**
* * * * * * * * * * pattern k
Patterns of
Behavior
Fox Chase Cancer Center
BD Structure
Atomic Domains Allow Encoding
of Biological Information
Markov Chain Monte Carlo is used
to explore possible sets of
distributions and patterns
Bioinformatics
Fox Chase Cancer Center
Project Normal Data
•
•
•
•
•
•
Download Data from CAMDA Site
Adjust for Background Measurement
Take Ratios
Calc Mean and SDOM for Each Ratio
Eliminate M3T and M4T Data
Eliminate 24 Points with Only 1 Data Pt
– 99% 4 Pts, 1% 3 Pts, 0.1% 2 Pts
Bioinformatics
Fox Chase Cancer Center
Filtering of Genes
• Eliminated all ESTs
– Annotated Remaining Genes from Gene
Ontology on Unigene Name
• Annotated all Genes on Clone ID
– 24% Changed Unigene Cluster
– 948 Clones Had GO Process Information
Bioinformatics
Fox Chase Cancer Center
Updating Annotations: ASAP
http://bioinformatics.fccc.edu/
Bioinformatics
Fox Chase Cancer Center
Bayesian Decomposition
• Encoded 3 Known Patterns
– Kidney, 6 Conditions
– Liver, 6 Conditions
– Testis, 4 Conditions
• Allowed 1 - 3 Additional Patterns
– Account for Behavior Unrelated to Tissue
Specific Expression
Bioinformatics
Fox Chase Cancer Center
Fitting the Data
Bioinformatics
Fox Chase Cancer Center
Four Patterns
0.3
Kidney
Liver
Testis
Background
0.25
0.2
0.15
0.1
0.05
0
Bioinformatics
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Fox Chase Cancer Center
Five Patterns
0.3
Kidney
Liver
Testis
Background 1
Background 2
0.25
0.2
0.15
0.1
0.05
0
1
Bioinformatics
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Fox Chase Cancer Center
Four vs Five Patterns
Bioinformatics
Fox Chase Cancer Center
Gene Ontology
• Identify Genes “Only” in One Pattern
– See if Pattern Enhanced in GO
• Identify Genes in a Pattern
– 3σ above Zero in Distribution
– Look at GO Assignments
• Identify Genes Lacking in Pattern
– Eliminate Background (Genes > 70%)
– Look for Genes Not in Pattern (3σ)
Bioinformatics
Fox Chase Cancer Center
Genes Only in Kidney by GO
neurotransmitter transport *
chloride transport *
receptor mediated endocytosis
enzyme linked receptor protein
signaling pathway *
transmembrane receptor protein
tyrosine kinase signaling
pathway *
vitamin/cofactor transport *
vitamin B12 transport
inorganic anion transport *
anion transport *
From Old Annotations
Sodium transport, vesiclemediated transport, amino
acid transport, folate
transport, homophilic cell
adhesion, cell-cell
adhesion, monovalent
inorganic cation transport
metal ion transport
neuropeptide signaling pathway
endocytosis *
Bioinformatics
> 10x Enhancement
Fox Chase Cancer Center
Genes Only in Liver by GO
antigen processing
antigen processing, endogenous
antigen via MHC class I"
cellular defense response
response to drug
drug susceptibility/resistance *
cell-cell adhesion *
homophilic cell adhesion *
response to abiotic stimulus
response to chemical substance
From Old Annotations
small molecule transport,
histogenesis and
organogenesis,
embryogenesis and
morphogenesis, lipid
metabolism
response to pest/pathogen/parasite
protein targeting
Bioinformatics
> 10x Enhancement
Fox Chase Cancer Center
Genes Only in Testis by GO
DNA recombination
meiotic recombination
reproduction *
gametogenesis *
spermatogenesis *
regulation of transcription from
Pol II promoter
microtubule-based movement
microtubule-based process
development *
From Old Annotations
nuclear organization
and biogenesis,
chromosome
organization and
biogenesis, cell
organization and
biosynthesis
> 10x Enhancement
Bioinformatics
Fox Chase Cancer Center
Kidney Genes, 3σ , > 2 fold
amino acid metabolism
inflammatory response
mitotic cell cycle
amine metabolism
anion transport
nitrogen metabolism
perception of abiotic stimulus
perception of light
cell-cell adhesion
homophilic cell adhesion
S phase of mitotic cell cycle
endocytosis
G-protein coupled receptor protein
signaling pathway
Bioinformatics
Fox Chase Cancer Center
Testis Genes, 3σ, >4 fold
reproduction
gametogenesis
spermatogenesis
regulation of cell shape and cell
size
mitotic cell cycle
microtubule-based movement
protein folding
S phase of mitotic cell cycle
Bioinformatics
Fox Chase Cancer Center
Liver Genes, 3σ, >3 fold
amino acid metabolism
response to drug
drug susceptibility/resistance
energy pathways
energy derivation by oxidation of organic
compounds
main pathways of carbohydrate metabolism
catabolic carbohydrate metabolism
response to abiotic stimulus
response to chemical substance
sensory perception
morphogenesis
organogenesis
tricarboxylic acid cycle
Bioinformatics
Fox Chase Cancer Center
Genes Absent in Patterns
Absent in Kidney
Absent in Liver
monosaccharide metabolism
reproduction
regulation of transcription
from Pol II promoter
gametogenesis
regulation of cell shape and
cell size
cell differentiation
biological_process unknown
obsolete
reproduction
gametogenesis
spermatogenesis
actin filament-based process
actin cytoskeleton
organization and biogenesis
microtubule-based
movement
spermatogenesis
microtubule-based process
Bioinformatics
Fox Chase Cancer Center
Genes Absent in Background 1
biological_process unknown
obsolete
protein modification
protein targeting
actin filament-based process
actin cytoskeleton organization and biogenesis
endocytosis
regulation of transcription from Pol II promoter
reproduction
gametogenesis
spermatogenesis
mitotic cell cycle
Bioinformatics
Fox Chase Cancer Center
Genes Present in Two Tissues
Kidney/Liver not Testis
cell-cell adhesion
Kidney/Testis not Liver
mitotic cell cycle
homophilic cell adhesion
defense response
immune response
amino acid metabolism
amine metabolism
perception of abiotic
stimulus
perception of light
Bioinformatics
Fox Chase Cancer Center
Acknowledgements
• This Work
–
–
–
–
Tom Moloshok
DJ Datta (Cambridge)
Andrew Kossenkov
Bill Speier (JHU)
• Colleagues
– J. Robert Beck
– Frank Manion
Bioinformatics
• Programming
– Jeffrey Grant
– Elizabeth Goralczyk
– Luke Somers
• Others
– G. Parmigiani (JHU)
– T. Brown (Columbia)
– E. Korotkov (RAS)
Fox Chase Cancer Center