Download Case Study: Characterizing Diseased States from Expression

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Case Study: Characterizing Diseased
States from Expression/Regulation Data
Tuck et al., BMC Bioinformatics, 2006.
Background
●
●
●
How do we classify processes/expression related to
disease/phenotype (separating signal/data)?
How do we use all of the data available to us –
sequences, expression, regulation?
Present case study of acute leukemia and breast cancer
(normal vs. diseased cells).
Summary of Contributions
●
●
●
Constructing sample-specific regulatory networks.
Identify links between transcription factors and
regulated genes that differentiate healthy states from
diseased states.
Generalize to simultaneous changes in functionality of
multiple regulatory links, pointing to a regulatory gene
/ emanating from one TF.
Summary of Contributions
●
●
●
●
Examine distances in transcriptional networks for
subsets of genes that characterize diseased state.
Observation that genes that optimally classify samples
are concentrated in neighborhoods.
Genes that are deregulated in diseased sttes exhibit
high connectivity.
TF-regulated gene links and centrality of genes
can be used to characterize diseased cells.
Background
●
●
●
●
Current work largely focuses on identification of
individual differentially expressed genes, or coregulated gene sets.
There is significant work on module identification
(graph models, SVD, connected components, etc.)
There is work on expression patterns of genes that can
classify tumor types.
There is some work on transcription networks prior to
this work as well [TRANSFAC/CREME]
Constructing Disease Cell Networks
●
●
●
Intersect connectivity network representing TF
binding to gene promoter regions, with co-expression
networks representing TF target gene co-expression.
Use TRANSFAC to relate known TF binding sites to
promoter regions of genes and known TF-target gene
interactions.
For data derived from each microarray (Sample or
patient), construct a co-expression network such that
each TF-gene pair is assigned +1 or -1 based on
up/down co-regulation.
Constructing Disease Cell Networks
●
●
●
Intersection of connectivity and individual coexpression networks gives condition specific (CS)
regulatory networks.
CS networks derived from 6 gene expression studies
using 3 types of datasets – normal cell lineages, tumor
vs. normal tissues, and disease specific tumors
associated with variable climical outcomes.
4821 genes and 196 Tfs on early Affy arrays and 13363
genes and 233 Tfs on newer arrays.
Constructing Disease Cell Networks
Constructing Disease Cell Networks
Classifying based on
network features.
●
●
Assume that each disease sample has a distinct
regulatory network (pattern of activated links that
gives rise to its expression profile).
Examine how different aspects of network structure
characterize different phenotypes.
Classifying based on
network features.
Link Based Approach
●
Examine differences between patient samples by
analyzing activity status of regulatory links
●
Construct networks unique to patients
●
Yields complete discriminatory networks.
Classifying based on
network features.
Degree Based Approach
●
●
●
“Centrality” of individual genes in networks
Degree – number of TFs activating or suppressing a
particular gene (in degree), or number of genes
regulated by a single TF (out degree).
Use genome wide degree profile – identifying nodes
with largest changes in centrality (rewiring) will assist
is in detecting hotspots.
Classifying based on
network features.
Sample Classification
●
Create regulatory networks for every sample and apply
a classifier.

Rank features to identify set of TF-gene links

Use training sets to identify features and rank links, genes,
and degree of nodes that undergo most substantial changes
●
●
●
Acute lymphoblastic leukemia vs. acute mueloid leukemia
Two different myeloid leukemia types
Different matched cell types (renal-cell carcinoma vs. normal)
Classifying based on
network features.
Sample Classification
●
Create regulatory networks for every sample and apply
a classifier.

Rank features to identify set of TF-gene links

Use training sets to identify features and rank links, genes,
and degree of nodes that undergo most substantial changes
●
●
●
Acute lymphoblastic leukemia vs. acute mueloid leukemia
Two different myeloid leukemia types
Different matched cell types (renal-cell carcinoma vs. normal)
Classifying based on
network features.
Sample Classification
●
Pass top links to train a basic classifier
●
Cross validate.
Classifying based on
network features.
Classifying based on
network features.
Classifying based on
network features.
Classifying based on
network features.
Classifying based on
network features.
Classifying based on
network features.
Classifying based on
network features.
Classification Techniques
Classification Techniques
Classification Techniques
Classification
results
Classification
results
Classification
results