Download PPT slides

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Information Visualization
using graphs algorithms
Symeonidis Alkiviadis
[email protected]
[email protected]
Contents

Preliminaries

Gene clustering

Graph extraction from biological data

Graph visualization

Open issues

Discussion
Preliminaries

Visualize clusters of genes produced by
clustering over gene expressions

Gene expression:
set of values of genes over a set of
patients
Preliminaries

Graph G(V,E) : set of vertices, with
edges joining vertices

Each vertex represents a gene

Each edge represents strong correlation

Clustering => groups of vertices
Contents

Preliminaries

Gene clustering

Graph extraction from biological data

Graph visualization

Open issues

Discussion
Gene clustering

Correlation
Compute Pearson's
correlation coefficient r 
for every pair of
genes
xy
N
 2 x 2  2 y 2 
 x 
 y 


N 
N 

xy 
Gene clustering

Greedy clustering
for every unclassified gene x
create a cluster which includes it
add all genes y
with correlation > threshold

Cost: O(|genes|2)
Contents

Preliminaries

Gene clustering

Graph extraction from biological data

Graph visualization

Open issues

Discussion
Graph extraction from biological
data

Genes → vertices
۷

Clusters→ groups
۷

Edges
?
Graph extraction from biological
data


In-cluster relation

Mean value of correlation coefficients for all
genes in a cluster

All pairs of genes with correlation higher
than threshold* mean are considered highly
correlated
Edge meaning: (Very) strong correlation
Graph extraction from biological
data


Inter-cluster relation

Mean value of correlation coefficients for
each cluster

All pairs of genes with correlation higher
than threshold* (mean1+mean2)/2 are
considered highly correlated
Edge meaning: Possibly wrong
classification
Graph extraction from biological
data

Genes → vertices
۷

Clusters→ groups
۷
Edges
۷
all highly correlated pairs of genes

Contents

Preliminaries

Gene clustering

Graph extraction from biological data

Graph visualization

Open issues

Discussion
Graph visualization

Gene → Vertex → circle

High correlation → Edge → line

Cluster → Group → Circle with
respective genes - vertices on its
periphery
Graph visualization

Place groups

Determine ordering of vertices in group

Try to reduce crossings
Graph visualization
placing groups

Force - directed method over groups
Graph visualization

Place groups

Determine ordering of vertices in group

Try to reduce crossings
Graph visualization
Determine ordering of vertices in group(tree)

Tree
depth first search discovery time
Graph visualization
Determine ordering of vertices in group(bicon)

Biconnected graph:
Remains connected after removing one(any) vertex/edge
Graph visualization
Determine ordering of vertices in group(bicon)

For every node u

identify triangles
v
u

or create them
Store (v,w)
Remove u
v
u
w
w
Graph visualization
Determine ordering of vertices in group(bicon)
Restore graph
 Remove all stored edges
 Perform dfs, compute longest path
and place it

Graph visualization
Determine ordering of vertices in group(bicon)

Place any remaining vertices
Next to 2 neighbors
 Next to 1 neighbor
 Next to 0 neighbors

Graph visualization
Determine ordering of vertices in group(n-bic)

Non-biconnected graph … under development
There is a vertex whose removal disconnects the graph

Decompose into bicon. components

get articulation points
vertices responsible for non-biconnectivity
Graph visualization
Determine ordering of vertices in group(n-bic)

Decompose into bicon. components


biconnected subgraphs
get articulation points

vertices responsible for non-biconnectivity
Graph visualization
Determine ordering of vertices in group(n-bic)
Articulation points
+ biconnected components
-----------------------------------------Block - cut - point tree
-Dfs on block cut point=> relative ordering of
components
- For each biconnected component act as
before
Graph visualization
Determine ordering of vertices in group

Cost

Tree:


Biconnected graph


dfs: O(|E|+\V|)=O(|E|)
Dominated by dfs O(|E|)
Non- biconnected graph

Dominated by extracting block-cut tree O(|E|)
Graph visualization
… until now
Determine groups’ positions
 Determine vertices ordering

۷
۷
Graph visualization

Place groups
۷

Determine ordering in group
۷

Try to reduce crossings
Graph visualization
reduce crossings

Spin groups trying to minimize energy
Graph visualization
edge coloring

Each edge is assigned a weight
weight(xnode ,ynode )= r(xgene ,ygene)
 The color of each edge reflects its weight
brighter color → stronger correlation


In- group edges have different color than
inter-group edges
Graph visualization
Overall

Initially…
Graph visualization
overall

Finally…
Open issues

Clustering

Edge translation

Visualize large data sets
Zoom
 Layered drawing
 Scrollbars
