Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Information Visualization using graphs algorithms Symeonidis Alkiviadis [email protected] [email protected] Contents Preliminaries Gene clustering Graph extraction from biological data Graph visualization Open issues Discussion Preliminaries Visualize clusters of genes produced by clustering over gene expressions Gene expression: set of values of genes over a set of patients Preliminaries Graph G(V,E) : set of vertices, with edges joining vertices Each vertex represents a gene Each edge represents strong correlation Clustering => groups of vertices Contents Preliminaries Gene clustering Graph extraction from biological data Graph visualization Open issues Discussion Gene clustering Correlation Compute Pearson's correlation coefficient r for every pair of genes xy N 2 x 2 2 y 2 x y N N xy Gene clustering Greedy clustering for every unclassified gene x create a cluster which includes it add all genes y with correlation > threshold Cost: O(|genes|2) Contents Preliminaries Gene clustering Graph extraction from biological data Graph visualization Open issues Discussion Graph extraction from biological data Genes → vertices ۷ Clusters→ groups ۷ Edges ? Graph extraction from biological data In-cluster relation Mean value of correlation coefficients for all genes in a cluster All pairs of genes with correlation higher than threshold* mean are considered highly correlated Edge meaning: (Very) strong correlation Graph extraction from biological data Inter-cluster relation Mean value of correlation coefficients for each cluster All pairs of genes with correlation higher than threshold* (mean1+mean2)/2 are considered highly correlated Edge meaning: Possibly wrong classification Graph extraction from biological data Genes → vertices ۷ Clusters→ groups ۷ Edges ۷ all highly correlated pairs of genes Contents Preliminaries Gene clustering Graph extraction from biological data Graph visualization Open issues Discussion Graph visualization Gene → Vertex → circle High correlation → Edge → line Cluster → Group → Circle with respective genes - vertices on its periphery Graph visualization Place groups Determine ordering of vertices in group Try to reduce crossings Graph visualization placing groups Force - directed method over groups Graph visualization Place groups Determine ordering of vertices in group Try to reduce crossings Graph visualization Determine ordering of vertices in group(tree) Tree depth first search discovery time Graph visualization Determine ordering of vertices in group(bicon) Biconnected graph: Remains connected after removing one(any) vertex/edge Graph visualization Determine ordering of vertices in group(bicon) For every node u identify triangles v u or create them Store (v,w) Remove u v u w w Graph visualization Determine ordering of vertices in group(bicon) Restore graph Remove all stored edges Perform dfs, compute longest path and place it Graph visualization Determine ordering of vertices in group(bicon) Place any remaining vertices Next to 2 neighbors Next to 1 neighbor Next to 0 neighbors Graph visualization Determine ordering of vertices in group(n-bic) Non-biconnected graph … under development There is a vertex whose removal disconnects the graph Decompose into bicon. components get articulation points vertices responsible for non-biconnectivity Graph visualization Determine ordering of vertices in group(n-bic) Decompose into bicon. components biconnected subgraphs get articulation points vertices responsible for non-biconnectivity Graph visualization Determine ordering of vertices in group(n-bic) Articulation points + biconnected components -----------------------------------------Block - cut - point tree -Dfs on block cut point=> relative ordering of components - For each biconnected component act as before Graph visualization Determine ordering of vertices in group Cost Tree: Biconnected graph dfs: O(|E|+\V|)=O(|E|) Dominated by dfs O(|E|) Non- biconnected graph Dominated by extracting block-cut tree O(|E|) Graph visualization … until now Determine groups’ positions Determine vertices ordering ۷ ۷ Graph visualization Place groups ۷ Determine ordering in group ۷ Try to reduce crossings Graph visualization reduce crossings Spin groups trying to minimize energy Graph visualization edge coloring Each edge is assigned a weight weight(xnode ,ynode )= r(xgene ,ygene) The color of each edge reflects its weight brighter color → stronger correlation In- group edges have different color than inter-group edges Graph visualization Overall Initially… Graph visualization overall Finally… Open issues Clustering Edge translation Visualize large data sets Zoom Layered drawing Scrollbars