Download Visualizing Large, Complex Data

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Visualizing Large, Complex Data
Outline
•  Visualizing Large Scientific Simulation Data
–  Importance-driven visualization
–  Multidimensional filtering
•  Visualizing Large Networks
–  A layout method
–  Filtering methods
Supernova
Supernova
Shared resources
The Large Data Problem
Supercomputer
Storage
Visualization Machine
A Turbulent Lifted Autoignitive
Ethylene/air Jet Flame
HO2
and mixture fraction isosurface
HO2 and OH
Complex, Multi-Scale Nature of
Turbulent Flow
Small eddies are hidden in the multi-layer flow
Feature-directed Data
Reduction and Visualization
Feature-directed
Data Reduction and Visualization
•  Achieved over 80% saving
•  In situ data reduction and triage can facilitate following
data analysis and visualization!
In Situ Methods Enables
•  Seeing all the data and capturing transient
events at highest possible detail
•  More effective data reduction
•  More efficient postprocessing analysis and viz
•  Monitoring/debugging of the simulation
(ensuring the calculation is running well)
•  Steering the simulation and driving the
simulation with interactive analysis
•  Tuning and optimizing the performance of the
simulation/machine
Fusion
Multidimensional Particles Filtering
Multidimensional Particles Filtering
Trapped particles that change direction frequently
Visualizing
Large, Complex Networks
•  Data are created and collected for a variety of purposes
– 
– 
– 
– 
– 
– 
– 
– 
Internet is a source of massive data
Cyber security
Homeland security
Business transactions
Mobile device user data
Health care data
Email
…
•  Relations in these data are often represented with graph/
networks for analysis
•  To visualize a network, we need to lay it out
The Graph Layout Problem
•  The cost of displaying
a graph
•  The hairball problem
of large graph layouts
–  Large, dense graphs
become a mess
–  Inefficient use of space
–  Details cluttered
•  Solutions
– 
– 
– 
– 
Filtering
Clustering
Abstraction
Focus+context
California data 6,107 nodes 15,160 edges
High dimensional embedding method
Space Partitioning Based Layout
California data 6,107 nodes
15,160 edges
Hibert curve
Treemap
Linlog Method 10,737s
Radial Treemap
Gosper curve
Space Filling Curve Based Layout
 
 
 
 
 
 
 
 
Layout defined by clustering
Space filing
Interaction is very fast O(|V|)
Scales to large graphs
Effective for Focus+Context
Guaranteed aspect ratios
Nodes don’t become colinear
Rendering is slower than
layout
A protein homology graph. Color corresponds to depth
in the clustering hierarchy. |V| = 28,854, |E| = 1,180,816
Visualizing Internet Connectivity
Centrality Sensitivity
•  Centralities (degree, between-ness, closeness,
eigenvector, Markov, …) indicate how important a node
is in a network.
•  Studying the sensitivity and stability of a network in
terms of different metrics for centrality allow us to
–  Filter the network
–  Search and explore in the network
–  Obtain an overview of the network
•  Compute sensitivity as the derivative of the centrality
function, approximate derivatives of centrality using
finite difference, and validate by computing the mean
square error of the linear fit between the approximated
and analytical values
Centrality Sensitivity
Minimum spanning tree as the core network
with centrality derivatives as edge weights
Central nodes remain central
Network of protein-protein interaction (~1500 nodes)
Overview of Sensitivity
Friendster social network
Links exhibit negative sensitivity (red)
between cluster centers
Astrophysics co-author network
One competitive network (red) and
one collaborative network (blue)
Summary
•  Visualization, as a tool complementing conventional data
analysis and mining methods, enhances our ability to
utilize and communicate with data and knowledge
•  In situ visualization and data reduction/triage is the most
plausible solution to extreme scale scientific
supercomputing
•  Visual-based network analysis has become an essential
tool, and interactivity is the key to understanding
complex networks.
•  The 1st IEEE Symposium on Large Data Analysis and
Visualization (LDAV), Oct 23-24, Providence, RI
•  The 6th Ultrascale Visualization Workshop,
Supercomputing Conference (SC11), November 13,
Seattle, WA