Download Flexible Web Visualization for Alert

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Distributed firewall wikipedia , lookup

Cracking of wireless networks wikipedia , lookup

IEEE 1355 wikipedia , lookup

Computer security wikipedia , lookup

Network tap wikipedia , lookup

Airborne Networking wikipedia , lookup

Functional Database Model wikipedia , lookup

UniPro protocol stack wikipedia , lookup

Transcript
Flexible Web Visualization for AlertBased Network Security Analytics
Lihua Hao1, Christopher G. Healey1, Steve E. Hutchinson2
1North
Carolina State University, 2U.S. Army Research Laboratory
[email protected]
ARO MURI Meeting, ASU, October 29, 2013
1/22
Introduction
•
•
Building a visualization tool for Army Research Laboratory (ARL) network
security analysts
Driven by analysts
- “Do not fit our problem to your tool, but build a tool to fit our problem.”
- Our approach does not focus explicitly on network security data, but rather on network
security analysts
•
Balance
- Meeting needs of the analysts
- Applying knowledge and best practices from visualization
•
•
A web-based visualization tool to support flexible network data analysis
Looking for comments & advices about an idea
- Will the ongoing ensemble visualization research be useful in network security domain?
- How to adjust the techniques to better fit the requirements in network security domain?
2/22
Design Constraints
1. Mental models
- “Fit” the mental models the analysts use to investigate problems
2. Working environment
- Integrate into the analyst’s current working environment (e.g., web browser for ARL analysts)
3. Configurability
- Static, pre-defined presentations of the data are typically NOT useful
4. Accessibility
- The visualizations should be familiar to analysts (avoid steep learning curve)
5. Scalability
- Support query and retrieval from multiple data sources
6. Integration
- Augment the analyst’s current problem-solving strategies with useful support
3/22
Existing Visualization Techniques
•
Node-link graphs
- Portall, HoNe, LinkRank
•
Treemaps
- NetVis, NFlowVis
•
Timelines and Event Plots
- An aggregate value over all events
- The patterns of individual events
•
Basic Charts
- Snorby, NVisionIP
•
Zooming, Multivariate
- NVisionIP: galaxy, small multiple, and machine views
- VisFlowConnect: global, domain, internal, and host statistics views
4/22
Data Management
•
MySQL & PHP running on a remote server
- Provide reasonable scalability
- Efficient data filtering and projection
•
No pre-defined table format
- The analyst chooses columns to visualize
- Sets table correlations and data filtering
- Flexibility and configurability
•
Only cache results of current query in memory
- Generate queries to retrieve new data on demand
•
Full SQL is available on demand
- Analysts provide visualization requirement
- System generates whole queries automatically
5/22
Web-Based Visualization
•
ARL analysts work in a browser
- Mental models & working environment
•
HTML5’s canvas element
- No external plug-ins required
- Run in any modern web browser
- Accessibility
•
Use 2D charts
- Common in other security visualization systems
- Effective for presenting values, trends, patterns
and relationships our analysts want to explore
- Accessibility
6/22
Analyst-Driven Charts
•
RGraph for basic chart visualizations
- General information visualization with 2D charts
- Only choose types of charts commonly used in network data visualization
Initialize chart properties
dest_ip
Proportion and frequency comparison (pie)
Value comparison over a secondary attribute (bar)
Trends of change of a value over time (line)
Correlation between two attributes (scatterplots)
Range related correlation (gantt)
Number of alerts
-
•
dest_ip
Assisted chart selection based on data and task (capability)
src_ip, port
•
- E.g., background grids, glyph size, color and type
•
Free to change the initial choices
src_ip
time
time
7/22
Interactive Visualization
•
Intelligent zoom
- Redraw chart to include only the selected chart elements
- Rescale the visual attributes of chart elements
•
Tooltips for value query
- Data-driven notes attached to chart elements
- Access to quantitative data on demand
•
Toolbars
- Customize glyph size, color, size
- Change chart title, size, label width, and so on
- Zooming, correlated views, spreadsheets
8/22
Correlated Views
•
A sequence of visualizations to track an ongoing investigation
- Correlate multiple data sources
- Explore data at multiple levels of details
•
Correlated charts
-
•
Select sub-regions of a chart
Filter corresponding rows
Add additional constraints, tables, attributes
Generate a following-on, correlated chart
Raw data spreadsheets
- Text-based value examination
- A conventional approach
- Working environment and mental models
9/22
Track Visualization Requests
•
Record visualization requests in each step
•
When new request is issued, list all previous requests, actions and charts
•
Improve an analyst’s “working memory” capacity
10/22
Trap Data
•
Need real world data to test the system
•
For security reasons, it is not possible to use data from ARL for testing
•
The trap server
-
•
Data from network security researchers at NCSU
Real world network traffic in Computer Science building
Transmitted to a Snort sensor to perform: (1) intrusion detection and (2) extraction of network packets
Stores two types of data: (1) NetFlow data and (2) Snort alerts
An example file for 24 hours of data
- 17.4GB of packet headers
- 938K unique source IPs, 168K unique destination IPs
- 1.6M flows with 615K alerts
11/22
Summarization of our Web-based Visualization
•
MySQL & PHP based database management
- Scalability, data filtering and projection
- No predefined table format
•
Web-based visualization & analyst driven 2D charts
- Mental model & working environment
- Avoid steep learning curve
- Select chart based on data and task
•
RGraph
Interactive Visualization
- Intelligent zoom, tooltips, toolbar
•
Correlated Views
- A sequence of visualizations
- Track an ongoing investigation
- Raw data spreadsheets
12/22
Ensemble Visualization
•
Scientific ensemble analysis & visualization
- A collection of related datasets (members), from runs of a simulation or an experiment, with
slightly varying initial conditions or parameters
- Focus on scalability (data attribute, data element, member)
- Relationships between members (comparison, aggregation, pattern mining)
•
Apply to network security data
- Scalability is also critical
- Relationships between network traffics
- Opportunity to apply ongoing research from ensembles to network security domain
•
How is a network security dataset an ensemble?
- E.g., NetFlow ensemble (member: a NetFlow)
- Distributions of alerts within and between NetFlows
•
Are ensemble techniques useful in network security domain?
- Determine the value added of this analysis
13/22
Two Stages of Ensemble Analysis
1. Structure the members into sets based on their similarities
- Level of detail clustering
- Visualize the cluster hierarchy as a tree
- Analysts choose members to visualize from the cluster tree (configurability)
2. Visualizing member sets
- Use chart visualizations
- Working environment, accessibility
14/22
NetFlow Similarity Measurement
1. Time duration
2. Density of alerts
3. Distributions of alerts
4. Types of alerts within NetFlow
• Analysts decide
- Which factors to measure
- Weights of each factor
- Configurability
……
46 secs 1 alert
46 secs 7 alerts
46 secs 7 alerts
15/22
NetFlow Cluster Tree
•
Clustering at varying threshold of similarity
•
Analysts choose tree nodes to visualize
Trade off: similarity vs. number of members
16/22
NetFlows Ensemble – 123 Members
•
Analysts define members to form an ensemble
17/22
A Cluster of NetFlows
Currently all NetFlows are visualized individually in a gantt chart
•
Developing methods to aggregate NetFlows into a composite visualization
source IP, port
•
time
18/22
Feedbacks for Further Adjustment
• Ensemble analysis and visualization is flexible
- Techniques vary based on requirements of applications
•
•
•
•
•
Different perspectives to define a network ensemble (member)?
Useful ways to measure correlations between ensemble members?
Useful ways to structure ensemble members?
Special requirements for the composite visualization?
Other recommendations?
19/22
Future Work
•
Analysis Sandbox
- Individual analyses can be performed, stored, reviewed and compared
- Improve an analyst’s “working memory” capacity
•
Analysis Preferences
- Track an analyst’s actions to better anticipate their strategies for specific types of tasks
- Use preference elicitation algorithms to track an analyst’s interest within a visualization session
•
Real-world Integration
- Not allowed to speak directly with the analysts
- Coordinate with IT staffs who support the analysts
•
Ensemble Visualization
- Further adjust existing techniques to meet the requirements in network security domain
- Integrate into the web-based network security visualization tool
20/22
Progress Summary
• Papers
- Flexible Web Visualization for Alert-Based Network Security Analytics. Hao,
Healey, and Hutchinson. In Proceedings VizSec 2013 (Atlanta, GA), 2013.
• Students supported
- Lihua Hao, PhD candidate, NC State University
• Projects supported
- Web-based visualization for network security analytics
- Ensemble visualization for network security analytics
21/22
FY 2014 Research Plan
• Validation of web-based tool with ARL collaborators
- Finalize web-based visualization tool
- Present tool to ARL IT staff
- Integrate feedback into tool’s design, iterate on requested changes and improvements
• Investigation of scalability support through ensemble visualization
- Confirm interest in pursuing scalability support
- Integrate ensemble visualization research into web-based visualization tool
- Update visualizations to support intelligent summarization and aggregation
22/22