Download Hybrid Intelligent Systems for Network Security

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cracking of wireless networks wikipedia , lookup

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

Network tap wikipedia , lookup

IEEE 1355 wikipedia , lookup

List of wireless community networks by region wikipedia , lookup

Airborne Networking wikipedia , lookup

Transcript
Hybrid Intelligent Systems
for Network Security
Lane Thames
Georgia Institute of Technology
Savannah, GA
[email protected]
Presentation Overview
Discuss the goals of this project
Overview of Self Organizing Maps
Overview of Bayesian Learning Networks
Describe the details of the Hybrid System
Review the Experimental Results
Discuss Conclusions and Future Work
Q&A
Internet Growth
Internet Growth is Steadily Increasing
Many different types of applications are
now using the Internet as a
communication channel
Data Source: www.idc.com
The life of a network security
professional
Data Source: http://www.cert.org/stats/cert_stats.html
Current Issues with Security
Short time between disclosure of
vulnerability and attack
Huge Rule Base
Huge Signature Databases
Lag time between attack detection and
signature creation
Lag time between vulnerability discovery
and patch deployment
Project Goals
Develop an Intelligent System that works
reliably with data that can be collected
purely within a Computer Network
Why? If security mechanisms are difficult
to use, people will not use them.
Using data from the network takes some
of the burden off the end user
Hybrid Intelligent Systems
A system was developed that made use of
two types of Intelligence Algorithms:

Self-Organizing Maps

Bayesian Learning Networks
Training and Testing Data Set
KDD-CUP 99 Data Set
The Data set used for the Third
International Knowledge Discovery and
Data Mining Tools Competition
Training and Testing Data Set
41 Total Features Categorized as:




Basic TCP/IP features
Content Features
Time Based Traffic Features
Host Based Traffic Features
Self Organizing Maps—SOM
Pioneered by Dr. Teuvo Kohonen
An algorithm that transforms high
dimensional input data domains to
elements of a low dimensional array of
nodes
Self-Organizing Maps
Input Data Vectors
X  [ x1  xn ]
Parametric Vector
associated with each
element, i, of the grid
M i  [mi1 min ]
Self-Organizing Map
A decoder function is defined on the
basis of distance between the input
vector and the parametric vector.
d(X , Mi )
The decoder function is used to map the
image of the input vector onto the SOM
grid. The decoder function is usually
chosen to be either the Manhattan or
Euclidean distance metric.
Self-Organizing Maps
A Best Matching Unit, denoted as the
index c, is chosen as the node on the SOM
grid that is closest to the input vector
c  arg min i {d ( X , M i )}
Self-Organizing Maps
The dynamics of the SOM algorithm
demand that the Mi be shifted towards the
order of X such that a set of values {Mi} are
obtained as the limit of convergence of the
following:
mi (t  1)  mi (t )   (t )[ x(t )  mi (t )]H ic
Bayesian Learning Networks—BLN
A BLN is a probabilistic model, and the
network is built on the basis of a Directed
Acyclic Graph (DAG)
The directed edges of the graph represent
relationships among the variables
Bayesian Learning Networks
The Fundamental Equation: Bayes Theorem
P ( D | h) P ( h )
P ( h | D) 
P( D)
Bayesian Learning Networks
In Bayesian learning, we calculate the
probability of an hypothesis and make
predictions on that basis
Bayesian Learning Networks
With BLN, we have
conditional probabilities
for each node given its
parents
x1
x2
x3
The graph shows causal
connections between the
variables
x4
Prediction and abduction
x5
Naïve Bayesian Learning Network
The Naïve BLN is a
special case of the
general BLN
It contains one root node
which is called the class
variable, C
The leaf nodes are the
attribute variables
(X1 … Xi)
It is Naïve because it
assumes the attributes
are conditionally
independent given the
class
C
x1
x2
x3
The Naïve BLN Classifier
Once the network is trained, it can be used
to classify new examples where the
attributes are given and the class variable
is unobserved—abduction
The Goal: Find the most probable class
value given a set of attribute instantiations
(X1 … Xi)
Hybrid System Details
SOM Training
Training Data Subset
Hybrid System Details
Trained SOM
Modified Data
BN Development
Module
Data
Hybrid System Details
BN Development
Module
Structure File
Training Data
Bayesian Training
Hybrid System Details
Bayesian/SOM
Classifier
Classification File
Test Data
Experimental Results
4 types of analyses were made with the
dataset




BLN analysis with network and host based
data
BLN analysis with network data
Hybrid analysis with network and host based
data
Hybrid analysis with network based data
Experimental Results
BLNHost/Network
Based
BLNNetwork
Based
HybridHost/Network
Based
HybridNetwork
Based
Total
Cases
65,505
62,047
65,505
62,047
Correctly
Classified
65,019
59,734
65,238
61,631
%
Correctly
Classified
99.26%
96.27%
99.59%
99.33%
Number of
486
2315
267
416
Incorrectly
Classified
Future and Current Work
HoneyNet Project
Resource
Management
System with
Intelligent System
Processing at the
Core
Conclusions
Intelligent System algorithms are very
useful tools for applications in Network
Security
Conclusions
Questions remain to be answered:


How will the system behave as the data
becomes very noisy with respect to training
data
How will other intelligence algorithms
compare in performance—training time,
accuracy, robustness in noise