Download A Proposal of New Benchmark Data for Intrusion Detection

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
A Proposal of New Benchmark Data
to Evaluate Mining Algorithms
for Intrusion Detection
Jungsuk SONG†, Hiroki TAKAKURA‡, Yasuo OKABE‡
†Graduate
‡Academic
School of Informatics, Kyoto Univ.
Center for Computing and Studies, Kyoto Univ.
[email protected], [email protected], [email protected]
Overview

Introduction
 Intrusion
Detection System
 Intrusion Detection Evaluation Data

KDD Cup 99 Data Set




Details
Problems
Our Experimental Result
Our Proposal
2007/1/25
23rd Asia Pacific Advanced Network Meeting
2
Introduction

Intrusion Detection System(IDS)
 combination
of software and hardware that attempts to
perform intrusion detection
 raise the alarm when possible intrusion or suspicious patterns
are observed
IDS
Attacker
The
Interne
t
Firewall
IDS
Internal Network
2007/1/25
23rd Asia Pacific Advanced Network Meeting
3
Introduction

Why we need IDS?
 Unknown
weakness or bugs
 Complex, unforeseen attacks
 Firewalls,
security policies
 Using information detected




2007/1/25
Recover compromised system
Understand the attack mechanism
Detect novel attacks
Defend our systems
23rd Asia Pacific Advanced Network Meeting
4
Introduction

We need evaluation data for IDS
 Performance
improvement
 Technical progress
 Research guide…

KDD Cup 99 Data Set
 Most

commonly used evaluation data, but..
Propose new benchmark data
2007/1/25
23rd Asia Pacific Advanced Network Meeting
5
KDD Cup 99 Data Set


Modification of DARPA 1998 data set
DARPA 1998 data set
 Managed
by Lincoln Lab.(under DARPA sponsorship)
 Simulated nine weeks of raw TCP dump data
 Attacks


38 different attacks against Unix/Linux machines
DoS, Scan, Buffer overflow and so on.
 Normal

2007/1/25
traffic
1000’s of virtual hosts and 100’s of user automata
23rd Asia Pacific Advanced Network Meeting
6
KDD Cup 99 Data Set


Each connection ⇒ 41-dimensions vector
Samples
5,tcp,smtp,SF,959,337,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,1,0.00,0.00,0.00,0.00,1.00,
0.00,0.00,144,192,0.70,0.02,0.01,0.01,0.00,0.00,0.00,0.00,normal.
0,tcp,http,SF,54540,8314,0,0,0,2,0,1,1,0,0,0,0,0,0,0,0,0,2,2,0.00,0.00,0.00,0.00,1.0
0,0.00,0.00,118,118,1.00,0.00,0.01,0.00,0.00,0.00,0.02,0.02,back.
0,tcp,http_443,S0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,114,2,1.00,1.00,0.00,0.00,0.02
,0.06,0.00,255,2,0.01,0.07,0.00,0.00,1.00,1.00,0.00,0.00,neptune.

Numerical: 34, Categorical: 7
 Basic feature: “duration”, “protocol”…
 Statistical feature: “number of connections to the same host as the current
connection in the past two seconds”…
 Label ⇒ “normal” or “name of attacks”
2007/1/25
23rd Asia Pacific Advanced Network Meeting
7
KDD Cup 99 Data Set

Problems
 Attacks

Can not reflect current malicious activities


Stealthy scan ⇒ short time interval, no multiple IP address scan
No attacks against Windows machines
 Protocol

types
Only TCP, UDP, ICMP

Can not detect attacks such as ARP Spoofing
 Simplicity


2007/1/25
Only 3 real victim hosts
1000’s of virtual hosts and 100’s of user automata(custom software)
23rd Asia Pacific Advanced Network Meeting
8
Our Experimental Results

PCA(Principal Components Analysis)
 Technique
for reducing dimensions of data set
 Transform the data to a new coordinate system

What we know from PCA
 The
number of dimensions that are actually required to
represent the original data
 Accumulative Contribution Ratio

Indicate what percentage of the original data can be represented
 For

2007/1/25
example
2 dimensions ⇒ 90% : represent 90% of the original data by them
23rd Asia Pacific Advanced Network Meeting
9
Our Experimental Results
There is no guarantee their performance also will be good in real environment
2007/1/25
23rd Asia Pacific Advanced Network Meeting
10
Our Proposal

New benchmark data
 IDS
KDD Cup 99 form
 Honeypots

Privacy problems
 Sanitize IP address
 Remove

Open
Update every month
payload data
Goal
 Comparison
analysis of IDS alert and Honeypots traffic data
 Detect the attacks that are missed by IDS
2007/1/25
23rd Asia Pacific Advanced Network Meeting
11
Thank you for your attention!
2007/1/25
23rd Asia Pacific Advanced Network Meeting
12
Related documents