Download Real-time Intrusion Detection and Classification

yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Computer network wikipedia , lookup

Distributed firewall wikipedia , lookup

Wake-on-LAN wikipedia , lookup

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

Deep packet inspection wikipedia , lookup

List of wireless community networks by region wikipedia , lookup

Network tap wikipedia , lookup

IEEE 1355 wikipedia , lookup

Airborne Networking wikipedia , lookup

UniPro protocol stack wikipedia , lookup

Cracking of wireless networks wikipedia , lookup

Real-time Intrusion Detection and Classification
Phurivit Sangkatsanee1, Naruemon Wattanapongsakorn1,* and Chalermpol Charnsripinyo2
Computer Engineering Department, King Mongkut’s University of Technology Thonburi,
126 Pracha-Utid, Tung-Kru, Bangkok 10140 Thailand,
Network Technology Laboratory, National Electronics and Computer Technology Center,
Klong Luang, Pathumthani, 10120 Thailand
*Corresponding author: [email protected]
Together with the growth of computer network
activities, the growing rate of network attacks
including hacker, cracker, and criminal enterprises
have been advancing, which impact to the
availability, confidentiality, and integrity of critical
information data. In this paper, we propose a RealTime Intrusion Detection System (RT-IDS) using
Decision tree technique to classify an online network
data that is preprocessed to have only 13 features.
The number of features affects to the RT-IDS
detection speed and resource consumption. In
addition our RT-IDS can classify normal network
activities and main attack types consisting of Probe
and Denial of Service (DoS). Hence, it helps to
decrease time to diagnose and defense each network
attack. The results show that our RT-IDS technique
offers the detection rate higher than 98%, while
consuming less than 25% of CPU and 94.5 MB of
memory on full traffic load of 100 Mbps.
Key Words: Intrusion Detection System, Decision
tree, Network security system, Denial of Service,
Probe, KDD99 dataset
1. Introduction
Nowadays, many organizations and companies
use Internet services as their communication and
marketplace to do business such as at EBay and website. Together with the growth of
computer network activities, the growing rate of
network attacks has been advancing, impacting to the
availability, confidentiality, and integrity of critical
information data. Therefore a network system must
use one or more security tools such as firewall, antivirus, IDS and Honey Pot to prevent important data
from criminal enterprises.
A network system using a firewall only is not
enough to prevent networks from all attack types.
The firewall cannot defense the network against
intrusion attempts during the opening port. Hence a
Real-Time Intrusion Detection System (RT-IDS),
shown in Figure 1, is a prevention tool that gives an
alarm signal to the computer user or network
administrator for antagonistic activity on the opening
session, by inspecting hazardous network activities.
Figure 1. Intrusion detection system environment
In the past, there were research papers proposing
classification algorithms such as Adaptive Resonance
Theory (ART), Self-Organizing Map (SOM), BackPropagation (Back-Prop) Neural Network, statistical
probability distribution, BLINd classification and
Bayesian [1-8]. Most of them used KDD99 dataset to
evaluate their IDS performance. The KDD 99 dataset
which is a 10 year data is very old off-line data
consisting of 41 features.
There were a few of researchers proposing realtime intrusion detection systems using different
techniques. The first one uses Self-Organizing Map
to classify normal data and DoS attack with 10
features of every 50 packets evaluated by different
characteristic visualization of normal and DoS [6].
The second one uses Bayesian Classification model
to classify normal and attack with first 3 months of
training and last month of testing evaluated by
detection penalty [7]. The third one uses Adaptive
Resonance Theory (ART) and Self-Organizing Map
(SOM) by considering about 5000 packets for
training and 3000 packets for training, obtaining from
sampling during 4-day experiment with 27 features
by number of frequency of occurrences in each
interval as inputs for classification. The detection
rates (Attack and Normal) of the ART and the SOM
are about 97% and 95%, respectively [8].
In this paper, we propose a Real-Time Intrusion
Detection System (RT-IDS) using Decision tree
approach considering only 13 features of network
traffic data which are effective to detection speed and
computer resource consumption (CPU, Memory).
Moreover we classify normal activity and main attack
types consisting of Probe and DoS. This advantage
will reduce the time for computer users to analyze
network data and protect the network from the
criminal enterprises. Decision tree algorithm has high
performance in classifying unknown attack (without
training) [9].
The rest of this paper is organized as follows. In
section 2, we present our research methodology with
the Decision tree. In section 3, we describe the realtime IDS process. Section 4 explains parameter
settings and evaluation. Section 5 shows the
experimental results including detection rate and
consumption of CPU and memory. Finally in section
6, conclusion of this research is given.
C4.5 version is an efficiency and popular learning
type of the decision tree. It is proposed by J.R.
Quinlan who has his research spanning for more than
15 years [13].
3. Real-Time IDS Process
Our Real-Time IDS as shown in Figure 3 mainly
consists of the preprocess part, and the classifying
Online data
Packet Sniffer
Tcpdump packet
Preprocess Part
IP features
TCP packet
UDP packet
TCP features
UDP features
TCP Extracted
UDP Extracted
ICMP packet
ICMP features
ICMP Extracted
2. Research Methodology
Decision tree model is a well-known classification
algorithm. It consists of non-terminal nodes (a root
and internal nodes) and terminal nodes (leaves)
which efficiently classify data [10]. Root node is the
first attribute with test conditions to split each record
toward each internal node depending on
characteristics of the record. Firstly, the decision tree
is trained with known data by a learning type before
it can classify new or untrained data. After training,
this algorithm can predict new data by starting from a
root node to each internal node containing attribute
test conditions until arriving at the leaf node
consisting of answer class as shown in Figure 2 [11].
Collecting and
Waiting every 2 sec
All Extracted Data
Separating the data into records by
connection between 2 IP addresses
Records including
13 features
Classifying Part
Answer Class
Internal node
Decision Tree
Using C4.5
learning Detection Result
Log File
Log File
Figure 3. Real-time IDS process
Figure 2. Decision tree structure
When the RT-IDS receives online network data
packet firstly entering the preprocess part where the
packet header and other detailed data are considered.
The detailed packet data feature is then generated to
numeral. The essential feature which represents the
network activity will be extracted from this data.
Then the preprocessed data with key signature
extraction is ready to enter the classifying part so that
the IDS can classify the data into normal network
activity and main attack types. The Real-Time IDS is
implemented on 2.83 GHz Intel Pentium Core2 Quad
9550 processor with 4GB RAM and 100Mbps LAN.
3.1 Preprocess Part
In the preprocess part, we use the packet sniffer,
which is built with Jpcap library, to store network
packet information including IP header, TCP header,
UDP header, and ICMP header from each
promiscuous packet. After that, the packet
information is divided by considering connections
between any two IP addresses (source IP and
destination IP) and collect all records every 2 seconds
as shown in the preprocess part in Figure 3. Each
record consists of 13 data features and an answer
class as shown in Table 1. Examples of the data
records obtained from the preprocess part are shown
2138, 33, 33, 4, 4, 0, 644, 2136, 0, 0, 0, 0, 0, Normal
12, 2, 2, 0, 0, 0, 1, 12, 0, 0, 0, 0, 0,Normal
230, 2, 120, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,Probe
6, 3, 1, 2, 2, 0, 2, 2, 2, 153, 2, 78, 0, Probe
0, 0, 0, 0, 0, 0, 0, 0, 0, 48810, 1, 1, 48810, DoS
145, 145, 1, 0, 145, 0, 0, 0, 0, 0, 0, 0, 0, DoS
Table 1. Thirteen features in preprocess data
Feature Description
Data Type
Number of TCP packets
Number of TCP source port
Number of TCP destination port
Number of TCP fin flag
Number of TCP syn flag
Number of TCP reset flag
Number of TCP push flag
Number of TCP ack flag
Number of TCP urget flag
Number of UDP packets
Number of UDP source port
Number of UDP destination port
Number of ICMP packets
Answer Class
String (Normal, DoS,
3.2 Classifying Part
The classification part consists of 2 main
processes which are training and testing using java
library of WEKA tool [13]. We train the C4.5
Decision tree model with known answer class of each
record from the preprocess part. After that, we test
the trained Decision tree model by new or untrained
dataset where each record is captured on real-time
system as shown in the classifying part in Figure 3.
Our experimental network data consists of 4 DoS
attack types, 13 Probe attack types, and normal
activity as shown in Table 2.
Table 2. Attack type and normal activity
UDP Flood
HTTP Flood
Port Scan
Advance Port Scan
Host Scan
SYN Stealt
FIN Stealt
UDP Scan
Null Scan
Xmas Tree
IP Scan
ACK Scan
Window Scan
RCP Scan
Net Tools 5
Net Tools 5
Net Tools 5
Net Tools 5
Host Scan 1.6
NMapWin 1.3.1
NMapWin 1.3.1
NMapWin 1.3.1
NMapWin 1.3.1
NMapWin 1.3.1
NMapWin 1.3.1
NMapWin 1.3.1
NMapWin 1.3.1
NMapWin 1.3.1
NMapWin 1.3.1
Actual Environment
4. Parameter Settings & Evaluation
A 2.83 GHz Intel Pentium Core2 Quad 9550
processor with 4GB RAM on maximum 100 Mbps is
used to host our RT-IDS that captures network traffic
in Computer engineering department of King
Mongkut’s University of Technology Thonburi
We simultaneously generate attacks from many
computers as shown in Figure 4 consisting of 4 DoS
attack types and 14 Probes attack types to a computer
victim that hosts our RT-IDS. Many Internet services
on Ethernet are used full load in order to generate
normal network activity.
Detection performance evaluation of our RT-IDS
is quantified based on following values.
• The Total Detection Rate (TDR) is the
percentage that the RT-IDS can correctly detect
the DoS attacks, Probe attacks, and Normal
network data.
• The Normal Detection Rate (NDR) is the
percentage that the RT-IDS can correctly detect
the normal class.
• The DoS Detection Rate (DDR) is the
percentage that the RT-IDS can correctly detect
the DOS attacks.
• The Probe Detection Rate (PDR) is the
percentage that the RT-IDS can correctly detect
the Probe attacks.
The consumption of CPU and memory resource
in our running RT-IDS system is captured by Process
Explorer tool.
with full load (100 Mbps), our RT-IDS uses less than
25% of CPU resource while consuming memory
about 94.5 MB. In addition, the detection time which
is captured using the OS clock is about 2 seconds.
DoS intruder
RT-IDS & Victim
Create C4.5 Decision
Tree model
Full load
(100 Mbps)
Figure 5. CPU comsumption of the RT-IDS
Probe intruder
Figure 4. Network environment
5. Experimental Result
In our experiment, the training data has 55,000
records including 10,000 DoS records, 30,000 Probe
records and 15,000 normal records.
After trained with the data for 0.03 seconds, the
C4.5 Decision tree model consists of 197 nonterminal nodes (a root and internal nodes) and 99
terminal nodes (leaves). The RT-IDS with the trained
decision tree is used to test online network data in 24hour time interval capturing about 109 Mega
connections. After preprocessing the test data, we
obtain a total of 102,959 records of testing data,
consisting of 19,454 DoS records, 8392 Probe
records, and 75,113 Normal records to test the
classifying part of the real-time system or the RTIDS. The results are shown in Tables 3 and 4.
Table 3. Experimental results
Table 4. Summary of results
Decision tree
While running the RT-IDS, the Process Explorer
tool captures consumption of CPU and memory
resource as shown in Figures 5 and 6. When running
Using 94.5 MB of memory
Figure 6. Memory comsumption of the RT-IDS
6. Conclusion
This paper proposes a new real-time intrusion
detection system (RT-IDS) using a decision tree
approach with an efficient data preprocessing
consisting of only 13 features. We evaluate the RTIDS performance including detection rate, CPU, and
memory consumption under a real-time environment.
From the experimental results, our RT-IDS offers
both total detection rate (TDR) and normal detection
rate (NTR) higher than 99%, and the false alarm rate
is very low. When capturing the network traffic with
full load (100 Mbps), the RT-IDS uses less than 25%
of CPU consumption, with only 94.5MB of memory
usage, which is very low comparing to the memory
capacity of a PC at present. Essentially, our RT-IDS
can detect data packet about 2 seconds which is
sufficient to warn or alert the computer
user/administrator to protect the network system.
Therefore the decision tree classification algorithm is
a suitable approach for real-time intrusion detection.
7. Reference
[1] M. Sabhnani and G. Serpen, “Application of Machine
Learning Algorithms to KDD Intrusion Detection
Dataset within Misuse Detection Context”, Inter
Conference: Machine Learning, Models, Technologies
and Applications (MLMTA), 2003, pp. 209-215.
[2] M. Gil-Jong, K. Yong-Min, K. DongKook, and N.
BongNam, “Network Intrusion Detection Using
Statistical Probability Distribution”, ”, Inter
Conference: ICCSA(2), 2006, pp. 340-348.
[3] V. Katos, “Network Intrusion Detection: Evaluating
Cluster, Discriminant, and Logit Analysis”, Inter J.:
Information Sciences, 177, 2007, pp. 3060-3073.
[4] N. Ngamwitthayanon, N. Wattanapongsakorn, C.
Charnsripinyo, and D.W. Coit, “Multi-Stage NetworkBased Intrusion Detection System Using Back
Propagation Neural Networks”, Inter Conference:
Asian International Workshop on Advanced
Reliability Modeling (AIWARM), 2008.
[5] S. Pukkawanna, V. Visoottiviseth, and P. Pongpaibool,
“Lightweight Detection of DoS Attacks”, Networks,
15th IEEE Inter Conference, 2007, pp 77-82.
[6] K. Labib and R. Vemuri, “NSOM: A Real-Time
Network-Based Intrusion Detection System Using
Self-Organizing Maps”, Networks and Security, 2002.
[7] Puttini, Ricardo S., Marrakchi, Zakia and Mé,
Ludovic, “A Bayesian Classification Model for RealTime Intrusion Detection”, API Conference, 2003, pp.
[8] Amini, M., Jalili, A. and Reza Shahriari, H., “RTUNNID: A Practical Solution to Real-Time NetworkBased Intrusion Detection Using Unsupervised Neural
Networks”, Computer & Security 25, 2005, pp. 459468.
[9] P. Sangkatsanee, N. Wattanapongsakorn, and C.
Charnsripinyo, “Network Intrusion Detection and
Classification with Decision Tree and Rule Based
Approaches”, IEEE Inter Conference: ISCIT, 2009.
[10] Zhi-Song Pan, Song-Can Chen, Gen-Bao Hu, and
Dao-Qiang Zhang, "Hybrid Neural Network and C4.5
for Misuse Detection," Inter Conference: Machine
Learning and Cybernetics, vol.4, 2003, pp. 2463-2467.
[11] P.N. Tan, M. Steinbach, V. Kumar, “Introduction to
Data Mining”, Pearson Addison Wesley, 2005,
[12] W. Cohen, “Fast Eective Rule Induction”, Inter
Conference: Machine Learning, 1995.
[13] Weka
Available: [2009, July 2]