Download Performance Analysis of Back Propagation Neural Network for

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Computer network wikipedia , lookup

Asynchronous Transfer Mode wikipedia , lookup

Net neutrality law wikipedia , lookup

Cracking of wireless networks wikipedia , lookup

Piggybacking (Internet access) wikipedia , lookup

Internet protocol suite wikipedia , lookup

Airborne Networking wikipedia , lookup

Network tap wikipedia , lookup

IEEE 1355 wikipedia , lookup

List of wireless community networks by region wikipedia , lookup

Net bias wikipedia , lookup

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

Deep packet inspection wikipedia , lookup

UniPro protocol stack wikipedia , lookup

Transcript
Performance Analysis of Back Propagation
Neural Network for Internet Traffic
Classification
Kuldeep Singh#1, Sunil Agrawal#2
#
University Institute of Engineering & Technology. , Panjab University, Chandigarh (India)-160014
[email protected]
[email protected]
Abstract- With rapid increase in internet usage over last few
years, the area of internet traffic classification has advanced to
large extent due to a dramatic increase in the number and
variety of applications running over the internet. These
applications include www, e-mail, P2P, multimedia, FTP
applications, Games etc. The diminished effectiveness of
traditional port number based and payload based direct packet
inspection internet traffic classification techniques motivate us
to classify internet traffic into various application categories
using Machine Learning (ML) techniques. Neural Networks are
also one of the important ML techniques. In this paper, Back
Propagation Neural Networks (BPNN) are employed for
internet traffic classification which is a type of multilayer feed
forward neural network. In this research work, Performance of
BPNN is analysed based upon accuracy, recall, number of
hidden layer neurons and training time of network using large
feature data sets and reduced feature data sets. This
experimental analysis shows that BPNN is an efficient technique
for internet traffic classification for reduced feature data sets
also.
Keywords- Internet traffic classification, Back propagation
neural network, Machine Learning, Accuracy, Recall, Training
Time, Features.
I. INTRODUCTION
The demand for Internet Traffic Classification that
optimizes network performance by solving difficult network
management problems for Internet Service Providers (ISPs)
and provides quality-of-service (QoS) guarantees has
increased substantially in recent years, in part, due to the
phenomenal growth of bandwidth-hungry application [1], [2].
Variety of applications running over internet are www, email, p2p, multimedia, FTP applications, interactive services,
Games etc which lead to rapid increase in internet traffic.
Classification of this internet traffic is necessary in order to
solve ISPs network management and monitoring problems
such as available bandwidth planning and provisioning,
measure of QoS, identification of customer’s use of particular
application for billing, detection of indicators of denial of
service attacks and any severe problem degrading the
performance of network etc. Now a day, it is also being
utilized by various governmental intelligence agencies from
security point of view.
Internet Traffic classification can be either offline or
online. In online classification, analysis is performed while
data packets flowing through the network are captured; but in
case of offline classification technique, firstly data traces are
captured and stored and then analysed later [2]. Traditionally,
various internet traffic classification techniques have been
based upon direct inspection of packets flowing through the
network [1]. These techniques are payload based and port
number based packet inspection techniques. In payload based
technique , payload of few TCP/IP packets are analysed in
order to find type of application which is not possible today
because of use of cryptographic techniques used to encrypt
data in packet payload and privacy policies of governments
which do not allow any unaffiliated third party to inspect each
packets payload. In port number based packet inspection
technique, well-known port numbers are provided in header
of IP packets which are reserved by IANA (Internet Assigned
Numbers Authority) for particular applications e.g. port
number 80 is reserved for web based applications.
Unfortunately, this method also becomes ineffective due to
the use of Dynamic port numbers instead of Well-known port
numbers for various applications.
The diminished effectiveness of traditional port number
based and payload based direct packet inspection internet
traffic classification techniques motivate us to classify
internet traffic into various application categories using
Machine Learning (ML) techniques which are based upon
supervised and unsupervised learning techniques [1]. Neural
Networks which are a massively parallel distributed network
consisting of number of information processing units
(Neurons) inspired by the way human brain work, also comes
under the category of ML techniques.
This paper is based upon use of Back Propagation Neural
Network (BPNN) for internet traffic classification [10], [11].
BPNN is a type of supervised multilayer feed forward neural
network which is based upon backward flow of error signal
between actual output and desired output in order to update
weights during training process. In this paper, performance of
BPNN is analysed for two different training and testing data
sets having different number of features contained in input
samples. Performance of this network is evaluated on the
basis of accuracy, recall, number of hidden layer neurons and
training time [7], [8], [9]. This paper shows that as the back
propagation neural network gives good classification accuracy
even by reducing the number of features of input samples in
training and testing data sets to much extent and it also leads
to reduced complexity and reduction in training time of
BPNN.
The rest of the paper is organized as follows: Section II
gives introductory information about BPNN for readers who
are new to this field. Section III reveals certain information
about data sets of internet traffic. Implementation and result
analysis is given in Section IV. Some conclusions are given in
Section V.
II.
BACK PROPAGATION NEURAL NETWORK
Back Propagation Neural Network (BPNN), also known as
Multilayer Perceptron (MLP), is a Multilayer Neural Network
which is based upon back propagation algorithm for training.
This neural network is based upon extended gradient-descent
based Delta learning rule, commonly known as Back
Propagation rule. In this network, error signal between desired
output and actual output is being propagated in backward
direction from output to hidden layer and then to input layer
in order to train the network [10], [11].
Consider the network shown in fig. 1. It consists of input
layer having i neurons, hidden layer having j neurons and
output layer having k neurons.
In second phase, the actual output values will be
compared with the target output values. The error between
these outputs will be calculated and propagated back to
hidden layer in order to update the weight of each node again.
This is called backward pass or learning. The network will
iterate over many cycles until the error is acceptable. After the
training phase is done, the trained network is ready to use for
any new input data. During the testing phase, there is no
learning or modifying of the weight matrices. The testing
input is fed into the input layer, and the feed forward network
will generate results based on its knowledge from trained
network [7], [8], [9].
According to Gradient- descent based Delta learning rule,
the weight change should be in negative direction to the error
gradient in order to minimize total sum square error E, which
is given as
1
2
E =  [dk  yk]2
(1)
Where dk is the desired output of this BP neural network. Now
change in weights from output to hidden layer
interconnections can be expressed as
∆ Wkj = - ƞ
E
Wkj
(2)
Where ƞ gives the learning rate of this neural network. Thus
according to back propagation algorithm, weight update from
output layer to hidden layer interconnections can be expressed
as
W t+1 = W t + ∆ Wkjt
(3)
Or
W t+1 = W t - ƞ
E
Wkj
(4)
Similarly weights are updated for hidden to input layer
interconnections and neural network is trained.
III. INTERNET TRAFFIC DATA SET
Fig.1.Structure of Back Propagation Neural Network
In Back Propagation Neural Network, Training process
is done in two phases which are explained as following:
In the first phase, the training data is fed into the input
layer. It is propagated to both the hidden layer and the output
layer. This process is called the forward pass. In this stage,
each node in the input layer, hidden layer and output layer
calculates and adjusts the appropriate weight between nodes
and generate output value of the resulting sum.
In this research work, we have used a data set of 10,193
samples for two different cases of input features. [5], [6].
Each data sample belongs to a particular internet application.
These applications for present data set are of twelve types
which are given as: WWW, Mail, FTP-CONTROL, FTPPASV, Attack, P2P, Database, FTP-Data, Multimedia,
Services, Interactive and Games.
For first case, we have trained BPNN using a data set of
10,193 samples, where each input sample consists of 248
features to characterize each particular application. These
features mainly include inter packet arrival times (max., min.,
median, mean, variance, first quartile, third quartile etc), total
number of packets (server to client and client to server), total
number of bytes on the wire and in IP packet (max., min.,
median, mean, variance, first quartile, third quartile etc),
control signals, bandwidth, duration, FFTs of various features,
server port, client port and many other features. After the
network is being trained, then out of the data set of 10,193
data samples, 2548 input data samples are being used for
testing purpose and to obtain the classified outputs.
 True Negative (TN): Percentage of samples of other
classes correctly classified as not belonging to class Z.
 False Positive (FP): Percentage of samples of other classes
incorrectly classified as belonging to class Z (equivalent to
100%-TN).
 False Negatives (FN): Percentage of samples of class Z
incorrectly classified as not belonging to class Z (equivalent
to 100%-TP).
For second case, we have trained BPNN using a data set
of 10193 samples again, where each input sample consists of
only 48 features to characterize each particular application.
These features mainly include Inter packet arrival times (min.,
max., mean and variance), total number of packet byes on the
wire and in IP packet (max., min., mean, variance), bandwidth
features, duration etc. After the network is being trained using
this reduced feature data set, 2548 data samples of same data
set are used for testing purpose and to obtain classified
outputs.
All these matrix terms are considered to range from 0 to
100%. In general, Accuracy can be defined as percentage of
correctly classified samples over all classified samples. It is
given as follows:
IV. IMPLEMENTATION AND ANALYSIS
A. Methodology
In this research work, we have used MATLAB R2009b
to develop BPNN program. First of all, this BPNN model
was trained with training pair consisting of 10,193 training
inputs and training targets consisting of 248 features in each
input i.e. full feature training data set. [5], [6]. After that full
feature testing data set of 2548 input samples is provided to
this BPNN model and output file consisting of various
application classes are obtained. After that, we have used
reduced feature training data set of 10,193 samples , each
having 48 features only, are used for training the BPNN again
and then reduce feature testing data set of 2548 samples is
used to obtain the classified outputs.
The Classification Accuracy and recall are employed in
this research work in order to evaluate performance of BPNN
against number of hidden layer neurons and training time for
full feature data set and reduced feature data set [1], [3], [4].
Accuracy can be deduced from confusion matrix as shown in
fig. 2.
Positive
Negative
True
False
TP
FP
FN
TN
Fig.2Confusion Matrix
In this confusion matrix following terms are used:
 True Positive (TP): Percentage of samples of class Z
correctly classified as belonging to class Z.
Accuracy (ACC.) =
(5)
Recall can be defined as percentage of samples of class Z
correctly classified as belonging to class Z. It is given as
follows:
Recall (R) = TP
(6)
Training Time (Ttrn): It is the total time taken for training of
RFB Neural Network. Training time depends upon number of
training data samples and number of hidden layer neurons. In
this paper, it is measured in minutes.
B. Analysis
Table I shows Classification accuracy of BPNN with
increase in number of hidden layer neurons and Training
Time of BPNN model for full feature data sets and reduced
feature data sets. It is clear from this table and fig. 3and fig. 4
that as the number of hidden layer neurons increases from
given data set of training samples, accuracy of BPNN
classifier is going to be increased and training time of BPNN
is also going to be increased.
TABLE I
ACCURACY VS. NO. OF HIDDEN LAYER NEURONS AND
TRAINING TIME
No. of
Hidden
Layer
Neurons
50
100
200
300
400
500
600
700
800
900
1000
BPNN with full
feature data set
Accuracy Training
(%)
Time
(Minutes)
78.54
5
80.35
7
80.16
11
77.78
14
82.71
19
81.26
22
75.12
26
74.89
29
64.56
33
55.19
36
44.91
47
BPNN with reduced
feature data set
Accuracy Training
(%)
Time
(Minutes)
62.45
3
65.30
4
68.74
6
70.98
9
68.64
11
72.54
13
64.55
16
66.80
18
69.87
24
66.75
28
69.83
33
In case of full feature (248 features) training and testing
data set, BPNN gives optimum accuracy value of 82.71 % at
400 hidden layer neurons and training time of 19 minutes. But
in case of reduced feature (48 features only) training and
testing data set, BPNN gives optimum accuracy value of
72.54 % at 500 hidden layer neurons and training time of 13
minutes.
TABLE II
RECALL OF BPNN FOR FULL FEATURE AND REDUCED FEATURE
DATA SETS
Various Internet
Applications
WWW
MAIL
FTP-CONTROL
FTP-PASV
ATTACK
P2P
DATABASE
FTP-DATA
MULTIMEDIA
SERVICES
INTERACTIVE
BPNN with full
feature
datasets:
Recall
(%)
85.91
60.98
89.59
83.41
79.0
46.28
95.05
96.61
82.57
91.48
100
BPNN with
reduced
feature
datasets:
Recall (%)
61.65
68.64
81
78.26
75.5
56.65
55.98
77.76
70.18
91.48
80.85
Fig. 3. Accuracy Vs. No. of hidden Layer Neurons for BPNN with full
feature and reduced feature data sets
Fig. 5. Recall of various applications for BPNN with full feature and reduced
feature data sets.
Fig. 4. Training Time Vs. No. of hidden Layer Neurons for BPNN with
full feature and reduced feature data sets
Table II shows recall value of various internet applications
for BPNN with full feature and reduced feature training and
testing data sets. It is also clear from fig. 5 that BPNN with
reduced feature data sets gives almost equal recall value for
most of applications as compared to BPNN with full feature
data sets.
From all this analysis, it is clear that by reducing
number of features for testing and training data sets, accuracy
of BPNN classifier decreases slightly as compared to that of
BPNN with full feature data sets and training time is reduced
to much extent in case of reduced feature data sets as
compared to that of full feature data sets. Thus with reduction
in number of features of training and testing data sets ,
complexity and training time of neural networks reduce to
great extent. Therefore, number of features in training and
testing data sets used to characterize various internet
applications should not be necessarily very high. These data
sets should include only important and meaningful features
required to describe any particular internet application.
V. CONCLUSION
In this paper, we have first designed a BP Neural
Network. Then this neural network is trained by using given
full feature and reduced feature data sets of training data
samples. Then for both the cases, the performance of BPNN is
evaluated based upon increase in number of hidden layer
neurons and training time of BPNN. Results show that for
better overall performance of BPNN, number of features in
training and testing inputs should not be very high. It should
include important and meaningful features to represent
various internet applications because with reduction in
number of features, accuracy of BPNN classifier is still
maintained to a magnificent value. Thus this research work
shows that BPNN is an effective machine learning technique
for internet traffic classification even with reduction in
number of features of training and testing data samples.
REFENCENCES
Thuy T.T. Nguyen and Grenville Armitage. “A Survey of Techniques
for Internet Traffic Classification using Machine Learning,” IEEE
Communications Survey & tutorials, Vol. 10, No. 4, pp. 56-76, Fourth
Quarter 2008.
[2]
Arthur Callado, Carlos Kamienski, Géza Szabó, Balázs Péter Ger˝o,
Judith Kelner,Stênio Fernandes ,and Djamel Sadok. “A Survey on
Internet Traffic Identification,” IEEE Communications Survey &
tutorials, Vol. 11, No. 3, pp. 37-52, Third Quarter 2009.
[3] Runyuan Sun, Bo Yang, Lizhi Peng, Zhenxiang Chen, Lei Zhang, and
Shan Jing. “Traffic Classification Using Probabilistic Neural
Network,” in Sixth International Conference on Natural Computation
(ICNC 2010), 2010, pp. 1914-1919.
[4] Luca Salgarelli, Francesco Gringoli, Thomas Karagiannis. “Comparing
Traffic Classifiers,” ACM SIGCOMM Computer Communication
Review, Vol. 37, No. 3, pp. 65-68, July 2007.
[5] Andrew W. Moore, Denis Zuev, Michael L. Crogan, “Discriminators
for use in flow-based classification,” Queen Mary University of
London, Department of Computer Science, RR-05-13, August 2005.
[6] Wei Li , Marco Canini , Andrew W. Moore , Raffaele Bolla, “Efficient
application identification and the temporal and spatial stability of
classification schema” Elsevier Journal, Computer Networks, Vol. 53,
pp. 790–809,23 April, 2009.
[7] Y.L. Chongand K. Sundaraj, “A Study of Back Propagation and Radial
Basis Neural Networks on ECG signal classification”. 6th
International Symposium on Mechatronics and its Applications
(ISMA09), Sharjah, UAE, March 24-26, 2009.
[8] Mutasem khalil Alsmadi, Khairuddin Bin Omar, Shahrul Azman
Noah ,Ibrahim Almarashdah, “Performance Comparison of Multi-layer
Perceptron (Back Propagation, Delta Rule and Perceptron) algorithms
in Neural Networks” 2009 IEEE International Advance Computing
Conference (IACC 2009) ,Patiala, India, 6-7 March 2009, p. 296-299.
[9] P. Jeatrakul and K.W. Wong, “Comparing the Performance of
Different Neural Networks for Binary Classification Problems,” Eighth
International Symposium on Natural Language Processing, 2009, pp.
111-115.
[10] Satish Kumar, Neural Networks: A Classroom Approach, 6th edition,
Tata McGraw – Hill Publishing Company Limited, New Delhi, 2008.
[11] Simon Hakin, Neural Networks: A Comprehensive foundation, 2th
edition, Pearson Prentice Hall, New Delhi, 2005.
[1]