Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
International Journal of Computer Application Available online on http://www.rspublication.com/ijca/ijca_index.htm Issue 4, Volume 1 (February 2014) ISSN: 2250-1797 Network Intrusion Detection Based On Fuzzy Logic S. Revathi 1 Dr. A. Malathi2 1 Ph.D. Research Scholar, PG and Research, Department of Computer Science, Government Arts College, Coimbatore-18 2 Assistant Professor, PG and Research, Department of Computer Science Government Arts College, Coimbatore-18 Abstract: By increasing use of computer network and internet using Intrusion Detection System has become more popular. The main drawback of IDS is to generate alert to system administrator based on malicious activities that violates security policies. Recently fuzzy logic plays a vital role in detecting attacks using various rule generation technique. This paper proposed a new concept of using various fuzzy rule generation to detect intrusion in an effective manner. The experimental analysis are done on NSL-KDD intrusion detection dataset, it‟s clear that the proposed system achieve high detection rate and reduce false alarm rate than other existing machine learning algorithm such as neural network ,data mining and many more which are used to identify whether record is normal or attack . Keyword: Intrusion Detection, Fuzzy Logic, NSL-KDD dataset, Data Mining and Neural Network. I. Introduction Network system mainly based on entrenched product which is open to public access. Due to wide use of internet it may lead to vulnerabilities, though there are various methods to secure system from unauthorized access it is not possible to protect system completely. So to defense it, an intrusion detection mechanism are used to search for malicious behavior and watches various anomalous in the data [1]. The researcher mainly focused on misuse and anomaly detection using various machine learning techniques and to achieve high Detection Rate (DR), with a low False Alarm Rate (FAR) which is a stimulating task. [2] Intrusion detection mainly based on two types as Misuse Detection (can detect only Known attack) and Anomaly Detection (Detect various unknown attacks) which may result in high false alarm rate [3]. To detect intrusion behavior various researches has been done mainly based on data mining [5] and other machine learning technique such as Neural Network [6], Support Vector Machine [7] and Genetic Algorithm [8],. Now-a-days fuzzy logic plays a vital role in detecting intrusion using various rule generation which increase detection rate drastically [4]. The proposed designed based on detecting anomaly based intrusion using fuzzy logic. The input to the proposed system is NSL-KDD dataset [9], which is divided into two subsets such as, training dataset and testing dataset. Initally, the training dataset is classified into five subsets so that, four types of attacks such as Denial of Service, R2L, Probe and U2R and normal data are separated. The dataset many contained various attribute which may be unwanted or irrelevant and may leads to dimensionality problem. To reduce the dataset and to R S. Publication (rspublication.com), [email protected] Page 143 International Journal of Computer Application Available online on http://www.rspublication.com/ijca/ijca_index.htm Issue 4, Volume 1 (February 2014) ISSN: 2250-1797 select important attribute various attribute selection techniques are used. Then, we generate fuzzy if-then rule in accordance with the definite rule by fuzzifying with consequent parts that represent whether it is a normal data or an attack data. These rules are given to the fuzzy to effectively learn the fuzzy system. In the testing phase, the test data is stepped with fuzzy rules to detect whether the test data is an abnormal data or a normal data. The rest of the paper is organized as follows: section II describes the detailed analysis of the NSL-KDD dataset. The proposed intrusion detection system using fuzzy logic is given in section III. Experimentation and result analysis of the proposed system is discussed in section IV. Finally, the conclusion is specified in section V. II. NSL_KDD Dataset Description: In Earlier day‟s the researcher focused on DARPA dataset for analyzing intrusion detection [13]. It consist of seven weeks of training and also two weeks of testing raw tcpdump data. The main drawback is its packet loss. The refined version of DARPA dataset which contains only network data (i.e. Tcpdump data) is termed as KDD dataset [11]. Which consist on 5 million single connection for training records and 2 million connection for testing. Due to its huge size the researcher used on 10%% of dataset to analysis intrusion accuracy which affects the performance of the system, and results in a very poor estimation of anomaly detection approaches. To solve these issues, a new data set as, NSL-KDD [9] is proposed, which consists of selected records of the complete KDD data set. The advantage of NSL KDD dataset are 1. No redundant records in the train set, so the classifier will not produce any biased result 2. No duplicate record in the test set which have better reduction rates. 3. The number of selected records from each difficult level group is inversely proportional to the percentage of records in the original KDD data set. The training dataset is made up of 21 different attacks out of the 37 present in the test dataset. The known attack types are those present in the training dataset while the novel attacks are the additional attacks in the test dataset i.e. not available in the training datasets. Table 1: Attack in testing dataset Attacks in Dataset DOS Probe R2L U2R Attack Type (37) Back,Land,Neptune,Pod,Smurf, Teardrop,Mailbomb,Processtable,Udpstorm,Apache2,Worm Satan,IPsweep,Nmap,Portsweep,Mscan,Saint Guess_password,Ftp_write,Imap,Phf,Multihop,Warezmaster,Xlock, Xsnoop,Snmpguess,Snmpgetattack,Httptunnel,Sendmail, Named Buffer_overflow,Loadmodule,Rootkit,Perl,Sqlattack,Xterm,Ps The attack types are grouped into four categories: DoS, Probe, U2R and R2L. Table 1 shows the major attacks in both training and testing dataset [10]. The NSL KDD dataset has asset of 41 attributes derived from each connection and the status of the each connection record label as normal or specific attack type. These features mainly based on continues, discrete and symbolic which falls under four categories of attack as A) Denial of Service (DOS): Extreme consumption of resources that rejects legitimate requests from legal users on the system [9]. R S. Publication (rspublication.com), [email protected] Page 144 International Journal of Computer Application Available online on http://www.rspublication.com/ijca/ijca_index.htm Issue 4, Volume 1 (February 2014) ISSN: 2250-1797 B) Remote to Local (R2L): Attacker act as a legal user and gains account on the victim machine by sending packets over the networks[9] C) User to Root (u2r): Attacker tries to access limited privileges of the machine [9]. D) Probe: Attacks can automatically scan a network and gather information or find known vulnerabilities [9]. III. Intrusion Detection based on Fuzzy Rule Generation The idea of fuzzy logic was invented by Professor L. A. Zadeh of the University of California at Berkeley in 1965 [12]. Fuzzy logic starts with the concept of a fuzzy set. A fuzzy set is a set without a crisp boundary. It can contain elements with only a partial degree of membership. A classical set is a container that wholly includes or wholly excludes any given element.. To implement fuzzy logic technique to a real application requires the following three steps: 1. Fuzzification – convert classical data or crisp data into fuzzy data or Membership Functions (MFs) 2. Fuzzy Inference Process – combine membership functions with the control rules to derive the fuzzy output 3. Defuzzification – use different methods to calculate each associated output and put them into a table: the lookup table. Pick up the output from the lookup table based on the current input during an application NSL-KDD dataset Training Data (41 Attr) Feature Reduced Training Data (17) Testing Data (41 attr) Feature Reduced Testing Data (17) Output Classified Training Data Fuzzifier Defuzzifier Fuzzy Inference Engine Fuzzy Rule Generation Membership Function Rule Base Figure1: Proposed IDS using Fuzzy Rule Generation R S. Publication (rspublication.com), [email protected] Page 145 International Journal of Computer Application Available online on http://www.rspublication.com/ijca/ijca_index.htm Issue 4, Volume 1 (February 2014) ISSN: 2250-1797 This paper developed a fuzzy rule base system for identifying the attacks. Here we design anomaly based intrusion detection, which makes use of the generated rules from Sugeno fuzzy inference system [3].The figure 1 describes the design of proposed IDS using fuzzy rule generation. The different steps involved in fuzzy rule generation to identify intrusion detection are A. Classify Training Data The dataset used for analyzing intrusion behavior is NSL-KDD, which contains four types of attacks and normal behavior of data with 41 attribute that have both continuous and discrete values. On analyzing intrusion behavior 41 attribute may leads to huge dataset dimensionality problem so we reduced the dataset size using cfs subset value and reduced to 17 attributes. These attribute are further used for generating rule using fuzzy logic so that fuzzy system can learn the rules effectively. B. Fuzzy If-Then Rule generation A fuzzy if-then rule (fuzzy rule, fzzy implication or fuzzy conditional statement) assumes the form If x is A then y is B, Where A and B are linguistic values defined by fuzzy sets on universe of discourse X and Y, respectively. Often “x is A” is called the antecedent or premise while “y is B” is called the consequence or conclusion. Understanding an if-then rule involves distinct parts: first evaluating the antecedent (which involves fuzzifying the input and applying any necessary fuzzy operators) and second applying that result to the consequent (implication). In the case of binary logic, if-then rules do not present much difficulty. If the antecedent is true, then the conclusion is true to some degree of membership, same as consequent. Based on these logic fuzzy rules are generated automatically using various attributes such as source byte, dest byte, protocol, ip address etc. C. Classify Testing Dataset For testing phase, a test data from the NSL-KDD dataset is given to the designed fuzzy logic system for finding the fuzzy score. Initially, the test data containing 17 attributes is applied to fuzzifier, which converts numerical variable into linguistic variable using the triangular membership function. The output is fed to the inference engine which in turn compares that particular input with the rule base. Rule base which contains a set of rules obtained from the definite rules. The output of fuzzy inference engine is both {Low and high} and then, it is converted by using defuzzifier. The crisp value obtained from the fuzzy inference engine is between 0 to 1, where „0‟ denotes that the data is completely normal and „1‟specifies it‟s attacked data. IV. Experimentation and Result Analysis The experimentation and result analysis are described in the following section. The experimentation is done MATLAB tool and the attribute selection technique are analyzed using popular data mining tool “WEKA” [14]. For evaluating the performance, NSL-KDD dataset is used for training and testing. The number of records taken for testing and training phase is given in table 2 and table 3. R S. Publication (rspublication.com), [email protected] Page 146 International Journal of Computer Application Available online on http://www.rspublication.com/ijca/ijca_index.htm Issue 4, Volume 1 (February 2014) ISSN: 2250-1797 Table 2: Number of instance in Training dataset Attacks Normal DOS Probe U2R R2L Table 3: Number of instance in testing dataset Attacks Normal DOS Probe U2R R2L Record 67343 45927 11656 52 995 Record 9711 7456 2421 200 2756 Result Analysis The training dataset contains normal data as well as attack data that are given to the proposed system for identifying the suitable attributes. The selected attribute for rule generation process is given in table 5. Then, using the fuzzy rule learning strategy, the system generates definite and indefinite rules and finally, fuzzy rules are generated from the definite rules. In the testing phase, the testing dataset is given to the proposed system, which classifies the input as a normal or attack. The obtained result is then used to compute overall accuracy of the proposed system. The evaluation metrics are computed for both training and testing dataset in the testing phase and the obtained result for all attacks and normal data are given in table 6, which is the overall classification performance of the proposed system on NSL-KDD dataset. Table 5: Selected Attribute for rule generation 1 Protocol Type 2 Service Name 3 Flag status 4 Size of Source in Bytes 5 Size of Destination in Bytes 6 Geographical Attribute flag 7 Wrong Fragments count 8 Emergency Flag 9 Connection Status 10 Failed Login count 11 Node Logged in status 12 Shell Access Count 13 Number of Files Created 14 Number of Outbound commands 15 Accessed Files Count 16 Host Login Status 17 Guest Login Status R S. Publication (rspublication.com), [email protected] Page 147 International Journal of Computer Application Available online on http://www.rspublication.com/ijca/ijca_index.htm Issue 4, Volume 1 (February 2014) ISSN: 2250-1797 By evaluating the result, the overall enactment of the proposed system is improved significantly and it achieves more than 95% accuracy for all types of attacks. Table 6: Overall Detection Rate and FAR Class Name Detection False Rate Alarm rate Normal 98.1 0.54 DOS 97.4 0.83 Probe 93.7 1.11 U2R 92.4 1.51 R2L 92.7 1.20 V. Conclusion The proposed system based on fuzzy rule generation for detecting anomaly based intrusion. A fuzzy decision module are more accurate in detecting attacks. An effective set of fuzzy rules were identified automatically by making use of the fuzzy rule learning scheme, which are more effective for detecting intrusion in a computer network. Then these fuzzy rules were identified by fuzzifying the definite rules and these rules were given to Sugeno fuzzy system, which classify the test data. The above result shows that the proposed method is more effective in detecting various intrusions in computer networks. Reference 1. 1 R. Heady, G. Luger, A. Maccabe, and M. Servilla. The architecture of a network level intrusion detection system. Technical report, Computer Science Department, University of New Mexico, (August 1990). 2. Deris tiawan, Abdul Hanan Abdullah, Mohd. Yazid dris, Characterizing Network Intrusion Prevention System, International Journal of Computer Applications (0975 – 8887), Volume 14– No.1, (January 2011). 3. Intrusion Detection Techniques. Peng Ning, North Carolina State University. Sushil Jajodia, George Mason University. Introduction. Anomaly Detection http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.89.2492&rep=r ep1&type=pdf 4. J. Luo, and S. M. Bridges, “Mining fuzzy association rules and fuzzy frequency episodes for intrusion detection”, International Journal of Intelligent Systems, Vol. 15, No. 8, pp. 687-704, 2000. 5. W. Lee, S. Stolfo, and K. Mok, “A Data Mining Framework for Building Intrusion Detection Model”, In Proceedings of the IEEE Symposium on Security and Privacy, Oakland, CA, pp. 120-132, 1999. R S. Publication (rspublication.com), [email protected] Page 148 International Journal of Computer Application Available online on http://www.rspublication.com/ijca/ijca_index.htm Issue 4, Volume 1 (February 2014) ISSN: 2250-1797 6. Cannady J, “Artificial Neural Networks for Misuse Detection”, in Proceedings of the ‟98 National Information System Security Conference (NISSC‟98), pp. 443-456, 1998. 7. Shon T, Seo J, and Moon J, “SVM Approach with A Genetic Algorithm for Network Intrusion Detection”, Lecture Notes in Computer Science, Springer Berlin / Heidelberg, Vol. 3733, pp. 224-233, 2005. 8. Yu Y, and Huang Hao, “An Ensemble Approach to Intrusion Detection Based on Improved Multi-Objective Genetic Algorithm”, Journal of Software, Vol.18, No.6, pp.1369-1378, June 2007. 9. “Nsl-kdd data set for network-based intrusion detection systems.” Available on: http://nsl.cs.unb.ca/KDD/NSLKDD. html, March 2009. 10. Mahbod Tavallaee, Ebrahim Bagheri, Wei Lu, and Ali A. Ghorbani, “A Detailed Analysis of the KDD CUP 99 Data Set”, In the Proc. Of the IEEE Symposium on Computational Intelligence in Security and Defense Applications (CISDA 2009), pp. 1-6, 2009. 11. KDD Cup 1999. Available 99/kddcup99.html, October 2007. on: http://kdd.ics.uci.edu/databases/kddcup 12. Zadeh L. A. (1965) Fuzzy Sets. Intl J. Information Control 8:338-353. 13. J. McHugh, “Testing intrusion detection systems: a critique of the 1998 and 1999 darpa intrusion detection system evaluations as performed by lincoln laboratory,” ACM Transactions on Information and System Security, vol. 3, no. 4,pp. 262–294, 2000. 14. Weka – Data Mining Machine Learning Software. http://www.cs.waikato.ac.nz/ml/weka/ R S. Publication (rspublication.com), [email protected] Page 149