Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
NOVEL APPROACH FOR NETWORK INTRUSION DETECTION A.B. Babatope N.A. Azeez INTRODUCTION Information that is not properly secured has the tendency of being vulnerable to intrusions and threats. Security measures ensure information maintains its integrity. Intrusion Detection Security (IDS) is one of the methods of securing computer networks and systems as it detects the attacks before gaining access to the system. 2 AIM AND OBJECTIVES The aim of this project is to develop an IDS using Genetic algorithm approach which is better for detecting unusual events and threats within a computer network Objectives include; • To analyse the different Artificial Intelligence algorithms used in Intrusion Detection systems • To develop an Intrusion Detection System that detects more threats to computer networks and less likely to produce errors. 3 CONTRIBUTIONS Based on the result gotten, it was discovered the intrusive attacks were more frequent on the Class A set of IP addresses. It was also discovered that the probe category of attacks were most frequent across the system. 4 LITERATURE REVIEW According to Scarfone and Mell (2007) “Intrusion detection is the process of monitoring the events occurring in a computer system or network and analysing them for signs of possible incidents, which are violations or imminent threats of violation of computer security policies, acceptable use policies or standard security practices.” 5 LITERATURE REVIEW The following are types of network attacks; • Denial of service • Remote to User Attacks (R2L) • User to Root Attacks (U2R) • Probing 6 RELATED WORKS 7 TITLE APPROACH STRENGTH WEAKNESS The Design and Implementation of Intrusion Detection System based on Data Mining Technology (Zhou & Zhao, 2013) Data Mining Adaptive ability Not time efficient It takes less time to implement. Security issues Implementation of an Intrusion sensor Intrusion Detection (Snortt) System (Ourida, 2012) GENETIC ALGORITHM Genetic algorithm is a problem solving method that was coined from the biological gene operators. The decision variables of search problem are encoded as strings of alphabets. The strings are the Chromosomes, the alphabets are the Genes and the value of the gene is called the Allele 10 GENETIC ALGORITHM During the evolution, different operators are used to process the chromosomes during each generation. The operators include; • Selection (or Reproduction) • Crossover (or Recombination) • Mutation 11 GENETIC ALGORITHM Selection – is the phase where population individuals with better fitness are selected, otherwise it gets damaged. Crossover – is a process where each pair of individuals selects randomly participates in exchanging their parents with each other, until a total new population has been generated. Mutation – this involves diversifying the population due to repeated use of crossover operators. 12 GENETIC ALGORITHM 13 I. Random generation of initial chromosomes XVI. Fitness = II. Set w1 = 0.2, w2 = 0.8, T = 0.5, Max Generations = 100 XVII. If Fitness > T III. Set N = total number of record in training set XVIII. Select fitted chromosomes into new selection pool IV. Set generation counter = 0 XIX. End if V. For each chromosome in population XX. For each chromosome in new pool/population VI. Set A = 0, AB = 0 XXI. Select chromosome for breeding VII. For each record in dataset set XXII. Apply crossover and mutation to new offspring VIII. IX. If record matches chromosome AB = AB + 1 XXIII. Place newly created chromosome into population XXIV. End for Each X. End If XXV. Kill old pool, new pool now current pool XI. If record matches only condition part XXVI. Increment generation Counter by 1 XII. XIII. A=A+1 End If XIV. End for Each record XV. End for Each chromosome XXVII. If generation Counter < Max Generation then XXVIII. Goto line v LINKING GA WITH INTRUSION DETECTION Genetic algorithm is linked with intrusion detection by using the GA to classify the different network connections that the system comes across. Each network connection is represented as a chromosome by the genetic algorithm. The attributes of each network connection is represented as genes in the chromosome. 14 LINKING GA WITH INTRUSION DETECTION The system acts on each network connection as a chromosome. Therefore, the GA makes it possible for the intrusion detection system to differentiate the different types of network connections. 15 FITNESS FUNCTION It is defined as a function which scales the value individual relative to the rest of population. It computes the best possible solutions from the amount of candidates located in the population. It is used to determine the most fit set of chromosomes in respect to other chromosomes present, that will be used for recombination in the next generation. 16 FITNESS FUNCTION Algorithm I: Fitness Function Algorithm 17 SYSTEM DESIGN Figure 1: Genetic Algorithm architecture 18 SYSTEM DESIGN 19 Table 1: Chromosome representation Attribute Name Number of Genes Format Duration 3 H:M:S Protocol 1 Numeric Source port 1 Numeric Destination port 1 Numeric Source IP 4 a.b.c.d Destination IP 4 a.b.c.d Attack name 1 String IMPLEMENTATION The software was developed using the Microsoft Visual studio; an integrated Development Environment (IDE) with the in-built programming language C#. The system was developed as a console program On initialisation of the software, random chromosomes are generated and the system locates the dataset file and reads the file. The system continues to iterate the results for each generation until the number of generation is reached. 20 IMPLEMENTATION Figure 2: Initialisation of the program 21 IMPLEMENTATION 22 DURATION PROTOCOL SOURCE PORT DESTINATION PORT SOURCE IP DESTINATION IP ATTACK NAME 0:0:53 ftp-data 38127 1985 65.127.-1.203 243.192.194.177 rcp -1:1:0 Auth 26586 55979 20.154.30.-1 51.70.111.254 rsh 0:0:-1 Rsh 62512 26370 160.72.57.42 135.-1.237.4 Phf -1:1:0 Auth 26586 -1 242.41.151.123 239.18.88.223 Guess 0:0:-1 Rsh 62728 55979 20.154.30.-1 51.70.111.254 rsh 0:0:-1 http -1 12106 104.65.17.124 -1.191.-1.244 Port-scan 0:0:39 Rsh 62728 -1 242.41.151.123 239.18.88.223 Guess 0:0:39 http -1 12106 104.65.17.124 -1.191.-1.244 Port-scan Table 2: The result of the intrusion detection system DARPA DATASET The Defense Advanced Research Projects Agency (DARPA) dataset was created in 1998 out of the need to evaluate intrusion detection systems by the Lincoln Laboratory of MIT. It was first made to the public in February 1998. There are three classes of attacks are present in the dataset namely; • Probe (Portscan) • R2L (phf, guess) • U2R (rlogin, rsh, rcp) 23 DARPA DATASET 24 The following show the distribution of the types of network connections present in the dataset file; Table 3: Distribution of the intrusive connections in the dataset Probe R2L U2R Portscan – 30 Phf – 1 Guess – 4 rlogin – 1 rsh – 2 rcp – 1 30 5 4 DARPA DATASET 25 U2R R2L Probe Normal 0 20 40 60 80 Figure 3: Graphical distribution of the network connections 100 ANALYSIS OF RESULTS Three set of results were selected to be analysed for findings and notes. Each set of result consists of 50 records. The first analysis was the classification of the source and destination IP addresses. The second analysis is the relationship between the protocol and the type of intrusion 26 ANALYSIS OF RESULTS 27 • First run Table 4: Classification of IP addresses of the first set of results Source IP Destination IP Class A 20 50 Class B 30 0 Class C 0 0 Class D 0 0 Class E 0 0 ANALYSIS OF RESULTS 28 60 50 40 Source IP Destination IP 30 20 10 0 Class A Class B Class C Class D Class E Figure 4: Graphical classification of the IP addresses of the first set of results ANALYSIS OF RESULTS 29 Table 5: Distribution of intrusion attacks to the Protocols for the first set of results Protocols Attack names Phf Port Scan Rcp rlogin rsh ftp-data 2 3 1 6 http 2 4 14 20 Rsh 3 2 3 8 Smtp 5 2 5 12 1 3 4 12 26 ftp telnet 12 ANALYSIS OF RESULTS 30 telnet Smtp Attack Attack Attack Attack Attack Rsh http ftp-data names rsh names rlogin names rcp names Port scan names Phf ftp 0 5 10 15 Figure 5: Graphical representation of the distribution of attacks to protocols for the first set of results ANALYSIS OF RESULTS 31 • Second run Table 6: Classification of IP addresses of the second set of results Source IP Destination IP Class A 19 39 Class B 0 11 Class C 0 0 Class D 0 0 Class E 31 0 ANALYSIS OF RESULTS 32 45 40 35 30 25 20 15 10 5 0 Source IP Destination IP Class A Class B Class C Class D Class E Figure 6: Graphical classification of the IP addresses of the second set of results ANALYSIS OF RESULTS 33 Table 7: Distribution of intrusion attacks to the Protocols for the second set of results Protocols Attack names Phf Port Scan Rcp rlogin rsh ftp 2 4 6 ftp-data 11 21 32 8 4 12 21 29 http Rsh Smtp telnet ANALYSIS OF RESULTS 34 telnet Smtp Attack Attack Attack Attack Attack Rsh http ftp-data names rsh names rlogin names rcp names Port scan names Phf ftp 0 5 10 15 20 25 Figure 7: Graphical representation of the distribution of attacks to protocols for the second set of results ANALYSIS OF RESULTS 35 • Third run Table 8: Classification of IP addresses of the third set of results Source IP Destination IP Class A 21 50 Class B 8 0 Class C 21 0 Class D 0 0 Class E 0 0 ANALYSIS OF RESULTS 36 60 50 40 Source IP Destination IP 30 20 10 0 Class A Class B Class C Class D Class E Figure 8: Graphical classification of the IP addresses of the third set of results ANALYSIS OF RESULTS 37 Table 9: Distribution of intrusion attacks to the Protocols for the third set of results Protocols Attack names Phf Port Scan Rcp rlogin rsh 1 8 16 25 1 9 15 25 2 17 31 ftp ftp-data http Rsh Smtp telnet ANALYSIS OF RESULTS 38 telnet Smtp Attack Attack Attack Attack Attack Rsh http ftp-data names rsh names rlogin names rcp names Port scan names Phf ftp 0 5 10 15 20 Figure 9: Graphical representation of the distribution of attacks to protocols for the third set of results ANALYSIS OF RESULTS 39 • Four run Table 10: Classification of IP addresses of the fourth set of results Source IP Destination IP Class A 46 44 Class B 4 0 Class C 0 0 Class D 0 0 Class E 0 6 ANALYSIS OF RESULTS 40 50 45 40 35 30 25 20 15 10 5 0 Source IP Destination IP Class A Class B Class C Class D Class E Figure 10: Graphical classification of the IP addresses of the fourth set of results ANALYSIS OF RESULTS 41 Table 11: Distribution of intrusion attacks to the Protocols for the fourth set of results Protocols Attack names Phf Port Scan Rcp rlogin rsh http 4 5 9 Rsh 1 1 2 Smtp 9 30 39 14 36 ftp ftp-data telnet ANALYSIS OF RESULTS 42 telnet Smtp Attack Attack Attack Attack Attack Rsh http ftp-data names rsh names rlogin names rcp names Port scan names Phf ftp 0 5 10 15 20 25 30 35 Figure 11: Graphical representation of the distribution of attacks to protocols for the fourth set of results FUTURE WORK Genetic algorithm as an approach towards Intrusion Detection system needs to continually improved on, and the classification rules continually need to be enhanced in order for the system to classify connections correctly. 43 CONCLUSION Securing information against attacks is a process that continuous research needs to be made on as new threats come up in order to be up to date and minimise risk and cost. 44 REFERENCES Chittur, A., 2001. Model Generation for an Intrusion Detection System Using Genetic Algorithms. Hoque, M.S., Mukit, A. & Bikas, A.N., 2012. An Implementation of Intrusion Detection System using Genetic Algorithm. International Journal of Network Security & Its Applications (IJNSA), Vol. 4, No. 2, March 2012. Li, W., 2004. Using Genetic Algorithm for Network Intrusion Detection. Mississippi State University, Mississippi State, MS 39762. Sastry, K., Goldberg, D., Kendall, G., 2005. Genetic Algorithms. Scarfone, K. & Mell, P., 2007. Guide to Intrusion Detection and Prevention Systems (IDPS). National Institute of Standards and Technology NIST special publication 800-94. 45 THANK YOU 46