Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CHAPTER 2 THE DATA MINING TECHNOLOGY Data Mining Data mining is the process of sorting through large database or data warehouse and extracting knowledge interested by the people. The extracted knowledge may be represented as concept, rule, law and model. The purpose of data mining is to help the decision- maker in order to find potential association between data, found neglected elements which might be very useful for trends and decision- making behavior. It has been described as “the nontrivial extraction of implicit, previously unknown, and potentially useful information from data” [5] and “the science of extracting useful information from large data sets or databases” [6]. Data mining identifies trends within data that go beyond simple analysis. Through the use of sophisticated algorithms, non-statistician users have the opportunity to identify key attributes of any kind of real life problems like Intrusion Detection Activities, Face recognition problem, Image processing, business processes and any other target opportunities. However abdicating control on these process from the statistician to the machine may or may not result in positives or useful results [1] until one can assure that the data on which the operations are supposed to be performed are complete in all respect. Figure 2 shows the basic approach of Data Mining. Figure 2.1: The transition from raw data to valuable knowledge. Here are a few specific things that data mining might contribute to an intrusion detection project: • Remove normal activity from alarm data to allow analysts to focus on real attacks • Identify false alarm generators and ”bad” sensor signatures • Find anomalous activity that uncovers a real attack • Identify long, ongoing patterns (different IP address, same activity) To accomplish these tasks, data miners employ one or more of the following techniques: • Data summarization with statistics, including finding outliers • Visualization: presenting a graphical summary of the data • Clustering of the data into natural categories • Association rule discovery: defining normal activity and enabling the discovery of anomalies • Classification: predicting the category to which a particular record belongs