Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Data Mining Enterprise systems DT211 4 1 Data Mining • The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases and using it to make crucial business decisions. • Involves the analysis of data and the use of software techniques for finding hidden and unexpected patterns and relationships in sets of data. 2 Examples of Applications of Data Mining • Retail / Marketing – Predicting response to mailing campaigns – Market basket analysis • Banking: – Detecting patterns of fraudulent credit card use. • Insurance – Claims analysis • Medicine – Identifying successful medical therapies for different illnesses 3 Data Mining Operations • Data mining operations include: – Predictive modelling – Database segmentation – Link analysis – Deviation detection 4 Predictive Modeling • Similar to the human learning experience – uses observations to form a model of the important characteristics of some phenomenon. 5 Predictive Modeling • Applications of predictive modeling include direct marketing. • There are two techniques associated with predictive modelling: classification and value prediction. 6 Example of Classification using Tree Induction 7 Predictive Modelling - Value Prediction • Used to estimate a continuous numeric value that is associated with a database record. • Uses the traditional statistical techniques such as linear regression and nonlinear regression. • Applications of value prediction include credit card fraud detection or target mailing list identification. 8 Predictive Modelling - Value Prediction • Data mining requires statistical methods that can accommodate nonlinearity, outliers, and non-numeric data. • Applications of value prediction include credit card fraud detection or target mailing list identification. 9 Database Segmentation • Aim is to partition a database into an unknown number of segments, or clusters, of similar records. • Applications of database segmentation include credit card fraud…. 10 Database Segmentation using a Scatterplot 11 Link Analysis • Aims to establish links between records, or sets of records, in a database; one such example would be association discovery…. • Applications include product affinity analysis. • Finds items that imply the presence of other items in the same event. 12 Link Analysis - Associations Discovery • Affinities between items are represented by association discovery. – e.g. ‘When a customer rents property for more than 2 years and is more than 25 years old, in 40% of cases, the customer will buy a property. This association happens in 35% of all customers who rent properties’. 13 • Data Mining and Data Warehousing Data mining requires single, separate, clean, integrated, and self-consistent source of data. • Data quality and consistency is a prerequisite for mining to ensure the accuracy of the predictive models. Data warehouses are populated with clean, consistent data as well as other attributes that are advantageous to the data mining process: drill down…. 14 Sample types questions • “Data Mining is one of the most essential information technologies to aid strategic formulation” Discuss the validity of this statement. • • Discuss, how different data mining types operations can generate meaningful information for the enterprise. 15