Download Data mining

MIS 451 Building Business Intelligence Systems Introduction to Data Mining Why data mining?  OLAP can only provide shallow data analysis -- what  Ex: sales distribution by product 2 Why data mining?  Shallow data analysis is not sufficient to support business decisions -- how   Ex: how to boost sales of other products Ex: when people buy product 6 what other products do they are likely to buy? – cross selling 3 Why data mining?  OLAP can only do shallow data analysis  OLAP is based on SQL SELECT PRODUCTS.PNAME, SUM(SALESFACTS.SALES_AMT) FROM DBSR.PRODUCTS PRODUCTS, DBSR.SALESFACTS SALESFACTS WHERE ( ( PRODUCTS.PRODUCT_KEY = SALESFACTS.PRODUCT_KEY ) ) GROUP BY PRODUCTS.PNAME;   The nature of SQL decides that complicated algorithm cannot be implemented with SQL. Complicated algorithms need to be developed to support deep data analysis – data mining 4 Why data mining?  OLAP results generated from data sets with large number of attributes are difficult to be interpreted  Ex: cluster customers of my company --- target marketing  Pick two attributes related to a customer: income level and sales amount 5 Why data mining?   Ex: cluster customers of my company --- target marketing Pick three attributes related to a customer: income level, education level and sales amount 6 What is data mining?  Data mining is a process to extract hidden and interesting patterns from data.  Data mining is a step in the process of Knowledge Discovery in Database (KDD). 7 Steps of the KDD Process Step 4: Data Mining Step 2: Cleaning Step 5: Interpretation & Evaluation Knowledge Step 3: Transformation Patterns Step 1: Selection Transformed Data Preprocessed Data Data Target Data 8 Steps of the KDD Process      Step 1: select interested columns (attributes) and rows (records) to be mined. Step 2: clean errors from selected data Step 3: data are transformed to be suitable for high performance data mining Step 4: data mining Step 5: filter out non-interesting patterns from data mining results 9 Data mining – on what kind of data     Transactional Database Data warehouse Flat file Web data    Web content Web structure Web log 10 Major data mining tasks  Association rule mining – cross selling  Clustering – target marketing  Classification – potential customer identification, fraud detection 11  Reading : data mining book chapter 1 12

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Data mining