Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Virtual University of Pakistan Data Warehousing Lecture-29 Brief Intro. to Data Mining Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics Research www.nu.edu.pk/cairindex.asp National University of Computers & Emerging Sciences, Islamabad Email: [email protected] 1 DWH-Ahsan Abdullah What is Data Mining?: Non technical view “There are things that we know that we know… there are things that we know that we don’t know… there are things that we don’t know we don’t know.” Donald Rumsfield US Secretary of Defence 2 DWH-Ahsan Abdullah What is Data Mining?: Slightly formal 3 DWH-Ahsan Abdullah What is Data Mining?: Formal view Data mining digs out valuable non-trivial information from large multidimensional apparently unrelated data bases(sets). 4 DWH-Ahsan Abdullah Why Data Mining? Huge volume 5 DWH-Ahsan Abdullah Claude Shannon's info. theory More volume means less information 6 DWH-Ahsan Abdullah Value vs. Volume Decision (Y/N) Decision Support Value of Data Knowledge Information Indexed Data Raw Data Volume of Data 7 DWH-Ahsan Abdullah Why Data Mining?: Supply & Demand 8 DWH-Ahsan Abdullah 9 DWH-Ahsan Abdullah Data Mining is HOT! 10 Hottest Jobs of year 2025 Time Magazine, 22 May, 2000 10 emerging areas of technology MIT’s Magazine of Technology Review, Jan/Feb, 2001 10 DWH-Ahsan Abdullah How Data Mining is different? Traditionally Knowledge Discovery (KDD) Data Mining (Knowledge-driven exploration) Data Warehouses (Data-driven exploration): Traditional Database (Transactions): 11 DWH-Ahsan Abdullah Data Mining Vs Statistics 12 DWH-Ahsan Abdullah Data Mining Vs. Statistics 13 DWH-Ahsan Abdullah Knowledge extraction using statistics Stock increase (%) Inflation Vs Stock inedx increase 40 30 20 10 0 1.6 1.7 1.8 1.85 1.9 1.95 2 2.9 3 3.3 4.2 4.4 5 6 Inflation (%) Q: What will be the stock increase when inflation is 6%? A: Model non-linear relationship using a line y = mx + c. Hence answer is 13% 14 DWH-Ahsan Abdullah Failure of regression models 70000 70000 y = -0.0127x 6 + 1.5029x 5 - 63.627x 4 + 1190.3x 3 - 9725.3x 2 + 31897x - 29263 60000 60000 50000 50000 40000 40000 30000 30000 20000 20000 10000 10000 0 0 0 -10000 0 5 5 10 10 15 15 20 20 25 25 30 30 35 35 15 DWH-Ahsan Abdullah Data Mining is… Decision Trees Neural Networks Rule Induction If. . . . . Then. . . Clustering Genetic Algorithms 16 DWH-Ahsan Abdullah Data Mining is NOT ... Data warehousing Ad Hoc Query / Reporting Online Analytical Processing (OLAP) Data Visualization Software Agents 17 DWH-Ahsan Abdullah Data Mining: Business Perspective “knowledge” is worth knowing if it can be used to increase profit by lowering cost or it can be used to increase profit by raising revenue. Business questions Profiling/Segmentation Cross-Service Employee retention: 18 DWH-Ahsan Abdullah