Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Promising “Newer” Technologies to Cope with the Information Flood Knowledge Discovery and Data Mining (KDD) Agent-based Technologies Ontologies and Knowledge Brokering Non-traditional data analysis techniques Model Generation As an Example To Explain / Discuss Technologies Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) 1 Why Do We Need so many Data Mining / Analysis Techniques? No generally good technique exists. Different methods make different assumptions with respect to the data set to be analyzed Cross fertilization between different methods is desirable and frequently helpful in obtaining a deeper understanding of the analyzed dataset. Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) 2 Data Mining and Business Intelligence Increasing potential to support business decisions Making Decisions Data Presentation Visualization Techniques Data Mining Information Discovery End User Business Analyst Data Analyst Data Exploration Statistical Analysis, Querying and Reporting Data Warehouses / Data Marts OLAP, MDA Data Sources Paper, Files, Information Providers, Database Systems, OLTP Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) DBA 3 Example: Decision Tree Approach Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) 4 Decision Tree Approach2 Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) 5 Decision Trees Example: • Conducted survey to see what customers were interested in new model car • Want to select customers for advertising campaign sale custId c1 c2 c3 c4 c5 c6 car taurus van van taurus merc taurus age 27 35 40 22 50 25 city newCar sf yes la yes sf yes sf yes la no la no Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) training set 6 One Possibility sale custId c1 c2 c3 c4 c5 c6 age<30 Y N city=sf Y likely car taurus van van taurus merc taurus age 27 35 40 22 50 25 city newCar sf yes la yes sf yes sf yes la no la no car=van N unlikely Y likely N unlikely Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) 7 Another Possibility sale custId c1 c2 c3 c4 c5 c6 car=taurus Y N city=sf Y likely car taurus van van taurus merc taurus age 27 35 40 22 50 25 city newCar sf yes la yes sf yes sf yes la no la no age<45 N unlikely Y likely N unlikely Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) 8 Summary KDD KDD: discovering interesting patterns from large amounts of data A natural evolution of database technology, in great demand, with wide applications A KDD process includes data cleaning, data integration, data selection, transformation, data mining, pattern evaluation, and knowledge presentation Mining can be performed in a variety of information repositories Data mining functionalities: characterization, discrimination, association, classification, clustering, outlier and trend analysis, etc. Multi-disciplinary activity Important Issues: KDD-methodologies and user-interactions, scalability, tool use and tool integration, preprocessing, interpretation of results, finding good parameter settings when running data mining tools,… Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) 9 Where to Find References? Data mining and KDD (SIGKDD member CDROM): – Conference proceedings: KDD, and others, such as PKDD, PAKDD, etc. – Journal: Data Mining and Knowledge Discovery Database field (SIGMOD member CD ROM): – Conference proceedings: ACM-SIGMOD, ACM-PODS, VLDB, ICDE, EDBT, DASFAA – Journals: ACM-TODS, J. ACM, IEEE-TKDE, JIIS, etc. AI and Machine Learning: – Conference proceedings: Machine learning, AAAI, IJCAI, etc. – Journals: Machine Learning, Artificial Intelligence, etc. Statistics: – Conference proceedings: Joint Stat. Meeting, etc. – Journals: Annals of statistics, etc. Visualization: – Conference proceedings: CHI, etc. – Journals: IEEE Trans. visualization and computer graphics, etc. Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) 10