Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Data Mining Mandeep Jandir CS157B What is Data Mining? Data mining, or knowledge discovery, is the process of discovering hidden patterns and relationships in data in order to make better and more informed decisions. Data mining tools predict behaviors and future trends, allowing businesses to make knowledge-driven decisions. Why use Data Mining? Data mining is technique that helps individuals or companies find useful information to make better decisions from large amounts of data. Reduce risks Find problems and issues Save money High confidence predictions Simplifies information Goals of Data Mining Prediction Data mining can show how certain attributes within the data will behave in the future. Ex. - certain seismic wave patterns may predict an earthquake with high probability. Identification Data patterns can be used to identify the existence of an item, an event, or an activity. Goals of Data Mining (cont’d) Classification Data mining can partition the data so that different classes or categories can be identified based on combinations of parameters. Ex. - customers in a supermarket can be categorized into discount-seeking shoppers, shoppers in a rush, loyal regular shoppers, shoppers attached to name brands, and infrequent shoppers. Goals of Data Mining (cont’d) Optimization Optimize the use of limited resources such as time, space, money, or materials and maximize output variables such as sales or profits under a given set of constraints. Types of Knowledge Discovered during Data Mining Knowledge is often classified as inductive versus deductive. Deductive knowledge deduces new information based on applying pre-specified logical rules of deduction on the given data. Data mining addresses inductive knowledge, which discovers new rules and patterns from the supplied data. Types of Knowledge Discovered during Data Mining cont’d It is common to describe knowledge discovered during data mining as: Association Rules Classification hierarchies Sequential patterns Patterns within time series Clustering Types of Association Rules Market-Basket Model, Support, and Confidence Apriori Algorithm Sampling Algorithm Frequent-Pattern Tree Algorithm Partition Algorithm Apriori Algorithm Principle: Any subset of a frequent itemset must be frequent. Generate k-itemsets by joining large k-1itemsets and deleting any that is not large. Notation: Apriori Algorithm cont’d Input: Database of m transactions, D, and a minimum support, mins, represented as a fraction of m. Output: Frequent itemsets, L1,L2,…,Lk References http://en.wikipedia.org/wiki/Data_mining http://www.megaputer.com/dm/dm101.php 3#whyuse www.icaen.uiowa.edu/~comp/Public/Aprior .pdf Elmasri, R. and Navathe, S.: Fundementals of Database Systems, 5th ed.,Pearson-AddisonWesley