Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The interaction between KM and DM is also shown by the current efforts on the construction of automated systems for filtering association rules learned from medical transaction databases. The availability of a formal ontology allows the ranking of association rules by clarifying what are the rules confirming available medical knowledge, what are surprising but plausible, and finally, the ones to be filtered out (Raj et al., 2008). /xvi Other approaches, such as association and classification rules, joining the declarative nature of rules, and the availability of learning mechanisms including inductive logic programming, are of great potential for effectively merging DM and KM (Amini et al., 2007). /xvi A first challenge in discovery knowledge from local patterns in SAGE data is to perform the local pattern extractions. Recalling that few years ago it was impossible to mine such patterns in large datasets and only association rules with rather a high frequency threshold were used (Becquet et al., 2002). /262 This chapter gives a summary of our recent experience in mining of transcriptomic data. The chapter accentuates the potential of genomic background knowledge stored in various formats such as free texts, ontologies, pathways, links among biological entities, etc. It shows the ways in which heterogeneous background knowledge can be preprocessed and subsequently applied to improve various learning and data mining techniques. In particular, the chapter demonstrates an application of background knowledge in the following tasks: • Relational descriptive analysis • Constraint-based knowledge discovery • Feature selection and construction (and its impact on classification accuracy) • Quantitative association rule mining /269 Association rule (AR) mining can overcome these drawbacks, however transcriptomic data represent a difficult mining context for association rules. First, the data are high-dimensional (typically contain several thousands of attributes), which asks for an algorithm scalable in the number of attributes. Second, expression values are typically quantitative variables. This variable type further increases computational demands and moreover may result in an output with a prohibitive number of redundant rules. Third, the data are often noisy which may also cause a large number of rules of little significance. In this section we discuss the above-mentioned bottlenecks and present results of mining association rules using an alternative approach to quantitative association rule mining. We also demonstrate a way in which background genomic knowledge can be used to prune the search space and reduce the amount of derived rules. /284 To avoid this discretization step, authors in (Georgii, 2005) investigate the use of quantitative association rules, i.e., association rules that operate directly on numeric data and can represent the cumulative effects of variables. /284 Association Rule: A rule, such as implication or correlation, which relates elements cooccurring within a dataset. /291 Types of learning desired include classification learning to help with classifying unseen examples, association learning to determine any association among features (largely statistical) and clustering to seek groups of examples that belong together. /301 Instance (case, record): A single object of the world from which a model will be learned, or on which a model will be used. /349 The ever-increasing number of electronic patients’ records, specialized medical databases, and various computer-stored clinical files provides an unprecedented opportunity for automated and semiautomated discovery of patterns, trends, and associations in medical data. /351 Several methods are available for the integration of domain knowledge into DM in various applications, using different representation methods. For example, ontologies are used in the preprocessing phase to reduce dimensionality, in the mining phase to improve clustering and association rules, and in the post-processing phase to evaluate discovered patterns (Maedche et al., 2003). /355 This is an unusual finding, since several medical studies demonstrate association between OSA and diabetes mellitus (Punjabi & Beamer, 2005). /370 According these questions, various data mining tasks can be formulated. The descriptive tasks - associations or segmentation (subgroup discovery), are used if the main purpose of the data mining is to find some relation between attributes or examples. /382 The most known GUHA procedure is the procedure ASSOC (Hájek & Havránek, 1978) mining for association rules corresponding to various relations of two Boolean attributes. These rules have much stronger expression power than „classical“ association rules using confidence and support; see e.g. (Aggraval et al., 1996). /383 The STULONG data have been used during the Discovery Challenge workshops organized at the ECML conferences 2002, 2003 and 2004. About 25 different analyses, reported at these workshops, covered a variety of data mining tasks: segmentation, subgroup discovery, mining for association rules of different types, mining for sequential and episode rules, classification, regression. /394 Association Rules: Relation between two Boolean attributes, It is defined by a condition concerning contingency table of these attributes. Usually, this condition is given by lower thresholds for confidence and support. /397 JI [08] Data Mining and Medical Knowledge Management, Cases and Applications, Petr Berka, Jan Rauch, Djamel Abdelkader Zighed, IGI Global, 2009. The important point here is that the association of a proximity relationship over the domain over a variable can be seen as a very creative activity. More importantly the choice of proximity relationship can play a significant role in the resolution of conflicting information. /9 Data mining analyzes data previously collected; it is non-experimental. There are several different data mining products. The most common are conditional rules or association rules. /31 At first glance, association rules seem to imply a causal or cause-effect relationship. That is: A customer’s purchase of both sausage and beer causes the customer to also buy hamburger. /31 The most popular market basket association rule development method identifies rules of particular interest by screening for joint probabilities (associations) above a specified threshold. /32 Association rules are used is to aid in making retail decisions. However, simple association rules may lead to errors. Errors might occur; either if causality is recognized where there is no causality; or if the direction of the causal relationship is wrong [20, 35]. /33 The problem of entity association is at the core of information mining techniques. In this work we propose an approach that links the similarity of two knowledge entities to the effort required to fuse them in one. This is implemented as an iterative updating process. /123 Associations between these patterns are then found by applying a data mining technique based on rough set analysis. Further work and applications to discover knowledge about patterns in sequences are currently in process. /159 Abstract. We say that there is an association between two sets of items when the sets are likely to occur together in transactions. In information retrieval, an association between two keyword sets means that they co-occur in a record or document. In databases, an association is a rule latent in the databases whereby an attribute set can be inferred from another. Generally, the number of associations may be large, so we look for those that are particularly strong. Maximal association rules were introduced by [3, 4], and there is only one maximal association. Rough set theory has been used successfully for data mining. By using this theory, rules that are similar to maximal associations can be found. However, we show that the rough set approach to discovering knowledge is much simpler than the maximal association method. /163 An association is said to exist between two sets of items when a transaction containing one set is likely to also contain the other. /163 In a database like this, the number of associations may be large. For example, from a record “Canada, Iran, USA, crude, ship” we may discover a number of associations such as /164 Now standard association rules are based on the notion of frequent sets of attributes which appear in many documents. We are concerned here with maximal association rules, which are based on frequent maximal sets of attributes. /164 Association rules are based on the notion of frequent sets of attributes which appear in many documents. Maximal association rules are based on frequent maximal sets of attributes which appear maximally in many documents. The regular association rule X → Y means that if X then Y (with some confidence). /182 Association rules (ARs) emerged in the domain of market basket analysis and provide a convenient and effective way to identify and represent certain dependencies between attributes in a database. /203 The idea of association rule (AR) mining already dates back to H´ajek et al. (see e.g. [17, 18, 19]). Its application to market basket analysis gained high popularity soon after its re-introduction by Agrawal et al. [1] at the beginning of the 1990’s. The straightforwardness of the underlying ideas as well as the increasing availability of transaction data from shops certainly helped to this end. /205 Association rules can be rated by a number of quality measures, among which support and confidence stand out as the two essential ones. /206 Moreover, as information structures representing associations such as synonymy, specification, and generalization between linguistic terms seem to pop up in many domains that require the semantical representation of language (such as information retrieval, and natural language processing techniques like machine translation) and under a variety of different names (apart from taxonomies one also encounters thesauri and ontologies), this application is steadily gaining in importance. /210 Data mining includes several kinds of technologies such as association rule analysis, classification, clustering, sequential pattern etc. In the chapter, we focus on association rule mining since it has been applied in many fields and considered an important method for discovering associations among data [23]. /225 An association rule is an expression X ⇒ Y meaning that if X occurs, then Y occurs at the same time, where X and Y are sets of items, X ⊂ I, Y ⊂ I, and X ∩ Y = Ø. /225 A lot of research has been carried out in the past by using association rules to build more accurate classifiers. The idea behind these integrated approaches is to focus on a limited subset of association rules. /253 As mentioned above, association and classification rules are the two main learning algorithms in associative classification. The study of association rules is focused on using exhaustive search to find all rules in data that satisfy user-specified minimum support and minimum confidence criteria. On the other hand, classification rules aim to discover a small set of rules to form an accurate classifier. /254 Association rules will search globally for all rules that satisfy minimum support and minimum confidence norms. They will therefore contain the full set of rules, which may incorporate important information. The richness of the rules gives this technique the potential of reflecting the true classification structure in the data [17]. /255 The research presented in this chapter focused on the integration of supervised and unsupervised learning. In doing so, a modified version of the CBA algorithm, which can be used to build classifiers based on association rules, has been proposed. /264 JI [10] Intelligent Data Mining, Techniques and Applications, Da Ruan, Guoqing Chen, Etienne E. Kerre, Geert Wets, Springer, 2005. Two key statistics in association rule mining are support and confidence, which measure the number of cases (i.e., the database transactions in association rule mining) that contain a rule’s antecedent and consequent parts and the number of cases that contain the consequent part among those containing the antecedent part, respectively. /5 Different data mining tasks are discussed in the literature [4, 63, 50], such as regression, classification, association rule mining, and clustering. /54 Many studies have shown the limits of support/confidence framework used in Apriori-like algorithms to mine association rules. /75 This approach is based on the research of classification rules which are association rules of which the consequent is a class label (that is to say class association rules). /76 An association rule is a rule A→B, where A and B are two sets of items (also called itemsets) such that A _= , B _= , and A∩B = , meaning that given a database D of transactions (where each transaction is a set of items) whenever a transaction T contains A, then T probably contains B also [3]. /76 Two other differences should be noted between associative classification and decision trees. The first one is that association rules were built primarily for boolean data [5], where only the presence of attributes is of interest for the user. The second, linked to the previous one, is that the usual measures of the interest of a rule, such as the confidence or the lift, take into account only the cases covered by the rule. Thereafter, various algorithms have been developed to build association rules between categorical and numerical attributes (see, for example, [21, 25]). /77 An association rule is defined by a database DB, a nonempty set A⊂A (A is an itemset) called antecedent, and a nonempty set B ⊂ A called consequent such that A∩B = . We denote a rule by A DB −→ B, or simply A→B when there is no possible confusion. /84 An association rule on a given database is described by two itemsets. One can also speak about the contingency table of an association rule, which leads us to the notion of descriptor system. /85 JI [11] Data Mining, Special Issue in Annals of Information Systems, Robert Stahlbock, Sven F. Crone, Stefan Lessmann, Springer, Nov. 2009. David W. Cheung technique for updating association rules in large databases [5], /102 JI [12] Data Mining In Time Series Databases, Mark Last, Abraham Kandel, Horst Bunke, World Scientific Publishing, 2004. Problem statement The problem of mining association rules over market basket analysis was introduced in (Agrawal, Imielinski, & Swami, 1993; Agrawal & Srikant, 1994). The problem consists of finding associations between items or itemsets in transactional data. The data is typically retail sales in the form of customer transactions, but can be any data that can be modeled into transactions. /33 It is known that algorithms for discovering association rules generate an overwhelming number of those rules. /33 Since the introduction of association rules a decade ago and the launch of the research in efficient frequent itemset mining, the development of effective approaches for mining large transactional databases has been the focus of many research studies. /53 This chapter examines the problem of mining association patterns (Agrawal, Imielinski & Swami, 1993) from data sets with skewed support distributions. Most of the algorithms developed so far rely on the support-based pruning strategy to prune the combinatorial search space. /58 A better approach will be to have a measure that can efficiently identify useful patterns even at low levels of support and can be used to automatically remove spurious patterns during the association mining process. Omiecinski (2003) recently introduced a measure called all-confidence as an alternative to the support measure. /59 Besides all-confidence (Omiecinski, 2003), other measures of association have been proposed to extract interesting patterns in large data sets. For example, Brin, Motwani, and Silverstein (1997) introduced the interest measure and χ2 test to discover patterns containing highly dependent items. However, these measures do not possess the desired anti-monotone property. /60 This chapter introduces a data mining method for the discovery of association rules from images of scanned paper documents. /176 JI [13] Data Mining Patterns, New Methods And Applications, Pascal Poncelet, Florent Masseglia, Maguelonne Teisseire, IGI Global, 2008. Data mining is a process concerned with uncovering patterns, associations, anomalies and statistically significant structures in data (Fayyad et al., 1996). It typically refers to the case where the data is too large or too complex to allow either a manual analysis or analysis by means of simple queries. /49 It is worth noting here that there is another class of algorithms that may also be used to deliver nuggets: association rule algorithms (Agrawal, Imielinski & Swami, 1993; Agrawal Mannila, Srikant, Toivonen & Verkamo, 1996). They were developed for transaction data (also known as basket data). This type of data contains information on transactions, for example, showing items that have been purchased together. Association rule mining algorithms deliver a set of association rules, often containing all associations between items above certain support and confidence thresholds. The association rules are generally of the form “customers that purchase bread and butter also get milk, with 98 % confidence.” This type of rule is not constrained to have a particular value as output, or indeed to refer to any particular attribute. Delivering all association rules in transactional data is a suitable approach, since transactional data tends to contain few associations. Classification datasets, however, tend to contain many associations, so delivering all association rules for a classification dataset results in output of overwhelming size. Also classification datasets often contain many numeric continuous attributes, and association rule induction algorithms are not designed to cope with this type of data. Therefore, although association rules can be used for classification (Bayardo, 1997; Liu, Hsu & Ma, 1998), and even for partial classification or nugget discovery (Ali et al., 1997), work is required to adapt the association algorithms to cope with classification data, and with the problem of partial classification or nugget discovery. /74 Also, the following association rule algorithms were chosen: GRI: Generalised Rule Induction (Mallen & Bramer, 1995) is described as an association rule algorithm, although it could also be considered as a partial classification algorithm. It builds a table of the best N association rules, as Interesting Nuggets Using Heuristic Techniques 89 ranked by the J measure, where N is a parameter set by the user. In GRI the output attribute can be chosen, and each rule produced can be used as a nugget describing that output. They contain binary partitions for numeric attributes and tests on a simple value for categorical attributes. Apriori: The Apriori algorithm (Agrawal et al., 1993, 1996) is the most prominent association rule algorithm. Pre-discretisation of numeric attributes is necessary, since the algorithm can only handle categorical attributes. A simple equal width discretisation scheme was used for this. The output of this algorithm is not constraint to rules for a particular attribute, hence only the nuggets relating to the class under scrutiny need to be analysed for the task of nugget discovery. The Apriori rules contain simple value tests for categorical attributes. /89 The data clustering approach was implemented in association with hierarchical clustering and graphtheoretical techniques, and the network performance is illustrated using several benchmark problems /231 Techniques for data mining include mining association rules, data classification, generalization, clustering, and searching for patterns (Chen, Han, & Yu, 1996). The focus of data mining is to reveal information that is hidden and unexpected, as there is little value in finding patterns and relationships that are already intuitive. By discovering hidden patterns and relationships in the data, data mining enables users to extract greater value from their data than simple query and analysis approaches. /262 There are many data mining techniques being proposed in the literature. The most common ones, which will be included in this section, are association rules (Agrawal & Srikant, 1994), sequential patterns (Agrawal & Srikant 1995; Srikant & Agrawal, 1996; Zaki 1998), classification (Agrawal, Ghosh, Imielinski, Iyer, & Swami, 1992; Alsabti, Ranka, & Singh, 1998; Mehta, Agrawal & Rissanen, 1996; Shafer, Agrawal, & Mehta, 1996), and clustering (Aggarwal, Procopiuc, Wolf, Yu, & Park, 1999; Agrawal, Gehrke, Gunopulos, & Raghavan, 1998; Cheng, Fu & Zhang, 1999; Guha, Rastogi & Shim, 1998; Ng & Han, 1994; Zhang, Ramakrishnan & Livny, 1996). /269 An association rule is a rule that implies certain association relationships among a set of objects (such as “occur together” or “one implies the other”) in a database. Given a set of transactions, where each transaction is a set of items, an association rule is an expression of the form X Y, where X and Y are sets of items. The intuitive meaning of such a rule is that transactions of the database which contain X then contain Y. An example of an association rule is “25% of transactions that contain instant noodles also contain Coca Cola; 3% of all transactions contain both of these items”. Here 25% is called the confidence of the rule and 3% the support of the rule. The problem is to find all association rules that satisfy userspecified minimum support and minimum confidence constraints. /269 JI [14] Data Mining, A Heuristic Approach, Hussein Aly Abbass, Ruhul Amin Sarker, Charles S. Newton, Idea Group Publishing, 2002 In the paper “Latent Semantic Space forWeb Clustering” by I-Jen Chiang, T.Y. Lin, Hsiang-Chun Tsai, Jau-MinWong, and Xiaohua Hu, latent semantic space in the form of some geometric structure in combinatorial topology and hypergraph view, has been proposed for unstructured document clustering. Their clustering work is based on a novel view that term associations of a given collection of documents form a simplicial complex, which can be decomposed into connected components at various levels. An agglomerative method for finding geometric maximal connected components for document clustering is proposed. Experimental results show that the proposed method can effectively solve polysemy and term dependency problems in the field of information retrieval. /vi The chapter “Na¨ıve Rules Do Not Consider Underlying Causality” by Lawrence J. Mazlack argues that it is important to understand when association rules have causal foundations in order to avoid na¨ıve decisions and increases the perceived utility of rules with causal underpinnings. /viii In his first paper “Definability of Association Rules and Tables of Critical Frequencies,” Jan Ranch presents a new intuitive criterion of definability of association rules based on tables of critical frequencies, which are introduced as a tool for avoiding complex computation related to the association rules corresponding to statistical hypotheses tests. /viii In the paper “Using Association Rules for Classification from Databases Having Class Label Ambiguities: A Belief Theoretic Method” by S.P. Subasinghua, J. Zhang, K. Premaratae, M.L. Shyu, M. Kubat, and K.K.R.G.K. Hewawasam, a classification algorithm that combines belief theoretic technique and portioned association mining strategy is proposed, to address both the presence of class label ambiguities and unbalanced distribution of classes in the training data. /ix Classification Association Rule Mining (CARM) is the technique that utilizes association mining to derive classification rules. A typical problem with CARM is the overwhelming number of classification association rules that may be generated. The paper “Mining Efficiently Significant Classification Associate Rules” by Yanbo J. Wang, Qin Xin, and Frans Coenen addresses the issues of how to efficiently identify significant classification association rules for each predefined class. Both theoretical and experimental results show that the proposed rule mining approach, which is based on a novel rule scoring and ranking strategy, is able to identify significant classification association rules in a time efficient manner. /x Association rules [3] describe the co-occurrence among data items in a large amount of collected data. They have been profitably exploited for classification purposes [8, 11, 19]. In this case, rules are called classification rules and their consequent contains the class label. Classification rule mining is the discovery of a rule set in the training dataset to form a model of data, also called classifier. The classifier is then used to classify new data for which the class label is unknown. /1 Data items in an association rule are unordered. However, in many application domains (e.g., web log mining, DNA and proteome analysis) the order among items is an important feature. /1 To deal with the generation of a large solution set, in the context of association rule mining a significant effort has been devoted to define concise representations for frequent itemsets and association rules. /27 For association rules, concise representations have been proposed based on closed and generator itemsets [22, 23, 33]. In the context of associative classification, compact representations for associative classification rules have been proposed based on generator itemsets [7] and free-sets [15]. /28 Let us recall some examples to illustrate the main intuition. The association that consists of “wall” and “street” denotes some financial notions that have meaning beyond the two nodes, “wall” and “street”. This is similar to the notion of open segment (v0, v1)) that represents one dimensional geometric object, 1-simplex, that carries information beyond the two end points. /62 The notion of association rules was introduced by Agrawal et al. [1] and has been demonstrated to be useful in several domains [4, 5], such as retail sales transaction database. In the theory two standard measures, called support and confidence, are often used. For documents the orders of keywords or directions of rules are not essential. Our focus will be on the support; a set of items that meets the support is often referred to as frequent itemsets; we will call them associations (undirected association rules) as to indicate the emphasis on their meaning more than the phenomena of frequency. /62 A lot of data mining research has been focusing on the development of algorithms for performing different tasks, i.e. clustering, association and classification [1,2,5,13,15,16,19,20,24,28,30], and on their applications to diverse domains. /166 The Common Warehouse Model for Data Mining (CWM DM) [22] proposed by the Object Management Group, introduces a CWM Data Mining metamodel integrated by the following conceptual areas: a core Mining metamodel and metamodels representing the data mining subdomains of Clustering, Association Rules, Supervised, Classification, Approximation, and Attribute Importance. /166 The goal of this step is to determine relationships among variables. In this phase, both statistical methods (e.g. discriminant analysis, clustering, and regression analysis) and data-oriented methods (e.g. neural networks, decision trees, association rules) can be used. /167 The Task Model defines all the data mining tasks to be done in the project. The approach here is that a task model is first defined in terms of types of problems (e.g. clustering instead of K-means, association instead of a priori, . . . ) and then refined in some iterations by a data mining expert. /170 This chapter proposes a fuzzy data-mining algorithm for extracting both association rules and membership functions from quantitative transactions. /179 Data mining is most commonly used in attempts to induce association rules from transaction data. Transaction data in real-world applications, however, usually consist of quantitative values. Designing a sophisticated data-mining algorithm able to deal with various types of data presents a challenge to workers in this research field. /179 In [4], we proposed a mining approach that integrated fuzzy-set concepts with the a priori mining algorithm [1] to find interesting itemsets and fuzzy association rules in transaction data with quantitative values. /180 Wang and Bridges used GAs to tune membership functions for intrusion detection systems based on similarity of association rules [11]. Kaya and Alhajj [6] proposed a GA-based clustering method to derive a predefined number of membership functions for getting a maximum profit within an interval of user specified minimum support values. /180 A new algorithm named Compressed Binary Mine (CBMine) for mining association rules and frequent patterns is presented in this chapter. /198 Mining association rules in transaction databases have been demonstrated to be useful and technically feasible in several application areas [14,18,21] particularly in retail sales, and it becomes every day more important in applications that use document databases [11, 16, 17]. Although research in this area has been going on for more than one decade; today, mining such rules is still one of the most popular methods in knowledge discovery and data mining. /198 Various algorithms have been proposed to discover large itemsets [2, 3, 6, 9, 11, 19]. Of all of them, a priori has had the biggest impact [3], since its general conception has been the base for the development of new algorithms to discover association rules. /198 The discovery of large itemsets (the first step of the process) is computationally expensive. The generation of association rules (the second step) is the easiest of both. The overall performance of mining association rules depends on the first step; for this reason, the comparative effects of the results that we present with our algorithm covers only the first step. /198 An association rule is an implication of the form X ⇒ Y , where X ⊂ I, Y ⊂ I, and X ∩ Y = ∅. The association rule X ⇒ Y holds in the database D with certain quality and a support s, where s is the proportion of transactions in D that contain X ∪Y . Some quality measures have been proposed, although these are not considered in this work. /198 The first step in the discovery of association rules is to find each set of items (called itemset) that has co-occurrence rate above the minimum support. An itemset with at least the minimum support is called a large itemset or a frequent itemset. In this chapter, as in others, the term frequent itemset will be used. The size of an itemset represents the number of items contained in the itemset, and an itemset containing k items is called a k-itemset. For example, beer, diaper can be a frequent 2-itemset. If an itemset is frequent and no proper superset is frequent, we say that it is a maximally frequent itemset. /199 Na¨ıve association rules may result if the underlying causality of the rules is not considered. The greatest impact on the decision value quality of association rules may come from treating association rules as causal statements without understanding whether there is, in fact, underlying causality. /213 One of the cornerstones of data mining is the development of association rules. Association rules greatest impact is in helping to make decisions. One measure of the quality of an association rule is its relative decision value. Association rules are often constructed using simplifying assumptions that lead to na¨ıve results and consequently na¨ıve and often wrong decisions. Perhaps the greatest area of concern about the decision value is treating association rules as causal statements without understanding whether there is, in fact, underlying causality. /213 Data mining analyzes data previously collected; it is non-experimental. There are several different data mining products. The most common are conditional rules or association rules. Conditional rules are most often drawn from induced trees while association rules are most often learned from tabular data. IF Age < 20 THEN vote frequency is: often with {belief = high} IFAge is old THEN Income < $10,000 with {belief = 0.8} Fig. 2. Conditional rules 220 L.J. Mazlack Customers who buy beer and sausage also tend to buy hamburger with {confidence = 0.7} in {support = 0.15} Customers who buy strawberries also tend to buy whipped cream with {confidence = 0.8} in {support = 0.2} Fig. 3. Association rules /219 At first glance, conditional and association rules seem to imply a causal or cause-effect relationship. That is: A customer’s purchase of both sausage and beer causes the customer to also buy hamburger. /219 The association rule’s confidence measure is simply an estimate of conditional probability. The association rule’s support indicates how often the joint occurrence happens (the joint probability over the entire data set). The strength of any causal dependency may be very different from that of a possibly related association value. /220 The association rule’s confidence measure is simply an estimate of conditional probability. The association rule’s support indicates how often the joint occurrence happens (the joint probability over the entire data set). The strength of any causal dependency may be very different from that of a possibly related association value. /221 The most popular market basket association rule development method identifies rules of particular interest by screening for joint probabilities (associations) above a specified threshold. /221 Association rules are used is to aid in making retail decisions. However, simple association rules may lead to errors. Errors might occur; either if causality is recognized where there is no causality; or if the direction of the causal relationship is wrong [18, 33]. Errors might occur; either if causality is recognized where there is no causality; or if the direction of the causal relationship is wrong. /221 One of the corner stones of data mining is the development of association rules. Association rules greatest impact is in helping to make decisions. One measure of the quality of an association rule is its relative decision value. Association rules are often constructed using simplifying assumptions that lead to na¨ıve results and consequently na¨ıve and often wrong decisions. Perhaps the greatest area of concern is treating association rules as causal statements without understanding whether there is, in fact, underlying causality. /226 Data mining analyzes non-experimental data previously collected. There are several different data mining products. The most common are conditional rules or association rules. Conditional rules are most often drawn from induced trees while association rules are most often learned from tabular data. /241 Many (if not all) DM techniques can be viewed in terms of the data compression approach. For example, association rules and pruned decision trees can be viewed as ways of providing compression of parts of the data. Clustering can also be considered as a way of compressing the dataset. There is a connection with the Bayesian theory for modeling the joint distribution – any compression scheme can be viewed as providing a distribution on the set of possible instances of the data. /255 Piatetsky-Shapiro in Wu et al. [40] gives a good example that characterizes the whole area of current DM research: “we see many papers proposing incremental refinements in association rules algorithms, but very few papers describing how the discovered association rules are used”. DM is fundamentally application-oriented area motivated by business and scientific needs to make sense of mountains of data [40]. A DMS is generally used to support or do some task(s) by human beings in an organizational environment (see Fig. 8) both having their desires related to DMS. Further, the organization has its own environment that has its own interest related to DMS, for example that privacy of people is not violated. /267 Finding useful rules is an important task of knowledge discovery in data. Most of the researchers on knowledge discovery focus on techniques for generating patterns, such as classification rules, association rules . . . etc, from a data set. /289 Chapter concerns theoretical foundations of association rules. We deal with more general rules than the classical association rules related to market baskets are. Various theoretical aspects of association rules are introduced and several classes of association rules are defined. Implicational and double implicational rules are examples of such classes. It is shown that there are practically important theoretical results related to particular classes. Results concern deduction rules in logical calculi of association rules, fast evaluation of rules corresponding to statistical hypotheses tests, missing information and definability of association rules in classical predicate calculi. /314 The goal of this chapter is to contribute to theoretical foundations of data mining. We deal with association rules. We are however interested with more general rules than the classical association rule [1] inspired by market basket are.We understand the association rule as an expression ϕ ≈ ψ where ϕ and ψ are Boolean attributes derived from columns of an analysed data matrix. The intuitive meaning of the association rule ϕ ≈ ψ is that Boolean attributes ϕ and ψ are associated in a way corresponding to the symbol ≈. The symbol ≈ is called 4ft-quantifier. It is associated to a condition related to the (fourfold) contingency table of ϕ and ψ in the analysed data matrix. /314 Various aspects of association rules of the form ϕ ≈ ψ were studied: • Logical calculi formulae of which correspond to association rules were defined. Some of these calculi are straightforward modification of classical predicate calculi [2], the other are more simple [12]. • Logical aspects of calculi of association rules e.g. decidability, deduction rules definability in classical predicate calculus were investigated, see namely [2, 12, 13]. • Association rules that correspond to statistical hypotheses tests were defined and studied [2]. • Several approaches to evaluation of association rules in data with missing information were investigated [2, 7]. • Software tools for mining all kinds of such association rules were implemented and applied [3, 5, 15]. /316 This chapter concerns theoretical aspects of association rules. It was shown that most of theoretically interesting and practically important results concerning association rules are related to classes of association rules. Goal of this chapter is to give an overview of important classes of association rules and their properties. Both already published results are mentioned and new results are introduced. /316 Broadly CARM algorithms can be categorised into two groups according to the way that the CRs are generated: • Two stage algorithms where a set of CARs are produced first (stage 1), which are then pruned and placed into a classifier (stage 2). Examples of this approach include CBA [38] and CMAR [36]. CBA (Classification Based on Associations), developed by Liu et al. in 1998, is an Apriori [2] based CARM algorithm, which (1) applies its CBA-GR procedure for CAR generation; and (2) applies its CBA-CB procedure to build a classifier based on the generated CARs. CMAR (Classification based on Multiple Association Rules), introduced by Han and Jan in 2001, is similar to CBA but generates CARs through a FP-tree [27] based approach. • Integrated algorithms where the classifier is produced in a single processing step. Examples of this approach include TFPC2 [15,18], and induction systems such as FOIL [46], PRM and CPAR [53]. TFPC (Total From Partial Classification), proposed by Coenen et al. in 2004, is a Apriori-TFP [16] based CARM algorithm, which generates CARs through efficiently constructing both P-tree and T-tree set enumeration tree structures. FOIL 2TFPC may be obtained from http://www.csc.liv.ac.uk/(frans/KDD/Software. 450 Y.J. Wang et al. (First Order Inductive Learner) is an inductive learning algorithm for generating CARs developed by Quinlan and Cameron-Jones in 1993. This algorithm was later developed by Yin and Han to produce the PRM (Predictive Rule Mining) CAR generation algorithm. PRM was then further developed, by Yin and Han in 2003 to produce CPAR (Classification based on Predictive Association Rules). /450 We present here an abstract model in which data preprocessing and data mining proper stages of the Data Mining process are are described as two different types of generalization. In the model the data mining and data preprocessing algorithms are defined as certain generalization operators. We use our framework to show that only three Data Mining operators: classification, clustering, and association operator are needed to express all Data Mining algorithms for classification, clustering, and association, respectively. We also are able to show formally that the generalization that occurs in the preprocessing stage is different from the generalization inherent to the data mining proper stage. /469 This gives us only three data mining generalization operators to consider: classification, clustering, and association. /476 Data mining includes a number of different tasks, such as association rule mining, classification, and clustering, etc. This paper studies how to learn support vector machines. /518 We use a modified association rule mining (ARM) technique to extract the interesting rules from the training data set and use a belief theoretic classifier based on the extracted rules to classify the incoming feature vectors. The ambiguity modelling capability of belief theory enables our classifier to perform better in the presence of class label ambiguities. /539 Classification Association Rule Mining (CARM) is the technique that utilizes association mining to derive classification rules. A typical problem with CARM is the overwhelming number of classification association rules that may be generated. The paper “Mining Efficiently Significant Classification Associate Rules” by Yanbo J. Wang, Qin Xin, and Frans Coenen addresses the issues of how to efficiently identify significant classification association rules for each predefined class. Both theoretical and experimental results show that the proposed rule mining approach, which is based on a novel rule scoring and ranking strategy, is able to identify significant classification association rules in a time efficient manner. /539 JI [15] Data Mining, Foundations and Practice, Tsau Young Lin, Ying Xie, Anita Wasilewska, Churn-Jung Liau, Springer, 2008. Dong-Peng et al. described one such application where the implementations of decision trees and C. Soares et al. / Applications of Data Mining in E-Business and Finance: Introduction 3 association rules in WEKA [9] are applied in a risk analysis problem in banking, for which the data was suitably prepared [10]. Another example in this volume is the paper by Giuffrida et al., in which the Apriori algorithm for association rule mining is used on an online advertising personalization problem [11]. /3 Another example in this volume is the paper by Giuffrida et al., in which the Apriori algorithm for association rule mining is used on an online advertising personalization problem [11]. /4 The applications cover tasks such as clustering (e.g., [15]), classification (e.g., [13,14]), regression (e.g., [6]), information retrieval (e.g., [8]) and extraction (e.g., [7]), association mining (e.g., [10,11]) and sequence mining (e.g., [12,16]). Many research fields are also covered, including neural networks (e.g., [5]), machine learning (e.g., SVM [13]), data mining (e.g., association rules [10,11]), statistics (e.g., logistic [13] and linear regression [6]) and evolutionary computation (e.g., [4,14]) The wider the range of tools that is mastered by a data analyst, the better the results he/she may obtain. /4 The purpose of association analysis is to figure out hidden associations and some useful rules contained in the data base, and can use these rules to speculate and judge unknown data from the already known information [6]. /38 At present, there is a great deal of research which has combined the decision trees method with other methods such as association rules, Bayesian, Neural Network and Support Vector Machine [10]. /42 Moreover, a high-support product may have some other temporal restrictions (i.e., it may go out of the market), thus, it may be necessary to dismiss association rules associated with it. They introduce the concept of temporal support. /53 They propose a method to discover cyclic association rules. The same problem was also considered by others, Verma and Vyas [5] propose an efficient algorithm to discover calendar-based association rules: the rule “egg coffee” has a strong support in the early morning but much smaller support during the rest of the day. Zimbrao et al. [6] extend the seasonality concept just mentioned to include also the concept of product lifespan during rule generation and application. /53 Additional work includes enhancing the actionability of pattern mining in traditional data mining techniques such as association rules [6], multi-objective optimization in data mining [23], role model-based actionable pattern mining [27], cost-sensitive learning [28] and postprocessing [29], etc. /106 JI [17] Applications of Data Mining in E-Business and Finance , Carlos Soares, Yonghong Peng, Jun Meng, Takashi Washio, Zhi-Hua Zhou, The authors and IOS Press, 2008. Association rules: Association rules with high confidence and support define a different kind of pattern. As before, records that do not follow these rules are considered outliers. The power of association rules is that they can deal with data of different types. However, Boolean association rules do not provide enough quantitative and qualitative information. Ordinal association rules, defined by (Maletic and Marcus, 2000, Marcus et al., 2001), are used to find rules that give more information (e.g., ordinal relationships between data elements). The ordinal association rules yield special types of patterns, so this method is, in general, similar to the pattern-based method. This method can be extended to find other kind of associations between groups of data elements (e.g., statistical correlations). /24 The term association rule was first introduced by (Aggarwal et al., 1993) in the context of market-basket analysis. Association rule of this type are also referred to in the literature as classical or Boolean association rules. The concept was extended in other studies and experiments. Of particular interest to this research are the quantitative association rules (Srikant et al., 1996) and ratio-rules (Korn et al., 1998) that can be used for the identification of possible erroneous data items with certain modifications. In previous work we argued that another extension of the association rule – ordinal association rules (Maletic and Marcus, 2000, Marcus et al., 2001) – is more flexible, general, and very useful for identification of errors. Since this is a recently introduced concept, it is briefly defined. /26 Although various discretization methods are available, they are tuned to different types of learning, such as decision tree learning, decision rule learning, naive-Bayes learning, Bayes network learning, clustering, and association learning. Different types of learning have different characteristics and hence require different strategies of discretization. It is important to be aware of the leaning context whenever to design or employ discretization methods. It is unrealistic to pursue a universally optimal discretization approach that can be blind to its learning context. /111 The traditional application of association rules is market basket analysis, see for instance (Brijs et al., 1999). Since then, the technique has been applied to other kinds of data, such as: • Census data (Brin et al., 1997A, Brin et al., 1997B) • Linguistic data for writer evaluation (Aumann and Lindell, 2003) • Insurance data (Castelo and Giudici, 2003) • Medical diagnosis (Gamberger et al., 1999) One of the first generalizations, which has still applications in the field of market basket analysis, is the consideration of temporal or sequential information, such as the date of purchase. Applications include: • Market basket data (Agrawal and Srikant, 1995) • Causes of plan failures (Zaki, 2001) • Web personalization (Mobasher et al., 2002) • Text data (Brin et al., 1997A, Delgado et al., 2002) • Publication databases (Lee et al., 2001) /307 Neural networks have been used extensively in data mining for a wide variety of problems in business, engineering, industry, medicine, and science. In general, neural networks are good at solving the following common data mining problems such as classification, prediction, association, and clustering. This section provides a short overview on the application areas. /436 With association techniques, we are interested in the correlation or relationship among a number variables or objects. Association is used in several ways. One use as in market basket analysis is to help identify the consequent items given a set of antecedent items. An association rule in this way is an implication of the form: IF x, THEN Y, where x is a set of antecedent items and Y is the consequent items. This type of association rule has been used in a variety of data mining tasks including credit card purchase analysis, merchandise stocking, insurance fraud investigation, /436 Association/Pattern defect recognition (Kim and Kumara, 1997) Recognition facial image recognition (Dai and Nakano, 1998) frequency assignment (Salcedo-Sanz et al., 2004) graph or image matching (Suganthan et al., 1995; Pajares et al., 1998) image restoration (Paik and Katsaggelos, 1992; Sun and Yu, 1995) imgage segmentation (Rout et al., 1998; Wang et al., 1992) landscape pattern prediction (Tatem et al., 2002) market basket analysis (Evans, 1997) object recognition (Huang and Liu, 1997; Young et al., 1997; Li and xxxxxxxxxxxxx Lee, 2002) on-line marketing (Changchien and Lu, 2001) pattern sequence recognition (Lee, 2002) semantic indexing and searching (Chen et al., 1998) /437 We present three important classes of neural network models: Feedforward multilayer networks, Hopfield networks, and Kohonen’s selforganizing maps, which are suitable for a variety of problems in pattern association, pattern classification, prediction, and clustering. /438 Data Mining, as presently understood, is a broad term, including search for “association rules”, classification, regression, clustering and similar. Here we shall restrict ourselves to search for “rules” in a rather general sense, namely general dependencies valid in given data and expressed by formulas of a formal logical language. /541 The study of logical aspects of Data Mining is interesting and useful: it gives an exact abstract approach to “association rules” based on the notion of (generalized) quantifiers, important classes of quantifiers, deductive properties of associations expressed using such quantifiers as well as other results not mentioned here (as e.g. results on computational complexity). Hopefully the present chapter will help the reader to enjoy this. /549 Data Mining is mainly concerned with methodologies for extracting patterns from large data repositories. There are many Data Mining methods which accomplishing a limited set of tasks produces a particular enumeration of patterns over data sets. The main tasks of Data Mining which have already been discussed in previous sections are: i) Clustering, ii) Classification, iii) Association Rule Extraction, iv)Time Series, v) Regression, and vi) Summarization. /613 There are also some other well-known approaches and measures for evaluating association rules are: • Rule templates are used to describe a pattern for those attributes that can appear in the left- or right-hand side of an association rule. A rule template may be either inclusive or restrictive. An inclusive rule template specifies desirable rules that are considered to be interesting. On the other hand a restrictive rule template specifies undesirable rules that are considered to be uninteresting. Rule pruning can be done by setting support, confidence and rule size thresholds. • Dong and Li’s interestingness measure (Dong and Li, 1998) is used to evaluate the importance of an association rule by considering its unexpectedness in terms of other association rules in its neighborhood. The neighborhood of an association rule consists of association rules within a given distance. • Gray and Orlowska’s Interestingness (Gray and Orlowka, 1998) to evaluate the confidence of associations between sets of items in the extracted association rules. Though suppor and confidence have been shown to be useful for characterizing association rules, interestingness contains a discriminator component that gives an indication of the independence of the antecedent and consequent. • Peculiarity (Zhong et al., 1999) is a distance-based measure of rules interestingness. It is used to determine the extent to which one data object differs from other similar data objects. • Closed Association Rules Mining. It is widely recognized that the larger the set of frequent itemsets, the more association rules are presented to the user, many of which turn out to be redundant. However it is not necessary to mine all frequent itemsets to guarantee that all non-redundant association rules will be found. It is sufficient to consider only the closed 624 Maria Halkidi and Michalis Vazirgiannis frequent itemsets (Zaki and Hsiao, 2002, Pasquier et al., 1999, Pei et al., 2000). The set of closed frequent itemsets can guarantee completeness even in dense domains and all non-redundant association rules can be defined on it. CHARM is an efficient algorithm for closed association rules mining. /623 A preliminary exploratory analysis can give indications on how to code the explanatory variables, in order to maximize their predictive power. In order to reach this objective we have employed statistical measures of association between pairs of variables, such as chi-squared based measures and statistical measures of dependence, such as Goodman and Kruskal’s (see (Giudici, 2003) for a systematic comparison of such measures). /647 Rare cases are often of special interest. This is especially true in the context of Data Mining, where one often wants to uncover subtle patterns that may be hidden in massive amounts of data. Examples of mining rare cases include learning word pronunciations (Van den Bosch et al., 1997), detecting oil spills from satellite images (Kubat et al., 1998), predicting telecommunication equipment failures (Weiss and Hirsh, 1998) and finding associations between infrequently purchased supermarket items (Liu et al., 1999). Rare cases warrant special attention because they pose significant problems for Data Mining algorithms. /747 The three rare cases will be more difficult to detect and generalize from because they contain fewer data points. A second important unsupervised learning task is association rule mining, which looks for associations between items (Agarwal et al., 1993). Groupings of items that co-occur frequently, such as milk and cookies, will be considered common cases, while other associations may be extremely rare. For example, mop and broom will be a rare association (i.e., case) in the context of supermarket sales, not because the items are unlikely to be purchased together, but because neither item is frequently purchased in a supermarket (Liu et al., 1999). /748 Table 53.1. Data Mining tasks and used techniques Data Mining Tasks Data Mining Techniques Classification induction, neural networks, genetic algorithms Association Apriori, statistics, genetic algorithms Clustering neural networks, induction, statistics Regression induction, neural networks, statistics Episode discovery induction, neural networks, genetic algorithms Summarization induction, statistics /1012 Association rule learning The standard algorithm for association rule induction is Apriori, which is implemented in the workbench. Two other algorithms implemented in Weka are Tertius, which can extract first-order rules, and Predictive Apriori, which combines the standard confidence and support statistics into a single measure. /1273 JI [18] Data Mining and Knowledge Discovery Handbook, Oded Maimon, Lior Rokach, Springer, 2nd, 2010. The three main areas of data mining are (a) classification, (b) clustering, and (c) association rule mining (Dunham, 2003). A brief review is given of the methods discussed in this article. /7 Association rule mining –(ARM) considers marketbasket or shopping-cart data, that is, the items purchased on a particular visit to the supermarket. ARM first determines the frequent sets, which have to meet a certain support level. /7 merchandising, both to analyze patterns of preference across products, and to recommend products to consumers based on other products they have selected. An association rule expresses the relationship that one product is often purchased along with other products. The number of possible association rules grows exponentially with the number of products in a rule, but constraints on confidence and support, combined with algorithms that build association rules with itemsets of n items from rules with n-1 item itemsets, reduce the effective search space. /45 Association Rules: Used to associate items in a database sharing some relationship (e.g., co-purchase information). Often takes the for “if this, then that,” such as, “If the customer buys a handheld videogame then the customer is likely to purchase batteries.” /48 Association Rule Mining (ARM) is concerned with how items in a transactional database are grouped together. It is commonly known as market basket analysis, because it can be likened to the analysis of items that are frequently put together in a basket by shoppers in a market. /59 Association rule mining (Agrawal, Imilienski, & Swami, 1993) has been proposed for understanding the relationships among items in transactions or market baskets. For instance, if a customer buys butter, what is the chance that he/she buys bread at the same time? Such information may be useful for decision makers to determine strategies in a store. /65 Association Rule: A kind of rule in the form X _ Ij, where X is a set of some items and Ij is a single item not in X. /69 Association Rule: A rule of the form A B meaning “if the set of items A is present in a transaction, then the set of items B is likely to be present too”. A typical example constitutes associations between items purchased at a supermarket. /73 Association rules, introduced by Agrawal, Imielinski and Swami (1993), provide useful means to discover associations in data. The problem of mining association rules in a database is defined as finding all the association rules that hold with more than a user-given minimum support threshold and a user-given minimum confidence threshold. According to Agrawal, Imielinski and Swami, this problem is solved in two steps: 1. Find all frequent itemsets in the database. 2. For each frequent itemset I, generate all the association rules I'I\I', where I'I. /150 Association Rule: A pair of frequent itemsets (A, B), where the ratio between the support of AB and A itemsets is greater than a predefined threshold, denoted minconf. /153 Association Rule: A rule in the form of “if this, then that.” It states a statistical correlation between the occurrence of certain attributes in a database. /164 /233 /233 Association rule is a type of data mining that correlates one set of items or events with another set of items or events. It employs association or linkage analysis, searching transactions from operational systems for interesting patterns with a high probability of repetition /272 /493 In classical association analysis, records in a transactional database contain only items. Although transactions occur under certain contexts, such as time, place, customers, and so forth, such contextual information has been ignored in classical association rule mining, due to the fact that such rule mining was intratransactional in nature. However, when we talk about intertransactional associations across multiple transactions, the contexts of occurrence of transactions become important and must be taken into account. /653