Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Efficient Extended Quantitative Frequent Pattern Mining Baoying Wang, Fei Pan, Yue Cui, William Perrizo {baoying.wang, fei.pan, yue.cui, william.perrizo}@ndsu.nodak.edu Computer Science Department North Dakota State University Fargo, ND 58105 Abstract The information in many datasets is not limited to categorical attributes, but also contains much quantitative data. The typical approach of quantitative association rule mining is achieved by using intervals. In some cases, a user may not be interested in association rules with regard to the whole transaction set. For example, a user may want to know the association rule for a particular region. In this paper, we present a Peano tree* based quantitative frequent pattern mining (PFP) for flexible intervals and user-defined constraints. Peano trees (P-trees) are lossless, vertical bitwise quadrant-based data structures, which are developed to facilitate data compression and data mining. Our method has three major advantages: 1) P-trees are pre-generated tree structures, which are used to facilitate any interval discretization, and to satisfy any user-defined constraints; 2) PFP is efficient by using fast P-tree logic operations for frequent pattern mining; and 3) PFP has better performance due to the vertically decomposed structure and compression of P-trees. Experiments show that our algorithm outperforms Apriori algorithm by orders of magnitude with better dimensionality and cardinality scalability. Keywords: ARM, Frequent pattern mining, Quantitative frequent pattern, Peano trees 1. Introduction The problem of mining association rules in a set of transactions where each transaction is a set of items was first introduced by Agrawal et al [1]. Given a set of transactions, the problem of mining association rules is to generate all association rules that have support and confidence greater than the user-specified minimum support (called minsup) and minimum confidence (called minconf) respectively. In the most general form, an association rule can be viewed as being defined over attributes a relation and has the form C1C2, where C1 and C2 are disjoint sets of items. A typical mining problem is “What items are frequently bought together in transactions.” Extensive research on standard mining association rules has been conducted by other researchers and a number of fast algorithms have been proposed [6][7][8]. In practice the information in many datasets is not limited to categorical attributes, but also contains much quantitative data. Some research works have been done for mining quantitative frequent patterns [9][10][11]. Typically, the approach of categorical ARM is extended to the quantitative data by considering intervals of the numeric values. An example of a rule would be: age[30,45] and income[$40,$60] #car[1, 2]. This approach has achieved a certain degree of success. However, discretizing quantitative data into intervals has several problems. First, the number of intervals is critical to execution time. If a quantitative attribute has n values (or intervals), there are on average O(n2) ranges that include a specific value or interval. Hence the number of rules blows up, which will blow up the execution time exponentially. Second, quantitative frequent pattern mining involves many interval generations and merges, which involves construction of hash trees, R-trees and FP-trees [6][9][8]. The on-the-fly construction of these tree structures for all intervals is time-consuming. Finally, the usefulness or interestingness of a rule is often application dependent. In the real world, a user may not be interested in association rules with regard to the whole transaction set. For example, a user may want to know the association rule about car sales for a particular location, because for different locations, the same income and the same age don’t have same confidence rule for purchase of cars. It’s preferable to allow human guidance of the rule discovery process. * Patents are pending on the P-tree technology. This work is partially supported by GSA Grant ACT#: K96130308. Recently, a data structure, Peano-trees (P-trees), have been developed and used in many data mining algorithms [2][3][12]. In this paper, we will present Peano tree based quantitative frequent pattern mining (PFP). Peano-trees (P-trees) are lossless, vertical bitwise quadrant-based data structures, which are developed to facilitate data compression and data mining. Our method has three major advantages: 1) P-trees are pre-generated tree structures, which are used to facilitate any interval discretization, and to satisfy any inequity constraint; 2) PFP is efficient by using fast P-tree logic operations for frequent pattern mining; and 3) PFP has better dimensionality and cardinality scalable due to the vertically decomposed structure and compression of P-trees. Experiments show that our algorithm outperforms Apriori algorithm by orders of magnitude with better dimensionality and cardinality scalability. This paper is organized as follows. In section 2, we review P-trees and their properties. In section 3, we present the extended quantitative frequent pattern mining algorithm using P-trees (PFP). Finally we compare our method with the traditional Apriori approach experimentally in section 4 and conclude the paper in section 5. 2. Review of Peano Trees A P-tree is a lossless, bitwise quadrant-based tree. In this section, we will give a brief review of P-trees on their structure and operations, and introduce two variations of predicate P-trees which are used in our frequent pattern mining algorithm. 2.1 Structure of P-trees Given a data set with d feature attributes, X = (A1, A2 … Ad), and the binary representation of j th feature attribute Aj as bj,mbj,m-1...bj,i… bj,1bj,0, we decompose each feature attribute into bit files, one file for each bit position [1]. To build a P-tree, a bit file is recursively partitioned into quadrants and each quadrant into sub-quadrants until the subquadrant is pure (entirely 1-bits or entirely 0-bits). A P-tree can be 1-dimensional, 2-dimensional, 3-dimensional, etc. For a 2-dimensional P-tree, its root contains the 1-bit count of the entire bit file. The next level of the tree contains the 1-bit counts of the four quadrants. At the third level, each quadrant is partitioned into four subquadrants. This level contains 1-bit counts of sub-quadrants. This construction is continued recursively down each tree path until the sub-quadrant is pure (entirely 1-bits or entirely 0-bits), which may or may not be at the leaf level. The detailed construction of P-trees is illustrated by an example in Figure 1. The spatial data is the red reflective value of an 8x8 2-dimensional image, which is shown in a). We represent the reflectance as binary values, e.g., (7)10 = (111)2. Then vertically decompose them into three separate bit files, one file for each bit, as shown in b), c), and d). The corresponding basic P-trees, P1, P2 and P3, are constructed by recursive partition, which are shown in e), f) and g). As shown in e) of Figure 1, the root of P1 tree is 36, which is the 1-bit count of the entire bit-file-1. The second level of P1 contains the 1-bit counts of the four quadrants, 16, 7, 13, and 0. Since quadrant 0 and quadrant 3 are pure, there is no need to partition these quadrants. Quadrant 1 and 2 are further partitioned recursively. 2.2 P-tree Operations AND, OR and NOT logic operations are the most frequently used P-tree operations. For efficient implementation, we use a variation of P-trees, called Pure-1 trees (P1-trees). A tree is pure-1 if all the values in the sub-tree are 1’s. A node in a P1-tree is a 1-bit if and only if that quadrant is pure-1. P1-trees are variations of the more general “predicate-tree” construct. Given any predicate, <p>, the predicate <p> tree has a 1-bit at a node if and only if that quadrant (or half in the 1-D case, Octant in the 3-D case, etc.) satisfies <p>. Figure 2 shows the P1-trees corresponding to the P-trees in e), f), and g) of Figure 1. 111 111 101 101 110 110 010 010 111 111 101 111 110 110 010 110 111 111 111 111 110 110 110 110 111 111 111 111 110 110 110 110 101 001 100 100 011 000 011 011 101 001 100 101 011 000 011 011 001 001 001 101 000 000 000 011 001 001 001 001 000 000 000 000 a) 8x8 2-D image data 11 11 11 11 11 11 00 01 11 11 11 11 11 11 11 11 11 00 11 11 00 00 00 00 00 00 00 10 00 00 00 00 b) bit-file-1 36 _______/ / \ \______ / __ / \__ \ / / \ \ 16 __7__ __13__ 0 / / | \ / | \ \ 2 0 4 1 4 4 1 4 / / |\ //|\ //|\ 1100 0010 0001 11 11 00 01 11 11 11 11 11 11 11 11 11 11 11 11 0 ______/ / \ \______ / __ / \__ \ / / \ \ 1 __0__ __ 0__ 0 / / | \ / | \ \ 0 0 1 0 1 1 0 1 //|\ //|\ //|\ 1100 0010 0001 a) P11 00 00 00 00 00 00 00 10 c) bit-file-2 11 11 11 11 00 00 00 00 11 11 11 11 00 00 00 00 11 11 00 01 11 00 11 11 11 11 11 11 00 00 00 10 d) bit-file-3 36 _____/ / \ \_____ / / \ \ / / \ \ _13__ 0 16 ___7__ / / | \ / | \ \ 44 1 4 2 0 4 1 //|\ //|\ //|\ 0001 1100 0010 36 _______/ / \ \____ / __ / \ \ / / \ \ 16 _13__ 0 ___7__ / / | \ / | \ \ 4 4 1 4 2 0 4 1 //|\ //|\ //|\ 0001 1100 0010 f) P2 g) P3 e) P1 Figure 1. 00 00 00 00 11 00 11 11 Construction of 2-D Basic P-trees for 8x8 Image Data 0 _____/ / \ \_____ / / \ \ / / \ \ __0__ 0 1 __ 0__ / / | \ / | \ \ 11 0 0 0 0 1 0 //|\ //|\ //|\ 0001 1100 0010 b) P12 Figure 2. 0 ______/ / \ \_____ / ___/ \___ \ / / \ \ 1 __0__ 0 __0__ / / | \ / | \ \ 1 1 0 1 0 1 0 1 //|\ //|\ //|\ 0001 1100 0010 c) P13 P1-trees for 8x8 image data The P-tree logic operations are performed level-by-level starting from the root level. They are commutative and distributive, since they are simply pruned bit-by-bit operations. For instance, ANDing a pure-0 node with anything results in a pure-0 node, ORing a pure-1 node with anything results in a pure-1 node. In Figure 3, a) is the ANDing result of P11 and P12, b) is the ORing result of P11 and P12, and c) is the result of NOT P13 (or P13’), where P11, P12 and P13 are shown in Figure 2. 0 ____ / / \ \______ / / \ \ / / \ \ 0 0 0 0 / |\ \ / | \ \ 1101 1 1 0 1 //|\ //|\ 0001 0001 0 ______ / / \ \____ / __ / \ \ / / \ \ 1 0 1 0 / | \ \ / | \ \ 0 0 1 0 0 0 1 0 //|\ //|\ //|\ //|\ 1110 0010 1110 0010 0 _____/ / \ \___ / __ / \ \ / / \ \ 0 0 1 0 / / | \ / | \ \ 0 0 0 0 0 0 0 0 //|\ //|\ //|\ 1110 0011 1101 a) P11 AND P12 b) P11 OR P12 Figure 3. 2.3 c) NOT P13 or P13’ AND, OR and NOT Operations Predicate P-trees There are many variations of predicate P-trees, such as Pure-1 trees, Non-Pure-0 trees, value P-trees, inequity Ptrees, etc. We have discussed Pure1 trees in the previous section. In this section, we will describe value P-trees and inequity P-trees, which are used in frequent pattern mining algorithm. 2.3.1 Value P-trees A value P-tree represents a data set X related to a specified value v, denoted by P x=v, where x X. Let v = bmbmth 1…b0, where bi is i binary bit value of v. There are two steps to calculate Px=v. 1) Get the bit-P-tree Pb,i for each bit position of v according to the bit value: If bi = 1, Pb,i = Pi; Otherwise Pb,i = Pi’; 2) Calculate Px=v by ANDing all the bit P-trees of v, i.e. Px=v = Pb1 Pb2… Pbm. Here, means AND operation. For example, if we want to get a value P-tree satisfying x = 110 in Figure 1. We have Px=110 = Pb,3 Pb,2 Pb,1 = P3 P2 P1’. The result Px=110 is shown in Figure 4. 1 in the bit file of Px=110 represents the data point with the value 110. The root count of P x=110 tells that there are totally 13 data points whose value is 110. 00 00 00 00 11 11 00 01 00 00 00 00 11 11 11 11 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 (a) The bit file of Px=110 Figure 4. 2.3.2 13 _______/ / \ \______ / __ / \__ \ / / \ \ 0 0 __13__ 0 / | \ \ 4 4 1 4 //|\ 0001 (b) Px=110 Value P-tree Px=110 = P3 P2 P1’ Inequity P-trees An inequity P-tree represents data points within a data set X satisfying an inequity predicate, such as x>v, xv, x<v, and xv. Without loss of generality, we will discuss two inequity P-trees for xv and xv, denoted by Pxv and Pxv, where x X, v is a specified value. The calculation of Pxv and Pxv is as follows: Calculation of Pxv: Let x be a data within a data set X, x be a m-bit data, and Pm, Pm-1, … P0 be the P-trees for the vertical bit files of X. Let v=bm…bi…b0, where bi is ith binary bit value of v, and Pxv be the predicate tree for the predicate xv, then Pxv = Pm opm … Pi opi Pi-1 … op0 P0, i = 0, 1 … m, where 1) op i is if bi=1, opi is otherwise, and 2) the operators are right binding. Here, means AND operation, means OR operation, right binding means operators are associated from right to left, e.g., P 2 op2 P1 op1 P0 is equivalent to (P2 op2 (P1 op1 P0)). For example, the inequity tree Px 101 = (P2 (P1 P0)). Calculation of Pxv: Calculation of Pxv is similar to Calculation of Pxv. Let x be a data within a data set X, x be a m-bit data set, and P’m, P’m-1, … P’0 be the complement P-trees for the vertical bit files of X. Let v=b m…bi…b0, where bi is ith binary bit value of v, and Pxv be the predicate tree for the predicate xv, then Pxv = P’mopm … P’i opi P’i-1 … opk+1P’k, kim, where 1). opi is if bi=0, opi is otherwise, 2) k is the rightmost bit position with value of “0”, i.e., bk=0, bj=1, j<k, and 3) the operators are right binding. For example, the inequity tree Px 101 = (P’2 (P’1P’0)). 3. Extended Quantitative Frequent Pattern Mining Using P-trees In this section, we first describe how to represent a relational table in P-tree structure. Then we present P-tree based frequent pattern mining (PFP) of categorical and quantitative attributes, and the extended frequent pattern mining algorithm with user-defined constraints. 3.1 P-tree Representation of Databases Suppose we have a car sale database with customer information and the number of cars purchased shown in Figure 5 (a). First we convert all attributes into binary form as Figure 5 (b) according to the property and the domain of each attribute. For example, categorical attribute “sex” is code as 1 for M and 0 for F. The numeric attribute can be converted directly into binary number. Loc Age A 20 B 45 C 33 B 25 A 54 A 30 B 40 C 62 Sex Income(K) F 10 F 44 M 58 F 27 M 31 M 35 F 62 F 23 Loc 00 01 10 01 00 00 01 10 #car 0 2 2 1 2 2 3 1 (a) A relational table Age Sex Income(K) 010100 1 001010 101101 1 101100 100001 0 111010 011001 1 011011 110110 0 011111 011110 0 100011 101000 1 111110 111101 1 010111 #car 00 10 10 01 10 10 11 01 (b) A binary relational table Figure 5. A relational table and its binary form Next we vertically decompose the binary transaction table into bit files, one file each for bit position. For the binary table in Figure 5 (b), there are totally 17 bit files. Then we build a P-tree for each bit file. Figure 6 shows two 1dimenstional P-trees for the first attribute “Loc”: P 11 and P12. Decomposition process and construction of P-trees are discussed in details in section 2.1. 2 1 0 1 1 1 1 0 1 0 1 0 0 1 0 1 (a) P11 Figure 6. 3.2 0 1 0 0 1 0 1 0 (b) P12 P-trees for the attribute “Loc”. Categorical Frequent Pattern Mining Using P-trees There are two kinds of attributes in database: categorical attributes and quantitative attributes. Categorical frequent pattern mining is what the standard ARM is intended and is the most straight forward case among frequent pattern mining. For completeness, we present categorical frequent pattern mining using P-trees in this section before introducing other pattern mining algorithms. Suppose we want to get the support of an attribute A with a Boolean value v. In another word, we want to find the number of transactions that involves A = v, denoted by N xA=v. NxA=v is simply the root count of P-tree PA=v for the predicate A = v. If v =1, PA=v = PA, Otherwise PA=v = P’A. For example, if we want to find the number of transactions in Figure 5, in which Sex = 1. PSex=1= P31 = 11010011, P31 is the P-tree for attribute “Sex”. NSex10 = RootCount (PSex=10) = 5. The support of the attribute Sex = 1 is ratio of NSex1 and the total number of transaction, i.e. 5/8 = 0.625. 3.3 Quantitative Frequent Pattern Mining Using P-trees The typical approach for quantitative association rule mining is to partition the quantitative data into intervals. There is intensive study with many algorithms on how to determine the range intervals [6][13]. Our approach can easily integrate any of them for interval optimization, which is beyond the scope of this paper. In this section, we will focus on quantitative frequent pattern mining using P-trees (PFP) for any interval. We will show that P-trees can be used for any interval pattern mining because P-trees are vertical decomposed bitwise data structure. Suppose we want to get the support of an attribute A with a value between an interval [l, u]. That is to find the number of transactions that involves A [l, u], denoted as Nxl,u. Nxl,u is the root count of the range tree P lAu for the predicate l A u. Now the problem is to calculate Plxu. Since lAu = Al Au, according to the property of Ptrees, PlAu = PAl PAu. Here denotes AND operation. Notice PAl and PAu are just inequity trees for predicates Al and Au. The calculation of the inequity trees is discussed in Section 2.3. For example, if we want to find the number of transactions in Figure 5, in which age is between [30, 45]10 or [011110, 101101]2. Table 1 shows the calculation process of Nage30,45. P26, P25, P24, P23, P22 and P21 are P-trees for attribute age, one for each bit. Here, for convenience we use uncompressed P-trees to illustrate the calculation process. First, Page011110 and Page101101 are calculated according to the formula given in Section 2.3 based on the boundary values 101101 and 101101. Then P 30age45 = Page30 Page45 = Page011110 Page101101. Finally we get the result Nage30,45 = RootCount ( P30age45 ) = 4. The support of the attribute age within an interval [30, 45] is ratio of Nage30,45 and the total number of transaction, i.e. 4/8 = 0.5. Table 1. Calculation process of Nage30,45 Name P26, P’26 P25, P’25 P24, P’24 P23, P’23 P22, P’22 P21, P’21 Page011110 Page101101 P30age45 Nage30,45 Formula and result ( means AND, means OR ) 01101011, 10010100 10011101, 01100010 01010111, 10101000 11001101, 00110010 00001100, 11110011 01110001, 10001110 Page011110 = P26( P25( P24( P23( P22 P21)))) = 01101111 Page101101 = P6’(P5’(P4’ (P3’ P2’))) = 11110110 P30age45 = Page011110 Page101101 = 01101111 11110110 = 01100110 Nage30,45 = RootCount (P30age45) = 4 If we want to find item pattern with a different interval, we will use the same P-trees for the attribute and only need to change the formula of inequity P-trees based on the new boundary values. Therefore PFP is very flexible for interval optimization. What we present above is for 1-item pattern mining. In case of multiple item pattern mining, we simply AND the inequity P-tree of each item pattern to get the multiple item pattern inequity P-tree. For example, if we want to find 2-item pattern age[30,45] and income[20,50], the 2-item inequity P-tree P30age45, 20income50 = P30age45 AND P20income50. 3.4 Extended Frequent Pattern Mining with User-defined Constraints In this section, we discuss the extended frequent pattern mining algorithm which gives users more options to find the valuable frequent patterns. In real world, a user may not be interested in association rules with regard to the whole transaction set. For example, a user may want know the association rule about car sales for particular location, because for different locations, the same income and the same age don’t have same confidence rule with regard to car purchase. Suppose we have a transaction set T, where X is a set of values of attributes, i.e. {45, 30, 2} for attribute age, income, number of cars. Let A = {A1, A2, … Ak} be a set of constraints, such as “Loc”. We want find frequent pattern within a constraint, i.e. Loc = 01. We define Q as the subset of transaction set T that satisfies the constraint. For the purpose of frequent pattern mining, we want to get the number of transactions within Q that involves X, n x|q, and the total number of transactions within Q, nq. The ratio, nx|q/nq, is the support of X within Q. X can be 1-item or multi-item pattern. The algorithms to calculate nq and nx|q using P-trees are given as follows. Algorithm 4.1: Calculation of nq. First, let’s consider the subset Q that satisfies 1-dimension constraint A1 = a1, which is denoted by Q = T|A1 = a1. Q is represented by a value tree PQ = P A1=a1 for predicate A1 = a1. Calculation of the value P-tree PA1=a1 is given in details in Section 2.3.1. The nodes of PQ represent the entries in the relational table that satisfy the predicate A1 = a1. nq is the root count of PQ, denoted by RootCount (PQ). For example, we want to get the subset of transactions for a particular location Q = T| Loc = 01. PQ = PLoc=01 = P12 AND P’11. nq = RootCount (PQ) = 3. Table 2 shows the calculation process of nq. For convenience the P-trees are uncompressed. Table 2. Calculation process of nq Name P11, P’11 P12, P’12 PQ nq Formula and result ( means AND) 00100001, 11011110 01010010, 10101101 PQ = P’11 P12 = 01010010 nq = RootCount (PQ) = 3 Similarly when k > 1, we simply get value tree PAi= ai for each predicate constraints Ai = ai ( i = 1, 2, …, k), and get the final tree PQ by ANDing all the value trees: PQ = P A1= a1 PA2= a2 …PAk= ak. Here denotes AND operation. For example, if we want to get a particular subset which satisfies Loc = 01 and Sex = 1, PQ = PLoc=01 PSex = 1. Algorithm 4.2: Calculation of nx|q. First we need to calculate the tree Px|q which represents the transactions within Q that involves X. Px|q is defined as Px|q = PQ PX, where PQ represents transactions with the particular subset Q, and PX is a predicate tree that represents transactions what involves value pattern X. The calculation of PX is similar to PQ, only PQ involves non-item dimensions and PX involves item dimensions. The root count of P x|q is nx|q. For example, we want to find the number of transactions within subset Q = T| Loc = 01 that involves pattern #car = 10. Px|q = PQ PX = PLoc=01 Pcar=10. PLoc=01 and Pcar=10 are both value P-trees. PLoc=01 is calculated in the previous example. Pcar=10 = P52 P’51, where P52 and P’51 are P-trees for attribute “Loc.” Finally, we get nx|q = RootCount (Px|q) = 1. Table 3 shows the calculation process of nx|q. Table 3. Calculation process of nx|q P-tree PQ PX Px|q nx|q Formula and result ( means AND) PQ = PLoc=01 = 01010010 PX = Pcar=10 = P52 P’51 = 01101100 Px|q = PQPX = 01000000 RootCount (Px|q) = 1 4. Performance Analysis 5. Conclusion In this paper, we present an extended quantitative frequent pattern mining algorithm using P-trees (PFP). P-trees are lossless, vertical bitwise quadrant-based data structure, which is developed to facilitate data compression and data mining. P-trees are pre-generated tree structure, which is used to facilitate any interval discretization, and to satisfy any inequity constraint. There is no need to build trees on-the-fly. Fast P-tree logic operations are used to achieve efficient frequent pattern mining. Our approach has better performance due to the vertical decomposed data structure and compression of P-trees. Experiments show that our algorithm outperforms Apriori algorithm by orders of magnitude with better dimensionality and cardinality scalability. REFERENCES [1] R. Agrawal, T. Imielinski, and A. Swami, Mining Association Rules Between Sets of Items in Large [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] Databases. Proc. ACM SIGMOD Conf. Management of Data, pp. 207-216, May 1993 Perrizo, W., Peano Count Tree Technology, Technical Report NDSU-CSOR-TR-01-1, 2001. Khan, M., Ding, Q., Perrizo, W., k-Nearest Neighbor Classification on Spatial Data Streams Using P-Trees, PAKDD 2002, Spriger-Verlag, LNAI 2776, 2002, pp. 517-528. TIFF image data sets. Available at http://midas-10cs.ndsu.nodak.edu/data/images/. Nestorov, S., Jukic, N., Ad-Hoc Association-Rule Mining within the Data Warehouse, In Proceedings of HICSS’03, Big Island, Hawaii, January 2003. R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proceedings of VLDB’94, pp. 487-499. Brin, s., Motwani, R., Ullman, J., and Tsur, S., Dynamic Itemset Counting and Implication Rules for Market Basket Data. In Proc. of the 1997 ACM-SIGMOD Conf. On Management of Data, pp. 255-264. J. Han, J. Pei, and Y. Yin. Mining Frequent Patterns without Candidate Generation, In Proc. of 2000 ACMSIGMOD Conf. On Management of Data , Dallas, TX, May 2000. R. Srikant, R. Agrawal. Mining Quantitative Association Rules in Large Relational Tables. Proc. Of the ACM SIGMOD Conference on Management of Data, 1996. H. Mannila, H. Toivonen, and A.I. Verkamo, Efficient Algorithms for Discovering Association Rules. Proc. KDD-94: AAAI Workshop Knowledge Discovery in Databases, pp. 181-192, July 1994 J.S. Park, M.-S. Chen, and P.S. Yu, An Effective Hash Based Algorithm for Mining Association Rules. Proc. ACM-SIGMOD Conf. Management of Data, May 1995. Q. Ding, Q. Ding and W. Perrizo "Association Rule Mining on Remotely Sensed Images Using P-trees", Proceedings of PAKDD2002, Taipei, Taiwan, May 6-8, 2002. 14 38 10 9 11 4 12 410 9 11 12 [13] Aumann, Y.; Lindell, Y.: “A Statistical Theory for Quantitative Association Rules”, Proceedings KDD99, San Diego, CA, 1999, pp. 261 - 270