Download 03_quant_freq_pat_mining

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Quadtree wikipedia , lookup

Binary search tree wikipedia , lookup

Interval tree wikipedia , lookup

Transcript
Efficient Extended Quantitative Frequent Pattern Mining
Baoying Wang, Fei Pan, Yue Cui, William Perrizo
{baoying.wang, fei.pan, yue.cui, william.perrizo}@ndsu.nodak.edu
Computer Science Department
North Dakota State University
Fargo, ND 58105
Abstract
The information in many datasets is not limited to categorical attributes, but also contains much quantitative data.
The typical approach of quantitative association rule mining is achieved by using intervals. In some cases, a user
may not be interested in association rules with regard to the whole transaction set. For example, a user may want to
know the association rule for a particular region. In this paper, we present a Peano tree* based quantitative frequent
pattern mining (PFP) for flexible intervals and user-defined constraints. Peano trees (P-trees) are lossless, vertical
bitwise quadrant-based data structures, which are developed to facilitate data compression and data mining. Our
method has three major advantages: 1) P-trees are pre-generated tree structures, which are used to facilitate any
interval discretization, and to satisfy any user-defined constraints; 2) PFP is efficient by using fast P-tree logic
operations for frequent pattern mining; and 3) PFP has better performance due to the vertically decomposed
structure and compression of P-trees. Experiments show that our algorithm outperforms Apriori algorithm by orders
of magnitude with better dimensionality and cardinality scalability.
Keywords: ARM, Frequent pattern mining, Quantitative frequent pattern, Peano trees
1. Introduction
The problem of mining association rules in a set of transactions where each transaction is a set of items was first
introduced by Agrawal et al [1]. Given a set of transactions, the problem of mining association rules is to generate
all association rules that have support and confidence greater than the user-specified minimum support (called
minsup) and minimum confidence (called minconf) respectively. In the most general form, an association rule can
be viewed as being defined over attributes a relation and has the form C1C2, where C1 and C2 are disjoint sets of
items. A typical mining problem is “What items are frequently bought together in transactions.” Extensive research
on standard mining association rules has been conducted by other researchers and a number of fast algorithms have
been proposed [6][7][8].
In practice the information in many datasets is not limited to categorical attributes, but also contains much
quantitative data. Some research works have been done for mining quantitative frequent patterns [9][10][11].
Typically, the approach of categorical ARM is extended to the quantitative data by considering intervals of the
numeric values. An example of a rule would be: age[30,45] and income[$40,$60]  #car[1, 2].
This approach has achieved a certain degree of success. However, discretizing quantitative data into intervals has
several problems. First, the number of intervals is critical to execution time. If a quantitative attribute has n values
(or intervals), there are on average O(n2) ranges that include a specific value or interval. Hence the number of rules
blows up, which will blow up the execution time exponentially. Second, quantitative frequent pattern mining
involves many interval generations and merges, which involves construction of hash trees, R-trees and FP-trees
[6][9][8]. The on-the-fly construction of these tree structures for all intervals is time-consuming. Finally, the
usefulness or interestingness of a rule is often application dependent. In the real world, a user may not be interested
in association rules with regard to the whole transaction set. For example, a user may want to know the association
rule about car sales for a particular location, because for different locations, the same income and the same age don’t
have same confidence rule for purchase of cars. It’s preferable to allow human guidance of the rule discovery
process.
*
Patents are pending on the P-tree technology. This work is partially supported by GSA Grant ACT#: K96130308.
Recently, a data structure, Peano-trees (P-trees), have been developed and used in many data mining algorithms
[2][3][12]. In this paper, we will present Peano tree based quantitative frequent pattern mining (PFP). Peano-trees
(P-trees) are lossless, vertical bitwise quadrant-based data structures, which are developed to facilitate data
compression and data mining. Our method has three major advantages: 1) P-trees are pre-generated tree structures,
which are used to facilitate any interval discretization, and to satisfy any inequity constraint; 2) PFP is efficient by
using fast P-tree logic operations for frequent pattern mining; and 3) PFP has better dimensionality and cardinality
scalable due to the vertically decomposed structure and compression of P-trees. Experiments show that our
algorithm outperforms Apriori algorithm by orders of magnitude with better dimensionality and cardinality
scalability.
This paper is organized as follows. In section 2, we review P-trees and their properties. In section 3, we present the
extended quantitative frequent pattern mining algorithm using P-trees (PFP). Finally we compare our method with
the traditional Apriori approach experimentally in section 4 and conclude the paper in section 5.
2. Review of Peano Trees
A P-tree is a lossless, bitwise quadrant-based tree. In this section, we will give a brief review of P-trees on their
structure and operations, and introduce two variations of predicate P-trees which are used in our frequent pattern
mining algorithm.
2.1
Structure of P-trees
Given a data set with d feature attributes, X = (A1, A2 … Ad), and the binary representation of j th feature attribute Aj
as bj,mbj,m-1...bj,i… bj,1bj,0, we decompose each feature attribute into bit files, one file for each bit position [1]. To
build a P-tree, a bit file is recursively partitioned into quadrants and each quadrant into sub-quadrants until the subquadrant is pure (entirely 1-bits or entirely 0-bits). A P-tree can be 1-dimensional, 2-dimensional, 3-dimensional,
etc. For a 2-dimensional P-tree, its root contains the 1-bit count of the entire bit file. The next level of the tree
contains the 1-bit counts of the four quadrants. At the third level, each quadrant is partitioned into four subquadrants. This level contains 1-bit counts of sub-quadrants. This construction is continued recursively down each
tree path until the sub-quadrant is pure (entirely 1-bits or entirely 0-bits), which may or may not be at the leaf level.
The detailed construction of P-trees is illustrated by an example in Figure 1. The spatial data is the red reflective
value of an 8x8 2-dimensional image, which is shown in a). We represent the reflectance as binary values, e.g., (7)10
= (111)2. Then vertically decompose them into three separate bit files, one file for each bit, as shown in b), c), and
d). The corresponding basic P-trees, P1, P2 and P3, are constructed by recursive partition, which are shown in e), f)
and g).
As shown in e) of Figure 1, the root of P1 tree is 36, which is the 1-bit count of the entire bit-file-1. The second level
of P1 contains the 1-bit counts of the four quadrants, 16, 7, 13, and 0. Since quadrant 0 and quadrant 3 are pure, there
is no need to partition these quadrants. Quadrant 1 and 2 are further partitioned recursively.
2.2
P-tree Operations
AND, OR and NOT logic operations are the most frequently used P-tree operations. For efficient implementation,
we use a variation of P-trees, called Pure-1 trees (P1-trees). A tree is pure-1 if all the values in the sub-tree are 1’s. A
node in a P1-tree is a 1-bit if and only if that quadrant is pure-1. P1-trees are variations of the more general
“predicate-tree” construct. Given any predicate, <p>, the predicate <p> tree has a 1-bit at a node if and only if that
quadrant (or half in the 1-D case, Octant in the 3-D case, etc.) satisfies <p>. Figure 2 shows the P1-trees
corresponding to the P-trees in e), f), and g) of Figure 1.
111
111
101
101
110
110
010
010
111
111
101
111
110
110
010
110
111
111
111
111
110
110
110
110
111
111
111
111
110
110
110
110
101
001
100
100
011
000
011
011
101
001
100
101
011
000
011
011
001
001
001
101
000
000
000
011
001
001
001
001
000
000
000
000
a) 8x8 2-D image data
11
11
11
11
11
11
00
01
11
11
11
11
11
11
11
11
11
00
11
11
00
00
00
00
00
00
00
10
00
00
00
00
b) bit-file-1
36
_______/ / \ \______
/
__ / \__
\
/
/
\
\
16 __7__
__13__ 0
/ / | \ / | \ \
2 0 4 1 4 4 1 4
/ / |\
//|\
//|\
1100
0010
0001
11
11
00
01
11
11
11
11
11
11
11
11
11
11
11
11
0
______/ / \ \______
/
__ / \__
\
/
/
\
\
1 __0__
__ 0__ 0
/ / | \
/ | \ \
0 0 1 0 1 1 0 1
//|\
//|\
//|\
1100 0010
0001
a) P11
00
00
00
00
00
00
00
10
c) bit-file-2
11
11
11
11
00
00
00
00
11
11
11
11
00
00
00
00
11
11
00
01
11
00
11
11
11
11
11
11
00
00
00
10
d) bit-file-3
36
_____/ / \ \_____
/
/ \
\
/
/
\
\
_13__ 0
16 ___7__
/ / | \
/ | \ \
44 1 4
2 0 4 1
//|\
//|\
//|\
0001
1100 0010
36
_______/ / \ \____
/
__ / \
\
/
/
\
\
16 _13__
0 ___7__
/ / | \
/ | \ \
4 4 1 4
2 0 4 1
//|\
//|\
//|\
0001
1100 0010
f) P2
g) P3
e) P1
Figure 1.
00
00
00
00
11
00
11
11
Construction of 2-D Basic P-trees for 8x8 Image Data
0
_____/ / \ \_____
/
/ \
\
/
/
\
\
__0__ 0
1 __ 0__
/ / | \
/ | \ \
11 0 0
0 0 1 0
//|\
//|\
//|\
0001
1100 0010
b) P12
Figure 2.
0
______/ / \ \_____
/
___/ \___
\
/
/
\ \
1 __0__
0 __0__
/ / | \
/ | \ \
1 1 0 1
0 1 0 1
//|\
//|\
//|\
0001
1100 0010
c) P13
P1-trees for 8x8 image data
The P-tree logic operations are performed level-by-level starting from the root level. They are commutative and
distributive, since they are simply pruned bit-by-bit operations. For instance, ANDing a pure-0 node with anything
results in a pure-0 node, ORing a pure-1 node with anything results in a pure-1 node. In Figure 3, a) is the ANDing
result of P11 and P12, b) is the ORing result of P11 and P12, and c) is the result of NOT P13 (or P13’), where P11, P12
and P13 are shown in Figure 2.
0
____ / / \ \______
/
/ \
\
/
/ \
\
0
0
0
0
/ |\ \
/ | \ \
1101
1 1 0 1
//|\
//|\
0001
0001
0
______ / / \ \____
/
__ / \
\
/
/
\
\
1
0
1
0
/ | \ \
/ | \ \
0 0 1 0
0 0 1 0
//|\
//|\
//|\
//|\
1110 0010
1110 0010
0
_____/ / \ \___
/
__ / \
\
/ /
\
\
0 0
1
0
/ / | \
/ | \ \
0 0 0 0
0 0 0 0
//|\
//|\ //|\
1110
0011 1101
a) P11 AND P12
b) P11 OR P12
Figure 3.
2.3
c) NOT P13 or P13’
AND, OR and NOT Operations
Predicate P-trees
There are many variations of predicate P-trees, such as Pure-1 trees, Non-Pure-0 trees, value P-trees, inequity Ptrees, etc. We have discussed Pure1 trees in the previous section. In this section, we will describe value P-trees and
inequity P-trees, which are used in frequent pattern mining algorithm.
2.3.1
Value P-trees
A value P-tree represents a data set X related to a specified value v, denoted by P x=v, where x  X. Let v = bmbmth
1…b0, where bi is i binary bit value of v. There are two steps to calculate Px=v. 1) Get the bit-P-tree Pb,i for each bit
position of v according to the bit value: If bi = 1, Pb,i = Pi; Otherwise Pb,i = Pi’; 2) Calculate Px=v by ANDing all the
bit P-trees of v, i.e. Px=v = Pb1  Pb2…  Pbm. Here,  means AND operation. For example, if we want to get a value
P-tree satisfying x = 110 in Figure 1. We have Px=110 = Pb,3  Pb,2  Pb,1 = P3 P2 P1’. The result Px=110 is shown in
Figure 4. 1 in the bit file of Px=110 represents the data point with the value 110. The root count of P x=110 tells that
there are totally 13 data points whose value is 110.
00
00
00
00
11
11
00
01
00
00
00
00
11
11
11
11
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
(a) The bit file of Px=110
Figure 4.
2.3.2
13
_______/ / \ \______
/
__ / \__
\
/
/
\
\
0
0
__13__ 0
/ | \ \
4 4 1 4
//|\
0001
(b) Px=110
Value P-tree Px=110 = P3  P2  P1’
Inequity P-trees
An inequity P-tree represents data points within a data set X satisfying an inequity predicate, such as x>v, xv, x<v,
and xv. Without loss of generality, we will discuss two inequity P-trees for xv and xv, denoted by Pxv and Pxv,
where x  X, v is a specified value. The calculation of Pxv and Pxv is as follows:
Calculation of Pxv: Let x be a data within a data set X, x be a m-bit data, and Pm, Pm-1, … P0 be the P-trees for the
vertical bit files of X. Let v=bm…bi…b0, where bi is ith binary bit value of v, and Pxv be the predicate tree for the
predicate xv, then Pxv = Pm opm … Pi opi Pi-1 … op0 P0, i = 0, 1 … m, where 1) op i is  if bi=1, opi is  otherwise,
and 2) the operators are right binding. Here,  means AND operation,  means OR operation, right binding means
operators are associated from right to left, e.g., P 2 op2 P1 op1 P0 is equivalent to (P2 op2 (P1 op1 P0)). For example, the
inequity tree Px 101 = (P2  (P1 P0)).
Calculation of Pxv: Calculation of Pxv is similar to Calculation of Pxv. Let x be a data within a data set X, x be a
m-bit data set, and P’m, P’m-1, … P’0 be the complement P-trees for the vertical bit files of X. Let v=b m…bi…b0,
where bi is ith binary bit value of v, and Pxv be the predicate tree for the predicate xv, then Pxv = P’mopm … P’i opi
P’i-1 … opk+1P’k, kim, where 1). opi is  if bi=0, opi is  otherwise, 2) k is the rightmost bit position with value
of “0”, i.e., bk=0, bj=1,  j<k, and 3) the operators are right binding. For example, the inequity tree Px 101 = (P’2 
(P’1P’0)).
3. Extended Quantitative Frequent Pattern Mining Using P-trees
In this section, we first describe how to represent a relational table in P-tree structure. Then we present P-tree based
frequent pattern mining (PFP) of categorical and quantitative attributes, and the extended frequent pattern mining
algorithm with user-defined constraints.
3.1
P-tree Representation of Databases
Suppose we have a car sale database with customer information and the number of cars purchased shown in Figure 5
(a). First we convert all attributes into binary form as Figure 5 (b) according to the property and the domain of each
attribute. For example, categorical attribute “sex” is code as 1 for M and 0 for F. The numeric attribute can be
converted directly into binary number.
Loc Age
A 20
B 45
C 33
B 25
A 54
A 30
B 40
C 62
Sex Income(K)
F
10
F
44
M
58
F
27
M
31
M
35
F
62
F
23
Loc
00
01
10
01
00
00
01
10
#car
0
2
2
1
2
2
3
1
(a) A relational table
Age Sex Income(K)
010100 1
001010
101101 1
101100
100001 0
111010
011001 1
011011
110110 0
011111
011110 0
100011
101000 1
111110
111101 1
010111
#car
00
10
10
01
10
10
11
01
(b) A binary relational table
Figure 5.
A relational table and its binary form
Next we vertically decompose the binary transaction table into bit files, one file each for bit position. For the binary
table in Figure 5 (b), there are totally 17 bit files. Then we build a P-tree for each bit file. Figure 6 shows two 1dimenstional P-trees for the first attribute “Loc”: P 11 and P12. Decomposition process and construction of P-trees are
discussed in details in section 2.1.
2
1
0
1
1
1
1 0
1 0
1
0
0 1
0 1
(a) P11
Figure 6.
3.2
0
1 0
0 1
0
1 0
(b) P12
P-trees for the attribute “Loc”.
Categorical Frequent Pattern Mining Using P-trees
There are two kinds of attributes in database: categorical attributes and quantitative attributes. Categorical frequent
pattern mining is what the standard ARM is intended and is the most straight forward case among frequent pattern
mining. For completeness, we present categorical frequent pattern mining using P-trees in this section before
introducing other pattern mining algorithms.
Suppose we want to get the support of an attribute A with a Boolean value v. In another word, we want to find the
number of transactions that involves A = v, denoted by N xA=v. NxA=v is simply the root count of P-tree PA=v for the
predicate A = v. If v =1, PA=v = PA, Otherwise PA=v = P’A.
For example, if we want to find the number of transactions in Figure 5, in which Sex = 1. PSex=1= P31 = 11010011, P31
is the P-tree for attribute “Sex”. NSex10 = RootCount (PSex=10) = 5. The support of the attribute Sex = 1 is ratio of NSex1
and the total number of transaction, i.e. 5/8 = 0.625.
3.3
Quantitative Frequent Pattern Mining Using P-trees
The typical approach for quantitative association rule mining is to partition the quantitative data into intervals. There
is intensive study with many algorithms on how to determine the range intervals [6][13]. Our approach can easily
integrate any of them for interval optimization, which is beyond the scope of this paper. In this section, we will
focus on quantitative frequent pattern mining using P-trees (PFP) for any interval. We will show that P-trees can be
used for any interval pattern mining because P-trees are vertical decomposed bitwise data structure.
Suppose we want to get the support of an attribute A with a value between an interval [l, u]. That is to find the
number of transactions that involves A [l, u], denoted as Nxl,u. Nxl,u is the root count of the range tree P lAu for the
predicate l  A  u. Now the problem is to calculate Plxu. Since lAu = Al  Au, according to the property of Ptrees, PlAu = PAl PAu. Here  denotes AND operation. Notice PAl and PAu are just inequity trees for predicates
Al and Au. The calculation of the inequity trees is discussed in Section 2.3.
For example, if we want to find the number of transactions in Figure 5, in which age is between [30, 45]10 or
[011110, 101101]2. Table 1 shows the calculation process of Nage30,45. P26, P25, P24, P23, P22 and P21 are P-trees for
attribute age, one for each bit. Here, for convenience we use uncompressed P-trees to illustrate the calculation
process. First, Page011110 and Page101101 are calculated according to the formula given in Section 2.3 based on the
boundary values 101101 and 101101. Then P 30age45 = Page30  Page45 = Page011110  Page101101. Finally we get the
result Nage30,45 = RootCount ( P30age45 ) = 4. The support of the attribute age within an interval [30, 45] is ratio of
Nage30,45 and the total number of transaction, i.e. 4/8 = 0.5.
Table 1. Calculation process of Nage30,45
Name
P26, P’26
P25, P’25
P24, P’24
P23, P’23
P22, P’22
P21, P’21
Page011110
Page101101
P30age45
Nage30,45
Formula and result
( means AND,  means OR )
01101011, 10010100
10011101, 01100010
01010111, 10101000
11001101, 00110010
00001100, 11110011
01110001, 10001110
Page011110
= P26( P25( P24( P23( P22 P21))))
= 01101111
Page101101
= P6’(P5’(P4’ (P3’ P2’)))
= 11110110
P30age45
= Page011110  Page101101
= 01101111  11110110
= 01100110
Nage30,45 = RootCount (P30age45) = 4
If we want to find item pattern with a different interval, we will use the same P-trees for the attribute and only need
to change the formula of inequity P-trees based on the new boundary values. Therefore PFP is very flexible for
interval optimization. What we present above is for 1-item pattern mining. In case of multiple item pattern mining,
we simply AND the inequity P-tree of each item pattern to get the multiple item pattern inequity P-tree. For
example, if we want to find 2-item pattern age[30,45] and income[20,50], the 2-item inequity P-tree P30age45,
20income50 = P30age45 AND P20income50.
3.4
Extended Frequent Pattern Mining with User-defined Constraints
In this section, we discuss the extended frequent pattern mining algorithm which gives users more options to find the
valuable frequent patterns. In real world, a user may not be interested in association rules with regard to the whole
transaction set. For example, a user may want know the association rule about car sales for particular location,
because for different locations, the same income and the same age don’t have same confidence rule with regard to
car purchase.
Suppose we have a transaction set T, where X is a set of values of attributes, i.e. {45, 30, 2} for attribute age,
income, number of cars. Let A = {A1, A2, … Ak} be a set of constraints, such as “Loc”. We want find frequent
pattern within a constraint, i.e. Loc = 01. We define Q as the subset of transaction set T that satisfies the constraint.
For the purpose of frequent pattern mining, we want to get the number of transactions within Q that involves X, n x|q,
and the total number of transactions within Q, nq. The ratio, nx|q/nq, is the support of X within Q. X can be 1-item or
multi-item pattern. The algorithms to calculate nq and nx|q using P-trees are given as follows.
Algorithm 4.1: Calculation of nq. First, let’s consider the subset Q that satisfies 1-dimension constraint A1 = a1,
which is denoted by Q = T|A1 = a1. Q is represented by a value tree PQ = P A1=a1 for predicate A1 = a1. Calculation of
the value P-tree PA1=a1 is given in details in Section 2.3.1. The nodes of PQ represent the entries in the relational table
that satisfy the predicate A1 = a1. nq is the root count of PQ, denoted by RootCount (PQ).
For example, we want to get the subset of transactions for a particular location Q = T| Loc = 01. PQ = PLoc=01 = P12 AND
P’11. nq = RootCount (PQ) = 3. Table 2 shows the calculation process of nq. For convenience the P-trees are
uncompressed.
Table 2. Calculation process of nq
Name
P11, P’11
P12, P’12
PQ
nq
Formula and result
( means AND)
00100001, 11011110
01010010, 10101101
PQ = P’11  P12 = 01010010
nq = RootCount (PQ) = 3
Similarly when k > 1, we simply get value tree PAi= ai for each predicate constraints Ai = ai ( i = 1, 2, …, k), and get
the final tree PQ by ANDing all the value trees: PQ = P A1= a1  PA2= a2 …PAk= ak. Here  denotes AND operation. For
example, if we want to get a particular subset which satisfies Loc = 01 and Sex = 1, PQ = PLoc=01  PSex = 1.
Algorithm 4.2: Calculation of nx|q. First we need to calculate the tree Px|q which represents the transactions within
Q that involves X. Px|q is defined as Px|q = PQ  PX, where PQ represents transactions with the particular subset Q,
and PX is a predicate tree that represents transactions what involves value pattern X. The calculation of PX is
similar to PQ, only PQ involves non-item dimensions and PX involves item dimensions. The root count of P x|q is
nx|q.
For example, we want to find the number of transactions within subset Q = T| Loc = 01 that involves pattern #car = 10.
Px|q = PQ  PX = PLoc=01  Pcar=10. PLoc=01 and Pcar=10 are both value P-trees. PLoc=01 is calculated in the previous
example. Pcar=10 = P52  P’51, where P52 and P’51 are P-trees for attribute “Loc.” Finally, we get nx|q = RootCount
(Px|q) = 1. Table 3 shows the calculation process of nx|q.
Table 3. Calculation process of nx|q
P-tree
PQ
PX
Px|q
nx|q
Formula and result
( means AND)
PQ = PLoc=01
= 01010010
PX = Pcar=10
= P52  P’51
= 01101100
Px|q = PQPX
= 01000000
RootCount (Px|q) = 1
4. Performance Analysis
5. Conclusion
In this paper, we present an extended quantitative frequent pattern mining algorithm using P-trees (PFP). P-trees are
lossless, vertical bitwise quadrant-based data structure, which is developed to facilitate data compression and data
mining. P-trees are pre-generated tree structure, which is used to facilitate any interval discretization, and to satisfy
any inequity constraint. There is no need to build trees on-the-fly. Fast P-tree logic operations are used to achieve
efficient frequent pattern mining. Our approach has better performance due to the vertical decomposed data
structure and compression of P-trees. Experiments show that our algorithm outperforms Apriori algorithm by orders
of magnitude with better dimensionality and cardinality scalability.
REFERENCES
[1] R. Agrawal, T. Imielinski, and A. Swami, Mining Association Rules Between Sets of Items in Large
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
Databases. Proc. ACM SIGMOD Conf. Management of Data, pp. 207-216, May 1993
Perrizo, W., Peano Count Tree Technology, Technical Report NDSU-CSOR-TR-01-1, 2001.
Khan, M., Ding, Q., Perrizo, W., k-Nearest Neighbor Classification on Spatial Data Streams Using P-Trees,
PAKDD 2002, Spriger-Verlag, LNAI 2776, 2002, pp. 517-528.
TIFF image data sets. Available at http://midas-10cs.ndsu.nodak.edu/data/images/.
Nestorov, S., Jukic, N., Ad-Hoc Association-Rule Mining within the Data Warehouse, In Proceedings of
HICSS’03, Big Island, Hawaii, January 2003.
R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proceedings of VLDB’94, pp.
487-499.
Brin, s., Motwani, R., Ullman, J., and Tsur, S., Dynamic Itemset Counting and Implication Rules for Market
Basket Data. In Proc. of the 1997 ACM-SIGMOD Conf. On Management of Data, pp. 255-264.
J. Han, J. Pei, and Y. Yin. Mining Frequent Patterns without Candidate Generation, In Proc. of 2000 ACMSIGMOD Conf. On Management of Data , Dallas, TX, May 2000.
R. Srikant, R. Agrawal. Mining Quantitative Association Rules in Large Relational Tables. Proc. Of the ACM
SIGMOD Conference on Management of Data, 1996.
H. Mannila, H. Toivonen, and A.I. Verkamo, Efficient Algorithms for Discovering Association Rules. Proc.
KDD-94: AAAI Workshop Knowledge Discovery in Databases, pp. 181-192, July 1994
J.S. Park, M.-S. Chen, and P.S. Yu, An Effective Hash Based Algorithm for Mining Association Rules. Proc.
ACM-SIGMOD Conf. Management of Data, May 1995.
Q. Ding, Q. Ding and W. Perrizo "Association Rule Mining on Remotely Sensed Images Using P-trees",
Proceedings of PAKDD2002, Taipei, Taiwan, May 6-8, 2002.
14
38
10
9 11
4
12
410
9 11
12
[13] Aumann, Y.; Lindell, Y.: “A Statistical Theory for Quantitative Association Rules”, Proceedings KDD99, San
Diego, CA, 1999, pp. 261 - 270