Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Carson Kai-Sang Leung, Boyu Hao, Fan Jiang ICDE 2010 1 Motivation Method (UF-streaming+, UF-streaming*, CUF-streaming) Experimental results Conclusion 2 There are many situations in which ones are uncertain about the contents of transactions. Moreover, there are also situations in which users are interested in only some portions of the mined frequent itemsets. 3 Minsup=1.2 preMinsup=0.9 First batch: a b c d e 1.8 1.6 1.9 0.9 1.4 例如: expSup({a, e}) = (1 × 0.9 × 0.6) + (1 × 0.9 × 0.7) = 1.17 ≥ preMinsup expSup({c, e}) = (1 × 0.7 × 0.6)+ (1 × 0.8 × 0.7) = 0.98 ≥ preMinsup expSup({d, e}) =1 × 0.9 × 0.1 = 0.09 < preMinsup expSup({a, c, e}) = (1 × 0.9 × 0.7×0.6) + (1 × 0.9 × 0.8×0.7) ≈ 0.88 < preMinsup.) 4 First batch: {a} {a, c} {a, e} {b} {c} {c, e} {d} {e} 1.8 ,1.35, 1.17 ,1.6 ,1.5 ,0.98 , 0.9 , 1.4 ----------------------------------Second batch: {a} {a, c} {b} {b, d} {c} {d} 0.9, 0.9, 1.4, 1.4, 1.8, 2.0 5 Second batch: {a} {a, c} {b} {b, d} {c} {d} 0.9, 0.9, 1.4, 1.4, 1.8, 2.0 ----------------------------third batch: {a} {a, c} {b} {b, d} {c} {d} 1.7, 1.53, 1.0, 1.0, 1.9 1.2 post-processing step: {a}:2.6, {a, c}:2.43, {b}:2.4 and {c}:3.7 satisfying C1. 6 the algorithm first uses the same UF-growth mining technique to find all “frequent” itemsets, and it then checks the mined itemsets against userspecified constraints before storing the constrained itemsets in the UFstream structure. 7 Type1: ANTI-MONOTONE CONSTRAINT min(X.attr) ≥ const R+ (Xi.attr ≤ Xi+1.attr) max(X.attr) ≤ const R- (Xi.attr ≥ Xi+1.attr) Ex : C1 ≡ min(X.WBC) ≥ 10*103/μL (e , d , c , b , a ) 9.0 9.5 10.5 11.0 11.5 Type2: MONOTONE CONSTRAINT max(X.attr) ≥ const Rmin(X.attr) ≤ const R+ Ex: C2 ≡ max(X.RBC) ≥ 6.1 × 106/μL a b c d e 8.5 3.3 7.5 6.6 5.9 a c d e b 8.5 7.5 6.6 5.9 3.3 8 Type3: CONVERTIBLE ANTI-MONOTONE CONSTRAINT avg (X.attr) ≥ const or sum(X−.attr) ≥ const Ravg (X.attr) ≤ const or sum(X+.attr) ≤ const R+ Ex: C3 ≡ sum(X.Rainfall ) ≤ 200mm a b c d e 50 33 200 101 120 b a d e c 33 50 101 120 200 Type4: CONVERTIBLE MONOTONE CONSTRAINT sum(X+.attr) ≥ const Rsum(X−.attr) ≤ const R+ Ex: C4 ≡sum(X.Rainfall ) ≥ 200mm a b 201 52 c d e 70 300 180 d a e c b 300 201 180 70 52 9 10 C1 ≡ min(X.WBC) ≥ 10000/μL R+ (e,d,c,b,a) {a} (c,b,a) check {a, c} {a, e} {b} {c} {c, e} {d} {e} {a}, {a, c}, {b} ,{c} 11 12 we proposed three tree-based algorithms—namely, UFstreaming+, UF-streaming∗ and CUF-streaming— which integrate : (i) mining of uncertain data (ii) constrained mining (iii) mining of data streams. These algorithms effectively mine constrained frequent itemsets from uncertain data streams. 13