Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Frequent Pattern Mining from Time-Fading Streams of Uncertain Data Carson Kai-Sang Leung and Fan Jiang DaWaK 2011 1 Outline Motivation Background Method A Naive Algorithm:TUF-Streaming(Naive) A Space-Saving Algorithm:TUF-Streaming(Space) A Time-Saving Algorithm:TUF-Streaming(Time) Experimental Result Conclusion 2 Motivation In past few years, several mining algorithms have been proposed to discover frequent patterns from uncertain data. However, most of them mine frequent patterns from static databases—but not dynamic streams—of uncertain data. 3 Background Mining from Static Database of Uncertain data x:item X:itemset DB: transaction database ti:transaction the expected support of X in the DB can be computed by summing (over all transactionst1, ..., t|DB|) the product (of existential probabilities of items within X): 4 Background Mining from Uncertain data Streams with Sliding window Bi:batch X:itemset T:time DB: transaction database the expected support of X in the current sliding window containing 𝜔 batches of uncertain data in Batches 𝐵𝑇−𝜔+1 , … , 𝐵𝑇 inclusive can be computed as follows: 5 A Naive Algorithm:TUF-Streaming(Naive) E minsup=1.0 preMinsup=0.8 expSup a = 0.7 × 1 + 1.0 × 1 = 1.7 expSup b = 0.9 × 1 + 0.9 × 1 = 1.8 expSup c = 0.8 × 1 + 0.8 × 1 = 1.6 expSup d = 0.1 × 1 + 0.6 × 1 + 0.6 × 1 = 1.3 expSup {b, c} = 0.9 × 0.8 × 1 + 0.9 × 0.8 × 1 = 1.44 expSup {b, d} = 0.9 × 0.6 × 1 + 0.9 × 0.6 × 1 = 1.08 expSup {b, c, d} = 0.9 × 0.8 × 0.6 × 1 + 0.9 × 0.8 × 0.6 × 1 = 0.86 expSup {c, d} = 0.8 × 0.6 × 1 + 0.8 × 0.6 × 1 = 0.96 6 (Cont.) 7 (Cont.) minsup=1.0 preMinsup=0.8 𝛼 = 0.9 𝑒𝑥𝑝𝑆𝑢𝑝 𝑏, 𝑐 , 𝐵1 ∪ 𝐵2 = 1.44 × 0.9 + 0.8 ≈ 2.10 𝑒𝑥𝑝𝑆𝑢𝑝 𝑏, 𝑐 , 𝐵1 ∪ 𝐵2 ∪ 𝐵3 = 1.44 × 0.92 + 0.8 × 0.9 + 0 ≈ 1.89 8 A Space-Saving Algorithm:TUFStreaming(Space) minsup=1.0 preMinsup=0.8 𝛼 = 0.9 𝑒𝑥𝑝𝑆𝑢𝑝 𝑏, 𝑐 , 𝐵1 ∪ 𝐵2 = 1.44 × 0.9 + 0.8 ≈ 2.10 𝑒𝑥𝑝𝑆𝑢𝑝 𝑏, 𝑐 , 𝐵1 ∪ 𝐵2 ∪ 𝐵3 = (1.44 × 0.9 + 0.8) × 0.9 + 0 ≈ 1.89 9 (Cont.) 10 A Time-Saving Algorithm:TUFStreaming(Time) In 𝐵1 have frequent: {a}=1.7,{b}=1.8,{b,c}=1.44,{b,c,d}=0.86,{b,d}=1.08,{c}=1.6, {c,d}=0.96,{d}=1.3 Last batch Last batch’s expected support 11 (Cont.) minsup=1.0 preMinsup=0.8 𝛼 = 0.9 In 𝐵2 have frequent:{a},{a,d},{b},{b,c},{c},{d} 𝑒𝑥𝑝𝑆𝑢𝑝 𝑏, 𝑐 , 𝐵1 ∪ 𝐵2 = 1.44 × 0.92−1 + 0.8 ≈ 2.1 12 (Cont.) minsup=1.0 preMinsup=0.8 𝛼 = 0.9 In 𝐵3 have frequent:{a},{b},{b,d},{c},{d} 𝑒𝑥𝑝𝑆𝑢𝑝 𝑏, 𝑑 , 𝐵1 ∪ 𝐵2 ∪ 𝐵3 = 1.08 × 0.93−1 + 1.44 ≈ 2.31 13 Experimental Result 14 (Cont.) 15 Conclusion In this paper, we proposed tree-based mining algorithms that can be used for mining frequent patterns from dynamic streams of uncertain data with both time-fading and landmark models. 16