Download Summarization of FPM

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Cluster analysis wikipedia , lookup

Transcript
Summarization of Frequent
Pattern Mining
What is FPM?


Why being frequent is so important?
Application of FPM




Decision make/Business
Software Debugging
Bioinformatics
Other data mining tasks


Indexing
Clustering/Classification/Association Rule
What have been done

Frequent
Frequent
Frequent
Frequent

Mining A Single Large Graph




Itemset Mining
Sequential Pattern Mining
Subgraph Mining
Tree Mining
Frequent motifs
FPM is a way to think
E
A
A
A
B
E
B
E
A
A
B
F
B
E
A
C
A
B
F
D
C
C
D
D
F
D
C
A
D
F
A
B
D
C
C
Algorithm Foundations


Apriori Property
Enumeration Algorithm



Level-wise search
Depth-first search
Data structure


For Patterns
For Data
Lattice
null
A
B
C
D
E
AB
AC
AD
AE
BC
BD
BE
CD
CE
DE
ABC
ABD
ABE
ACD
ACE
ADE
BCD
BCE
BDE
CDE
ABCD
ABCE
ABDE
ABCDE
ACDE
BCDE
Apriori
R. Agrawal and R. Srikant. Fast algorithms for mining
association rules. VLDB, 487-499, 1994
Resource and Tools

Important FPM websites

FIMI workshop website


Mining Structure Data website


http://hms.liacs.nl/graphs.html
Commercial Databases


http://fimi.cs.helsinki.fi/
Oracle, IBM DB2, SQL Server
General Data Mining Information


KDDNuggets (general/job/software, etc)
Weka (www.cs.waikato.ac.nz/ml/weka/)
Why FPM does not work?


Too many patterns?
What can we do?

Pattern Pruning




Additional constraints?
Pattern summarization
Representative Patterns?
Pattern Ranking
What is missing


The common foundation for FPM,
clustering, classification, etc…
FPM formalization


language/compiler/automatic discovery
FPM understanding


How and why they are being generated?
The relationship between dataset and
pattern
How FIM relate to the underlying
structure of the dataset?