Download Mining Relationships Among Interval

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Mining Relationships Among
Interval-based Events for
Classification
Dhaval Patel、Wynne Hsu Mong、Li Lee
SIGMOD 08
1
Outline.







Introduction
Preliminaries
Augment hierarchical representation
Interval-based event mining
Interval-based event classifier
Experiment
Conclusion
2
Introduction.



Predicts categorical class labels
Classifies data (constructs a model) based on
the training set and the values (class labels) in a
classifying attribute and uses it in classifying
new data
A Two-Step Process
Model construction
Model usage
3
Introduction.(cont)
Training data
Classification algorithm
Classification
model
Input the
questions
The answer
4
Introduction.(cont)
age
<=30
<=30
31…40
>40
>40
>40
31…40
<=30
<=30
>40
<=30
31…40
31…40
>40
income student credit_rating
high
no fair
high
no excellent
high
no fair
medium
no fair
low
yes fair
low
yes excellent
low
yes excellent
medium
no fair
low
yes fair
medium
yes fair
medium
yes excellent
medium
no excellent
high
yes fair
medium
no excellent
buys_computer
no
no
yes
yes
yes
no
yes
no
yes
yes
yes
yes
yes
no
5
Introduction.(cont)
age?
<=30
31..40
overcast
student?
no
no
yes
yes
yes
>40
credit rating?
excellent
fair
yes
6
Preliminaries.





E = (type, start, end)
EL = {E1, E2,….., En}
The length of EL, given by |EL| is the number
of events in the list.
Composite event E = (Ei R Ej)
The start time of E is given by
min{ Ei.start, Ej.start }
end time is
max{Ei.end, Ej.end }
7
Augment hierarchical representation.

Before

Meet



Finish

Contain

Equal
Overlap
Start
8
Augment hierarchical representation(cont.)




((A overlap B) overlap C)
1.
2.
(A Overlap[0,0,0,1,0] B) Overlap[0,0,0,1,0] C
C = contain count、F = finish by count
M = meet count、O=overlap count
S = start count
9
Augment hierarchical representation(cont.)
10
Augment hierarchical representation(cont.)

The linear ordering of
is {{A+}{B+}{C+}{A−}{B−}{D+}{D−}{C−}}
11
Interval-based event mining.


Candidate generation
Theorem.
A (k+1)-pattern is a candidate pattern if it is generated from a frequent kpattern and a 2-pattern where the 2-pattern occurs in at least k − 1
frequent k-patterns.

Dominant event
Dominant event in the pattern P if it occurs in P and has the latest end time
among all the events in P.
12
Interval-based event mining(cont.)
13
Interval-based event mining(cont.)

Support count
14
IEClassifier.




Class labels Ci 1≦i ≦c, c is the number of
class label
The information gain:
p(TP) is probability of pattern TP to occur in
datasets.
Whose information gain values are below a
predefined info_gain threshold are removed.
15
IEClassifier.(cont)

Let PatternMatchI be the set of discriminating
patterns that are contained in I
16
Experiment.
17
Experiment.(cont)


對於一群資料而言,有時候我們會希望依據資料的一些特性來將這群
資料分為兩群。而就資料分群而言,我們已知有一些效果不錯的方法。
例如:Nearest Neighbor、類神經網路(Neural Networks)、Decision Tree
等等方式,而如果在正確的使用的前提之下,這些方式的準確率相去
不遠,然而,SVM 的優勢在於使用上較為容易。
我們希望能夠在該空間之中找出一Hyper-plan,並且,希望此Hyperplan可以將這群資料切成兩群。
18
Conclusion.

IEMiner algorithm

IEClassification

The performance improved

It achieved the best accuracy
19
Related documents