Download Parallel Monte-Carlo Tree Search

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Amol Ghoting, Srinivasan Parthasarathy, Matthew Eric Otey
Data Mining and Knowledge Discovery Vol. 16 No. 3, 2008
Reporter : CHENG-WEI, CHOU
Jan. 13 2010
組員名單:
89721002 周政緯
陳永洲





2017/5/5
Introduction
Distance-based outlier detection
Outlier detection algorithm
Experiment results
Conclusion
89721002 周政緯
2





2017/5/5
Introduction
Distance-based outlier detection
Outlier detection algorithm
Experiment results
Conclusion
89721002 周政緯
3

A common problem : automatically finding outliers

Outliers : those points are highly unlikely to occur

A measure of unusualness : a point’s distance

On high-dimensional, existing algorithms have not
good performance

2017/5/5
This paper further improve the scaling behavior of
distance-based outlier detection on large,
high-dimensional datasets
89721002 周政緯
4





2017/5/5
Introduction
Distance-based outlier detection
Outlier detection algorithm
Experiment results
Conclusion
89721002 周政緯
5




2017/5/5
Three popular definitions of distance-based outliers:
Outliers are the data points for which there are
fewer than p other data points within distance d
Outliers are the top n data points whose distance to
their kth nearest neighbor is greatest
Outliers are the top n data points whose average
distance to their k nearest neighbors is greatest
89721002 周政緯
6




2017/5/5
NL(nested loop) algorithm : the best performance in
high-dimensional spaces
For each data point in D, scan the dataset and keep
track of its k closest neighbors
Maintain a cutoff threshold, c
If (distance of a data point’s kth closest neighbor < c)
the data point is no longer an outlier
89721002 周政緯
7
2017/5/5
89721002 周政緯
8





2017/5/5
Introduction
Distance-based outlier detection
Outlier detection algorithm
Experiment results
Conclusion
89721002 周政緯
9



2017/5/5
RBRP(Recursive Bining and Re-Projection)
A two-phase algorithm for fast mining of distancebased outliers in high dimensional datasets
Finds the top n outliers in the dataset whose
distance to their kth nearest neighbor is the
greatest
89721002 周政緯
10

First phase of RBRP

Goal : to partition the dataset into bins

Points that are close to each other in space are
likely to be assigned to the same bin

A recursive procedure similar to divisive
hierarchical clustering

Second phase of RBRP : Use an extension of the NL
algorithm to find outliers in the dataset
2017/5/5
89721002 周政緯
11
2017/5/5
89721002 周政緯
12
2017/5/5
89721002 周政緯
13
2017/5/5
89721002 周政緯
14

Time Complexity of Phase 1 :
T ( N )  T ( N  m)  T (m)   ( N )

Worst case :

Best case :
2017/5/5
89721002 周政緯
15

2017/5/5
Average case:
89721002 周政緯
16





2017/5/5
Introduction
Distance-based outlier detection
Outlier detection algorithm
Experiment results
Conclusion
89721002 周政緯
17
2017/5/5
89721002 周政緯
18
2017/5/5
89721002 周政緯
19
2017/5/5
89721002 周政緯
20
2017/5/5
89721002 周政緯
21
2017/5/5
89721002 周政緯
22
2017/5/5
89721002 周政緯
23
2017/5/5
89721002 周政緯
24
2017/5/5
89721002 周政緯
25
2017/5/5
89721002 周政緯
26
2017/5/5
89721002 周政緯
27





2017/5/5
Introduction
Distance-based outlier detection
Outlier detection algorithm
Experiment results
Conclusion
89721002 周政緯
28






2017/5/5
Presented RBRP
RBRP improves upon the scaling behavior of the
state-of-the-art
Provide theoretical arguments
Validated its scaling behavior
Empirical results on real data back the above claim
Realizing a significant speedup over ORCA
89721002 周政緯
29
Thank you!!
2017/5/5
89721002 周政緯
30
Related documents