Download Parallel Monte-Carlo Tree Search

Amol Ghoting, Srinivasan Parthasarathy, Matthew Eric Otey Data Mining and Knowledge Discovery Vol. 16 No. 3, 2008 Reporter : CHENG-WEI, CHOU Jan. 13 2010 組員名單： 89721002 周政緯陳永洲      2017/5/5 Introduction Distance-based outlier detection Outlier detection algorithm Experiment results Conclusion 89721002 周政緯 2      2017/5/5 Introduction Distance-based outlier detection Outlier detection algorithm Experiment results Conclusion 89721002 周政緯 3  A common problem : automatically finding outliers  Outliers : those points are highly unlikely to occur  A measure of unusualness : a point’s distance  On high-dimensional, existing algorithms have not good performance  2017/5/5 This paper further improve the scaling behavior of distance-based outlier detection on large, high-dimensional datasets 89721002 周政緯 4      2017/5/5 Introduction Distance-based outlier detection Outlier detection algorithm Experiment results Conclusion 89721002 周政緯 5     2017/5/5 Three popular definitions of distance-based outliers: Outliers are the data points for which there are fewer than p other data points within distance d Outliers are the top n data points whose distance to their kth nearest neighbor is greatest Outliers are the top n data points whose average distance to their k nearest neighbors is greatest 89721002 周政緯 6     2017/5/5 NL(nested loop) algorithm : the best performance in high-dimensional spaces For each data point in D, scan the dataset and keep track of its k closest neighbors Maintain a cutoff threshold, c If (distance of a data point’s kth closest neighbor < c) the data point is no longer an outlier 89721002 周政緯 7 2017/5/5 89721002 周政緯 8      2017/5/5 Introduction Distance-based outlier detection Outlier detection algorithm Experiment results Conclusion 89721002 周政緯 9    2017/5/5 RBRP(Recursive Bining and Re-Projection) A two-phase algorithm for fast mining of distancebased outliers in high dimensional datasets Finds the top n outliers in the dataset whose distance to their kth nearest neighbor is the greatest 89721002 周政緯 10  First phase of RBRP  Goal : to partition the dataset into bins  Points that are close to each other in space are likely to be assigned to the same bin  A recursive procedure similar to divisive hierarchical clustering  Second phase of RBRP : Use an extension of the NL algorithm to find outliers in the dataset 2017/5/5 89721002 周政緯 11 2017/5/5 89721002 周政緯 12 2017/5/5 89721002 周政緯 13 2017/5/5 89721002 周政緯 14  Time Complexity of Phase 1 : T ( N )  T ( N  m)  T (m)   ( N )  Worst case :  Best case : 2017/5/5 89721002 周政緯 15  2017/5/5 Average case: 89721002 周政緯 16      2017/5/5 Introduction Distance-based outlier detection Outlier detection algorithm Experiment results Conclusion 89721002 周政緯 17 2017/5/5 89721002 周政緯 18 2017/5/5 89721002 周政緯 19 2017/5/5 89721002 周政緯 20 2017/5/5 89721002 周政緯 21 2017/5/5 89721002 周政緯 22 2017/5/5 89721002 周政緯 23 2017/5/5 89721002 周政緯 24 2017/5/5 89721002 周政緯 25 2017/5/5 89721002 周政緯 26 2017/5/5 89721002 周政緯 27      2017/5/5 Introduction Distance-based outlier detection Outlier detection algorithm Experiment results Conclusion 89721002 周政緯 28       2017/5/5 Presented RBRP RBRP improves upon the scaling behavior of the state-of-the-art Provide theoretical arguments Validated its scaling behavior Empirical results on real data back the above claim Realizing a significant speedup over ORCA 89721002 周政緯 29 Thank you!! 2017/5/5 89721002 周政緯 30

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Parallel Monte-Carlo Tree Search