Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
國立雲林科技大學 National Yunlin University of Science and Technology Multiobjective Clustering with Automatic k-determination for Large-scale Data Presenter : Shao-Wei Cheng Authors : Nobukazu Matake, Tomoyuki Hiroyasu, Mitsunori Miki, Tomoharu Senda CECCO 2007 Intelligent Database Systems Lab Outline Motivation Objective Methodology Original MOCK New scalable k-determination scheme Experiments and Results Conclusion Personal Comments N.Y.U.S.T. I. M. 2 Intelligent Database Systems Lab Motivation N.Y.U.S.T. I. M. Web behavior mining has attracted a great deal of attention today. MOCK is powerful and strict. But the computational costs are too high when applied to clustering huge data. Too Much Data !! 3 Intelligent Database Systems Lab Objectives Apply MOCK to web data clustering with a scalable automatic k-determination scheme. Determine the appropriate k at low cost. N.Y.U.S.T. I. M. It contains two complementary objectives. Determination of appropriate k. Find partitions between k clusters. 4 Intelligent Database Systems Lab Methodology Original MOCK N.Y.U.S.T. I. M. Third Step First Step Forth Step Second Step Gap statistic 5 Intelligent Database Systems Lab Methodology N.Y.U.S.T. I. M. New scalable k-determination scheme First Step Second Step First scheme:Calculate adjacent angles x y Second scheme x x 6 Intelligent Database Systems Lab Experiments N.Y.U.S.T. I. M. 7 Intelligent Database Systems Lab Conclusion N.Y.U.S.T. I. M. The new scheme is able to determine the appropriate k at low cost, although the performance is poorer than the original algorithm. Reduce the Pareto size by about 50-70%. Doesn’t need random data clustering. 8 Intelligent Database Systems Lab Personal Comments N.Y.U.S.T. I. M. Advantage MOCK can be applied to large-scale data. Drawback Application Web data. 9 Intelligent Database Systems Lab