* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Summary Team members: Weiqian Yan, Kanchan Khurad, and Yi
Computational electromagnetics wikipedia , lookup
Theoretical computer science wikipedia , lookup
Pattern recognition wikipedia , lookup
Computational phylogenetics wikipedia , lookup
Genetic algorithm wikipedia , lookup
Simplex algorithm wikipedia , lookup
Data assimilation wikipedia , lookup
Multidimensional empirical mode decomposition wikipedia , lookup
Operational transformation wikipedia , lookup
Fast Fourier transform wikipedia , lookup
Selection algorithm wikipedia , lookup
Non-negative matrix factorization wikipedia , lookup
Fisher–Yates shuffle wikipedia , lookup
Smith–Waterman algorithm wikipedia , lookup
Algorithm characterizations wikipedia , lookup
Dijkstra's algorithm wikipedia , lookup
K-nearest neighbors algorithm wikipedia , lookup
Factorization of polynomials over finite fields wikipedia , lookup
Summary Team members: Weiqian Yan, Kanchan Khurad, and Yi Yue Paper: A Monte Carlo Algorithm for Fast Projective Clustering Related papers: [1] Cecilia M. Procopiuc, Michael Jones, Pankaj K. Agarwal, and T.M. Murali [A Monte Carlo Algorithm for Fast Projective Clustering] [2] Clark F. Olson and Henry J. Lyons [Simple and efficient projective clustering] [3] Vladimír Ljubopytnov and Jaroslav Pokorný [Monte Carlo projective clustering of texts] [4] Shmuel Friedland, Amir Niknejad, Mostafa Kaveh and Hossein Zare [Fast Monte Carlo low rank approximations for matrices] [5] Bernd A. Berg. [Introduction to Markov Chain Monte Carlo Simulations and their Statistical Analysis] [6] D.J.C.MACKAY. [Introduction to Monte Carlo Methods] [7] Malvin H. Kalos, Paula A. Whitlock.[Monte Carlo Methods] The Monte Carlo Algorithm is both interesting and useful. The core idea of Monte Carlo is to learn about a system by simulating it with random sampling. It is often the simplest way to solve a problem. A Monte Carlo method is any process that consumes random numbers, each calculation is numerical experiment and Sources of errors must be controllable. Clustering is the task of grouping a set of objects in such a way that objects in the same group are more similar to each other than to those in other groups. It is a main task of exploratory data mining, and a common technique for statistical data analysis. However, Whether the computed clusters are good with high probability? Whether this method is more accurate than previous approaches? The paper, A Monte Carlo Algorithm for Fast Projective Clustering, proposes 2 novel approaches to approximate optimal clusters in high dimensional data space. As research has proven, existing clustering methods that work well in low dimensional spaces don’t work well in high dimensional space due to the fact that full dimensional distance is almost irrelevant in moderate to high dimensional spaces. The paper firstly proposes a monte carlo algorithm for projective clustering. The algorithm randomly selects parameters in each iteration and output the best cluster. While the theoretical run time of this algorithm is satisfactory, that is linear, but due to the large amount of data in high dimensional space problems, the algorithm is still very time consuming. Thus, the paper proposes to speed up the algorithm by including heuristic, so that the whole data set doesn’t need to be scanned for every iteration. This approaches speeds up the algorithm, but loses some quality guarantees. The generated clusters are still relevant in most practical cases. The application of the algorithms include image recognition, in particular, how to recognize rotated faces. The works related to Monte Algorithm for fast projective clustering have tried to improve the existing algorithm to make it better. The paper "Simple and efficient projective clustering" propose modification in the method of determination of dimensions and the method of determining points in the cluster. In another paper "Monte Carlo Projective clustering of texts", they have adapted the existing algorithm to work for texts. In the paper "Fast Monte Carlo low rank approximations for matrices" includes fast k rank approximation of matrices to be an important tool in projective clustering, if suppose data is represented in matrix form, where the columns of the matrix represent points and the rows represent the coordinates.