Download Summary Team members: Weiqian Yan, Kanchan Khurad, and Yi

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Computational electromagnetics wikipedia , lookup

Theoretical computer science wikipedia , lookup

Pattern recognition wikipedia , lookup

Computational phylogenetics wikipedia , lookup

Genetic algorithm wikipedia , lookup

Simplex algorithm wikipedia , lookup

Data assimilation wikipedia , lookup

Multidimensional empirical mode decomposition wikipedia , lookup

Operational transformation wikipedia , lookup

Fast Fourier transform wikipedia , lookup

Selection algorithm wikipedia , lookup

Non-negative matrix factorization wikipedia , lookup

Fisher–Yates shuffle wikipedia , lookup

Smith–Waterman algorithm wikipedia , lookup

Algorithm characterizations wikipedia , lookup

Dijkstra's algorithm wikipedia , lookup

K-nearest neighbors algorithm wikipedia , lookup

Algorithm wikipedia , lookup

Factorization of polynomials over finite fields wikipedia , lookup

Monte Carlo method wikipedia , lookup

Transcript
Summary
Team members: Weiqian Yan, Kanchan Khurad, and Yi Yue
Paper: A Monte Carlo Algorithm for Fast Projective Clustering
Related papers:
[1] Cecilia M. Procopiuc, Michael Jones, Pankaj K. Agarwal, and T.M. Murali [A Monte Carlo
Algorithm for Fast Projective Clustering]
[2] Clark F. Olson and Henry J. Lyons [Simple and efficient projective clustering]
[3] Vladimír Ljubopytnov and Jaroslav Pokorný [Monte Carlo projective clustering of texts]
[4] Shmuel Friedland, Amir Niknejad, Mostafa Kaveh and Hossein Zare [Fast Monte Carlo low
rank approximations for matrices]
[5] Bernd A. Berg. [Introduction to Markov Chain Monte Carlo Simulations and their Statistical
Analysis]
[6] D.J.C.MACKAY. [Introduction to Monte Carlo Methods]
[7] Malvin H. Kalos, Paula A. Whitlock.[Monte Carlo Methods]
The Monte Carlo Algorithm is both interesting and useful. The core idea of Monte Carlo is to learn
about a system by simulating it with random sampling. It is often the simplest way to solve a
problem. A Monte Carlo method is any process that consumes random numbers, each calculation
is numerical experiment and Sources of errors must be controllable. Clustering is the task of
grouping a set of objects in such a way that objects in the same group are more similar to each
other than to those in other groups. It is a main task of exploratory data mining, and a common
technique for statistical data analysis. However, Whether the computed clusters are good with
high probability? Whether this method is more accurate than previous approaches?
The paper, A Monte Carlo Algorithm for Fast Projective Clustering, proposes 2 novel approaches
to approximate optimal clusters in high dimensional data space. As research has proven, existing
clustering methods that work well in low dimensional spaces don’t work well in high dimensional
space due to the fact that full dimensional distance is almost irrelevant in moderate to high
dimensional spaces. The paper firstly proposes a monte carlo algorithm for projective clustering.
The algorithm randomly selects parameters in each iteration and output the best cluster. While
the theoretical run time of this algorithm is satisfactory, that is linear, but due to the large amount
of data in high dimensional space problems, the algorithm is still very time consuming. Thus, the
paper proposes to speed up the algorithm by including heuristic, so that the whole data set
doesn’t need to be scanned for every iteration. This approaches speeds up the algorithm, but
loses some quality guarantees. The generated clusters are still relevant in most practical cases.
The application of the algorithms include image recognition, in particular, how to recognize
rotated faces.
The works related to Monte Algorithm for fast projective clustering have tried to improve the
existing algorithm to make it better. The paper "Simple and efficient projective clustering"
propose modification in the method of determination of dimensions and the method of
determining points in the cluster. In another paper "Monte Carlo Projective clustering of texts",
they have adapted the existing algorithm to work for texts. In the paper "Fast Monte Carlo low
rank approximations for matrices" includes fast k rank approximation of matrices to be an
important tool in projective clustering, if suppose data is represented in matrix form, where the
columns of the matrix represent points and the rows represent the coordinates.