Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Master’s Thesis Project for 1 or 2 students: Movie recommendation system using Clustered Low Rank Approximations and Temporal Analysis Berkant Savas Depatment of Mathematics Linköping University May 4, 2012 In the last five years The Netflix Prize chalange [1, 10] has attracted attention from many researchers and hobby programmers. The online movie rental company Netflix provided over 100 million ratings from 480,189 users on 17,770 movies. The challenge was to improve the recommender system of Netwlix with 10% and the winner would be awarded $1,000,000. The recommender system may be seen as a missing data estimation problem, where the known movie ratings are stored in a user × movie matrix A. For example, if user i has rated movie j with the rating 4, then A(i, j) = aij = 4. The use of low rank approximation of A ≈ U ΣV T has turned out to yield good performance [8, 9]. The above low rank approximation is computed only over the know entries of A. Recently, a new approach called clustered low rank matrix approximations has been proposed [11]. The benefits of the clustered approach are faster low rank computations, more accurate and more memory efficient approximations, than traditional low rank approximations, e.g. approximation obtained by the singular value decomposition (SVD). In the Master’s thesis project temporal analysis and the clustered low rank approximation approach will be applied to the missing value estimation of the Netflix problem. Since the given rating matrix A is rectangular it will be considered as a bipartite graph and co-clustering [3, 5, 2] will be applied to obtain a clustering of the users and a clustering of the movies. After the co-clustering step (and reordering of the users/movies according to cluster belonging) the rating matrix may be partitioned as A11 · · · A1m .. , .. A = ... . . Au1 · · · Aum where u denotes the number of user cluster, m denotes the number of movie clusters, and Aij contains the ratings from users in cluster i and movies in cluster j. The rating matrix A will be approximated by computing low rank approximations of individual blocks Aij that are sufficiently dense. The project consist of the flowing parts. 1 1. Analyze the Netflix data set in various respects and in particular with respect to temporal dynamics. 2. Learn about and use state of the art methods for large scale graph clustering [4, 7] and co-clustering. 3. Learn about low rank approximations and the clustered low rank approximation of matrices [6, 11]. 4. Learn about low rank approximation of matrices with missing entries. 5. Implement a missing data estimation method using temporal analysis and clustered low rank approximation with missing values. 6. Investigate the effect of various parameters, e.g., ranks of the approximation, number of movie clusters, number of user clusters, on the performance of the method. 7. Compare the performance of the developed method with performance of standard/simple missing values estimation methods. References [1] J. Bennett and S. Lanning. The netflix prize. In KDD Cup and Workshop in conjunction with KDD, 2007. [2] M. Deodhar, G. Gupta, J. Ghosh, H. Cho, and I. S. Dhillon. A scalable framework for discovering coherent co-clusters in noisy data. In Proceedings of the 26th Annual International Conference on Machine Learning, ICML ’09, pages 241–248, New York, NY, USA, 2009. ACM. [3] I. S. Dhillon. Co-clustering documents and words using bipartite spectral graph partitioning. In Proceedings of the 7th ACM SIGKDD International conference on Knowledge Discovery and Data Mining, pages 269–274, New York, NY, USA, 2001. ACM. [4] I. S. Dhillon, Y. Guan, and B. Kulis. Weighted graph cuts without eigenvectors: A multilevel approach. IEEE Trans. Pattern Anal. Mach. Intell., 29(11):1944–1957, 2007. [5] I. S. Dhillon, S. Mallela, and D. S. Modha. Information-theoretic co-clustering. In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’03, pages 89–98, New York, NY, USA, 2003. ACM. [6] G. H. Golub and C. F. Van Loan. Matrix Computations. Johns Hopkins Studies in the Mathematical Sciences. Johns Hopkins University Press, Baltimore, MD, third edition, 1996. [7] George Karypis and Vipin Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on Scientific Computing, 20(1):359–392, 1998. [8] Y. Koren. Collaborative filtering with temporal dynamics. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’09, pages 447–456, New York, NY, USA, 2009. ACM. [9] Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. Computer, 42:30–37, August 2009. [10] S. Lohr. A $1 million research bargain for netflix, and maybe a model for others. The New York Times, September 22:B1, 2009. September, 21. [11] B. Savas and I. S. Dhillon. Clustered low rank approximation of graphs in information science applications. In Proceedings of the SIAM International Conference on Data Mining (SDM), pages 164–175, 2011. 2