Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Cache misses analysis by means of data mining methods Pavel Kordík, Ivan Šimeček kordikp, [email protected] Department of Computer Science, Faculty of Electrical Engineering, Czech Technical University, Technická 2, 166 27 Prague 6, Czech Republic It is really difficult to predict the cache behavior even for a simple program (for details see [1, 2]) because every modern CPU use a complex memory hierarchy, which consists of levels of cache memories. One challenging task is to predict the exact number of cache misses during the sparse matrix-vector multiplication (shortly SpMV). Due to matrix sparsity, the memory access patterns are irregular and the utilization of a cache suffers from low spatial and temporal locality. It is really difficult to predict the cache behavior for all cases of input parameters. The cache misses data were also analyzed by means of data mining methods. This is the main topic of this paper and we will discuss the data mining analysis bellow in the more detailed form. At first the data had to be preprocessed. It was transformed into the native format of the data mining application WEKA [3] where almost all experiments have been performed. We tried to predict the number of cache misses from input variables (read operations, size of matrices, bandwidth etc.). The data mining methods from the category of decision trees, bayes classifiers and neural networks were used. The detailed description of these methods can be found in [4]. Because WEKA is designed to solve mainly classification problems, we had divided the output attribute “number of cache misses” into 10 intervals (classes). We achieved just 62% classification accuracy by Bayes based methods (Bayes Net, Naive Bayes Simple, etc.). Other methods were unable to give any results because of memory demands. When we studied why the performance is so low, we found out that the data should be further preprocessed. New data set were created from the original one by leaving out redundant measurements with low additional information. It consisted from 1500 records that representatively described number of cache misses (almost uniformly distributed). With this new data set we achieved for data mining methods following classification accuracies: Multi Layer Perceptron MLP (92%), Radial Basis Function network RBF (93%), Decision tree C4.5 (95%), etc. This accuracy is perfect so we can conclude that using data mining methods, we can estimate the number of cache missed with relatively low error. Data mining also allows us to find out which input variables (features) are most important in estimating the number of cache misses (feature ranking). Again, we performed several experiments with method for feature ranking available in WEKA. The results show that the most important feature is readCount (number of read operations), nonZero (number of nonzero elements in the matrix), width (the bandwidth) feature. Surprisingly the size (size of the matrix) feature was on the fourth position in average – gaining much lower significance than we expected. We have implemented a simple cache analyzer, collected data about cache misses and along with analytical estimation of cache misses we analyzed data by means of data mining methods. The results of data mining analysis are very promising for our further research leading to models reducing the number of cache misses. We would like to thank to students Miloš Klíma and Michal Chalupník for performing some of the above described experiments as their semester projects from the course Neural networks and neurocomputers. References: [1] K. BEYLS, E. D'HOLLANDER: Exact compile-time calculation of data cache behavior, Proceedings of PDCS'01, 2001, pp. 617-662. [2] X. VERA, J. XUE: Efficient Compile-Time Analysis of Cache Behavior for Programs with IF Statements, International Conference on Algorithms And Architectures for Parallel Processing, Beijing, 2002. [3] THE UNIVERSITY OF WAIKOTO: WEKA data mining software, http://www.cs.waikato.ac.nz/ml/weka/. [4] I. H. WITTEN, F. EIBE: Data mining: practical machine learning tools and techniques, 2nd ed. Morgan Kaufmann series in data management systems, 2005, ISBN: 0-12088407-0. This research has been supported by GA AV ČR grant No. IBS 3086102, by IGA CTU FEE grant No. CTU0409313, and by MŠMT under research program MSM6840770014.