Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Leo Lee, Ph.D. student in department of Computer Science in HKBU Data mining on graphics processors --weekly report For the previous week, I mainly learned the Apriori algorithm and read a paper which implement the Apriori algorithm on GPU. In addition, I read the k-means program written by Kaiyong and try to debug it. 1 Reading-Frequent itemset mining on graphics processors The paper proposes two efficient methods on how to implement Apriori on GPU. The methods employ a bitmap data structure to exploit the GPU’s SIMD parallelism and to speed up the frequency counting operation. The Apriori algorithm is described as Fig.1. Two main steps are Candidate_Generation and Freq_Itemset_Generation. And the paper uses GPU to speed up this two parts. Figure 1 The Apriori algorithm Figure 2 itemsets represented by the bitmap 1 Leo Lee, Ph.D. student in department of Computer Science in HKBU The first method is so called Pure Bitmap-based Implementation (PBI), which uses bitmap to represent itemsets, as shown in Fig.2. The second method is called Trie-Based Implementation (TBI), which uses trie to represent itemsets, as shown in Fig.3. Specifically, the candidate generation based on trie traversal is implemented on the CPU, since the trie is an irregular structure and difficult to share among SIMD threads. Figure 3 itemsets represented by the trie The experiments were performed on a PC with an NVIDIA GTX 280 GPU and an Intel Core2 quad-core CPU, running on Microsoft Windows XP SP3. The GPU consists of 30 SIMD multi-processors, each of which has eight processors running at 1.29 GHz. The GPU memory is of size 1GB with the peak bandwidth of 141.7 GB/sec. The CPU has four cores running at 2.4 GHz. The main memory is 2 GB with the peak bandwidth of 5.6 GB/sec. The GPU uses a PCI-E bus to transfer data between the GPU memory and the main memory with a theoretical bandwidth of 4 GB/sec. The PC has a 160 GB SATA magnetic hard disk.The algorithms are shown as Fig.4 and the data sets are described as Fig.5. Figure 4 description of each Apriori Implementation 2 Leo Lee, Ph.D. student in department of Computer Science in HKBU Figure 5 description of each data set The results are shown as Fig.6, Fig.7 and Fig.8. From the experiments, we can see that TBI-GPU performs better than TBI CPU in data set Retail and Chess, and PBI-GPU performs better than TBI-GPU in most of the cases. PBI-GPU, TBI-GPU and TBI-CPU, implemented by this paper perform better than the original BORGELT and GOETHALS. The paper also compared the FP-GROWTH with GPU based implementation, as shown in Fig.9. We can see that GPU based implementation is still less fast than FP-GROWTH, which indicates that there are still enough room for us to explore more efficient GPU based FIM algorithms. 3 Leo Lee, Ph.D. student in department of Computer Science in HKBU This paper systematically introduces how to implement Apriori on GPU and achieves a achieve a speedup of up to two orders of magnitude over optimized CPU Apriori implementations on a PC with an NVIDIA GTX 280 GPU and a quad-core CPU. However, the algorithms proposed in this paper still compare bad with CPU based FPGROWTH algorithm, which means how to make the best use of GPU to speed up Apriori still need further research. 2 research plan 1) Read related papers a. GPU, data mining, bioinformatics; 2) Read our k-means code and try to speed it up a. Learn from Kaiyong and discuss with him more 3) Try protein identification on GPU 3 References [1] http://cuda.csdn.net/zxkt/index.html [2] A Performance Study of General-Purpose Application on Graphics Processors Using CUDA. S. Che et al, J. Parallel-Distrib. Comput. 2008. [3] Parallel Data Mining on Graphics Processors. W. Fang et al. Technical Report HKUST-CS0807, Oct 2008. [4] GPU-Accelerated Large Scale Analytics. R. Wu et al, HP Laboratories. 2009 [5] K-Means on commodity GPUs with CUDA. H. Bai et al, World Congress on Computer Science and Information Engineering. 2009 [6] Fast Support Vector Machine Training and Classification on Graphics Processors. B. Catanzaro, et al. Technical Report No. UCB/EECS-2008-11. 4 Leo Lee, Ph.D. student in department of Computer Science in HKBU [7] Frequent Itemset Mining on Graphics Processors. Fang et al. Proceedings of the Fifth International Workshop on Data Management on New hardware, 2009. 5