Download Data mining on graphics processors

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

K-means clustering wikipedia , lookup

Transcript
Leo Lee, Ph.D. student in department of Computer Science in HKBU
Data mining on graphics processors
--weekly report
For the previous week, I mainly learned the Apriori algorithm and read a paper which
implement the Apriori algorithm on GPU. In addition, I read the k-means program
written by Kaiyong and try to debug it.
1 Reading-Frequent itemset mining on graphics processors
The paper proposes two efficient methods on how to implement Apriori on GPU. The
methods employ a bitmap data structure to exploit the GPU’s SIMD parallelism and to
speed up the frequency counting operation.
The Apriori algorithm is described as Fig.1. Two main steps are Candidate_Generation
and Freq_Itemset_Generation. And the paper uses GPU to speed up this two parts.
Figure 1 The Apriori algorithm
Figure 2 itemsets represented by the bitmap
1
Leo Lee, Ph.D. student in department of Computer Science in HKBU
The first method is so called Pure Bitmap-based Implementation (PBI), which uses
bitmap to represent itemsets, as shown in Fig.2. The second method is called Trie-Based
Implementation (TBI), which uses trie to represent itemsets, as shown in Fig.3.
Specifically, the candidate generation based on trie traversal is implemented on the CPU,
since the trie is an irregular structure and difficult to share among SIMD threads.
Figure 3 itemsets represented by the trie
The experiments were performed on a PC with an NVIDIA GTX 280 GPU and an Intel
Core2 quad-core CPU, running on Microsoft Windows XP SP3. The GPU consists of 30
SIMD multi-processors, each of which has eight processors running at 1.29 GHz. The
GPU memory is of size 1GB with the peak bandwidth of 141.7 GB/sec. The CPU has
four cores running at 2.4 GHz. The main memory is 2 GB with the peak bandwidth of 5.6
GB/sec. The GPU uses a PCI-E bus to transfer data between the GPU memory and the
main memory with a theoretical bandwidth of 4 GB/sec. The PC has a 160 GB SATA
magnetic hard disk.The algorithms are shown as Fig.4 and the data sets are described as
Fig.5.
Figure 4 description of each Apriori Implementation
2
Leo Lee, Ph.D. student in department of Computer Science in HKBU
Figure 5 description of each data set
The results are shown as Fig.6, Fig.7 and Fig.8. From the experiments, we can see that
TBI-GPU performs better than TBI CPU in data set Retail and Chess, and PBI-GPU
performs better than TBI-GPU in most of the cases. PBI-GPU, TBI-GPU and TBI-CPU,
implemented by this paper perform better than the original BORGELT and GOETHALS.
The paper also compared the FP-GROWTH with GPU based implementation, as shown
in Fig.9. We can see that GPU based implementation is still less fast than FP-GROWTH,
which indicates that there are still enough room for us to explore more efficient GPU
based FIM algorithms.
3
Leo Lee, Ph.D. student in department of Computer Science in HKBU
This paper systematically introduces how to implement Apriori on GPU and achieves a
achieve a speedup of up to two orders of magnitude over optimized CPU Apriori
implementations on a PC with an NVIDIA GTX 280 GPU and a quad-core CPU.
However, the algorithms proposed in this paper still compare bad with CPU based FPGROWTH algorithm, which means how to make the best use of GPU to speed up Apriori
still need further research.
2 research plan
1) Read related papers
a. GPU, data mining, bioinformatics;
2) Read our k-means code and try to speed it up
a. Learn from Kaiyong and discuss with him more
3) Try protein identification on GPU
3 References
[1] http://cuda.csdn.net/zxkt/index.html
[2] A Performance Study of General-Purpose Application on Graphics Processors Using CUDA.
S. Che et al, J. Parallel-Distrib. Comput. 2008.
[3] Parallel Data Mining on Graphics Processors. W. Fang et al. Technical Report HKUST-CS0807, Oct 2008.
[4] GPU-Accelerated Large Scale Analytics. R. Wu et al, HP Laboratories. 2009
[5] K-Means on commodity GPUs with CUDA. H. Bai et al, World Congress on Computer
Science and Information Engineering. 2009
[6] Fast Support Vector Machine Training and Classification on Graphics Processors. B.
Catanzaro, et al. Technical Report No. UCB/EECS-2008-11.
4
Leo Lee, Ph.D. student in department of Computer Science in HKBU
[7] Frequent Itemset Mining on Graphics Processors. Fang et al. Proceedings of the Fifth
International Workshop on Data Management on New hardware, 2009.
5