Download Speeding up k-Means by GPUs

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Cluster analysis wikipedia , lookup

K-means clustering wikipedia , lookup

Transcript
Speeding up k-Means by GPUs
1
YOU LI
SUPERVISOR: DR. CHU XIAOWEN
CO-SUPERVISOR: PROF. LIU JIMING
THURSDAY, MARCH 11, 2010
Outline
2
 Introduction
 Efficiency of data mining -> GPGPU -> k-means on GPU;
 Related work
 Method
 Research Plan
Efficiency of Data mining
3

Face the challenge of efficiency due to the increasing data
Fig.2
Fig.1
Parallel
data mining
GPGPU
5
 A general-purpose and high performance parallel hardware;
 Supply another platform for parallelizing data mining
algorithms.
Control
Tesla
C870
Cache
500
GeForce
8800 GTX
GFLOPS
400
ALU ALU
CPU
600
Fig.3
ALU ALU
DRAM
Quadro
FX 5600
300
G71
G70-512
G70
200
100
NV35
NV30
0
Jan 2003
NV40
3.0 GHz
Core 2 Duo
GPU
3.0 GHz
Core 2 Quad
3.0 GHz Pentium 4
Jul 2003
Jan 2004
Jul 2004
Jan 2005
Jul 2005
Jan 2006
Jul 2006
Jan 2007
Jul 2007
DRAM
k-means on GPU
6
 Programming on GPU

CUDA: integrated CPU+GPU , C program
 k-Means


Widely used in statistical data analysis, pattern recognition, etc.;
Easy to implement on CPU, suitable to implement on GPU;
Outline
7
 Introduction
 Related work
 UV_k-Means, GPUMiner and HP_k-Means;
 Method
 Research Plan
Related work
8
Speed of k-Means on low dimension data, in second.
700
n
k
d
2
million
100
400
100
400
100
400
100
400
2
2
8
8
2
2
8
8
4
million
MineBech
on CPU
19.36
70.93
39.81
152.25
38.74
141.84
79.60
304.46
HP
k-Means
1.45
2.16
2.48
4.53
2.88
4.38
4.95
9.03
UV
k-Means
2.84
5.96
6.07
16.32
5.64
11.94
12.85
34.54
NVIDIA GTX 280 GPU; Intel(R) Core(TM) i5 CPU;
GPU
Miner
61.39
63.46
192.05
226.79
130.36
126.38
383.41
474.83
600
500
Series2
Series1
400
300
200
100
0
MineBech
HP
on CPU k-Means
UV
k-Means
GPU
Miner
Outline
9
 Introduction
 Related work
 Method and Results
 k-Means (three steps)-> step 1 -> step 2 -> step 3;
 Experiments;
 Research Plan
k-Means algorithm
10
n data point;
k centroid;
Compute
distanc (ni, ki)
find the closest
centroid
compute
new centroid
Yes
If centroid
change?
No
End
Step 1
O(nkd)
Step 2
O(nk)
Step 3
O(nd)
Memory
Mechanism
Memory Mechanism of GPU
11
 Global Memory
 Large size
 Long latency
 Register
 Small size
 Short latency
 User cannot control
 Shared memory
 Medium size
 Short latency
 User control
k-Means on GPU
12
 Key idea
 Increase the number of computing operation for each global
memory access;
 Adopts the method from matrix multiplication and reduction.
 Dimension is a key parameter
 For low dimension: use register;
 For high dimension: use shared memory;
k-Means on GPU
13
 For low dimension
Read each data from
global memory once
k-Means on GPU
14
 For high dimension
Read each data from
global memory once
Experiments
15
 The experiments were conducted on a PC with an NVIDIA
GTX280 GPU and an Intel(R) Core(TM) i5 CPU.
 GTX 280 has 30 SIMD multi-processors, and each one contains
eight processors and performs at 1.29 GHz. The memory of the
GPU is 1GB with the peak bandwidth of 141.7 GB/sec.
 The CPU has four cores running at 2.67 GHz. The main memory
is 8 GB with the peak bandwidth of 5.6 GB/sec. We use Visual
Studio 2008 to write and compile all the source code. The
version of CUDA is 2.3.
 We calculate the time of the application after the file I/O, in
order to show the speedup effect more clearly.
Experiments
16
 On low dimension data
 Compare with HP, UV and GPUMiner, the data is generated
randomly
40
35
30
25
Series1
Series2
20
15
10
5
0
Our k-Means
HP k-Means
UV k-Means
Four to ten times faster than HP
Experiments
17
 On high dimension data
 Compare with UV and GPUMiner, the data is from KDD 1999.
45
40
35
30
Series1
Series2
25
20
15
10
5
0
Our k-Means
UV k-Means
GPU Miner
Four to eight times faster than UV
Experiments
18
 Compare with CPU
 The results illustrate that our algorithm compares
very favorably with other existing algorithms.
Forty to two hundred times faster than CPU version
Outline
19
 Introduction
 Related work
 Method
 Research Plan
Research Plan
20
 Detail analysis about k-Means on GPU
 GFLOPS
 Deal with even larger data set
 Other data mining algorithms on GPU
 K-nn
 SDP (widely used in protein identification )
Q&A
21
 Thanks very much