Download 04_VDB_encyc_cpt - NDSU Computer Science

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Transcript
R (A1, A2, A3)
A1
A2
101
010
111
111
010
100
011
001
A11 A12 A13
A12
0
1
1
0
1
0
0
0
1
1
1
1
0
1
0
A2
A3
010
011
010
010
101
111
010
011
111
010
010
101
101
001
001
100
A21 A22 A23 A31 A32 A33
1 A0
2
0 0
1 0
1 0
0 1
0 1
1 0
1 0
1
1
1
1
0
1
1
1
0 A1
2
1 0
0 0
0 1
1 1
1 0
0 0
1 1
1
1
1
0
0
0
0
0
1
0
0
1
1
1
1
0
Figure 2. Vertical decomposition o f the table R
After decomposition process, each bit vectors is then converted into a P -tree.
P-trees can be 1-dimensional, 2-dimensional, and multi -dimensional. If the data
has a natural dimension, for instance spatial data, P -tree dimension is matched to
the data dimension. Otherwise, the dimension can be chosen to optimize the
compression ratio. Figure 3 shows the construction of three 1 -dimensional P-trees
from the bit vectors of the second attribute A 2 . They are built by recording the truth
of the predicate “purely 1-bits” recursively on halves of the bit vectors until purity
is reached.
0
0
0
0
1
1
0
0
0
0
0 1
0
1
0
0 1
0
0 1
0
0 1
(a) P 2 1
(b) P 2 2
(c) P 2 3
Figure 3. P-trees of attributes A 2 1 , A 2 2 and A 2 3
With built-in various engines, such as query engine, OLAP engine, and data
mining engine, vertical database can be used to accomplish SPJ queries (Ding et al
2002), OLAP operations (Wang et al, 2003) and various data mining applications.
The detailed description of system structure is discussed in the next section.