Download Exercise 2– Solution proposal

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Relational model wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Clusterpoint wikipedia , lookup

Concurrency control wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Database model wikipedia , lookup

Object-relational impedance mismatch wikipedia , lookup

Transcript
Realization of Database Systems
SS 2009 – Exercise 2
Prof. Dr.-Ing. Dr. h. c. T. Härder
Computer Science Department
Databases and Information Systems
University of Kaiserslautern
Realization of Database Systems
SS 2009 – Exercise 2
c) Which pros and cons are raised by random-access reads on single blocks?
None (see Datenbanksysteme - Konzepte und Techniken der Realisierung, p. 79 f.)
Exercise 2.2: Costs for Accessing Files
Let there be the following SQL query:
SELECT *
FROM
EMP
WHERE ENO = ‘123456‘
Exercise 2– Solution proposal
Documentation of the lecture:
„http://wwwlgis.informatik.uni-kl.de/cms/courses/realisierung/“
(May 20, 2009, 3.30 pm, 36-336)
Exercise 2.1: Block Allocation on External Memory
To speed-up sequential processing, blocks are not arranged sequentially on a track.
Instead, they are staggered in such a way, that, within a single disk cycle, several
blocks can be read in a sequence. A track on a disk consists of m=25 4KByte slots
and every slot can handle a single block.
Let us assume the following:
The table EMP is stored in file F1, which consists of 105 pages.
Let there be a B*-tree index on attribute ENO in EMP: IEMP(ENO).This B*-tree
has a height h* of 3 and is stored in file F2, which consists of 106 pages. The leaf
nodes of the B*-tree contain the page numbers Pi of file F1, in which the corresponding EMP records are stored.
How many (external) page accesses on F1 and F2 are required for answering the
aforementioned SQL query, if we assume:
a) Dynamic extent allocation
Solution:
F2 : 3 Accesses (Traversal of the B*-tree)
a) How are blocks arranged on a track, if we assume a displacement factor I of 1,
2, 3, 4, 5, 6, 7, 8, or 9?
F1 : 1 Access
Internal calculation of the page addresses (small table!).
Two examples:
b) Dynamic block allocation (vector)
25 13 1 14
12
2
24
15
3
11
23
16
i=2
10
4
22
17
9
5
21
18
8 20 7
6
19
22
4
11
18
25
7
14
21
15 8
1 19
12
i=7
3 10
6
17 24
13
F2 : 6 Accesses
5
23
16
9
2
20
Note
F1 : 2 Accesses
Since the vector is very large, it requires for every access to the vector a corresponding page
access.
c) Dynamic block allocation (UNIX file system)
F2 : 12 Accesses (3 * 4)
F1 : 4 Accesses (in worst case).
The UNIX File System in a Nutshell
• Base structure (for the organization of disk storage)
For I=5 and m = 25, we get m mod I = 0. Consequently, for I=5, the displacement
strategy described in the lecture notes does not work. (For further information,
please refer to Härder, Rahm: Datenbanksysteme - Konzepte und Techniken der Realisierung, pp. 80 ff.
0123
m-1 m m+1
...
n-1n
...
data area
I(ndex) list
super block
boot block
b) How many disk cycles (rotations) are required to read m blocks sequentially?
Within a single disk cycle (rotation), j=m/I blocks can be read. Consequently, m/j=I
disk cycles are required.
1
2
Realization of Database Systems
Realization of Database Systems
SS 2009 – Exercise 2
Exercise 2.3: Indirect Page Allocation, Shadow Page Mechanism, and Differential File Method
• Important description information
Directory entries relate to the file name an index I (1 ≤ I ≤ m-1).
Each node of the I-list contains the following information:
a. Identification of the file owner
b. Protection bits
c. Addresses of 13 physical blocks
d. File size in bytes
e. Time of creation
f. Number of references to the file
g. Kind of file
Assume that a database consists of two segments, each having 8 pages, and that 32
blocks are available for storing it. On this database, two interfering transactions
manipulate these pages according to the following sequence:
T1:
•
1024
…
•
…
•
1024
1024
…
•
1024
…
•
1024
…
•
•
1024 1024
3 accesses
P25
Pij
Page j in segment i
EOT
End of Transaction
•
1024
P17 EOT
P16
P27 EOT
P11 P12 P13 P14 P15 P16 P17 P18 P21 P22 P23 P24 P25 P26 P27 P28
…
B1 B3 B6 B5 B7 B13 B9 B10 B15 B18 B19 B17 B21 B28 B24 B26
•
1024
…
2 accesses
P21
P24
For Indirect Page Allocation, we assume the following mapping of pages (P) to
blocks (B):
…
Solution:
…
a) Please describe the construction principle of the page tables regarding Indirect
Page Allocation.
.. .
1 access
P15
To get used to the concepts of Indirect Page Allocation, Shadow Page Mechanism,
and Differential File Method, please illustrate the princinples of these concepts
using this example. Before transaction T1 starts, all segments are closed.
inodes (13 block addresses)
10
P12
T2:
Storage Allocation in the UNIX File System
• • • • • • • • • • • • •
SS 2009 – Exercise 2
4 accesses
.
Seg. 1: V1
Seg. 2: V2
1 3 6 5 7 13 9 10
1
11
1
15 18 19 17 21 28 24 26
8
12
14 13 15
16
17 18
8
16
21
Page tables
24
24 22 23
16
25
27
24
32
28
26
File
32
1 0 1 0 1 1 1 0 1 1 0 0 1 0 1 0 1 1 1 0 1 0 0 1 0 1 0 1 0 0 0 0
3
4
Blocks
Bit table for
free placement admin.
Realization of Database Systems
Realization of Database Systems
SS 2009 – Exercise 2
c)
b) Which additional data structures are needed for the Shadow Page Mechanism?
Please illustrate their modifications during transaction processing..
1
8
16
24
11 12 12 21 14 13 15 15 17 18 25 24 16 16 21 17 24 22 23 27 25
V11 1 3 6 5 7 13 9 10
V21
27
26
File
1
V10
V20
8
8
32
28
26
File
V11 1 3 6 5 7 13 9 10
V21
15 18 19 17 21 28 24 26
Page tables
16
24
32
1 0 1 0 1 1 1 0 1 1 0 0 1 0 1 0 1 1 1 0 1 0 0 1 0 1 0 1 0 0 0 0
1 01 1 01 1 1 1 01 1 1 01 01 1 01 1 01 1 1 1 0 1 0 0 1 0 1 0 1 0 0 0 0
MAPSWITCH
27
4 18 19 12 11 28 20 26 Current version
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 0 1 0 1 0 0 0 0
STATUS
24
25
15 18 19 17 21 28 24 26 Shadow version
1 2 6 5 8 14 16 10
V10
1
16
11 12 12 21 14 13 15 15 17 18 25 24 16 16 21 17 24 22 23
Shadow pages
1 2 6 5 8 14 16 10
After EOT of T1, please apply the Shadow Page Mechanism and create a checkpoint for both segments. When EOT of T2 is reached, use the Shadow Page Mechanism and create a checkpoint for segment 2.
After EOT1:
32
28
SS 2009 – Exercise 2
V20
4 18 19 12 11 28 24 26
Bit tables
CMAP
MAP0
MAP1
1
8
16
32 Bit tables
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 0 1 0 1 0 1 0 0 0 0
CMAP
1 0 1 0 1 1 1 0 1 1 0 0 1 0 1 0 1 1 1 0 1 0 0 1 0 1 0 1 0 0 0 0
MAP0
1 1 0 1 1 1 0 1 0 1 1 1 0 1 0 1 0 1 1 0 0 0 0 1 0 1 0 1 0 0 0 0
MAP1
1
1
STATUS
0
MAPSWITCH
0
0
1
Actions:
State after establishing the first checkpoint:
• All modifications were successful.
• No checkpoint was created.
• MAPSWITCH shows valid version of MAP
•
•
•
•
• STATUS indicates modifications in both segments
24
MAP1 re-calculated, stored, and switched to MAP1
Current version stored
Modified pages stored
Resetting the STATUS
• CMAP and shadow pages are invalid (hatched tables). They will be initialized when the next
modification occurs.
5
6
Realization of Database Systems
SS 2009 – Exercise 2
After EOT2:
1
8
16
24
11 12 27 21 14 13 15 15 17 18 25 24 16 16 21 17 24 22 23
25
27
26
SS 2009 – Exercise 2
d) How must the Shadow Page Mechanism be modified to allow for an implementation of transaction-oriented checkpoints? Please comprehend such a checkpoint at EOT of T1 using our example.
32
28
Realization of Database Systems
File
In essence, transaction-specific tables
9 10
14 16
10
V11 11 32 66 55 78 13
V21
4 18 19 12 11 28 24 26
Page tables
V10
1 2 6 5 8 14 16 10
1
8
V20
4 18 19 12 11 28 3 26
16
24
32 Bit tables
1 1 1 1 1 1 0 1 0 1 1 1 0 1 0 1 0 1 1 0 0 0 0 1 0 1 0 1 0 0 0 0
CMAP
(Please refer to: Datenbanksysteme - Konzepte und Techniken der Realisierung, S. 101,
genauer in Härder T., Implementierung von Datenbanksystemen, Carl Hanser, München, 1978,
oder Härder, T., Reuter, A.: Optimization of Logging and Recovery in a Database System, in:
Data Base Architecture, Proc. IFIP TC-2 Working Conference, June 1979, Venice, Italy, Bracchi, G. and Nijssen, G.M. (eds.), North Holland Publ. Comp., 1979, pp. 151-168. )
Exercise 2.4: Bloom Filter
For the Differential File Method, it is very effective to use a bloom filter for deciding
whether a page modification occured, and, as a consequence, whether the differential file probably contains this modification or not.
1 1 1 1 1 1 0 1 0 1 1 1 0 1 0 1 0 1 1 0 0 0 0 0 0 1 0 1 0 0 0 0
MAP0
Let there be a bit vector of size 8 and a hash function h(x), which is defiened as:
1 1 0 1 1 1 0 1 0 1 1 1 0 1 0 1 0 1 1 0 0 0 0 1 0 1 0 1 0 0 0 0
MAP1
h(x) = (binary representation of x) XOR (01010101).
STATUS
MAPSWITCH
The pages with key values 31, 53, and 62 are modified. Now, the pages with key values 53, 93, and 124 shall be read.
0
0
Solution:
0
a) Which results are provided by the bloom filter for the read operations?
Actions:
• Before modifying P27, V21 is initialized and CMP is adjusted according to MAP1
• After establishing the checkpoint, MAP0 is created, stored, and switched to MAP0
• V1i are not affected
Vector after initializations: (0000 0000).
h(31) = (0100 1010) --> vector after page modification: (0100 1010)
h(53) = (0110 0000) --> vector after page modification: (0110 1010)
h(62) = (0110 1011) --> d vector after page modification: (0110 1011)
Reading record 53 using the bloom filter delivers „PROBABLY“,
Reading record 93 (h(93)=(0110 1001)) using the bloom filter also delivers „PROBABLY“
Reading record 24 (h(124)=(0010 1001)) using the bloom filter results in „PROBABLY“.
b) Why is the given hash function inappropriate for the bloom filter and which properties must be fulfilled by a well-suited hash function?
An suitable bloom filter should be able to allocate for every record a more or less equal number
of vector positions. The given hash function does not fit this property.
7
8