* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Exercise 2– Solution proposal
Survey
Document related concepts
Transcript
Realization of Database Systems SS 2009 – Exercise 2 Prof. Dr.-Ing. Dr. h. c. T. Härder Computer Science Department Databases and Information Systems University of Kaiserslautern Realization of Database Systems SS 2009 – Exercise 2 c) Which pros and cons are raised by random-access reads on single blocks? None (see Datenbanksysteme - Konzepte und Techniken der Realisierung, p. 79 f.) Exercise 2.2: Costs for Accessing Files Let there be the following SQL query: SELECT * FROM EMP WHERE ENO = ‘123456‘ Exercise 2– Solution proposal Documentation of the lecture: „http://wwwlgis.informatik.uni-kl.de/cms/courses/realisierung/“ (May 20, 2009, 3.30 pm, 36-336) Exercise 2.1: Block Allocation on External Memory To speed-up sequential processing, blocks are not arranged sequentially on a track. Instead, they are staggered in such a way, that, within a single disk cycle, several blocks can be read in a sequence. A track on a disk consists of m=25 4KByte slots and every slot can handle a single block. Let us assume the following: The table EMP is stored in file F1, which consists of 105 pages. Let there be a B*-tree index on attribute ENO in EMP: IEMP(ENO).This B*-tree has a height h* of 3 and is stored in file F2, which consists of 106 pages. The leaf nodes of the B*-tree contain the page numbers Pi of file F1, in which the corresponding EMP records are stored. How many (external) page accesses on F1 and F2 are required for answering the aforementioned SQL query, if we assume: a) Dynamic extent allocation Solution: F2 : 3 Accesses (Traversal of the B*-tree) a) How are blocks arranged on a track, if we assume a displacement factor I of 1, 2, 3, 4, 5, 6, 7, 8, or 9? F1 : 1 Access Internal calculation of the page addresses (small table!). Two examples: b) Dynamic block allocation (vector) 25 13 1 14 12 2 24 15 3 11 23 16 i=2 10 4 22 17 9 5 21 18 8 20 7 6 19 22 4 11 18 25 7 14 21 15 8 1 19 12 i=7 3 10 6 17 24 13 F2 : 6 Accesses 5 23 16 9 2 20 Note F1 : 2 Accesses Since the vector is very large, it requires for every access to the vector a corresponding page access. c) Dynamic block allocation (UNIX file system) F2 : 12 Accesses (3 * 4) F1 : 4 Accesses (in worst case). The UNIX File System in a Nutshell • Base structure (for the organization of disk storage) For I=5 and m = 25, we get m mod I = 0. Consequently, for I=5, the displacement strategy described in the lecture notes does not work. (For further information, please refer to Härder, Rahm: Datenbanksysteme - Konzepte und Techniken der Realisierung, pp. 80 ff. 0123 m-1 m m+1 ... n-1n ... data area I(ndex) list super block boot block b) How many disk cycles (rotations) are required to read m blocks sequentially? Within a single disk cycle (rotation), j=m/I blocks can be read. Consequently, m/j=I disk cycles are required. 1 2 Realization of Database Systems Realization of Database Systems SS 2009 – Exercise 2 Exercise 2.3: Indirect Page Allocation, Shadow Page Mechanism, and Differential File Method • Important description information Directory entries relate to the file name an index I (1 ≤ I ≤ m-1). Each node of the I-list contains the following information: a. Identification of the file owner b. Protection bits c. Addresses of 13 physical blocks d. File size in bytes e. Time of creation f. Number of references to the file g. Kind of file Assume that a database consists of two segments, each having 8 pages, and that 32 blocks are available for storing it. On this database, two interfering transactions manipulate these pages according to the following sequence: T1: • 1024 … • … • 1024 1024 … • 1024 … • 1024 … • • 1024 1024 3 accesses P25 Pij Page j in segment i EOT End of Transaction • 1024 P17 EOT P16 P27 EOT P11 P12 P13 P14 P15 P16 P17 P18 P21 P22 P23 P24 P25 P26 P27 P28 … B1 B3 B6 B5 B7 B13 B9 B10 B15 B18 B19 B17 B21 B28 B24 B26 • 1024 … 2 accesses P21 P24 For Indirect Page Allocation, we assume the following mapping of pages (P) to blocks (B): … Solution: … a) Please describe the construction principle of the page tables regarding Indirect Page Allocation. .. . 1 access P15 To get used to the concepts of Indirect Page Allocation, Shadow Page Mechanism, and Differential File Method, please illustrate the princinples of these concepts using this example. Before transaction T1 starts, all segments are closed. inodes (13 block addresses) 10 P12 T2: Storage Allocation in the UNIX File System • • • • • • • • • • • • • SS 2009 – Exercise 2 4 accesses . Seg. 1: V1 Seg. 2: V2 1 3 6 5 7 13 9 10 1 11 1 15 18 19 17 21 28 24 26 8 12 14 13 15 16 17 18 8 16 21 Page tables 24 24 22 23 16 25 27 24 32 28 26 File 32 1 0 1 0 1 1 1 0 1 1 0 0 1 0 1 0 1 1 1 0 1 0 0 1 0 1 0 1 0 0 0 0 3 4 Blocks Bit table for free placement admin. Realization of Database Systems Realization of Database Systems SS 2009 – Exercise 2 c) b) Which additional data structures are needed for the Shadow Page Mechanism? Please illustrate their modifications during transaction processing.. 1 8 16 24 11 12 12 21 14 13 15 15 17 18 25 24 16 16 21 17 24 22 23 27 25 V11 1 3 6 5 7 13 9 10 V21 27 26 File 1 V10 V20 8 8 32 28 26 File V11 1 3 6 5 7 13 9 10 V21 15 18 19 17 21 28 24 26 Page tables 16 24 32 1 0 1 0 1 1 1 0 1 1 0 0 1 0 1 0 1 1 1 0 1 0 0 1 0 1 0 1 0 0 0 0 1 01 1 01 1 1 1 01 1 1 01 01 1 01 1 01 1 1 1 0 1 0 0 1 0 1 0 1 0 0 0 0 MAPSWITCH 27 4 18 19 12 11 28 20 26 Current version 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 0 1 0 1 0 0 0 0 STATUS 24 25 15 18 19 17 21 28 24 26 Shadow version 1 2 6 5 8 14 16 10 V10 1 16 11 12 12 21 14 13 15 15 17 18 25 24 16 16 21 17 24 22 23 Shadow pages 1 2 6 5 8 14 16 10 After EOT of T1, please apply the Shadow Page Mechanism and create a checkpoint for both segments. When EOT of T2 is reached, use the Shadow Page Mechanism and create a checkpoint for segment 2. After EOT1: 32 28 SS 2009 – Exercise 2 V20 4 18 19 12 11 28 24 26 Bit tables CMAP MAP0 MAP1 1 8 16 32 Bit tables 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 0 1 0 1 0 1 0 0 0 0 CMAP 1 0 1 0 1 1 1 0 1 1 0 0 1 0 1 0 1 1 1 0 1 0 0 1 0 1 0 1 0 0 0 0 MAP0 1 1 0 1 1 1 0 1 0 1 1 1 0 1 0 1 0 1 1 0 0 0 0 1 0 1 0 1 0 0 0 0 MAP1 1 1 STATUS 0 MAPSWITCH 0 0 1 Actions: State after establishing the first checkpoint: • All modifications were successful. • No checkpoint was created. • MAPSWITCH shows valid version of MAP • • • • • STATUS indicates modifications in both segments 24 MAP1 re-calculated, stored, and switched to MAP1 Current version stored Modified pages stored Resetting the STATUS • CMAP and shadow pages are invalid (hatched tables). They will be initialized when the next modification occurs. 5 6 Realization of Database Systems SS 2009 – Exercise 2 After EOT2: 1 8 16 24 11 12 27 21 14 13 15 15 17 18 25 24 16 16 21 17 24 22 23 25 27 26 SS 2009 – Exercise 2 d) How must the Shadow Page Mechanism be modified to allow for an implementation of transaction-oriented checkpoints? Please comprehend such a checkpoint at EOT of T1 using our example. 32 28 Realization of Database Systems File In essence, transaction-specific tables 9 10 14 16 10 V11 11 32 66 55 78 13 V21 4 18 19 12 11 28 24 26 Page tables V10 1 2 6 5 8 14 16 10 1 8 V20 4 18 19 12 11 28 3 26 16 24 32 Bit tables 1 1 1 1 1 1 0 1 0 1 1 1 0 1 0 1 0 1 1 0 0 0 0 1 0 1 0 1 0 0 0 0 CMAP (Please refer to: Datenbanksysteme - Konzepte und Techniken der Realisierung, S. 101, genauer in Härder T., Implementierung von Datenbanksystemen, Carl Hanser, München, 1978, oder Härder, T., Reuter, A.: Optimization of Logging and Recovery in a Database System, in: Data Base Architecture, Proc. IFIP TC-2 Working Conference, June 1979, Venice, Italy, Bracchi, G. and Nijssen, G.M. (eds.), North Holland Publ. Comp., 1979, pp. 151-168. ) Exercise 2.4: Bloom Filter For the Differential File Method, it is very effective to use a bloom filter for deciding whether a page modification occured, and, as a consequence, whether the differential file probably contains this modification or not. 1 1 1 1 1 1 0 1 0 1 1 1 0 1 0 1 0 1 1 0 0 0 0 0 0 1 0 1 0 0 0 0 MAP0 Let there be a bit vector of size 8 and a hash function h(x), which is defiened as: 1 1 0 1 1 1 0 1 0 1 1 1 0 1 0 1 0 1 1 0 0 0 0 1 0 1 0 1 0 0 0 0 MAP1 h(x) = (binary representation of x) XOR (01010101). STATUS MAPSWITCH The pages with key values 31, 53, and 62 are modified. Now, the pages with key values 53, 93, and 124 shall be read. 0 0 Solution: 0 a) Which results are provided by the bloom filter for the read operations? Actions: • Before modifying P27, V21 is initialized and CMP is adjusted according to MAP1 • After establishing the checkpoint, MAP0 is created, stored, and switched to MAP0 • V1i are not affected Vector after initializations: (0000 0000). h(31) = (0100 1010) --> vector after page modification: (0100 1010) h(53) = (0110 0000) --> vector after page modification: (0110 1010) h(62) = (0110 1011) --> d vector after page modification: (0110 1011) Reading record 53 using the bloom filter delivers „PROBABLY“, Reading record 93 (h(93)=(0110 1001)) using the bloom filter also delivers „PROBABLY“ Reading record 24 (h(124)=(0010 1001)) using the bloom filter results in „PROBABLY“. b) Why is the given hash function inappropriate for the bloom filter and which properties must be fulfilled by a well-suited hash function? An suitable bloom filter should be able to allocate for every record a more or less equal number of vector positions. The given hash function does not fit this property. 7 8