Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
15.6 Index Based Algorithms Priyank Shah CS 257 Contents • Clustering and non-clustering indexes • Index based Selection • Joining using an index • Joining using a sorted index • A database index is data structure that improves the speed of data retrieval operations on a database table at the cost of slower writes and increased storage space. • Non clustered index - contains the index keys in sorted order, with the leaf level of the index containing the pointer to the record • A bitmap index is a special kind of index that stores the bulk of its data as bit arrays • Algorithms are useful for the selection operator. • In a clustered relation tuples are packed roughly as few blocks, as they can possibly hold those tuples. Clustering And Nonclustering Indexes • Clustering indexes are on an attribute or attributes such that all the tuples with a fixed value for the search key of this index appear on roughly as few blocks as can hold them. • A relation that isn’t clustered cannot have a clustering index Index-based Selection • For a selection σC(R), suppose C is of the form a=v, where a is an attribute • For clustering index R.a: the number of disk I/O’s will be B(R)/V(R,a) Index-based Selection • The actual number may be higher: 1. index is not kept entirely in main memory 2. they spread over more blocks 3. may not be packed as tightly as possible into blocks Example • B(R)=1000, T(R)=20,000 number of I/O’s required: • 1. clustered, not index 1000 • 2. not clustered, not index 20,000 • 3. If V(R,a)=100, index is clustering 10 • 4. If V(R,a)=10, index is nonclustering 2,000 Joining by using an index • Natural join R(X, Y) S S(Y, Z) Number of I/O’s to get R Clustered: B(R) Not clustered: T(R) Number of I/O’s to get tuple t of S Clustered: T(R)B(S)/V(S,Y) Not clustered: T(R)T(S)/V(S,Y) Example • R(X,Y): 1000 blocks S(Y,Z)=500 blocks Assume 10 tuples in each block, so T(R)=10,000 and T(S)=5000 V(S,Y)=100 If R is clustered, and there is a clustering index on Y for S the number of I/O’s for R is: 1000 the number of I/O’s for S is10,000*500/100=50,000 Joining Using a Sorted index • Natural join R(X, Y) S (Y, Z) with index on Y for either R or S • Example: relation R(X,Y) and R(Y,Z) with index on Y for both relations search keys (Y-value) for R: 1,3,4,4,5,6 search keys (Y-value) for S: 2,2,4,6,7,8 Joining using a sorted index • Used when the index is a B-tree, or structure from which we easily can extract the tuples of a relation in sorted order.