Download 15.6 Index Based Algorithms

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
15.6 Index Based Algorithms
Priyank Shah
CS 257
Contents
• Clustering and non-clustering indexes
• Index based Selection
• Joining using an index
• Joining using a sorted index
• A database index is data structure that
improves the speed of data retrieval
operations on a database table at the cost of
slower writes and increased storage space.
• Non clustered index - contains the index keys
in sorted order, with the leaf level of the index
containing the pointer to the record
• A bitmap index is a special kind of index that
stores the bulk of its data as bit arrays
• Algorithms are useful for the selection
operator.
• In a clustered relation tuples are packed
roughly as few blocks, as they can possibly
hold those tuples.
Clustering And Nonclustering Indexes
• Clustering indexes are on an attribute or
attributes such that all the tuples with a fixed
value for the search key of this index appear
on roughly as few blocks as can hold them.
• A relation that isn’t clustered cannot have a
clustering index
Index-based Selection
• For a selection σC(R), suppose C is of the form
a=v, where a is an attribute
• For clustering index R.a:
the number of disk I/O’s will be B(R)/V(R,a)
Index-based Selection
• The actual number may be higher:
1. index is not kept entirely in main
memory
2. they spread over more blocks
3. may not be packed as tightly as possible
into blocks
Example
• B(R)=1000, T(R)=20,000 number of I/O’s
required:
• 1. clustered, not index
1000
• 2. not clustered, not index
20,000
• 3. If V(R,a)=100, index is clustering
10
• 4. If V(R,a)=10, index is nonclustering 2,000
Joining by using an index
• Natural join R(X, Y) S S(Y, Z)
Number of I/O’s to get R
Clustered: B(R)
Not clustered: T(R)
Number of I/O’s to get tuple t of S
Clustered: T(R)B(S)/V(S,Y)
Not clustered: T(R)T(S)/V(S,Y)
Example
• R(X,Y): 1000 blocks S(Y,Z)=500 blocks
Assume 10 tuples in each block,
so T(R)=10,000 and T(S)=5000
V(S,Y)=100
If R is clustered, and there is a clustering index on
Y for S
the number of I/O’s for R is:
1000
the number of I/O’s for S
is10,000*500/100=50,000
Joining Using a Sorted index
• Natural join R(X, Y) S (Y, Z) with index on Y
for either R or S
• Example:
relation R(X,Y) and R(Y,Z) with index on Y for
both relations
search keys (Y-value) for R: 1,3,4,4,5,6
search keys (Y-value) for S: 2,2,4,6,7,8
Joining using a sorted index
• Used when the index is a B-tree, or structure
from which we easily can extract the tuples of
a relation in sorted order.