Download Overview of Storage and Indexing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Binary search tree wikipedia , lookup

B-tree wikipedia , lookup

Transcript
Overview of Storage and Indexing
Yanlei Diao
UMass Amherst
Feb 13, 2007
Slides Courtesy of R. Ramakrishnan and J. Gehrke
1
DBMS Architecture
Query Parser
Query Rewriter
Query Optimizer
Query Executor
Lock Manager
Access Methods
Log Manager
Buffer Manager
Disk Space Manager
DB
2
Data on External Storage

Disks: Can retrieve random pages at (almost) fixed cost
 But reading several consecutive pages is much faster than reading
them in random order.

Tapes: Can only read pages in sequence
 Slower but cheaper than disks; used for archival storage

Page: Unit of information read from or written to disk
 Size of page: DBMS parameter, 4KB, 8KB

Disk space manager:
 Abstraction: a collection of pages.
 Allocate/de-allocate a page.
 Read/write a page.

Page I/O:
 Pages read from disk and pages written to disk
 Dominant cost of database operations
3
Access Methods

Access methods: routines to manage various disk-based
data structures.
 Files of records
 Various kinds of indexes

File of records:
 Important abstraction of external storage in a DBMS!
 Record id (rid) is sufficient to physically locate a record

Indexes:
 Auxiliary data structures
 Given a value in the index search key field(s), find the record
ids of records with this value
4
Buffer Management

Architecture:
 Data is read into memory for processing
 Data is written to disk for persistent storage
Buffer manager stages pages between external
storage and main memory buffer pool.
 Access method layer makes calls to the buffer
manager.

5
Access Methods
Many alternatives exist, each ideal for some
situations, and not so good for others:



Heap (unorder) files: Suitable when typical access
is a file scan retrieving all records.
Sorted Files: Best if records are retrieved in order
of the sorting attribute(s), or only a `range’ of
records is needed.
Indexes: Data structures to organize records via
trees or hashing.
•
•
Like sorted files, they speed up searches for a subset of
records, based on values in certain (“search key”) fields
Updates are much faster than in sorted files.
6
Indexes

An index on a file speeds up selections on the search key
field(s) for the index.



Any subset of the fields of a relation can be the search key for an
index on the relation.
Search key is not the same as key (minimal set of fields that
uniquely identify a record in a relation).
An index contains a collection of data entries, and
supports efficient retrieval of all data entries k* with a
given key value k.
 Data entry (in an index) versus data record (in a file).
 Given a data entry k*, we can find record(s) with the key k in at
most one disk I/O. (Details soon …)
7
B+ Tree Indexes
Non-leaf
Pages
Leaf
Pages
(Sorted by search key)

Leaf pages contain data entries:
• Leaf pages are chained using prev & next pointers
• Data entries are sorted by the search key value
8
B+ Tree Indexes
Non-leaf
Pages
Leaf
Pages
(Sorted by search key)

Non-leaf pages have index entries; only used to direct searches
index entry
P0
K 1
P1
K 2
P 2
K m Pm
9
Example B+ Tree
Data entries in leaf
nodes are sorted.
Root
17
Entries <= 17
5
2*



3*
Entries > 17
27
13
5*
7* 8*
14* 16*
22* 24*
30
27* 29*
33* 34* 38* 39*
Equality selection: find 28*? 29*?
Range selection: find all > 15* and < 30*
Insert/delete: Find the data entry in a leaf, then change it.
 Need to adjust the parent sometimes.
 And the change sometimes bubbles up the tree.
10
Hash-Based Indexes
Good for equality selections.
 A hash index is a collection of
buckets.

 Bucket = primary page plus zero
or more overflow pages.
Search key
value k
h
1 2 3 … … … … … … N-1
 Buckets contain data entries.

Hashing function h:
h(k) = bucket of data entries that
may contain the search key value k.
 No need for “index entries” in this scheme.
11
Alternatives for Data Entry k* in Index

In a data entry k* we can store:
 Alt1: Data record with key value k
 Alt2: <k, rid of data record with search key value k>
 Alt3: <k, list of rids of data records with search key k>

Choice of an alternative for data entries is
orthogonal to an indexing technique used.
 Indexing techniques: B+ tree, hashing
 Every indexing technique can be combined with any
alternative.
12
Alternative 1 for Data Entries

Alternative 1:

Index structure itself is a file organization for data
records (instead of a Heap or sorted file).

At most one index on a given collection of data records
can use Alternative 1.
•

Otherwise, data records are duplicated, leading to redundant
storage and potential inconsistency.
If data records are very large, # of pages containing
data entries is high.
•
Size of the index structure to direct searches (e.g., non-leaf
nodes in B+ tree) can also be large.
13
Alternatives 2, 3 for Data Entries

Alternatives 2 and 3:

Data entries, <k, rid(s)>, typically much smaller than
data records.
•
•

Alternative 3 more compact than Alternative 2
•

Index structure used to direct search is much smaller than
with Alternative 1.
So, better than Alternative 1 with large data records,
especially if search keys are small.
In the presence of multiple K* entries!
Alternative 3 leads to variable sized data entries,
even if search keys are of fixed length.
•
Variable sizes records/data entries are hard to manage.
14
Index Classification

Recall: Key? Primary key? Candidate key?

Primary index vs. secondary index:
 If search key contains the primary key, then called
primary index.
 Other indexes are called secondary indexes.

Unique index: Search key contains a candidate
key.

No data entries can have the same search key value.
15
Index Classification (Contd.)

Clustered index: order of data records is the same
as, or `close to’, order of (sorted) data entries.





Alternative 1 implies clustered
Alternatives 2 and 3 are clustered only if data records
are sorted on the search key field.
A data file can be clustered on at most one search key.
Retrieving data records: fast, with one or a few I/Os.
Why?
Unclustered index:


Multiple unclustered indexes on a data file.
Retrieving data records: can be slow, toughing many
pages.
16
Clustered vs. Unclustered Indexes
CLUSTERED
Index entries
direct search for
data entries
Data entries
UNCLUSTERED
Data entries
(Index File)
(Data file)
Data Records

Data Records
Suppose that Alternative (2) is used for data entries, and data
records are stored in a Heap file.


To build a clustered index, first sort the Heap file (with some free
space on each page for future inserts).
Overflow pages may be needed for inserts. (Thus, order of data
records is `close to’, but not identical to, the sort order.)
17
Index Classification (Contd.)

Dense index: contains (at least) one data entry for
every search key value that appears in a record in the
indexed file
 Alternatives 1, 2, 3

Sparse index: contains one entry for each page of
records in the indexed file.




Sparse index has to be clustered!
Alternative 1?
Alternative 2?
Alternative 3?
18
Cost Model for Our Analysis
We ignore CPU costs, for simplicity:

B: The number of data pages

R: Number of records per page

D: (Average) time to read or write disk page

I/O cost: Measuring # of page I/O’s ignores pre-fetching
of a sequence of pages; thus, approximate I/O cost.

Average-case analysis, based on simplistic assumptions.
 Good enough to show the overall trends!
19
Comparing File Organizations





Heap files (random order; insert at eof)
Sorted files, sorted on <age, sal>
Clustered B+ tree file, Alternative (1),
search key <age, sal>
Unclustered B + tree index on a heap file,
search key <age, sal>
Unclustered hash index on a heap file,
search key <age, sal>
20
Operations to Compare


Scan: Fetch all records from disk
Equality search:
e.g. (age, sal) = (24, 50K)

Range selection:
e.g. (age, sal)  [(22, 20K), (30, 200K)]


Insert a record
Delete a record
21
Assumptions in Our Analysis

Heap Files:


Sorted Files:


Equality selection on key; exactly one match.
Files compacted after deletions.
Indexes:
 Alt (2), (3): size of data entry = 10% size of data record
 Hash: No overflow chains.
•

80% page occupancy => File size = 1.25 data size
B+Tree:
•
•
67% occupancy (typical): implies file size = 1.5 data size
Balanced with fanout F (133 typical) at each non-leaf level
22
Assumptions (contd.)

Scans:
 Leaf nodes of a tree-index are chained.
 Index data entries plus actual file scanned for
unclustered indexes.

Range searches:
 We use tree indexes to restrict the set of data
records fetched, but ignore hash indexes.
23
Cost of Operations
(a) Scan
(b) Equality
(c ) Range
(d) Insert (e) Delete
(1) Heap
BD
0.5BD
BD
2D
(2) Sorted
BD
Dlog 2B
D(log 2 B +
# pgs with
match recs)
(3)
1.5BD
Dlog F 1.5B D(log F 1.5B
Clustered
+ # pages w.
match recs)
(4) Unclust. BD(R+0.15)
D(1 +
D(log F 0.15B
Tree index
log F 0.15B) + # match
recs)
(5) Unclust. BD(R+0.125) 2D
BD
Hash index
Search
+ BD
Search
+D
Search
+BD
Search
+D
Search
+D
Search
+ 2D
Search
+ 2D
Search
+ 2D
Search
+ 2D
 Several assumptions underlie these (rough) estimates!
24
Cost of Operations
(a) Scan
(b) Equality
(c ) Range
(d) Insert (e) Delete
(1) Heap
BD
0.5BD
BD
2D
(2) Sorted
BD
Dlog 2B
Search
+D
Search
+BD
D(log 2 B +
Search
# pgs with
+ BD
match recs)
(3)
1.5BD
Dlog F 1.5B D(log F 1.5B Search Search
Clustered
+ # pages w. + D
+D
match recs)
(4) Unclust. BD(R+0.15)
D(1 +
D(log F 0.15B Search Search
Tree index
log F 0.15B) + # match
+ 2D
+ 2D
recs)
(5)
BD(R+0.125)
2D
BD
Search
# ofUnclust.
leaf pages:
B*0.1/67%=0.15B;
Cost of index I/O:
0.15BD Search
Hash
index
# of data
entries on a leaf page: R*10*0.67=6.7R; Cost of data I/O:+0.15B*6.7R*D=BDR
2D
+ 2D
Key:
unclustered
B+tree,
one I/O per data entry!
#
of data
entry pages:
B*0.1/80%=0.125B;
Cost of index I/O: 0.125BD
Several
assumptions
underlie
these
(rough)
estimates!
# of data
entries
on a hash
page: R*10*0.80=8R;
Cost
of data
I/O: 0.125B*8R*D=BDR
Key: unclustered hash index, one I/O per data entry!
25
Summary
Many alternative access methods exist, each
appropriate in some situation.
 Index is a collection of data entries plus a way
to quickly find entries with given key values.
 If selection queries are frequent, building an
index is important.



Hash-based indexes only good for equality search.
Tree-based indexes best for range search; also good
for equality search.
26
Summary (Contd.)

Data entries can be actual data records, <key,
rid> pairs, or <key, rid-list> pairs.

Choice orthogonal to indexing technique used to
locate data entries with a given key value.
Can have several indexes on a given file of
data records, each with a different search key.
 Indexes can be classified as clustered vs.
unclustered, primary vs. secondary, etc.
Differences have important consequences for
utility/performance.

27
Questions
28