Download Overview of Storage and Indexing

Overview of Storage and Indexing Yanlei Diao UMass Amherst Feb 13, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke 1 DBMS Architecture Query Parser Query Rewriter Query Optimizer Query Executor Lock Manager Access Methods Log Manager Buffer Manager Disk Space Manager DB 2 Data on External Storage  Disks: Can retrieve random pages at (almost) fixed cost  But reading several consecutive pages is much faster than reading them in random order.  Tapes: Can only read pages in sequence  Slower but cheaper than disks; used for archival storage  Page: Unit of information read from or written to disk  Size of page: DBMS parameter, 4KB, 8KB  Disk space manager:  Abstraction: a collection of pages.  Allocate/de-allocate a page.  Read/write a page.  Page I/O:  Pages read from disk and pages written to disk  Dominant cost of database operations 3 Access Methods  Access methods: routines to manage various disk-based data structures.  Files of records  Various kinds of indexes  File of records:  Important abstraction of external storage in a DBMS!  Record id (rid) is sufficient to physically locate a record  Indexes:  Auxiliary data structures  Given a value in the index search key field(s), find the record ids of records with this value 4 Buffer Management  Architecture:  Data is read into memory for processing  Data is written to disk for persistent storage Buffer manager stages pages between external storage and main memory buffer pool.  Access method layer makes calls to the buffer manager.  5 Access Methods Many alternatives exist, each ideal for some situations, and not so good for others:    Heap (unorder) files: Suitable when typical access is a file scan retrieving all records. Sorted Files: Best if records are retrieved in order of the sorting attribute(s), or only a `range’ of records is needed. Indexes: Data structures to organize records via trees or hashing. • • Like sorted files, they speed up searches for a subset of records, based on values in certain (“search key”) fields Updates are much faster than in sorted files. 6 Indexes  An index on a file speeds up selections on the search key field(s) for the index.    Any subset of the fields of a relation can be the search key for an index on the relation. Search key is not the same as key (minimal set of fields that uniquely identify a record in a relation). An index contains a collection of data entries, and supports efficient retrieval of all data entries k* with a given key value k.  Data entry (in an index) versus data record (in a file).  Given a data entry k*, we can find record(s) with the key k in at most one disk I/O. (Details soon …) 7 B+ Tree Indexes Non-leaf Pages Leaf Pages (Sorted by search key)  Leaf pages contain data entries: • Leaf pages are chained using prev & next pointers • Data entries are sorted by the search key value 8 B+ Tree Indexes Non-leaf Pages Leaf Pages (Sorted by search key)  Non-leaf pages have index entries; only used to direct searches index entry P0 K 1 P1 K 2 P 2 K m Pm 9 Example B+ Tree Data entries in leaf nodes are sorted. Root 17 Entries <= 17 5 2*    3* Entries > 17 27 13 5* 7* 8* 14* 16* 22* 24* 30 27* 29* 33* 34* 38* 39* Equality selection: find 28*? 29*? Range selection: find all > 15* and < 30* Insert/delete: Find the data entry in a leaf, then change it.  Need to adjust the parent sometimes.  And the change sometimes bubbles up the tree. 10 Hash-Based Indexes Good for equality selections.  A hash index is a collection of buckets.   Bucket = primary page plus zero or more overflow pages. Search key value k h 1 2 3 … … … … … … N-1  Buckets contain data entries.  Hashing function h: h(k) = bucket of data entries that may contain the search key value k.  No need for “index entries” in this scheme. 11 Alternatives for Data Entry k* in Index  In a data entry k* we can store:  Alt1: Data record with key value k  Alt2: <k, rid of data record with search key value k>  Alt3: <k, list of rids of data records with search key k>  Choice of an alternative for data entries is orthogonal to an indexing technique used.  Indexing techniques: B+ tree, hashing  Every indexing technique can be combined with any alternative. 12 Alternative 1 for Data Entries  Alternative 1:  Index structure itself is a file organization for data records (instead of a Heap or sorted file).  At most one index on a given collection of data records can use Alternative 1. •  Otherwise, data records are duplicated, leading to redundant storage and potential inconsistency. If data records are very large, # of pages containing data entries is high. • Size of the index structure to direct searches (e.g., non-leaf nodes in B+ tree) can also be large. 13 Alternatives 2, 3 for Data Entries  Alternatives 2 and 3:  Data entries, <k, rid(s)>, typically much smaller than data records. • •  Alternative 3 more compact than Alternative 2 •  Index structure used to direct search is much smaller than with Alternative 1. So, better than Alternative 1 with large data records, especially if search keys are small. In the presence of multiple K* entries! Alternative 3 leads to variable sized data entries, even if search keys are of fixed length. • Variable sizes records/data entries are hard to manage. 14 Index Classification  Recall: Key? Primary key? Candidate key?  Primary index vs. secondary index:  If search key contains the primary key, then called primary index.  Other indexes are called secondary indexes.  Unique index: Search key contains a candidate key.  No data entries can have the same search key value. 15 Index Classification (Contd.)  Clustered index: order of data records is the same as, or `close to’, order of (sorted) data entries.      Alternative 1 implies clustered Alternatives 2 and 3 are clustered only if data records are sorted on the search key field. A data file can be clustered on at most one search key. Retrieving data records: fast, with one or a few I/Os. Why? Unclustered index:   Multiple unclustered indexes on a data file. Retrieving data records: can be slow, toughing many pages. 16 Clustered vs. Unclustered Indexes CLUSTERED Index entries direct search for data entries Data entries UNCLUSTERED Data entries (Index File) (Data file) Data Records  Data Records Suppose that Alternative (2) is used for data entries, and data records are stored in a Heap file.   To build a clustered index, first sort the Heap file (with some free space on each page for future inserts). Overflow pages may be needed for inserts. (Thus, order of data records is `close to’, but not identical to, the sort order.) 17 Index Classification (Contd.)  Dense index: contains (at least) one data entry for every search key value that appears in a record in the indexed file  Alternatives 1, 2, 3  Sparse index: contains one entry for each page of records in the indexed file.     Sparse index has to be clustered! Alternative 1? Alternative 2? Alternative 3? 18 Cost Model for Our Analysis We ignore CPU costs, for simplicity:  B: The number of data pages  R: Number of records per page  D: (Average) time to read or write disk page  I/O cost: Measuring # of page I/O’s ignores pre-fetching of a sequence of pages; thus, approximate I/O cost.  Average-case analysis, based on simplistic assumptions.  Good enough to show the overall trends! 19 Comparing File Organizations      Heap files (random order; insert at eof) Sorted files, sorted on <age, sal> Clustered B+ tree file, Alternative (1), search key <age, sal> Unclustered B + tree index on a heap file, search key <age, sal> Unclustered hash index on a heap file, search key <age, sal> 20 Operations to Compare   Scan: Fetch all records from disk Equality search: e.g. (age, sal) = (24, 50K)  Range selection: e.g. (age, sal)  [(22, 20K), (30, 200K)]   Insert a record Delete a record 21 Assumptions in Our Analysis  Heap Files:   Sorted Files:   Equality selection on key; exactly one match. Files compacted after deletions. Indexes:  Alt (2), (3): size of data entry = 10% size of data record  Hash: No overflow chains. •  80% page occupancy => File size = 1.25 data size B+Tree: • • 67% occupancy (typical): implies file size = 1.5 data size Balanced with fanout F (133 typical) at each non-leaf level 22 Assumptions (contd.)  Scans:  Leaf nodes of a tree-index are chained.  Index data entries plus actual file scanned for unclustered indexes.  Range searches:  We use tree indexes to restrict the set of data records fetched, but ignore hash indexes. 23 Cost of Operations (a) Scan (b) Equality (c ) Range (d) Insert (e) Delete (1) Heap BD 0.5BD BD 2D (2) Sorted BD Dlog 2B D(log 2 B + # pgs with match recs) (3) 1.5BD Dlog F 1.5B D(log F 1.5B Clustered + # pages w. match recs) (4) Unclust. BD(R+0.15) D(1 + D(log F 0.15B Tree index log F 0.15B) + # match recs) (5) Unclust. BD(R+0.125) 2D BD Hash index Search + BD Search +D Search +BD Search +D Search +D Search + 2D Search + 2D Search + 2D Search + 2D  Several assumptions underlie these (rough) estimates! 24 Cost of Operations (a) Scan (b) Equality (c ) Range (d) Insert (e) Delete (1) Heap BD 0.5BD BD 2D (2) Sorted BD Dlog 2B Search +D Search +BD D(log 2 B + Search # pgs with + BD match recs) (3) 1.5BD Dlog F 1.5B D(log F 1.5B Search Search Clustered + # pages w. + D +D match recs) (4) Unclust. BD(R+0.15) D(1 + D(log F 0.15B Search Search Tree index log F 0.15B) + # match + 2D + 2D recs) (5) BD(R+0.125) 2D BD Search # ofUnclust. leaf pages: B*0.1/67%=0.15B; Cost of index I/O: 0.15BD Search Hash index # of data entries on a leaf page: R*10*0.67=6.7R; Cost of data I/O:+0.15B*6.7R*D=BDR 2D + 2D Key: unclustered B+tree, one I/O per data entry! # of data entry pages: B*0.1/80%=0.125B; Cost of index I/O: 0.125BD Several assumptions underlie these (rough) estimates! # of data entries on a hash page: R*10*0.80=8R; Cost of data I/O: 0.125B*8R*D=BDR Key: unclustered hash index, one I/O per data entry! 25 Summary Many alternative access methods exist, each appropriate in some situation.  Index is a collection of data entries plus a way to quickly find entries with given key values.  If selection queries are frequent, building an index is important.    Hash-based indexes only good for equality search. Tree-based indexes best for range search; also good for equality search. 26 Summary (Contd.)  Data entries can be actual data records, <key, rid> pairs, or <key, rid-list> pairs.  Choice orthogonal to indexing technique used to locate data entries with a given key value. Can have several indexes on a given file of data records, each with a different search key.  Indexes can be classified as clustered vs. unclustered, primary vs. secondary, etc. Differences have important consequences for utility/performance.  27 Questions 28

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Overview of Storage and Indexing