* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Document
Survey
Document related concepts
Transcript
B-Trees Chapter 18 in CLRS (2nd and 3rd) Chapter 19 in CLR Introduction B-trees are balanced search trees designed to work well on secondary storage devices. They are similar to red-black trees, but are better at minimizing disk I/O operations. One major difference between red-black trees and B-trees is that the nodes in Btrees may have many children – even thousands. The actual number is determined by the features of the memory being used. Being balanced and with “many children” assures that the height of an n-node B-tree is O(logn). Therefore, the dynamic set operations will all be implemented in O(logn) time. 2 Introduction B-trees generalize binary search trees in a natural manner. If a B-tree node contains n[x] keys, then x has n[x] + 1 children. The keys in node x are used as dividing points separating the range of keys handled by x into n[x] + 1 subranges, each handled by one child of x. When searching for a key in a B-tree, we make an (n[x] + 1) – way decision based on comparisons with the n[x] keys stored at node x. 3 Introduction 4 Introduction – data structures on secondary storage There are many different technologies available for providing memory capacity in a computer system. The primary memory (or main memory) of a computer normally consists of silicon memory chips. This technology is much more expensive per bit stored than magnetic storage technology, such as tapes or disks. Most computers have secondary storage based on magnetic disks. The amount of such secondary storage often exceeds the amount of primary storage by at least two orders of magnitude. 5 Introduction – data structures on secondary storage The main disadvantage of a secondary storage is the access time: the disk drive’s operations include mechanical movements of the head that reads the data. This results in access time that is much longer than the time needed to read data from the primary storage. The factors may be huge, milliseconds compared to hundreds of nanoseconds (a factor of 100,000). To make better use of the secondary storage, data is accessed in bulks – the information on the disk is divided to pages of bits, appearing consecutively within the disk, and each disk operation is of an entire page (whose size is about 212 bytes in length). Once the right page is found, the transfer rate is high. 6 Introduction – data structures on secondary storage Often, it takes more time to access a page of information and read it from a disk than it takes for the computer to examine all the information read. Therefore, we shall look separately at two different components of the running time: •The number of disk accesses, and •The CPU time. The first component is measured in terms of the number of pages of information that are to be read from or written to the disk (though this time depends on the exact location of the pages on the disk, we will take it as a good approximation). 7 Introduction – data structures on secondary storage In a typical B-tree application, the amount of data handled is so large that we cannot fit all the data into the main memory at once. The B-tree algorithms copy selected pages from the disk into the main memory as needed, and write back onto the disk when the pages have been changed. Since the algorithms only need a constant number of pages in the main memory at any given time, we are not limited by the size of this memory regarding the size of the given B-tree. 8 Introduction – data structures on secondary storage How do we model disk operations? Let x be a pointer to an object. If the object is in the main memory, then we refer to it’s fields as usual. However, if the object is in the disk, we must first perform the operation DISK-READ(x). This operation reads the object x into the main memory before “regular” operations on the fields may take place. Similarly, we use DISK-WRITE(x) to update the main memory with any changes made to the different fields. A typical pattern for working with an object is as follows: 9 Introduction – data structures on secondary storage x←a pointer to some object DISK-READ(x) operations that access and/or modify the fields of x DISK-WRITE(x) Omitted if no fields of x were changed. other operations that access but do not modify fields of x The system can keep only a limited number of pages in the main memory at any given time. We assume that pages no longer in use are flushed from the main memory by the system, without any notice by the B-tree algorithms. 10 Introduction – data structures on secondary storage Since in most systems the running time of a B-tree algorithm is determined mainly by the reading/writing to the disk, it seems wise to use these operations efficiently. I.e., whenever we approach such an operation, we should read as much data as possible. This leads to the fact that a node in a B-tree is usually as large as a whole disk page. The number of children of a node is therefore limited by the size of the disk’s page. For large B-trees stored on a disk, the numbers are between 50 and 2000 children per node, depending on the size of the key compared to the size of the page. The bigger the number – the smaller the height of the tree and the number of disk accesses required to find any key. 11 Introduction – data structures on secondary storage The following shows a B-tree with 1001 children per node, and height 2. This tree stores over 1,000,000,000 keys, yet two disk accesses at most suffice to find any key in the tree! 12 19.1 Definition of B-trees A B-tree T is a rooted tree (whose root is root[T]) having the following properties: 1. Every node x has the following fields: a. n[x], the number of keys currently stored in node x. b. the n[x] keys themselves, stored in nondecreasing order, so that key1[x] ≤ key2[x] ≤ key3[x] ≤ ≤ keyn[x][x]. c. leaf[x], a boolean value that is TRUE if x is a leaf and FALSE if x is an internal node. 13 19.1 Definition of B-trees 2. Each internal node x also contains n[x]+1 pointers c1[x], c2[x], ..., cn[x]+1[x] to its children. Leaf nodes have no children, so their ci fields are undefined. 3. The keys keyi[x] separate the range of keys stored in each subtree: if ki is any key stored in the subtree with root ci[x], then k1 ≤ key1[x] ≤ k2 ≤ key2[x] ≤ k3 ≤ key3[x] ≤ ≤ keyn[x][x] ≤ kn[x]+1. 4. All leaves have the same depth, which is the tree’s height h. 14 19.1 Definition of B-trees 5. There are lower and upper bounds on the number of keys that a node can contain. These bounds can be expressed in terms of a fixed integer t≥2 called the minimum degree of the B-tree: a. Every node other than the root must have at least t-1 keys. Every internal node other then the root thus has at least t children. If the tree is nonempty, the root must have at least one key. b. Every node can contain at most 2t-1 keys. Therefore, an internal node can have at most 2t children. We say that a node is full if it has exactly 2t children. 15 19.1 Definition of B-trees – the height of a B-tree As in red-black trees, the height of the tree proves to be the key to understanding the running time of the tree’s operations. Theorem: If n ≥ 1, then for any n-key B-tree T of height h and minimum degree t ≥ 2, h ≤ logt [½(n+1)]. 16 19.1 Definition of B-trees – the height of a B-tree Proof: If the tree has height h, the root contains at least one key and all other nodes contain at least t-1 keys. Thus, there are at least 2 nodes at depth 1, at least 2t nodes at depth 2, at least 2t2 nodes at depth 3, and so on, until at depth h there are at least 2th-1 nodes. Thus, the number n of keys satisfies the inequality n ≥ 1 + (t-1) ∑i=1,J,h 2ti-1 = = 2th-1 or h ≤ logt [½(n+1)]. █ Notice the advantage of B-trees over red-black trees: a factor of log2t. 17 19.1 Definition of B-trees – the height of a B-tree Figure 18.4 18 19.2 Basic operations on B-trees We now present the basic operations B-TREE-SEARCH, B-TREE-CREATE, and BTREE-INSERT. We adopt two conventions: • The root of the B-tree is always in the main memory, so there is no need to read it from the disk. However, if it is changed, it may be written to the disk. • Any nodes that are passed as parameters must already have been read from the disk. All the procedures we will meet here are “one pass” algorithms – going from the root downwards, never having to back up. 19 19.2 Basic operations on B-trees – Searching a B-tree Searching a B-tree is like searching a binary search tree, except that instead of making a binary branching decision at each node, we make a multiway decision according to the number of the node’s children. I.e., in the node x, we have to make an (n[x] +1)-way branching decision. The code is a straightforward generalization of the searching in the binary tree. The input is a pointer x to the root of the tree, and a key k to be searched for. The output is either NIL (if k is not in the tree), or an ordered pair (y,i), y being a node, and i being an index so that keyi [y] = k. 20 19.2 Basic operations on B-trees – Searching a B-tree B-TREE-SEARCH(x,k) i←1 while i ≤ n[x] and k > keyi [x] do i ← i + 1 if i ≤ n[x] and k = keyi [x] then return (x,i) if leaf[x] then return NIL else DISK-READ(ci [x]) return B-TREE-SEARCH(ci [x],k) 21 19.2 Basic operations on B-trees – Searching a B-tree The running time of B-TREE-SEARCH: The nodes encountered throughout the recursion form a path whose length is at most h, the height of the tree. In each node, the search is linear (at most). Summing it up – the total CPU time is O(t h) = O( t logt n ). 22 19.2 Basic operations on B-trees – Creating an empty B-tree To build a B-tree T, we first use B-TREE-CREATE to create an empty root node, and then call B-TREE-INSERT to add new keys. Both of these procedures use an auxiliary procedure ALLOCATE-NODE, which allocates one disk page to be used as a new node in O(1) time. We may assume that a node created this way requires no DISK-READ, since there is no useful information for that node at this time. B-TREE-CREATE(T) x ← ALLOCATE-NODE() leaf [x] ← TRUE n[x] ← 0 DISK-WRITE(x) root [T] ← x This procedure requires O(1) disk operations and O(1) CPU time. 23 19.2 Basic operations on B-trees – Splitting a node in a B-tree Inserting a key into a B-tree T is much more complicated then into a binary search tree. A fundamental operation used during insertion is the splitting of a full node y (having 2t-1 keys) around its median key keyt[y] into two nodes having t-1 keys each. The median key moves up into y’s parent – which must be non-full prior to the splitting of y – to identify the dividing point between the two new trees. If y has no parent, then the tree grows in height by one. Splitting is the only way by which the tree grows. 24 19.2 Basic operations on B-trees – Splitting a node in a B-tree The procedure B-TREE-SPLIT-CHILD takes as input a non-full internal node x (assumed to be in the main memory), an index i, and a node y such that y=ci[x] is a full child of x. The procedure then splits this child in two and adjusts x so it now has an additional child. 25 19.2 Basic operations on B-trees – Splitting a node in a B-tree B-TREE-SPLIT-CHILD(x,i,y) z ← ALLOCATE-NODE() leaf [z] ← leaf [y] n[z] ← t-1 for j ← 1 to t-1 z becomes the new node that adopts the t-1 largest keys of the original node y that splits. do keyj [z] ← keyj+t [y] if not leaf [y] then for j ← 1 to t do cj [z] ← cj+t [y] n[y] ← t-1 26 19.2 Basic operations on B-trees – Splitting a node in a B-tree for j ← n[x] +1 downto i+1 do cj+1 [x] ← cj [x] ci+1 [x] ←z z becomes a new son of x. for j ← n[x] downto i do keyj+1 [x] ← keyj [x] keyi [x] ← keyt [y] n[x] ← n[x] +1 DISK-WRITE(y) DISK-WRITE(z) DISK-WRITE(x) 27 19.2 Basic operations on B-trees – Splitting a node in a B-tree The procedure was a simple “cut and paste” one – nothing special about it. The CPU time is θ(t). 28 19.2 Basic operations on B-trees – Inserting a key into a B-tree Inserting a key k into a B-tree T of height h is done in a single pass down the tree, requiring O(h) disk accesses. The CPU time requires is O(t h) = O(t logt n). The B-TREE-INSERT procedure uses B-TREE-SPLIT-CHILD to guarantee that the recursion never descends to a full node. 29 19.2 Basic operations on B-trees – Inserting a key into a B-tree B-TREE-INSERT(T,k) r ← root [T] if n[r] = 2t-1 then s ← ALLOCATE-NODE() root [T] ← s These lines handle the case when leaf [s] ← FALSE the root is a full node: it splits and a n[s] ← 0 new node (s) becomes the root. c1[s] ← r The tree’s height increases by 1. B-TREE-SPLIT-CHILD(s,1,r) B-TREE-INSERT-NONFULL(s,k) else B-TREE-INSERT-NONFULL(r,k) 30 19.2 Basic operations on B-trees – Inserting a key into a B-tree 31 19.2 Basic operations on B-trees – Inserting a key into a B-tree The B-tree increases in height from its root! The procedure finishes by calling B-TREE-INSERT-NONFULL to perform the insertion of the key k in the tree rooted at the non-full root node. B-TREE-INSERTNONFULL recurses down the tree, at all times checking that it does not enter a full node by calling B-TREE-SPLIT-CHILD, whenever necessary. We now present B-TREE-INSERT-NONFULL. 32 19.2 Basic operations on B-trees – Inserting a key into a B-tree B-TREE-INSERT-NONFULL(x,k) i ← n[x] if leaf[x] then while i ≥ 1 and k < keyi [x] do keyi+1[x] ← keyi [x] i← i-1 keyi+1[x] ← k n[x] ← n[x] + 1 Here we handle the case that x is a leaf node. In that case, k is simply inserted into this node. DISK-WRITE(x) 33 19.2 Basic operations on B-trees – Inserting a key into a B-tree DISK-WRITE(x) else while i ≥ 1 and k < keyi [x] do i ← i – 1 i← i+1 Finding the right child into which we continue. DISK-READ(ci [x]) if n[ci [x]] = 2t – 1 Full node? Split! then B-TREE-SPLIT-CHILD(x,i,ci [x]) if k > keyi [x] then i ← i + 1 B-TREE-INSERT-NONFULL(ci [x],k) Find the right (non full) child. Recurse. 34 19.2 Basic operations on B-trees – Inserting a key into a B-tree 35 19.2 Basic operations on B-trees – Inserting a key into a B-tree 36 19.2 Basic operations on B-trees – Inserting a key into a B-tree 37 19.2 Basic operations on B-trees – Inserting a key into a B-tree 38 19.2 Basic operations on B-trees – Inserting a key into a B-tree 39 19.2 Basic operations on B-trees – Deleting a key from a B-tree Deleting a node from a B-tree is different from insertion in the sense that it may be done not just from a leaf. In case we delete a key from an internal node, the children of this node should be rearranged. We should also make sure that we never delete a key from a node (other than the root) containing only t -1 keys, otherwise we violate the rules. The procedure B-TREE-DELETE guarantees that if it is called on a node x, then the number of keys in this node is at least t. This allows us to claim that the procedure (almost) never backs-up. 40 19.2 Basic operations on B-trees – Deleting a key from a B-tree In the following, it should be understood that if it happens that the root node x becomes an internal node with no keys, then x is deleted and x’s only child becomes the new root of the tree, decreasing the height of the tree by 1 (that is unless the tree is empty). The following is the description of the deletion procedure: 41 19.2 Basic operations on B-trees – Deleting a key from a B-tree 1. If the key k is in node x and x is a leaf, delete k from x. 2. If the key k is in node x and x is an internal node, do the following: a. If the child y that precedes k in x has at least t keys, then find the predecessor k’ of k in the subtree rooted at y. Recursively delete k’, and replace k by k’ in x. (Finding k’ and deleting it can be performed in a single downward pass). b. Symmetrically, if the child z that follows k in x has at least t keys, then continue with k’s successor. 42 19.2 Basic operations on B-trees – Deleting a key from a B-tree c. Otherwise, if both y and z have (t–1) keys, merge k and z into y, so that x loses both k and the pointer to z, and y now contains (2t–1) keys. Then free z and recursively delete k from y. 3. If the key k is not present in internal node x, determine the root ci[x] of the appropriate subtree that must contain k, if k is in the tree at all. If ci[x] has only (t-1) keys, execute step 3a or 3b as necessary to guarantee that we descend into a node containing at least t keys. Then finish by recursing on the appropriate child of x. 43 19.2 Basic operations on B-trees – Deleting a key from a B-tree a. If ci [x] has only (t-1) keys but has an immediate sibling with at least t keys, give ci [x] an extra key by moving a key downwards from x, moving a key from ci[x]’s immediate sibling up into x, and updating the pointer. b. If all three nodes – ci[x] and its two immediate siblings have (t-1) keys, merge ci[x] with one of them (and don’t forget to bring the median from x). 44 19.2 Basic operations on B-trees – Deleting a key from a B-tree Note that in cases 2a and 2b, there is a small difference from the other cases: we do go back up the tree to place the successor / predecessor in place of the deleted key. However, we are still within the desired O(h) disk operations. Once again, the CPU time is O(th) = O(t logtn). The following figures are examples of deletions. 45 19.2 Basic operations on B-trees – Deleting a key from a B-tree 46 19.2 Basic operations on B-trees – Deleting a key from a B-tree 47 19.2 Basic operations on B-trees – Deleting a key from a B-tree 48 19.2 Basic operations on B-trees – Deleting a key from a B-tree 49 19.2 Basic operations on B-trees – Deleting a key from a B-tree 50 19.2 Basic operations on B-trees – Deleting a key from a B-tree 51 Animations and Presentations http://www.bluerwhite.org/btree/ Connection with Red-Black trees: http://www.eli.sdsu.edu/courses/fall96/cs660/notes/redBlack/redBlack.html http://cis.stvincent.edu/carlsond/swdesign/btree/btree.html 52