Download X - Suyash Bhardwaj

DATA STRUCTURES II UNIT 4 – Heaps SUYASH BHARDWAJ FACULTY OF ENGINEERING AND TECHNOLOGY GURUKUL KANGRI VISHWAVIDYALAYA, HARIDWAR Content • Mergeable Heaps : Mergeble Heap Operations, Binomial Trees Implementing Binomial Heaps and its Operations, 2-3-4. Trees. Structure and Potential Function of Fibonacci Heap Implementing Fibonacci Heap Definition • In computer science, a mergeable heap is an abstract data type, which is a heap supporting a merge operation. Summary of Heap ADT Analysis • Consider a heap of N nodes • Space needed: O(N) – Actually, O(MaxSize) where MaxSize is the size of the array – Pointer-based implementation: pointers for children and parent • Total space = 3N + 1 (3 pointers per node + 1 for size) • FindMin: O(1) time; DeleteMin and Insert: O(log N) time • BuildHeap from N inputs: What is the run time? – N Insert operations = O(N log N) – O(N): Treat input array as a heap and fix it using percolate down – percolate down. Running Time: O(log N) – E.g. Schedulers in OS often decrease priority of CPU-hogging jobs Operation A mergeable heap supports the following operations:[1] • Make-Heap(), creating an empty heap. • Insert(H,x), inserting an element x into the heap H. • Min(H), returning the minimum element, or Nil if no such element exists. • Extract-Min(H), extracting and returning the minimum element, or Nil if no such element exists. • Merge(H1,H2), combining the elements of H1 and H2. General Implementation • It is straightforward to implement a mergeable heap given a simple heap: Merge(H1,H2): 1. x ← Extract-Min(H2) 2. while x ≠ Nil 1. Insert(H1, x) 2. x ← Extract-Min(H2) • This can however be wasteful as each Extract-Min(H) and Insert(H,x) typically have to maintain the heap property. More efficient implementations • • • • Binary heaps Binomial heaps Fibonacci heaps Pairing heaps Binary Heap Properties 1. Structure Property 2. Ordering Property 8 Complete Binary Tree A Perfect binary tree – A binary tree with all leaf nodes at the same depth. All internal nodes have 2 children. height h 2h+1 – 1 nodes 2h – 1 non-leaves 2h leaves 11 5 21 2 1 16 9 3 7 10 13 25 19 22 30 9 Heap Structure Property • A binary heap is a complete binary tree. Complete binary tree – binary tree that is completely filled, with the possible exception of the bottom level, which is filled left to right. Examples: 10 Representing Complete Binary Trees in an Array 1 A 2 B 4 C H F E 9 10 I 7 6 5 D 8 From node i: 3 11 J left child: right child: parent: G 12 K L implicit (array) implementation: 0 A B C D E F G H I J K L 1 2 3 4 5 6 7 8 9 10 11 12 13 11 Heap Order Property Heap order property: For every non-root node X, the value in the parent of X is less than (or equal to) the value in X. 10 10 20 20 80 40 30 80 60 85 99 15 not a heap 50 700 12 Heap Operations • findMin: • insert(val): percolate up. • deleteMin: percolate down. 10 20 40 50 700 80 60 85 99 65 13 Heap – Insert(val) Basic Idea: 1. Put val at “next” leaf position 2. Percolate up by repeatedly exchanging node until no longer needed 14 Insert: percolate up 10 20 80 40 50 60 700 65 85 99 15 10 15 40 50 700 80 20 65 85 99 60 15 Heap – Deletemin Basic Idea: 1. Remove root (that is always the min!) 2. Put “last” leaf node at root 3. Find smallest child of node 4. Swap node with its smallest child if needed. 5. Repeat steps 3 & 4 until no swaps needed. 16 DeleteMin: percolate down 10 20 40 15 50 60 700 85 99 65 65 20 40 50 700 15 60 85 99 65 17 DeleteMin: percolate down 65 20 40 50 15 60 85 99 700 15 20 40 50 65 60 85 99 700 18 Building a Heap • Adding the items one at a time is O(n log n) in the worst case 19 Working on Heaps • What are the two properties of a heap? – Structure Property – Order Property • How do we work on heaps? – Fix the structure – Fix the order 20 BuildHeap: Floyd’s Method 12 5 11 3 10 6 9 4 8 1 7 2 Add elements arbitrarily to form a complete tree. Pretend it’s a heap and fix the heap-order property! 12 5 11 3 4 10 8 1 6 7 9 2 21 Buildheap pseudocode private void buildHeap() { for ( int i = currentSize/2; i > 0; i-- ) percolateDown( i ); } 22 BuildHeap: Floyd’s Method 12 5 11 3 4 10 8 1 6 7 9 2 23 BuildHeap: Floyd’s Method 12 5 11 3 4 10 8 1 2 7 9 6 24 BuildHeap: Floyd’s Method 12 12 5 11 3 4 10 8 1 2 7 6 5 9 3 4 11 1 8 10 7 2 9 6 25 BuildHeap: Floyd’s Method 12 12 5 11 3 4 10 8 1 2 7 5 9 3 6 4 11 1 8 10 7 2 9 6 12 5 3 4 2 1 8 10 7 6 11 9 26 BuildHeap: Floyd’s Method 12 12 5 11 3 4 10 8 1 2 7 5 9 11 3 6 4 1 2 8 10 7 6 12 12 5 3 4 2 1 8 10 7 6 11 9 1 9 3 4 2 5 8 10 7 6 11 9 27 Finally… 1 3 2 4 12 5 8 10 6 7 9 11 28 Facts about Heaps Observations: • • • • Finding a child/parent index is a multiply/divide by two Operations jump widely through the heap Each percolate step looks at only two new nodes Inserts are at least as common as deleteMins Realities: • Division/multiplication by powers of two are equally fast • Looking at only two new pieces of data: bad for cache! • With huge data sets, disk accesses dominate 29 Operation: Merge Given two heaps, merge them into one heap – first attempt: insert each element of the smaller heap into the larger. – second attempt: concatenate binary heaps’ arrays and run buildHeap. 30 Merge two heaps (basic idea) • Put the smaller root as the new root, • Hang its left subtree on the left. • Recursively merge its right subtree and the other tree. 31 Leftist Heaps Idea: Focus all heap maintenance work in one small part of the heap Leftist heaps: 1. Most nodes are on the left 2. All the merging work is done on the right 32 Leftist Heap Properties • Heap-order property – parent’s priority value is  to childrens’ priority values – result: minimum element is at the root • Leftist property – For every node x, npl(left(x))  npl(right(x)) – result: tree is at least as “heavy” on the left as the right 33 Are These Leftist? 2 2 1 1 0 1 0 0 0 1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 Every subtree of a leftist tree is leftist! 0 0 34 Merging Two Leftist Heaps • merge(T1,T2) returns one leftist heap containing all elements of the two (distinct) leftist heaps T1 and T2 merge T1 a a merge L1 T2 L2 R1 a<b L1 R1 b b R2 L2 R2 35 Merge Continued a a If npl(R’) > npl(L1) R’ L1 R’ L1 R’ = Merge(R1, T2) runtime: 36 Operations on Leftist Heaps • merge with two trees of total size n: O(log n) • insert with heap size n: O(log n) – pretend node is a size 1 leftist heap – insert by merging original heap with one node heap merge • deleteMin with heap size n: O(log n) – remove and return root – merge left and right subtrees merge 37 Leftest Merge Example merge 5 1 0 0 10 7 7 12 3 0 3 0 ? merge 5 1 5 ? 0 1 14 0 0 10 8 12 0 8 0 10 merge 0 12 0 8 0 0 14 8 (special case) 0 12 38 0 Sewing Up the Example 3 7 0 14 ? 3 0 5 ? 7 0 0 10 8 0 12 0 14 ? 3 0 5 7 1 0 8 10 0 0 14 1 0 5 1 0 8 10 0 0 0 12 12 Done? 39 Finally… 3 7 0 14 1 0 3 5 1 5 0 8 10 0 12 0 1 7 1 0 0 8 10 0 0 14 0 12 40 Skew Heaps Problems with leftist heaps – extra storage for npl – extra complexity/logic to maintain and check npl – right side is “often” heavy and requires a switch Solution: skew heaps – “blindly” adjusting version of leftist heaps – merge always switches children when fixing right path – amortized time for: merge, insert, deleteMin = O(log n) – however, worst case time for all three = O(n) 41 Merging Two Skew Heaps merge T1 a a merge L1 T2 L2 R1 a<b L1 R1 b b R2 L2 R2 Only one step per iteration, with children always switched 42 Example merge 3 5 3 merge 10 12 3 7 5 10 8 7 12 7 5 merge 12 14 10 14 8 8 3 14 7 5 8 12 10 14 43 Runtime Analysis: Worst-case and Amortized • No worst case guarantee on right path length! • All operations rely on merge  worst case complexity of all ops = • Probably won’t get to amortized analysis in this course, but see Chapter 11 if curious. • Result: M merges take time M log n  amortized complexity of all ops = 44 Comparing Heaps • Binary Heaps • Leftist Heaps • d-Heaps • Skew Heaps Still room for improvement! (Where?) 45 Data Structures Binomial Queues 46 Yet Another Data Structure: Binomial Queues • Structural property – Forest of binomial trees with at most one tree of any height What’s a forest? What’s a binomial tree? • Order property – Each binomial tree has the heap-order property 47 The Binomial Tree, Bh • Bh has height h and exactly 2h nodes • Bh is formed by making Bh-1 a child of another Bh1 • Root has exactly h children • Number of nodes at depth d is binomial coeff. – Hence the name; we will not use this last property B0 B1 B2 h   d  B3 48 Binomial Queue with n elements Binomial Q with n elements has a unique structural representation in terms of binomial trees! Write n in binary: n = 1101 (base 2) = 13 (base 10) 1 B3 1 B2 No B1 1 B0 49 Properties of Binomial Queue • At most one binomial tree of any height • n nodes  binary representation is of size ?  deepest tree has height ?  number of trees is ? Define: height(forest F) = maxtree T in F { height(T) } Binomial Q with n nodes has height Θ(log n) 50 Operations on Binomial Queue • Will again define merge as the base operation – insert, deleteMin, buildBinomialQ will use merge • Can we do increaseKey efficiently? decreaseKey? • What about findMin? 51 Merging Two Binomial Queues Essentially like adding two binary numbers! 1. Combine the two forests 2. For k from 0 to maxheight { a. b. c. d. e. m  total number of Bk’s in the two BQs if m=0: continue; if m=1: continue; if m=2: combine the two Bk’s to form a Bk+1 if m=3: retain one Bk and combine the other two to form a Bk+1 # of 1’s 0+0 = 0 1+0 = 1 1+1 = 0+c 1+1+c = 1+c } Claim: When this process ends, the forest has at most one tree of any height 52 Example: Binomial Queue Merge H1: 21 H2: 3 -1 1 7 2 3 1 8 5 9 6 11 5 7 6 53 Example: Binomial Queue Merge H1: H2: 3 -1 1 7 2 3 1 8 5 21 11 5 9 6 7 6 54 Example: Binomial Queue Merge H1: H2: -1 1 7 5 2 3 21 3 1 8 9 6 11 5 7 6 55 Example: Binomial Queue Merge H1: H2: -1 1 7 5 3 21 2 9 6 3 1 8 7 11 5 6 56 Example: Binomial Queue Merge H1: H2: -1 2 1 3 1 8 7 11 5 6 5 3 21 9 6 7 57 Example: Binomial Queue Merge H1: H2: -1 2 1 3 1 8 7 11 5 6 5 3 21 9 6 7 58 Complexity of Merge Constant time for each height Max number of heights is: log n  worst case running time = Θ( ) 59 Insert in a Binomial Queue Insert(x): Similar to leftist or skew heap runtime Worst case complexity: same as merge O( ) Average case complexity: O(1) Why?? Hint: Think of adding 1 to 1101 60 deleteMin in Binomial Queue Similar to leftist and skew heaps…. 61 deleteMin: Example BQ 7 4 3 8 5 7 find and delete smallest root merge BQ (without the shaded part) BQ’ 8 and BQ’ 5 7 62 deleteMin: Example Result: 7 4 8 5 7 runtime: 63 Binomial Heaps DATA STRUCTURES: MERGEABLE HEAPS • MAKE-HEAP ( ) – Creates & returns a new heap with no elements. • INSERT (H,x) – Inserts a node x whose key field has already been filled into heap H. • MINIMUM (H) – Returns a pointer to the node in heap H whose key is minimum. CS 473 Lecture X 64 Mergeable Heaps • EXTRACT-MIN (H) – Deletes the node from heap H whose key is minimum. Returns a pointer to the node. • DECREASE-KEY (H, x, k) – Assigns to node x within heap H the new value k where k is smaller than its current key value. CS 473 Lecture X 65 Mergeable Heaps • DELETE (H, x) – Deletes node x from heap H. • UNION (H1, H2) – Creates and returns a new heap that contains all nodes of heaps H1 & H2. – Heaps H1 & H2 are destroyed by this operation CS 473 Lecture X 66 Binomial Trees • A binomial heap is a collection of binomial trees. • The binomial tree Bk is an ordered tree defined recursively Bo Consists of a single node . . . Bk Consists of two binominal trees Bk-1 linked together. Root of one is the leftmost child of the root of the other. CS 473 Lecture X 67 Binomial Trees B k-1 B k-1 Bk CS 473 Lecture X 68 Binomial Trees B 2 B1 B1 B0 B1 B0 B1 B2 B3 B3 B2 B0 B1 B4 CS 473 Lecture X 69 Binomial Trees B1 Bo B2 Bk-1 Bk-2 Bk CS 473 Lecture X 70 Properties of Binomial Trees LEMMA: For the binomial tree Bk ; 1. There are 2k nodes, 2. The height of tree is k, 3. There are exactly k nodes at depth i for i i = 0,1,..,k and 4. The root has degree k > degree of any other node if the children of the root are numbered from left to right as k-1, k-2,...,0; child i is the root of a subtree Bi. CS 473 Lecture X 71 Properties of Binomial Trees PROOF: By induction on k Each property holds for the basis B0 INDUCTIVE STEP: assume that Lemma holds for Bk-1 1. Bk consists of two copies of Bk-1 | Bk | = | Bk-1 | + | Bk-1| = 2k-1 +2k-1 = 2k 2. hk-1 = Height (Bk-1) = k-1 by induction hk=hk-1+1 = k-1 +1 = k CS 473 Lecture X 72 Properties of Binomial Trees 3. Let D(k,i) denote the number of nodes at depth i of a Bk ; d=1 d=i-1 d=i d=i D(k-1, i) D(k-1,i -1) Bk-1 Bk-1 D(k,i)=D(k-1,i -1) + D(k-1,i) = true by induction k-1 i -1 CS 473 Lecture X k k-1 + i = i 73 Properties of Binomial Trees(Cont.) 4.Only node with greater degree in Bk than those in Bk-1 is the root, • The root of Bk has one more child than the root of Bk-1, Degree of root Bk=Degree of Bk-1+1=(k-1)+1=k CS 473 Lecture X 74 Properties of Binomial Trees (Cont.) B1 B0 Bk-3 B2 Bk-2 B1 B0 B2 Bk-3 Bk-2 Bk-1 CS 473 Lecture X 75 Properties of Binomial Trees (Cont.) • COROLLARY: The maximum degree of any node in an n-node binomial tree is lg(n) The term BINOMIAL TREE comes from the 3rd property. k i.e. There are i nodes at depth i of a Bk terms k are the binomial coefficients. i CS 473 Lecture X 76 Binomial Heaps A BINOMIAL HEAP H is a set of BINOMIAL TREES that satisfies the following “Binomial Heap Properties” 1. Each binomial tree in H is HEAP-ORDERED CS 473 • the key of a node is ≥ the key of the parent • Root of each binomial tree in H contains the smallest key in that tree. Lecture X 77 Binomial Heaps 2. There is at most one binomial tree in H whose root has a given degree, – n-node binomial heap H consists of at most [lgn] + 1 binomial trees. – Binary represantation of n has lg(n) + 1 bits, n ≤ b lgn ,b lgn -1, ....b1, b0> = Σ lgn i b 2 i=0 i By property 1 of the lemma (Bi contains 2i nodes) Bi appears in H iff bit bi=1 CS 473 Lecture X 78 Binomial Heaps Example: A binomial heap with n = 13 nodes 3 2 1 0 13 =< 1, 1, 0, 1>2 Consists of B0, B2, B3 head[H] 10 1 B0 12 B2 6 25 18 11 27 CS 473 Lecture X 8 14 17 38 29 B3 79 Representation of Binomial Heaps • Each binomial tree within a binomial heap is stored in the left-child, right-sibling representation • Each node X contains POINTERS – p[x] to its parent – child[x] to its leftmost child – sibling[x] to its immediately right sibling • Each node X also contains the field degree[x] which denotes the number of children of X. CS 473 Lecture X 80 Representation of Binomial Heaps HEAD [H] 10 10 parent key degree child 10 10 10 10 10 10 ROOT LIST (LINKED LIST) 10 10 10 10 10 10 10 10 10 10 10 10 10 10 sibling 10 10 10 CS 473 Lecture X 81 Representation of Binomial Heaps • Let x be a node with sibling[x] ≠ NIL – Degree [sibling [x]]=degree[x]-1 if x is NOT A ROOT – Degree [sibling [x]] > degree[x] if x is a root CS 473 Lecture X 82 Operations on Binomial Heaps CREATING A NEW BINOMIAL HEAP MAKE-BINOMIAL-HEAP ( ) allocate H head [ H ]  NIL return H RUNNING-TIME= Θ(1) end CS 473 Lecture X 83 Operations on Binomial Heaps BINOMIAL-HEAP-MINIMUM (H) x  Head [H] min  key [x] x  sibling [x] while x ≠ NIL do if key [x] < min then min  key [x] yx endif x  sibling [x] endwhile return y end CS 473 Lecture X 84 Operations on Binomial Heaps Since binomial heap is HEAP-ORDERED The minimum key must reside in a ROOT NODE Above procedure checks all roots NUMBER OF ROOTS ≤ lgn + 1 RUNNING–TIME = O (lgn) CS 473 Lecture X 85 Uniting Two Binomial Heaps BINOMIAL-HEAP-UNION Procedure repeatedly link binomial trees whose roots have the same degree BINOMIAL-LINK Procedure links the Bk-1 tree rooted at node y to the Bk-1 tree rooted at node z it makes z the parent of y i.e. Node z becomes the root of a Bk tree CS 473 Lecture X 86 Uniting Two Binomial Heaps BINOMIAL-LINK (y,z) p [y]  z sibling [y]  child [z] child [z]  y degree [z] degree [z] + 1 end CS 473 Lecture X 87 Uniting Two Binomial Heaps NIL z +1 NIL sibling [y] CS 473 Lecture X 88 Uniting Two Binomial Heaps: Cases We maintain 3 pointers into the root list x = points to the root currently being examined prev-x = points to the root PRECEDING x on the root list sibling [prev-x] = x next-x = points to the root FOLLOWING x on the root list sibling [x] = next-x CS 473 Lecture X 89 Uniting Two Binomial Heaps • Initially, there are at most two roots of the same degree • Binomial-heap-merge guarantees that if two roots in h have the same degree they are adjacent in the root list • During the execution of union, there may be three roots of the same degree appearing on the root list at some time CS 473 Lecture X 90 Uniting Two Binomial Heaps CASE 1: Occurs when degree [x] ≠ degree [next-x] prev-x x next-x a b c Bk Bl sibling { next-x} d l >k prev-x a b Bk CS 473 x c next-x d Bl Lecture X 91 Uniting Two Binomial Heaps: Cases CASE 2: Occurs when x is the first of 3 roots of equal degree degree [x] = degree [next-x] = degree [sibling[next-x]] prev-x a a CS 473 x next-x sibling [next-x] b c d BK BK BK prev-x x next-x b c d BK BK BK Lecture X 92 Uniting Two Binomial Heaps: Cases CASE 3 & 4: Occur when x is the first of 2 roots of equal degree degree [x] = degree [next-x] ≠ degree [sibling [next-x]] • Occur on the next iteration after any case • Always occur immediately following CASE 2 • Two cases are distinguished by whether x or next-x has the smaller key • The root with the smaller key becomes the root of the linked tree CS 473 Lecture X 93 Uniting Two Binomial Heaps: Cases CASE 3 & 4 CONTINUED prev-x a x next-x sibling [next-x] b c d Bk Bk x prev-x a Bl next-x b d l>k CASE 3 key [b] ≤ key [c] c prev-x x next-x a c d CASE 4 b CS 473 Lecture X 94 Uniting Two Binomial Heaps: Cases The running time of binomial-heap-union operation is O (lgn) • Let H1 & H2 contain n1 & n2 nodes respectively where n= n1+n2 • Then, H1 contains at most lgn1 +1 roots H2 contains at most lgn2 +1 roots CS 473 Lecture X 95 Uniting Two Binomial Heaps: Cases • So H contains at most lgn1 + lgn2 +2 ≤ 2 lgn +2= O (lgn) roots immediately after BINOMIAL-HEAP-MERGE • Therefore, BINOMIAL-HEAP-MERGE runs in O(lgn) time and • BINOMIAL-HEAP-UNION runs in O (lgn) time CS 473 Lecture X 96 Binomial-Heap-Union Procedure BINOMIAL-HEAP-MERGE PROCEDURE - Merges the root lists of H1 & H2 into a single linkedlist - Sorted by degree into monotonically increasing order CS 473 Lecture X 97 Binomial-Heap-Union Procedure BINOMIAL-HEAP-UNION (H1,H2) H  MAKE-BINOMIAL-HEAP ( ) head [ H ]  BINOMIAL-HEAP-MERGE (H1,H2) free the objects H1 & H2 but not the lists they point to prev-x  NIL x  HEAD [H] next-x  sibling [x] while next-x ≠ NIL do if ( degree [x] ≠ degree [next-x] OR (sibling [next-x] ≠ NIL and degree[sibling [next-x]] = degree [x]) then prev-x  x CASE 1 and 2 x  next-x CASE 1 and 2 elseif key [x] ≤ key [next-x] then sibling [x]  sibling [next -x] CASE 3 CS 473 Lecture X 98 Binomial-Heap-Union Procedure (Cont.) BINOMIAL- LINK (next-x, x) CASE 3 else if prev-x = NIL then head [H]  next-x else sibling [prev-x]  next-x endif BINOMIAL-LINK(x, next-x) x  next-x CASE 4 CASE 4 CASE 4 CASE 4 CASE 4 endif next-x  sibling [x] endwhile return H end CS 473 Lecture X 99 Uniting Two Binomial Heaps vs Adding Two Binary Numbers H1 with n1 NODES : H1 = H2 with n2 NODES : H2 = 5 4 3 2 1 0 ex: n1= 39 : H1 = < 1 0 0 1 1 1 > = { B0, B1, B2, B5 } n2 = 54 CS 473 : H2= < 1 1 0 1 1 0 > = { B1, B2, B4, B5 } Lecture X 100 MERGE H CASE1 MARCH Cin=0 1+0=1 x next-x B0 B1 x B0 x B0 CASE2 MARCH then CASE3 and CASE4 LINK Cin=1 1+1=11 CS 473 B2 B2 B4 B5 B5 next-x B1 CASE3 or 4 LINK Cin=0 1+1=10 B1 B1 B2 B2 B4 B2 B4 B5 B5 B5 next-x B1 B2 B5 B1 B2 x B0 B2 Lecture X B2 next-x B2 B4 B5 B5 101 B0 B2 CASE1 MARCH Cin=1 0+0=1 x next-x B2 B4 B5 B3 B2 x B0 B2 B3 CASE3 OR 4 LINK Cin=0 1+0=10 B2 B3 next-x B4 CASE1 MARCH Cin=0 0+1=1 B0 B4 B5 B5 x next-x B5 B5 x B0 B2 B3 B4 B5 B5 CS 473 B5 Lecture X B6 102 Inserting a Node BINOMIAL-HEAP-INSERT (H,x) H'  MAKE-BINOMIAL-HEAP (H, x) P [x]  NIL child [x]  NIL RUNNING-TIME= O(lg n) sibling [x]  NIL degree [x]  O head [H’]  x H  BINOMIAL-HEAP-UNION (H, H’) end CS 473 Lecture X 103 Relationship Between Insertion & Incrementing a Binary Number H : n1=51 H = < 110011> = { B0, B1 ,B4, B5 } H MERGE ( H,H’) B0 B1 x next-x B0 B0 B1 x LINK B4 B5 B5 next-x B0 B1 B2 B0 B1 LINK B4 B4 B5 B4 B5 B4 B5 5 4 3 2 1 0 1 1 1 0 0 1 1 1 + 1 1 0 1 0 0 B1 CS 473 Lecture X 104 A Direct Implementation that does not Call Binomial-Heap-Union - More effıcient - Case 2 never occurs - While loop should terminate whenever case 1 is encountered CS 473 Lecture X 105 Extracting the Node with the Minimum Key BINOMIAL-HEAP-EXTRACT-MIN (H) (1) find the root x with the minimum key in the root list of H and remove x from the root list of H (2) H’  MAKE-BINOMIAL-HEAP ( ) (3) reverse the order of the linked list of x’ children and set head [H’]  head of the resulting list (4) H  BINOMIAL-HEAP-UNION (H, H’) return x end CS 473 Lecture X 106 Extracting the Node with the Minimum Key Consider H with n = 27, H = <1 1 0 1 1> = {B0, B1, B3, B4 } assume that x = root of B3 is the root with minimum key x head [H] B0 B1 B4 B1 B2 CS 473 Lecture X B0 107 Extracting the Node with the Minimum Key x head [H] B0 B1 B4 B1 B2 B0 head [H’] CS 473 Lecture X 108 Extracting the Node with the Minimum Key • Unite binomial heaps H= {B0 ,B1,B4} and H’ = {B0 ,B1,B2} • Running time if H has n nodes • Each of lines 1-4 takes O(lgn) time it is O(lgn). CS 473 Lecture X 109 Decreasing a Key BINOMIAL-HEAP-DECREASE-KEY (H, x, k) key [x]  k y  x z  p[y] while z ≠ NIL and key [y] < key [z] do exchange key [y]  key [z] exchange satellite fields of y and z yz z  p [y] endwhile end CS 473 Lecture X 110 Decreasing a Key • Similar to DECREASE-KEY in BINARY HEAP • BUBBLE-UP the key in the binomial tree it resides in • RUNNING TIME: O(lgn) CS 473 Lecture X 111 Deleting a Key BINOMIAL- HEAP- DELETE (H,x) y←x z ← p [y] RUNNING-TIME= O(lg n) while z ≠ NIL do key [y] ← key [z] satellite field of y ← satellite field of z y ← z ; z ← p [y] endwhile H’← MAKE-BINOMIAL-HEAP remove root z from the root list of H reverse the order of the linked list of z’s children and set head [H’] ← head of the resulting list H ← BINOMIAL-HEAP-UNION (H, H’) CS 473 Lecture X 112 Deleting a Key (Cont.) H’ ← MAKE-BINOMIAL-HEAP remove root z from the root list of H reverse the order of the linked list of z’s children set head [H’] ← head of the resulting list H ← BINOMIAL-HEAP-UNION (H, H’) end CS 473 Lecture X 113 Fibonacci Heaps • Binomial heaps support the mergeable heap operations (INSERT, MINIMUM, EXTRACT_MIN, UNION plus, DECREASE_KEY and DELETE) in O(lgn) worst-case time. • Fibonacci heaps support the mergeable heap operations that do not involve deleting an element in O(1) amortized time. Fibonacci Heaps • Fibonacci heaps are especially desirable when the number of EXTRACT-MIN and DELETE operations is small relative to the number of other operations. • Fibonacci heaps are loosely based on binomial heaps. • A collection of trees if neither DECREASEKEY nor DELETE is ever invoked. • Each tree is like a binomial tree. Fibonacci Heaps • Fibonacci heaps differ from binomial-heaps, however, in that they have more more relaxed structure allowing for improved asymptotic time bounds work that maintains the structure is delayed until it is convenient to perform. • Like a binomial heap, a fibonacci heap is a collection of heap-ordered trees however, trees are not constrained to be binomial trees. • Trees within fibonacci heaps are rooted but unordered. Structure of Fibonacci Heaps Each node x contains: • A pointer p[x]to its parent • A pointer child[x] to one of its children – The children of x are linked together in a circular, doubly-linked list which is called the child-list. p key degree mark left right child Structure of Fibonacci Heaps • Each child y in a child list has pointers left[y] & right[y] that point to y’s left & right siblings respectively. • If y is an only child, then left[y] = right[y] = y. Structure of Fibonacci Heaps The roots of all trees are also linked together using their left & right pointers into a circular, doubly-linked list which is called the root list. p key degree mark left right child Structure of Fibonacci Heaps • Circular, doubly-linked lists have two advantages for use in fib-heaps: – we can remove a node in O(1) time – given two such lists, we can concatenate them in O(1) time. Structure of Fibonacci Heaps • Two other fields in each node x – degreee[x]: the number of children in the child list of x – mark[x]: a boolean-valued field • indicates whether node x has lost a child since the last time x was made the child of another one • newly created nodes are unmarked • A node x becomes unmarked whenever it is made the child of another node Structure of Fibonacci Heaps min[H] 23 7 3 18 52 17 38 24 30 26 Marked node marked nodes 39 41 35 46 Structure of Fibonacci Heaps min[H] 23 7 3 18 39 52 17 24 38 26 41 35 Concetenation of Two Circular, Doubly – Linked Lists min[H1] a b (x) c d p q (y) r s min[H2] e Concetenation of Two Circular, Doubly – Linked Lists min[H2] a b r min[H1] s p q c d d Concetenation of Two Circular, Doubly – Linked Lists CONCATENATE (H1, H2) x ← left[min[H1]] y ← left[min[H2]] right[x] ← min[H2] left[min[H2]] ← x right[y] ← min[H1] left[min[H1]] ← y end Running time is O(1) Potential Function • A given fibonacci heap H – t(H): the number of trees in root list of H – m(H): the number of marked nodes in H • The potential of fibonacci heap H is: Φ(H) = t(H) + 2 m(H) • A fibonacci heap application begins with an empty heap: the initial potential = 0 • The potential is non-negative at all subsequent times. Maximum Degree • We will assume that there is aknown upper bound D(n) on the maximum degree of any node in an n node heap • If only mergeable-heap operations are supported D(n)  lg n • If decrease key & delete operations are supported D(n) = O(lg n) Mergeable Heap Operations MAKE-HEAP, INSERT, MINIMUM, EXTRACT-MIN, UNION If only these operations are to be supported, each fibonacci-heap is a collection of unordered binomial trees. Mergeable Heap Operations • An unordered binomial tree Uk – is like a binomial tree – defined recursively: • U0 consists of a single node • Uk consists of two Uk-1’s for which the root of one is made into any child of the root of the other Mergeable Heap Operations • Lemma which gives properties of binomial trees holds for unordered binomial trees as well but with the following variation on property 4 • Property 4’: For the unordered binomial tree Uk: – The root has degree k > the degree of any other node – The children of the root are the roots of subtrees U0, U1, ..........,Uk-1 in some order Mergeable Heap Operations • The key idea in the mergeable heap operations on fibonacci heaps is to delay work as long as possible. • Performance trade-off among implementations of the various operations: – If the number of trees is small we can quickly determine the new min node during EXTRACT-MIN – However we pay a price for ensuring that the number of trees is small Mergeable Heap Operations • However we pay a price for ensuring that the number of trees is small • However it can take up to Ω(lg n) time – to insert a node into a binomial heap – or to unite two binomial heaps • We do not consolidate trees in a fibonacci heap when we insert a new node or unite two heaps • We delay the consolidation for the EXTRACT-MIN operation when we really need to find the new minimum node. Mergeable Heap Operations Creating a new fibonacci heap: MAKE-FIB-HEAP procedure – allocates and returns the fibonacci heap object H – Where n[H] = 0 and min[H] = NIL – There are no trees in the heap because t(H) = 0 and m(H) = 0 => Φ(H) = 0 the amortized cost = O(1) = the actual cost Mergeable Heap Operations Inserting a node FIB-HEAP-INSERT(H, x) degree[x] ← 0 p[x] ← NIL child[x] ← NIL left[x] ← x right[x] ← x mark[x] ← FALSE concatenate the root list containing x with root list H if key[x] < key[min[H]] then min[H] ← x endif n[H] ← n[H] + 1 end Mergeable Heap Operations min[H] H 23 x 7 3 18 39 52 17 38 41 30 24 26 35 46 21 Mergeable Heap Operations min[H’] H’ 23 7 21 3 18 39 52 17 38 41 30 24 26 35 46 Mergeable Heap Operations t(H’) = t(H) + 1 • Increase in potential: Φ(H’) - Φ(H) = [t(H) + 1 + 2m(H)] – [t(H) + 2m(H)] =1 The actual cost = O(1) The amortized cost = O(1) + 1 = O(1) Mergeable Heap Operations Finding the minimum node: Given by pointer min[H] actual cost = O(1) amortized cost = actual cost = O(1) since the potential of H does not change Uniting Two Fibonacci Heaps FIB-HEAP-UNION(H1, H2) H = MAKE-FIB-HEAP() if key[min[H1]] ≤ key[min[H2]] then min[H] ← min[H1] else min[H] ← min[H2] endif concatenate the root lists of H1 and H2 n[H] ← n[H1] + n[H2] Free the objects H1 and H2 return H end Uniting Two Fibonacci Heaps • No consolidation of trees • Actual cost = O(1) • Change in potential Φ(H) – (Φ(H1) + Φ(H2)) = = (t(H) + 2m(H)) – ((t(H1) + 2m(H1)) + (t(H2) + 2m(H2))) = 0 since t(H) = t(H1) + t(H2) m(H) = m(H1) + m(H2) Therefore amortized cost = actual cost = O(1) Extracting the Minimum Node The most complicated operation the delayed work of consolidating the trees in the root list occurs FIB-HEAP-EXTRACT-MIN(H) z = min[H] for each child x of z add x to the root list of H p[x] ← NIL endfor remove z from the root list of H min[H] ← right[z] CONSOLIDATE(H) end Extracting the Minimum Node • Repeatedly execute the following steps until every root in the root list has a distinct degree value (1) Find two roots x and y in the root list with the same degree where key[x] ≤ key[y] (2) Link y to x : Remove y from the root list and make y a child of x This operation is performed by procedure FIB-HEAP-LINK Procedure CONSOLIDATE uses an auxiliary pointer array A[0......D(n)] A[i] = y : y is currently a root with degree[y] = i Extracting the Minimum Node CONSOLIDATE(H) for i← 0 to D(n) do A[i] ← N IL endfor for each node w in the root list of H do x←w d ← degree[x] while A[d] ≠ NIL do y ← A[d] if key[x] > key[y] then exchange x ↔ y endif FIB-HEAP-LINK(H,y,x) A[d] ← NIL d←d+1 endwhile A[d] ← x endfor min[H] ← +∞ for i ← 0 to D(n[H]) do if A[i] ≠ NIL then add A[i] to the root list of H if key[A[i]] < key[min[H]] then min[H] ← A[i] endif endif endfor end Extracting the Minimum Node FIB-HEAP-LINK(H,y,x) remove y from the root list of H make y a child of x, incrementing degree[x] mark[y] ← FALSE end Extracting the Minimum Node min[H] 23 7 21 3 18 39 52 17 38 41 30 24 26 35 46 Extracting the Minimum Node min[H] 23 7 21 18 39 52 38 17 41 30 24 26 35 46 Extracting the Minimum Node 0 1 2 3 4 A w, x 23 7 21 18 39 52 38 17 41 30 24 26 35 46 Extracting the Minimum Node 0 1 2 3 4 A w, x 23 7 21 18 39 52 38 17 41 30 24 26 35 46 Extracting the Minimum Node 0 1 2 3 4 A w, x 23 7 21 18 39 52 38 17 41 30 24 26 35 46 Extracting the Minimum Node 0 1 2 3 4 A w 7 x 23 21 18 39 52 38 17 41 30 24 26 35 46 Extracting the Minimum Node 0 1 2 3 4 A x 21 7 17 23 w 18 39 52 38 41 24 26 35 30 46 Extracting the Minimum Node 0 1 2 3 4 A x 7 24 26 35 17 46 30 21 23 w 18 39 52 38 41 Extracting the Minimum Node A w, x 7 24 26 35 17 46 30 23 21 18 39 52 38 41 Extracting the Minimum Node 0 1 2 3 4 A 7 24 26 35 17 46 30 23 21 18 52 39 w, x 38 41 Extracting the Minimum Node 0 1 2 3 4 18 w, x A 7 24 26 35 17 46 30 23 21 52 39 38 41 Extracting the Minimum Node 0 1 2 3 4 A 7 24 26 35 17 46 30 18 23 21 52 38 39 41 w, x Extracting the Minimum Node min[H] 7 24 26 35 17 46 30 18 23 21 52 38 39 41 Analysis of the FIB-HEAP-EXTRACT-MIN Procedure • If all trees in the fib-heap are unordered binomial trees before the execution of the EXTRACT-MIN operation then they are all unordered binomial trees afterward. • There are two ways in which trees are changed: (1) each child of the extracted root node becomes a child, each new tree is itself an unordered binomial tree. (2) trees are linked by FIB-HEAP-LINK procedure only if they have the same degree hence Uk is linked to Uk to form a Uk+1 Complexity Analysis of the FIB-HEAP-EXTRACT-MIN Procedure Actual Cost 1-st for loop: Contributes O(D(n)) 3-rd for loop: Contributes O(D(n)) 2-nd for loop: Size of the root-list upon calling CONSOLIDATE is at most: D(n) + t(H) - 1 D(n): upper bound on the number of children of the extracted node t(H) – 1: original t(H) root list nodes – the extracted node Complexity Analysis of the FIB-HEAP-EXTRACT-MIN Procedure Each iteration of the inner while-loop links one root to another thus reducing the size of the root list by 1 Therefore the total amount work performed in the 2-nd for loop is at most proportional to D(n) + t(H) Thus, the total actual cost is O(D(n) + t(H)) Complexity Analysis of the FIB-HEAP-EXTRACT-MIN Procedure Amortized Cost Potential before: t(H) + 2m(H) Potential after: at most (D(n) + 1)+2m(H) since at most D(n)+1 roots remain & no nodes marked Amortized cost = O(D(n) + t(H)) + [(D(n) + 1) + 2m(H)] – [t(H) + 2m(H)] = O(D(n)) + O(t(H)) – D(n) – t(H) = O(D(n)) Complexity Analysis of the FIB-HEAP-EXTRACT-MIN Procedure The cost of performing each link is paid for by the reduction in potential due to the link reducing the number of roots by one EXTRACT-MIN Procedure for Fibonacci Heaps FIB-HEAP-EXTRACT-MIN (H) z  min[ H ] if z ≠ NIL then for each child x of z do add x to the root list of H p [ x ]  NIL endfor remove z from the root list of H if right [ z ] = z then min [ H ]  NIL else min [ H ]  right [ z ] CONSOLIDATE (H) endif n[H]n[H]–1 endif return z end EXTRACT-MIN Procedure for Fibonacci Heaps FIB-HEAP-LINK ( H, y, x ) remove y from the root list of H make y a child of x, incrementing degree [x] mark [ y ]  FALSE end EXTRACT-MIN Procedure for Fibonacci Heaps CONSOLIDATE ( H ) for i  0 to D ( n ( H ) ) A[ i ]  NIL endfor for each node w in the root list of H do xw d  degree [ x ] while A [ d ] ≠ NIL do yA[d] if key [ x ] > key [ y ] then exchange x ↔ y endif FIB-HEAP-LINK ( H , y, x ) A [ d ]  NIL dd+1 endwhile A[d]x endfor min [ H ]  NIL for i  0 to D ( n [ H ] ) do if A [ i ] ≠ NIL then Add A [ i ] to the root list of H if min [ H ] = NIL or key [ A [ i ] ] < key [ min [ H ] ] then min [ H ]  A [ i ] endif endif endfor end Bounding the Maximum Degree For each node x within a fibonacci heap, define size(x): the number of nodes, including itself, in the subtree rooted at x NOTE: x need not to be in the root list, it can be any node at all. We shall show that size(x) is exponential in degree[x] Bounding the Maximum Degree Lemma 1: Let x be a node with degree[x]=k Let y1,y2,....,yk denote the children of x in the order in which they are linked to x, from earliest to the latest, then degree[y1] ≥ 0 and degree[yi] ≥ i-2 for i = 2,3,...,k Bounding the Maximum Degree Proof: degree[y1] ≥ 0 => obvious For i ≥ 2: LIN y1 y2 z1 y3 z2 K yi-1 yi Bounding the Maximum Degree • When yi is linked to x: at least y1,y2,....,yi-1 were all children of x so we must have had degree[x] ≥ i – 1 • NOTE: z node(s) denotes the node(s) that were children of x just before the link of yi that are lost after the link of yi • When yi is linked to x: degree[yi] = degree[x] ≥ i – 1 since then, node yi has lost at most one child, we conclude that degree[yi] ≥ i-2 Bounding the Maximum Degree Fibonacci Numbers Fk 0 if k  0   1 if k  1 F  k 1  Fk  2 if k  2 Bounding the Maximum Degree Lemma 2: For all integers k≥0 Fk  2  1  k  i 0 Fi Proof: By induction on k When k = 0: o 1   Fi  1  F0  1  0  1  F2 i 0 Bounding the Maximum Degree Inductive Hypothesis: k 1 Fk 1  1   F i i 0 Fk  2  Fk  Fk 1 k 1  Fk  (1   Fi ) Fk 2   k Recall that i 0 k  1   Fi where i 0 1 5   1.61803 2 is the golden ratio Bounding the Maximum Degree Lemma 3: Let x be any node in a fibonacci heap with degree[x] = k then size(x) ≥ Fk+2 ≥ Φk, where Φ = (1+√5)/2 Proof: Let Sk denote the lower bound on size(z) over all nodes z such that degree[z] = k Trivially, S0 = 1, S1 = 2, S2 = 3 Note that,Sk ≤ size(z) for any node z with degree[z] = k Bounding the Maximum Degree As in Lemma-1, let y1,y2,....,yk denote the children of node x in the order in which they were linked to x k k i 2 i 2 size ( x)  S k  1  1   Si  2  2   Si  2 for x itself S1 for y1 for y2,....,yk due to Lemma-1 Bounding the Maximum Degree Inductive Hypothesis: Si ≥ Fi+2 for i=0,1,...,k-1 k k k i 2 i 2 i 0 S k  2   S i  2  2   Fi  1   Fi  Fi  2 Due to Lemma-2, thus we have shown that: size(x) ≥ Sk ≥ Fk+2 ≥ Φk Bounding the Maximum Degree Corollary: Max. degree D(n) in an n node fibheap is O(lg n) Proof: Let x be any node with degree[x] = k in an n-node fib-heap by Lemma-3 we have n ≥ size(x) ≥ Φk taking base-Φ log => k ≤ logΦn therefore D(n) = O(lg n) Decreasing a Key FIB-HEAP-DECREASE-KEY(H, x, k) key[x] ← k y ← p[x] if y ≠ NIL and key[x] < key[y] then CUT(H, x, y) CASCADING-CUT(H,y) endif /* else no structural change */ if key[x] < key[min[H]] then min[H] ← x endif end Decreasing a Key CUT(H, x, y) remove x from the child list of y, decrementing degree y add x to the root list of H p[x] ← NIL mark[x] ← FALSE end Decreasing a Key CASCADING-CUT(H, y) z ← p[y] if z ≠ NIL then if mark[y] = FALSE then mark[y] = TRUE else CUT(H, y, z) CASCADING-CUT(H, z) endif endif end Decreasing a Key min[H] z T 7 18 38 15 CUT(H, 46, 24) y 24 17 23 21 39 41 T T 26 46 30 52 x key[x] is decreased to 15 (no cascading cuts) 35 Decreasing a Key min[H] 15 38 18 7 5 CUT(H, 35, 26*) T 24 T z 17 23 21 y 30 26 x 35 key[x] => 5 will invoke 2 cascading cuts 52 39 41 Decreasing a Key 15 5 7 18 38 CASCADING-CUT T T 24 z 17 23 21 y 26 30 52 39 41 Decreasing a Key z 15 5 7 26 18 38 CASCADING-CUT y 24 17 30 23 21 52 39 41 Decreasing a Key min[H] 15 5 26 24 7 17 30 18 23 21 52 38 39 41 Decreasing a Key F 2 5 F 4 60 T 6 20 7 T 8 T 10 11 30 T 12 18 * CUT 14 decrease to 10 15 16 Decreasing a Key CASCADING-CUTS F * F 10 16 F 12 15 18 F 10 30 F 8 11 F 6 20 A CASCADING-CUT following a decrease key 14 by 10 2 7 4 60 5 Amortized Cost of FIB-HEAP-DECREASE-KEY Procedure Actual Cost O(1) time + the time required to perform the cascading cuts, suppose that CASCADING-CUT is recursively called c times each call takes O(1) time exclusive of recursive calls therefore, the actual cost = O(1) + O(c) = O(c) Amortized Cost of FIB-HEAP-DECREASE-KEY Procedure Amortized Cost Let H denote the fib-heap prior to the DECREASE-KEY operation. Each recursive call of CASCADING-CUT, except for the last one, cuts a marked node and last call of cascading cut may mark a node. Amortized Cost of FIB-HEAP-DECREASE-KEY Procedure Hence, after the DECREASE-KEY operation numberof trees  t ( H )  1  (c  1)  t ( H )  C tree rooted at x trees produced by cascading cuts number of marked nodes  m( H )  (c  1)  1  m( H )  c  2 unmarked during the first c-1 CASCADING-CUTS marked during the last CASCADING-CUT Amortized Cost of FIB-HEAP-DECREASE-KEY Procedure Potential Difference = [(t(H) + c) + 2(m(H)-c+2)]-[t(H)+2m(H)] =4–c Amortized Cost = O(c) + 4 – c = O(1) Deleting a Node FIB-HEAP-DELETE(H, x) FIB-HEAP-DECREASE-KEY(H, x, -∞) FIB-HEAP-EXTRACT-MIN(H) end Amortized Cost = O(1) + O(D(n)) = O(D(n)) => O(1) => O(D(n)) Analysis of the Potential Function • Why the potential function includes the term t(H)?: Each INSERT operation increases the potential by one unit such that its amortized cost = O(1) + 1(increase in potential) = O(1) Analysis of the Potential Function • Consider an EXTRACT-MIN operation: Let T(H) denote the trees just priori to the execution of EXTRACT-MIN the root-list, just before the CONSOLIDATE operation, T(H) – {x} U {children of x} where x is the extracted node with the minimum key • The root node of each tree in T(H) – {x} carries a unit potential to pay for the link operation during the CONSOLIDATE Analysis of the Potential Function • Let T1&T2 Є T(H) - {x} of the same degree = k $1 $1 T1 $1 LINK T2 T2 T1 •The root node with smaller key pays for the potential •The root node of the resulting tree still carries a unit potential to pay for a further link during the consolidate operation. Analysis of the Potential Function • Why the potential function includes the term 2m(H)?: – When a marked node y is cut by a cascading cut its mark bit is cleared so the potential is reduced by 2 – One unit pays for the cut and clearing the mark field – The other unit compensates for the unit increase in potential due to node y becoming a root Analysis of the Potential Function • That is, when a marked node is cleared by a CASCADING-CUT t  t 1    (t  1  2(m  1))  (t  2m)  1 m  m  1 This unit decrease in potential pays for the cascading cut. Note that, the original cut (of node x) is paid for by the actual cut. Bounding the Maximum Degree • Why do we apply CASCADING-CUT during DECREASE-KEY operation? • To maintain the size of any tree/subtree exponential in the degree of its root node. e.g.: to prevent cases where size[x] = degree[x] + 1 Bounding the Maximum Degree x size[x] = 7 = degree[x] + 1 Bounding the Maximum Degree x size[x] = 24 = 16 10 28 13 8 14 29 17 CUTs due to a worst-case sequence of DECREASE-KEY operations which do not decrease degree[x] Bounding the Maximum Degree x 10 28 13 8 17 14 29 size(x) = 8 ≥ Φ4 ≈ 6.5 2-3-4 Trees • Multi-way Trees are trees that can have up to four children and three data items per node. • 2-3-4 Trees: features – Are always balanced. – Reasonably easy to program . –  Serve as an introduction to the understanding of B-Trees!! • B-Trees: another kind of multi-way tree particularly useful in organizing external storage, like files. – B-Trees can have dozens or hundreds of children with hundreds of thousands of records! 202 Introduction to 2-3-4 Trees • In a 2-3-4 tree, all leaf nodes are at the same level. (but data can appear in all nodes) 50 30 10 20 40 60 55 62 64 66 70 75 80 83 86 203 2-3-4 Trees • The 2, 3, and 4 in the name refer to how many links to child nodes can potentially be contained in a given node. • For non-leaf nodes, three arrangements are possible: – A node with only one data item always has two children – A node with two data items always has three children – A node with three data items always has four children. • For non-leaf nodes with at least one data item ( a node will not exist with zero data items), the number of links may be 2, 3, or 4. 204 • Non-leaf nodes must/will always have one more child (link) than it has data items (see below); – Equivalently, if the number of child links is L and the number of data items is D, then L = D+1. 50 30 10 20 40 60 55 62 64 66 70 75 80 83 86 205 More Introductory stuff 50 30 10 20 40 60 55 62 64 66 70 75 80 83 86 • Critical relationships determine the structure of 2-3-4 trees: • A leaf node has no children, but can still contain one, two, or three data items ( 2, 3, or 4 links); cannot be empty. (See figure above) •  Because a 2-3-4 tree can have nodes with up to four children, it’s called a multiway tree of order 4. 206 Still More Introductory stuff • Binary (and Red Black) trees may be referred to as multiway trees of order 2 - each node can have up to two children. • But note: in a binary tree, a node may have up to two child links ( but one or more may be null). • In a 2-3-4 tree, nodes with a single link are NOT permitted; – a node with one data item must have two links (unless it’s a leaf); – nodes with two data items must have three children; – nodes with three data items must have four children. • (You will see this more clearly once we talk about how they are actually built.) 207 Even More Introductory stuff • These numbers are important. • For data items, a node with one data item: points to (links to) lower level nodes that have values less than the value of this item and a pointer to a node that has values greater than or equal to this value. • For nodes with two links: a node with two links is called a 2-node; a node with three links is called a 3-node; with four links, a 4node. (no such thing as a 1-node). 208 50 30 10 20 40 2-node 2-node 55 60 62 64 66 70 80 4-node 75 83 86 Do you see any 2-nodes? 3-nodes? 4-nodes? Do you see: a node with one data item that has two links? a node with two data items having three children; a node with three data items having four children? 209 2-3-4 Tree Organization • Very different organization than for a binary tree. • First, we number the data items in a node 0,1,2 and number child links: 0,1,2,3. Very Important. • Data items are always ascending: left to right in a node. • Relationships between data items and child links is easy to understand but critical for processing. 210 More on 2-3-4 Tree Organization A B Points to nodes w/keys < A ; Nodes with key between A and <B C Nodes w/keys between B and < C Nodes w/keys > C See below: (Equal keys not permitted; leaves all on same level; upper level nodes often not full; tree balanced! Its construction always maintains its balance, even if you add additional data items. (ahead) 50 30 10 20 40 2-node 2-node 55 60 62 64 66 70 75 80 4-node 83 86 211 • • • • Searching a 2-3-4 Tree A very nice feature of these trees. You have a search key; Go to root. Retrieve node; search data items; If hit: – done. • Else – Select the link that leads to the appropriate subtree with the appropriate range of values. – If you don’t find your target here, go to next child. (notice data items are sequential – VIP later) – etc. Data will ultimately be ‘found’ or ‘not found.’ 212 Try it: search for 64, 40, 65 50 30 10 20 40 2-node 2-node 55 60 62 64 66 70 75 80 4-node 83 86 Note: Nodes serve as holders of data and holders of ‘indexes’. Note: can easily have a ‘no hit’ condition Note: the sequential nature after indexing…sequential searching within node. 213 So, how do we Insert into this Structure? • Can be quite easy; sometimes very complex. – Can do a top-down or a bottom-up approach… • Easy Approach: – Start with searching to find a spot for data item. • We like to insert at the leaf level, but we will take the top-down approach to get there… So, – Inserting may very likely involve moving a data item around to maintain the sequential nature of the data in a leaf. 214 Node Split – a bit more difficult (1 of 2) • Using a top-down 2-3-4 tree. • If we encounter a full node in looking for the insertion point. – We must split the full nodes. • You will see that this approach keeps the tree balanced. 215 Node Split – Insertion: more difficult – 2 of 2 Upon encountering a full node (searching for a place to insert…) 1. split that node at that time. 2. move highest data item from the current (full) node into new node to the right. 3. move middle value of node undergoing the split up to parent node (Know we can do all this because parent node was not full) 4. Retain lowest item in node. 5. New node (to the right) only has one data item (the highest value) 6. Original node (formerly full) node contains only the lowest of the three values. 7. Rightmost children of original full node are disconnected and connected to new children as appropriate (They must be disconnected, since their parent data is changed) New connections conform to linkage conventions, as expected. 8. Insert new data item into the original leaf node. Note: there can be multiple splits encountered en route to finding the insertion point. 216 Insert: Here: Split is NOT the root node. Let’s say we want to add a 99 (from book)… 62 Want to add a data value of 99 Split this node… … other stuff 83 74 87 89 92 97 99 to be inserted… 104 112 217 Case 1 Insert: Split is NOT the root node (Let’s say we want to add a 99 (from book)…) 2. 92 moves up to parent node. (We know it was not full) 62 92 3. 83 stays put … other stuff 83 74 87 89 1. 104 starts a new node 104 97 99 112 4. Two rightmost children of split node are reconnected to new node. 5. New data item moved in. 218 If Root itself is full: Split the Root • Here, the procedure is the same. • Root is full. Create a sibling – Highest value data is moved into new sibling; – first (smallest value) remains in node; – middle value moves up and becomes data value in new root. • Here, two nodes are created: – A new sibling and a new root. 219 • Splitting on the Way Down Note: once we hit a node that must be split (on the way down), we know that when we move a data value ‘up’ that ‘that’ node was not full. – May be full ‘now,’ but it wasn’t on way down. • Algorithm is reasonably straightforward. • Do practice the splits on Figure 10.7. – Will see later on next exam. • I strongly recommend working the node splits on page 381. Ensure you understand how they work. • Just remember: – 1. You are splitting a 4-node. Node being split has three data values. Data on the right goes to a new node. Data on the left remains; data in middle is promoted upward; new data item is inserted appropriately. – 2. We do a node split any time we encounter a full node and when we are trying to insert a new data value. 220 Objects of this Class Represent Data Items Actually Stored. // tree234.java import java.io.*; //////////////////////////////////////////////////////////////// class DataItem { This is merely A data item stored at the nodes. In practice, this might be an entire record or object. Here we are only showing the key, where the key may represent the entire object. public int dData; // one data item //-------------------------------------------------------------public DataItem(int dd) // constructor { dData = dd; }// end constructor //-------------------------------------------------------------public void displayItem() // display item, format "/27" { System.out.print("/"+dData); } // end displayItem() //-------------------------------------------------------------} // end class DataItem 221 Following code is for processing that goes on inside the Nodes themselves – not the entire tree. Not trivial code. Try to understand totally. We will dissect carefully and deliberately! 222 This is what a Node looks like: Note: two arrays: a child array and an item array. private static final int ORDER = 4; Note their size: nodes = 4; item = 3. private int numItems; 4 The child array is size 4: the links: maximum children. private Node parent; 3 private Node childArray[] = new Node[ORDER]; The second array, itemArray is of size 3 – the private DataItem itemArray[] = new DataItem[ORDER-1]; maximum number of data items in a node. // ------------------------------------------------------------numItems is the number of items in the itemArray. parent will be used as a reference when inserting. class Node { public void connectChild(int childNum, Node child) {// connect child to this node childArray[childNum] = child; if(child != null) child.parent = this; } // ------------------------------------------------------------public Node disconnectChild(int childNum) {// disconnect child from this node, return it Node tempNode = childArray[childNum]; childArray[childNum] = null; return tempNode; Major work done by findItem(), insertItem() and } removeItem() (next slides) for a given node. // ------------------------------------------------------------public Node getChild(int childNum) These are complex routines and NOT to be confused { return childArray[childNum]; } with find() and insert() for the Tree234 class itself. // ------------------------------------------------------------These are find() and insert() within THIS node. public Node getParent() { return parent; } // ------------------------------------------------------------Recall: references are automatically initialized to null public boolean isLeaf() and numbers to 0 when their object is created. { return (childArray[0]==null) ? true : false; } So, Node doesn’t need a Constructor. // ------------------------------------------------------------public int getNumItems() 223 { return numItems; } One of three slides of code for class Node public DataItem getItem(int index) // get DataItem at index { return itemArray[index]; } // ------------------------------------------------------------public boolean isFull() { return (numItems==ORDER-1) ? true : false; } // ------------------------------------------------------------public int findItem(int key) // return index of item (within node) { for(int j=0; j<ORDER-1; j++) // if found, otherwise return -1 { if(itemArray[j] == null) break; else Find routine: if(itemArray[j].dData == key) Looking for the data within return j; the node where we are located. }// end for return -1; } // end findItem // ------------------------------------------------------------ public DataItem removeItem() { // removes largest item Delete Routine // assumes node not empty DataItem temp = itemArray[numItems-1]; // use index, save item Saves the deleted item. itemArray[numItems-1] = null; // disconnect it Sets the location contents to null. numItems--; // one less item Decrements the number of items return temp; // return item at the node. } Returns the deleted data item. // ------------------------------------------------------------public void displayNode() { // format "/24/56/74/" for(int j=0; j<numItems; j++) itemArray[j].displayItem(); // "/56" 224 System.out.println("/"); // final "/" } Class Node (continued) Class Node (continued) Insert Routine Increments number of items in node. Get key of new item. Now loop. // ------------------------------------------------------------ public int insertItem(DataItem newItem) { // assumes node is not full numItems++; int newKey = newItem.dData; // will add new item // key (int value) of new item for(int j=ORDER-2; j>=0; j--) // start on right to examine data { // looking for spot to insert if(itemArray[j] == null) // if item null, go left one cell. continue; // Recall: what does ‘continue’ do? Go through code and my comments.else { // if not null, get its key Start looking for place to insert the int itsKey = itemArray[j].dData; // not necessary, but … data item. Start on the right and if(newKey < itsKey) // if existing key is bigger, proceed left looking for proper place. itemArray[j+1] = itemArray[j]; // shift whole node right else { // otherwise, insert new item and return index.+1 itemArray[j+1] = newItem; // copies over moved item… return j+1; } } // end else (not null) } // end for // shifted all items, itemArray[0] = newItem; // insert new item return 0; } // end insertItem() 225 Code for the 2-3-4 Tree itself 226 class Tree234App { This code is merely the interface for a client. Pretty easy to follow. All the complex processing is undertaken at the tree level and at the node level. Be certain to recognize this. Note the ‘break’ in the case statements… public static void main(String[] args) throws IOException { long value; Tree234 theTree = new Tree234(); theTree.insert(50); theTree.insert(40); theTree.insert(60); theTree.insert(30); theTree.insert(70); while(true) { System.out.print("Enter first letter of "); System.out.print("show, insert, or find: "); char choice = getChar(); switch(choice) { case 's': theTree.displayTree(); break; case 'i': System.out.print("Enter value to insert: "); value = getInt(); theTree.insert(value); break; case 'f': System.out.print("Enter value to find: "); value = getInt(); int found = theTree.find(value); if(found != -1) System.out.println("Found "+value); else System.out.println("Could not find "+value); break; default: System.out.print("Invalid entry\n"); } // end switch 227 } // end while Class Tree234App (continued) //-------------------------------------------------------------public static String getString() throws IOException { InputStreamReader isr = new InputStreamReader (System.in); BufferedReader br = new BufferedReader(isr); String s = br.readLine(); return s; } //-------------------------------------------------------------public static char getChar() throws IOException { String s = getString(); return s.charAt(0); } //------------------------------------------------------------public static int getInt() throws IOException { String s = getString(); return Integer.parseInt(s); } //------------------------------------------------------------} // end class Tree234App 228 class Tree234 { The object of type Tree234 IS the entire tree.private Node root = new Node(); // make root node Note: tree only has one attribute, its root. public int find(int key) { This is all it needs. Node curNode = root; int childNumber; while(true) { if(( childNumber=curNode.findItem(key) ) != -1) return childNumber; // found; recall findItem returns index v else if( curNode.isLeaf() ) return -1; // can't find it else // search deeper Finding, Searching, and Splitting algorithms curNode = getNextChild(curNode, key); are all shown here. } // end while }// end find() public void insert(int dValue) { // insert a DataItem Node curNode = root; DataItem tempItem = new DataItem(dValue); while(true) Call split routine { if( curNode.isFull() ) { split(curNode); // call to split node See creation of additional two nodes curNode = curNode.getParent(); // back up // search onc curNode = getNextChild(curNode, dValue); } // end if(node is full) else if( curNode.isLeaf() ) // if node is leaf, insert data break; else // node is not full, not a leaf, so go to lower level curNode = getNextChild(curNode, dValue); 229 } // end while Class Tree234 (continued) Be careful in here. Remember, the middle value is moved to parent and must be disconnected (and made null). Rightmost element is moved into new node as leftmost data item. Review the process and then note the code that implements the process. public void split(Node thisNode) { // split the node // assumes node is fu DataItem itemB, itemC; // When you get here, you know you need to split… Node parent, child2, child3; int itemIndex; itemC = thisNode.removeItem(); // remove items from this node. itemB = thisNode.removeItem(); // Note: these are second and third items child2 = thisNode.disconnectChild(2); // remove children These are rightmost child3 = thisNode.disconnectChild(3); // from this node two children Node newRight = new Node(); // make new node if(thisNode==root) // if the node we’re looking at is the root, { root = new Node(); // make new root parent = root; // root is our parent root.connectChild(0, thisNode); // connect to parent } else // this node to be split is not the root parent = thisNode.getParent(); // get parent // deal with parent itemIndex = parent.insertItem(itemB); // item B to parent int n = parent.getNumItems(); // total items? for(int j=n-1; j>itemIndex; j--) { // move parent's connections Node temp = parent.disconnectChild(j); // one child to the right parent.connectChild(j+1, temp); } parent.connectChild(itemIndex+1, newRight); // connect newRight to parent // deal with newRight newRight.insertItem(itemC); // item C to newRight newRight.connectChild(0, child2); // connect to 0 and 1 230 newRight.connectChild(1, child3); // on newRight Class Tree234 (continued) // gets appropriate child of node during search for value public Node getNextChild(Node theNode, int theValue) { int j; // assumes node is not empty, not full, not a leaf int numItems = theNode.getNumItems(); for(j=0; j<numItems; j++) // for each item in node // are we less? if( theValue < theNode.getItem(j).dData ) return theNode.getChild(j); // return left child // end for // we're greater, so return theNode.getChild(j); // return right child }// end getNextChild() // ------------------------------------------------------------public void displayTree() { recDisplayTree(root, 0, 0); } // ------------------------------------------------------------private void recDisplayTree(Node thisNode, int level, int childNumber) { System.out.print("level="+level+" child="+childNumber+" "); thisNode.displayNode(); // display this node // call ourselves for each child of this node int numItems = thisNode.getNumItems(); for(int j=0; j<numItems+1; j++) { Node nextNode = thisNode.getChild(j); if(nextNode != null) recDisplayTree(nextNode, level+1, j); else return; } } // end recDisplayTree() // -------------------------------------------------------------\ 231 } // end class Tree234 Efficiency Considerations for 2-3-4 Trees • Searching: • 2-3-4 Trees: one node must be visited, but – More data per node / level. – Searches are fast. • recognize all data items at node must be checked in a 2-3-4 tree, • but this is very fast and is done sequentially. • All nodes in the 2-3-4 tree are NOT always full. •  Overall, for 2-3-4 trees, the increased number of items (which increases processing / search times) per node processing tends to cancel out the increases gained from the decreased height of the tree and size of the nodes. – Increased number of data items per node implies: fewer node retrievals. •  So, the search times for a 2-3-4 tree and for a balanced binary tree are approximately equal and both are O(log2n) 232 Efficiency Considerations for 2-3-4 Trees • Storage • 2-3-4 Trees: a node can have three data items and up to four references. • Can be an array of references or four specific variables. • IF not all of it is used, can be considerable waste. •  In 2-3-4 trees, quite common to see many nodes not full. 233

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download X - Suyash Bhardwaj