* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download heap property
Survey
Document related concepts
Transcript
GRIFFITH COLLEGE DUBLIN Data Structures, Algorithms & Complexity The Heap Data Structure Lecture 7 1 Introduction The binary heap data structure is a data structure that can be viewed as complete binary tree A binary tree is a tree in which each node can have a left-child and a right-child only A complete binary tree is a tree which is completely filled at all levels except possibly the lowest level, which is filled from left to right Because of its nature a heap can very easily be implemented using an array An array A that represents a heap is an array with two attributes length[A], the number of elements in the array heap-size[A], the number of heap elements stored in the array Lecture 7 2 Heap as Self-Referencing Structure Node has attributes, value, leftchild, rightchild and parent Val parent 24 rightchild leftchild 16 12 15 8 6 Lecture 7 7 3 Heap as an Array Viewed as a binary tree and as an array 1 2 4 8 14 5 4 2 14 3 10 9 4 8 10 5 7 10 6 7 8 2 1 16 16 3 9 3 7 1 6 9 7 3 8 2 9 4 10 1 The root of the tree is stored at A[0], its left-child at A[1], its right child at A[2] etc. Lecture 7 4 Accessing the Heap Values Given the index i of a node, the indices of its parent Parent(i), left-child Left(i) and right child Right(i) can be computed simply Parent(i) return i/2 Left(i) return 2i Right(i) return 2i+1 Heaps must also satisfy the heap property for every node, i, other than the root. A[Parent(i)] A[i] Therefore, the largest element in a heap is stored at the root, and the subtrees rooted at a node contain smaller values than does the node itself. Lecture 7 5 Heap Operations The height of a node in a tree is the number of edges on the longest simple downward path from the node to a leaf. The height of an n-element heap based on a binary tree is (lg n) The basic operations on heaps run in time at most proportional to the height of the tree and thus take O(lg n) time. Lecture 7 6 Maintaining the Heap Property Heapify is an important subroutine for manipulating heaps. Its inputs are an array A and an index i into the array. When Heapify is called, it is assumed that the binary trees rooted at Left(i) and Right(i) are heaps, but that A[i] may be smaller than its children, thus violating the heap property. The function of Heapify is to let the value at A[i] ‘float down’ in the heap so that the subtree rooted at index i becomes a heap. The action required from Heapify is as follows: Lecture 7 7 Example of Heapify 1 i 4 8 2 4 7 8 9 10 3 10 5 14 2 16 6 7 9 3 1 1 2 14 5 i44 8 1 2 14 4 8 2 5 8 4 9 10 16 7 2 16 7 8 9 10 3 10 6 9 7 3 1 3 6 10 9 3 7 1 Lecture 7 8 The Heapify Algorithm Heapify(A, i) left = Left(i) right = Right(i) if left heap_size[A] and A[left] > A[i] then largest = left else largest = i endif if right heap-size[A] and A[right] > A[largest] then largest = right if largest <> i then exchange A[i] and A[largest] Heapify(A, largest) endif endalg Lecture 7 9 The Heapify Algorithm At each step, the index of the largest of the elements A[i], A[Left(i)], and A[Right(i)] is stored in the variable largest. If A[i] is largest, then the subtree rooted at node i is a heap and the procedure ends. Otherwise, one of the two children has the largest element, and A[i] is swapped with A[largest], which causes node i and its children to satisfy the heap property. The node largest, however, now has the original value A[i], and thus the subtree rooted at largest may violate the heap property. Therefore, Heapify must be called recursively on that subtree. Lecture 7 10 Analysis of Heapify The running time of Heapify on a subtree of size n rooted at a given node i is the (1) time to fix up the relationships among the elements A[i], A[Left(i)] and A[Right(i)], plus the time to run Heapify on a subtree rooted at one of the children of node i. The subtrees can have size of at most 2n/3, and the running time of Heapify can therefore be described by the recurrence, T(n) T(2n/3) + (1) and the solution to this works out as, T(n) = O(log n) Lecture 7 11 Building a Heap We can use the Heapify procedure in a bottom up manner to convert an array A[1..n], where n = length[A], into a heap. Now, since the subarray A[(n/2 + 1)..n] are all leaves of the tree, each is a 1-element heap. The procedure Build-Heap goes through the remaining nodes and runs Heapify on each. The order of processing guarantees that the subtrees rooted at children of node i are heaps before Heapify is run at that node. Lecture 7 12 Build Heap Algorithm Build-Heap(A) heap-size[A] = lenght[A] for i = length[A]/2 downto 1 Heapify(A, i) endfor endalg Each call to Heapify costs O(lg n) time, and there are O(n) such calls. Thus, the running time is at most O(n lg n) A more complex analysis, however, gives us a tighter upper bound of O(n). Hence, we can build a heap from an unordered array in linear time. Lecture 7 13 Example of Build Heap A 4 1 1 2 4 8 2 14 8 1 2 16 9 10 14 8 7 1 4 5 i 16 9 10 3 6 3 3 9 2 10 7 4 i 2 8 7 14 8 4 1 5 16 9 10 6 7 2 1 4 8 14 2 1 4 5 16 216 2 7 10 4 8 i 4 8 i 9 10 7 8 (c) 4 3 6 i 9 3 14 8 9 5 10 1 7 2 1 14 2 8 6 3 10 9 (e) 2 7 3 4 8 2 Lecture 7 8 4 4 6 9 10 7 10 9 10 5 7 1 3 9 3 7 (d) 1 16 14 7 10 (b) 5 16 1 3 9 (a) 1 3 6 10 3 9 7 3 (f) 14 Priority Queue Many applications require that we process records with keys in order, but not necessarily in full sorted order Items in a priority queue are not processed strictly in order of entry into the queue Items can be placed on the queue at any time, but processing always takes the item with the largest key The priority queue is a generalised queue Abstract Data Structure (ADT) In fact, the priority queue is a proper generalisation of the stack and the queue ADTs, as we can implement either with priority queues, using appropriate priorities Lecture 7 15 Definition A priority queue is a data structure of items with keys that support two basic operations : insert a new item, and delete the item with the largest key. Applications of priority queues include: Simulation Systems, where events need to be processed chronologically. Job scheduling in computer systems. The Heap Data Structure is an efficient structure for implementing a simple priority queue In practice, priority queues are more complex than the simple definition above. Lecture 7 16 Priority Queue One of the reasons that priority queue implementations are so useful is their flexibility For that reason any implementation will need to support some of the following operations Construct a priority queue from N given items Insert a new item Delete the maximum item Change the priority of an arbitrary specified item Delete an arbitrary specified item We take “maximum” to mean “any record with the largest key value”. Lecture 7 17 Heap Extract As with many data structures, we also need to add standard initialise, test if empty and perhaps destroy and copy operations Lets look at insert and delete maximum. Heap-Extract-Max(A) if heap-size[A] < 1 then error “heap underflow” endif max = A[1] A[1] = A[heap-size[A]] heap-size[A] = heap-size[A] – 1 Heapify(A, 1) return max endalg The running time of Heap-Extract-Max is O(lg n), since it performs only a constant amount of work on top of the O(lg n) time for Heapify Lecture 7 18 Heap Insert Heap-Insert inserts a node into heap A. Expand by adding a new leaf to the tree. Traverse a path from this leaf toward the root to find a place for the new element. Heap-Insert(A, key) heap-size[A] = heap-size[A] + 1 i = heap-size[A] while i > 1 and A[Parent(i)] < key do A[i] = A[Parent(i)] i = Parent(i) endwhile A[i] = key endalg Running time of on an n-element heap is O(lg n) since the path traced from the new leaf to the root has length O(lg n) This is an example of a non-recursive sift up procedure. Lecture 7 19 Example Heap Insert (15) 16 16 14 10 7 8 2 14 9 3 1 4 7 8 2 (a) 10 (b) 16 10 2 14 4 1 3 1 4 16 8 9 9 7 15 3 10 14 8 2 4 1 9 3 7 (d) (c) (a) is the heap before insert, (b) add a leaf (c) Copy from leaf to root until place for new item found(d) Add new item Lecture 7 20 Heap Delete To construct a priority queue it is simply necessary to call BuildHeap for the array To delete an arbitrarily specified item from the queue follow the design for Heap-Extract-Max Heap-Delete(A, i) if heap-size[A] < 1 then error “heap underflow” endif item = A[i] A[i] = A[heap-size[A]] heap-size[A] = heap-size[A] – 1 Heapify(A, i) return item endalg Which will run in at worst O(lg n) time Lecture 7 21 Changing Priorities What if we want to change a node priority? Implement based on the existing functions? ‘Delete old value’ and ‘Insert new value’ Each of these will run in O(lg n) time, which means that the procedure will also run in O(lg n) time. The heap provides an efficient implementation for a priority queue using the operations outlined, but is not so efficient in other cases For example, an operation which may sometimes be required is the join operation to merge two priority queues There are other implementations (e.g. doubly linked list) which support this operation more efficiently Lecture 7 22 Summary The binary heap data structure is a data structure that can be viewed as complete binary tree Because of its nature a heap can very easily be implemented using an array Heaps must also satisfy the heap property for every node, i, other than the root. A[Parent(i)] A[i] The heapify algorithm can be used to create a heap, and maintain a heap Lecture 7 23