Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Analysis of Algorithms CS 477/677 Lecture 8 Instructor: Monica Nicolescu Quick Announcement • Hand imaging collection (hand shape) • Reza Amayeh: [email protected] • Lab address: LME building room 314 • Time: – Mon, Wed, Fri: 10:00 am until 5:00pm – Thu, Thr: 3:00 pm until 5:00pm CS 477/677 - Lecture 8 2 How Fast Can We Sort? • Insertion sort, Bubble Sort, Selection Sort • Merge sort (nlgn) • Quicksort (nlgn) (n2) • What is common to all these algorithms? – These algorithms sort by making comparisons between the input elements • To sort n elements, comparison sorts must make (nlgn) comparisons in the worst case CS 477/677 - Lecture 8 3 Decision Tree Model • Represents the comparisons made by a sorting algorithm on an input of a given size: models all possible execution traces • Control, data movement, other operations are ignored • Count only the comparisons • Decision tree for insertion sort on three elements: one execution trace node leaf: CS 477/677 - Lecture 8 4 Counting Sort • Assumption: – The elements to be sorted are integers in the range 0 to k • Idea: – Determine for each input element x, the number of elements smaller than x – Place element x into its correct position in the output array 1 A 2 3 4 5 6 7 8 0 B 2 3 4 2 3 4 5 C 2 2 4 7 7 8 2 5 3 0 2 3 0 3 1 1 5 6 7 8 0 0 2 2 3 3 3 5 CS 477/677 - Lecture 8 5 Analysis of Counting Sort Alg.: COUNTING-SORT(A, B, n, k) 1. for i ← 0 to k (k) 2. do C[ i ] ← 0 3. for j ← 1 to n (n) 4. do C[A[ j ]] ← C[A[ j ]] + 1 5. C[i] contains the number of elements equal to i 6. for i ← 1 to k (k) 7. do C[ i ] ← C[ i ] + C[i -1] 8. C[i] contains the number of elements ≤ i 9. for j ← n downto 1 (n) 10. do B[C[A[ j ]]] ← A[ j ] 11. C[A[ j ]] ← C[A[ j ]] - 1 CS 477/677 - Lecture 8 Overall time: (n + k) 6 Analysis of Counting Sort • Overall time: (n + k) • In practice we use COUNTING sort when k = O(n) running time is (n) • Counting sort is stable – Numbers with the same value appear in the same order in the output array – Important when satellite data is carried around with the sorted keys CS 477/677 - Lecture 8 7 Radix Sort • Considers keys as numbers in a base-R number – A d-digit number will occupy a field of d columns • Sorting looks at one column at a time – For a d digit number, sort the least significant digit first – Continue sorting on the next least significant digit, until all digits have been sorted – Requires only d passes through the list CS 477/677 - Lecture 8 8 RADIX-SORT Alg.: RADIX-SORT(A, d) for i ← 1 to d do use a stable sort to sort array A on digit i • 1 is the lowest order digit, d is the highest-order digit CS 477/677 - Lecture 8 9 Analysis of Radix Sort • Given n numbers of d digits each, where each digit may take up to k possible values, RADIXSORT correctly sorts the numbers in (d(n+k)) – One pass of sorting per digit takes (n+k) assuming that we use counting sort – There are d passes (for each digit) CS 477/677 - Lecture 8 10 Correctness of Radix sort • We use induction on number of passes through each digit • Basis: If d = 1, there’s only one digit, trivial • Inductive step: assume digits 1, 2, . . . , d-1 are sorted – Now sort on the d-th digit – If ad < bd, sort will put a before b: correct a < b regardless of the low-order digits – If ad > bd, sort will put a after b: correct a > b regardless of the low-order digits – If ad = bd, sort will leave a and b in the same order and a and b are already sorted on the low-order d-1 digits CS 477/677 - Lecture 8 11 Bucket Sort • Assumption: – the input is generated by a random process that distributes elements uniformly over [0, 1) • Idea: – – – – Divide [0, 1) into n equal-sized buckets Distribute the n input values into the buckets Sort each bucket Go through the buckets in order, listing elements in each one • Input: A[1 . . n], where 0 ≤ A[i] < 1 for all i • Output: elements ai sorted • Auxiliary array: B[0 . . n - 1] of linked lists, each list initially empty CS 477/677 - Lecture 8 12 BUCKET-SORT Alg.: BUCKET-SORT(A, n) for i ← 1 to n do insert A[i] into list B[nA[i]] for i ← 0 to n - 1 do sort list B[i] with insertion sort concatenate lists B[0], B[1], . . . , B[n -1] together in order return the concatenated lists CS 477/677 - Lecture 8 13 Example - Bucket Sort 1 .78 0 2 .17 1 .17 .12 3 .39 2 .26 .21 4 .26 3 .39 / 5 .72 4 / 6 .94 5 / 7 .21 6 .68 / 8 .12 7 .78 9 .23 8 10 .68 9 / .72 / .23 / / / .94 / CS 477/677 - Lecture 8 14 Example - Bucket Sort .12 0 .17 .23 .21 .12 .17 2 .21 .23 3 .39 / 6 .68 / 7 .72 4 / 5 / 9 .39 .68 .72 .78 .94 / 1 8 .26 .78 / .94 / / .26 / / Concatenate the lists from 0 to n – 1 together, in order CS 477/677 - Lecture 8 15 / Correctness of Bucket Sort • Consider two elements A[i], A[ j] • Assume without loss of generality that A[i] ≤ A[j] • Then nA[i] ≤ nA[j] – A[i] belongs to the same group as A[j] or to a group with a lower index than that of A[j] • If A[i], A[j] belong to the same bucket: – insertion sort puts them in the proper order • If A[i], A[j] are put in different buckets: – concatenation of the lists puts them in the proper order CS 477/677 - Lecture 8 16 Analysis of Bucket Sort Alg.: BUCKET-SORT(A, n) for i ← 1 to n O(n) do insert A[i] into list B[nA[i]] for i ← 0 to n - 1 do sort list B[i] with insertion sort (n) concatenate lists B[0], B[1], . . . , B[n -1] O(n) together in order return the concatenated lists CS 477/677 - Lecture 8 (n) 17 Conclusion • Any comparison sort will take at least nlgn to sort an array of n numbers • We can achieve a better running time for sorting if we can make certain assumptions on the input data: – Counting sort: each of the n input elements is an integer in the range 0 to k – Radix sort: the elements in the input are integers represented with d digits – Bucket sort: the numbers in the input are uniformly distributed over the interval [0, 1) CS 477/677 - Lecture 8 18 A Job Scheduling Application • Job scheduling – The key is the priority of the jobs in the queue – The job with the highest priority needs to be executed next • Operations – Insert, remove maximum • Data structures – Priority queues – Ordered array/list, unordered array/list CS 477/677 - Lecture 8 19 Example CS 477/677 - Lecture 8 20 PQ Implementations & Cost Worst-case asymptotic costs for a PQ with N items Insert Remove max ordered array N 1 ordered list N 1 unordered array 1 N unordered list 1 N Can we implement both operations efficiently? CS 477/677 - Lecture 8 21 Background on Trees • Def: Binary tree = structure composed of a finite set of nodes that either: – Contains no nodes, or – Is composed of three disjoint sets of nodes: a root node, a left subtree and a right subtree root 4 Left subtree 1 2 14 3 16 9 Right subtree 10 8 CS 477/677 - Lecture 8 22 Special Types of Trees • Def: Full binary tree = a binary tree in which each node is either a leaf or has degree exactly 2. • Def: Complete binary tree = a binary tree in which all leaves have the same depth and all internal nodes have degree 2. 4 1 3 2 14 16 9 8 10 7 12 Full binary tree 4 1 2 3 16 9 10 Complete binary tree CS 477/677 - Lecture 8 23 The Heap Data Structure • Def: A heap is a nearly complete binary tree with the following two properties: – Structural property: all levels are full, except possibly the last one, which is filled from left to right – Order (heap) property: for any node x Parent(x) ≥ x 8 7 5 4 2 It doesn’t matter that 4 in level 1 is smaller than 5 in level 2 Heap CS 477/677 - Lecture 8 24 Definitions • Height of a node = the number of edges on a longest simple path from the node down to a leaf • Depth of a node = the length of a path from the root to the node • Height of tree = height of root node = lgn, for a heap of n elements Height of root = 3 4 1 Height of (2)= 1 2 14 3 16 9 10 Depth of (10)= 2 8 CS 477/677 - Lecture 8 25 Array Representation of Heaps • A heap can be stored as an array A. – Root of tree is A[1] – Left child of A[i] = A[2i] – Right child of A[i] = A[2i + 1] – Parent of A[i] = A[ i/2 ] – Heapsize[A] ≤ length[A] • The elements in the subarray A[(n/2+1) .. n] are leaves • The root is the maximum element of the heap A heap is a binary tree that is filled in order CS 477/677 - Lecture 8 26 Heap Types • Max-heaps (largest element at root), have the max-heap property: – for all nodes i, excluding the root: A[PARENT(i)] ≥ A[i] • Min-heaps (smallest element at root), have the min-heap property: – for all nodes i, excluding the root: A[PARENT(i)] ≤ A[i] CS 477/677 - Lecture 8 27 Operations on Heaps • Maintain the max-heap property – MAX-HEAPIFY • Create a max-heap from an unordered array – BUILD-MAX-HEAP • Sort an array in place – HEAPSORT • Priority queue operations CS 477/677 - Lecture 8 28 Operations on Priority Queues • Max-priority queues support the following operations: – INSERT(S, x): inserts element x into set S – EXTRACT-MAX(S): removes and returns element of S with largest key – MAXIMUM(S): returns element of S with largest key – INCREASE-KEY(S, x, k): increases value of element x’s key to k (Assume k ≥ x’s current key value) CS 477/677 - Lecture 8 29 Maintaining the Heap Property • Suppose a node is smaller than a child – Left and Right subtrees of i are max-heaps • Invariant: – the heap condition is violated only at that node • To eliminate the violation: – Exchange with larger child – Move down the tree – Continue until node is not smaller than children CS 477/677 - Lecture 8 30 Maintaining the Heap Property Alg: MAX-HEAPIFY(A, i, n) 1. l ← LEFT(i) – Left and Right subtrees of i are 2. r ← RIGHT(i) max-heaps 3. if l ≤ n and A[l] > A[i] – A[i] may be 4. then largest ←l smaller than its 5. else largest ←i children 6. if r ≤ n and A[r] > A[largest] 7. then largest ←r 8. if largest i 9. then exchange A[i] ↔ A[largest] 10. MAX-HEAPIFY(A, largest, n) • Assumptions: CS 477/677 - Lecture 8 31 Example MAX-HEAPIFY(A, 2, 10) A[2] A[4] A[2] violates the heap property A[4] violates the heap property A[4] A[9] Heap property restored CS 477/677 - Lecture 8 32 MAX-HEAPIFY Running Time • Intuitively: – A heap is an almost complete binary tree must process O(lgn) levels, with constant work at each level • Running time of MAX-HEAPIFY is O(lgn) • Can be written in terms of the height of the heap, as being O(h) – Since the height of the heap is lgn CS 477/677 - Lecture 8 33 Building a Heap • Convert an array A[1 … n] into a max-heap (n = length[A]) • The elements in the subarray A[(n/2+1) .. n] are leaves • Apply MAX-HEAPIFY on elements between 1 and n/2 Alg: BUILD-MAX-HEAP(A) 1. n = length[A] 2. for i ← n/2 downto 1 3. 1 4 2 3 1 do MAX-HEAPIFY(A, i, n) 8 2 14 A: 4 3 4 1 CS 477/677 - Lecture 8 3 2 5 6 7 16 9 9 10 8 7 16 9 10 14 10 8 34 7 Readings • Chapter 8, 6 CS 477/677 - Lecture 8 35