Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 8 Sorting in linear time Ack: This presentation is based on the lecture slides from Hsu, Lih-Hsing, as well as various materials from the web. 8.1 Lower bound for sorting The decision tree model a1:a2 a2:a3 a1:a3 <1,2,3> a1:a3 <2,1,3> <1,3,2> 20071012 <3,1,2> Hsiu-Hui Lee a2:a3 <2,3,1> <3.2,1> 2 Theorem 8.1 Any comparison sort algorithm requires (n lg n) comparisons in the worst case. Proof: Consider a decision tree of height h with l reachable leaves corresponding to a comparison sort on n elements. n! l , Each of the n! permutations of the input appears as some leaf A binary tree of height h has no more than 2h leaves. l2 , h lg( n!) (n lg n). h Since the lg function is monotonically increasing 20071012 ??? Equation 3.18 ??? Hsiu-Hui Lee 3 Corollary 8.2 Heapsort and merge sort are asymptotically optimal comparisons. Proof: The O(n lg n) upper bounds on the running times for heapsort and merge sort match the (n lg n) worst–case lower bound from Theorem 8.1. 20071012 Hsiu-Hui Lee 4 8.2 Counting sort Assume that each of the n input elements is an integer in the range 1 to k for some integer k. COUNTING_SORT(A,B,k) (k ) (n) 6 for 1 for i 1 to k 7 do c [ i ] 0 2 3 for j 1 to length[A] i 1 to k do c [ i ] c [ i ] c [ i 1 ] 8 ► c[i] now contains the number of elements less 4 do c [ A[ j ]] c [ A[ j ]] 1 than or equal to i 5 ► c[i] now contains the 9 for j length [ A ] downto 1 number of equal to i 20071012 elements (k ) 10 11 Hsiu-Hui Lee do B[ c [ A[ j ]]] A[ j ] c [ A[ j ]] c [ A[ j ]] 1 (n) 5 The operation of Counting-sort on an input array A[1..8] 20071012 Hsiu-Hui Lee 6 Analysis: ( k n) Special case: (n) when Property: k O( n ) . stable (number with the same value appear in the output array in the same order as they do in the input array.) 20071012 Hsiu-Hui Lee 7 • • • ( k n) Analysis: Special case: (n) when Property: stable k O (n) . (number with the same value appear in the output array in the same order as they do in the input array.) 1 2 3 4 5 6 7 8 A 2 1 51 31 01 22 32 02 33 1 2 3 4 5 6 7 8 B 0 1 02 21 22 31 32 33 51 20071012 Hsiu-Hui Lee 8 8.3 Radix sort Used by the card-sorting machines you can now find only in computer museum. digit 1: the lowest –order digit digit d: the highest –order digit RADIX_SORT(A,d) 1 for i 1 to d 2 do use a stable sort to sort array A on digit i 20071012 Hsiu-Hui Lee 9 Analysis: Assume that we use counting sort as the intermediate sort. (n k ) per pass (digits in range 0, 1, 2, …, k) d passes (d (n k )) total If k O(n) , time = (dn) NOTE: not in-place (in-place: only constant number of elements of the input array are ever stored outside the array.) Lemma 8.3 Given n d-digit numbers in which each digit can take on up to k possible values, RADIX-SORT correctly sorts these number in (d(n + k)) time. if the stable sort it uses takes 20071012 (n + k) time Hsiu-Hui Lee 10 Lemma 8.4 Given n b-bit numbers and any positive integer r ≤ b, RADIX- SORT correctly sorts these numbers in ((b/r)(n+2r)) time. Proof : Choose d = b/r. digits of r bits each. Each digit is an integer in the range 0 to 2r -1 Each pass of counting sort takes tim (n+k)= (n+2r) and there are d passes, for a total running time of (d(n+2r))= ((b/r)(n+2r)). 20071012 Hsiu-Hui Lee 11 How to choose r? For given values of n and b, we wish to choose the value of r, with r≦b that minimizes ((b/r)(n+2r)) Choosing r ≒ lg n. ((b/r)(n+2r)) = ((b/lg n)(n+n)) = (bn/lg n) Example: To sort 216 32-bit words, n=216, b = 32, r = lg (216)=16 bits, d = 32/16. = 2 passes 20071012 Hsiu-Hui Lee 12 8.4 Bucket sort 20071012 Hsiu-Hui Lee 13 BUCKET_SORT(A) 1 n length [ A ] 2 for i 1 to n 3 do insert A[ i ] into list B nA[ i ] 4 for i 1 to n-1 5 do sort list B[ i ] with insertion sort 6 concatenate B[ 0 ], B[1 ],..., B[ n 1 ] together in order 20071012 Hsiu-Hui Lee 14 Analysis The running time of bucket sort is n1 T (n) (n) O(ni2 ). i 0 taking expectations of both sides and using linearity of expectation, we have n 1 E[T (n)] E (n) O(ni2 ) i 0 n 1 (n) E[O(ni2 )] i 0 n 1 (n) O[ E (ni2 )] 20071012 i 0 Hsiu-Hui Lee 15 We claim that E[ni2 ] 2 1 / n We define indicator random variables Xij = I {A[j] falls in bucket i} for i = 0, 1, …, n-1 and j = 1, 2,…,n. thus, ni 20071012 n X ij . j 1 Hsiu-Hui Lee 16 2 n E[ni2 ] E X ij j 1 n n E X ij X ik j 1k 1 n E X ij2 X ij X ik 1 j n 1 k n j 1 k j n E[ X ij2 ] j 1 20071012 E[ X ij X ik ], 1 j n 1 k n k j Hsiu-Hui Lee 17 Indicator random variable Xij is 1 with probability 1/n and 0 otherwise, and therefore 1 1 1 E[ X ij2 ] 1 0 1 n n n When k≠ j, the variables Xij and Xik are independent, and hence E[ X ij X ik ] E[ X ij ]E[ X ik ] 1 1 n n 20071012 1 . 2 n Hsiu-Hui Lee 18 E[ni2 ] n 1 1 n 2 j 1 1 j n 1 k n n k j 1 1 n n(n 1) 2 n n n 1 1 n 1 2 n We can conclude that the expected time for bucket sort is (n) n * O(2 1 / n) (n) 20071012 Hsiu-Hui Lee 19