Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
3 Sorting 3.1 The Concept of Sorting • The fundamental objectives of this chapter are [Wi86]: • (1) To provide an extensive set of examples illustrating the use of the data structures introduced in the preceding chapter. • (2) To show how the choice of structure for the underlying data profoundly influences: • The algorithms that perform a given task. • The programming techniques used in algorithm implementation. • Sorting is the ideal domain to study: • (1) The algorithms’ development. • (2) The algorithms’ performance. • (3) Advantages and disadvantages that have to be weighed against each other algorithm in the light of the particular application. • (4) The programming techniques specific for different algorithms. • Sorting is generally understood to be the process of rearranging a given set of objects in a specific order. • The purpose of sorting is to facilitate the later search for members of the sorted set. • Sorting is an almost universally performed, fundamental activity. • Objects are sorted in telephone books, in income tax files, in tables of contents, in libraries, in dictionaries, in warehouses, and almost everywhere that stored objects have to be searched and retrieved. • Even small children are taught to put their things "in order", and they are confronted with some sort of sorting long before they learn anything about arithmetic. • Inside this chapter, we presume that sorting refers to records with a specified structure as in [3.1.a]: -----------------------------------------------------------TYPE TypeElement = RECORD key: integer; [3.1.a] {other components} END; -----------------------------------------------------------typedef struct { int key; ... other components; /*3.1.a*/ } type_element; ------------------------------------------------------------ • The field key which may be not relevant from the information point of view, the essential information being contained by the other fields of the record. • But from the sorting point of view, the field key is the most important because we consider the following definition of sorting. • If we are given n items: a1, a2,....,an • Sorting consists of permuting these items into a certain order: ak1, ak2,.....,akn • So that, the sequence of keys to be monotonic increasing, in other words to have ak1.key ≤ ak2.key ≤ ... ≤ akn.key • The type of field key is presumed to be integer for a simpler understanding, but in reality it can be any scalar type. • A sorting method is called stable if the relative order of items with equal keys remains unchanged by the sorting process. • Stability of sorting is often desirable, if items are already ordered (sorted) according to some secondary keys, i.e., properties not reflected by the (primary) key itself. • The dependence of the choice of an algorithm on the structure of the data to be processed -- an ubiquitous phenomenon -- is profound in the case of sorting. • From this reason, categories, namely: sorting methods are generally classified into two • (1) Sorting of arrays or internal sorting. The items to be sorted are stored in the random-access "internal" store of the computing systems as arrays. • (2) Sorting of (sequential) files or external sorting. The items to be sorted are stored as files which are appropriate on the slower, but more spacious "external" stores based on mechanically moving devices (disks and tapes). 3.2 Sorting Arrays • Arrays are stored in the main memory of the computing systems, that’s for the sorting of the arrays is named internal sorting. • The predominant requirement that has to be made for sorting methods on arrays is an economical use of the available store. • This implies that the permutation of items which brings the items into order has to be performed “in situ”, that means only using the array designated area. • The methods which transport items from an array a to a result array b are intrinsically of minor interest. • Having thus restricted our choice of methods among the many possible solutions by the criterion of economy of storage, we proceed to a first classification of sorting algorithms according to their efficiency, i.e., their execution time. • The quantitative assessment of the efficiency of a sorting algorithm can be expressed by specific indicators: • (1) A first indicator is the number C of needed key comparisons in order to sort an array. • (2) Another indicator is number M of moves (transpositions) of items. • Both indicators are related to the number n of items to be sorted. • We first discuss several simple and obvious sorting techniques, called straight sorting methods, for which the values of C and M are in the order n2, that means they are O(n2). • There are also advanced sorting algorithms, with a higher complexity, for which the values of C and M are in the order n∗ log2 n ( O(n∗ log2 n) ). • The ratio n2/(n∗ log2 n), which illustrate the earned efficiency of this algorithms is approximately 10 for n = 64, respectively 100 for n = 1000. • Despite this situation, there are good reasons for presenting straight sorting methods before proceeding to faster algorithms: • (1) Straight methods are particularly well suited for elucidating the characteristics of the major sorting principles. • (2) Their implementations are short and easy to understand. • (3) Although sophisticated methods require fewer operations, these operations are usually more complex in their details. • Consequently, straight methods are faster for sufficiently small n, although they must not be used for large n. • (4) Represent the starting point for advanced sorting methods. • Sorting methods that sort items in situ can be classified into three principal categories according to their underlying method: • (1) Sorting by insertion. • (2) Sorting by selection. • (3) Sorting by exchange. • In presenting these methods we will use the TypeElement described in [3.1.a] as well as the following structures. [3.2.a]. -----------------------------------------------------------TYPE TypeIndex = 0..n; TypeArray = ARRAY [TypeIndex] OF TypeElement; VAR a: TypeArray; temp: TypeElement; [3.2.a] -----------------------------------------------------------#define N ... typedef struct { int key; /*3.2.a*/ ... other components; } type_element; type_element a[N]; ------------------------------------------------------------ 3.2.1 Sorting by Straight Insertion • This method is widely used by card players. • The items (cards) are conceptually divided into a destination sequence a1...ai-1 and a source sequence ai....an. • In each step, starting with i = 2, the i-th element of the array (which is the first element of the source sequence), is picked and transferred into the destination sequence by inserting it at the appropriate place. • i is incremented and the cycle is repeated. • That is, at the beginning the first two items are sorted, then the first three, and so on. • We have to observe that in step i, the first i-l items are already sorted, so the sorting consists only in inserting the item a[i] at his suitable place in an already ordered sequence. • The formal description of this algorithm appears in [3.2.1.a]. ----------------------------------------------------------{Sorting by Straight Insertion Algorithm} FOR i:= 2 TO n DO BEGIN [3.2.1.a] temp:=a[i]; *insert x at the appropriate place in a[1]...a[i]} END;{FOR} ------------------------------------------------------------ • For finding the place in which item a[i] will be inserted, the destination sequence, already sorted a[1],...,a[i-1], is scanned from right to left, comparing a[i]with each scanned item. • In the same time, during the scanning, each tested item is shifted away to right with one position, until the stop condition is fulfilled. • By this action, a place for the item to be inserted in the array is created • The scanning process is stopped when the first item a[j] having a key smaller or equal with a[i]is found. • If such an a[j] item doesn’t exist, the scanning process is stopped on a[1], that means on the first position. • This typical case of a loop with two conditions remember us the sentinel method (&1.4.2.1). • For this, the auxiliary element a[0]is added to the array, and is initialized with the value a[i]. • As result, the condition that the key of a[j]to be less or equal with the key of a[i]is fulfilled at latest for j = 0, and is not necessary to verify the value of the j index (j>=0). • The effective insertion is realized in the location a[j+1]. • The corresponding algorithm is presented in [3.2.1.b] and its temporal scheme in figure 3.2.1.a. -----------------------------------------------------------{Sorting by straight insertion – Pascal variant} PROCEDURE SortingByInsertion; VAR i,j: TypeIndex; temp: TypeElement; BEGIN FOR i:= 2 TO n DO BEGIN temp:= a[i]; a[0]:= temp; j:= i-1; WHILE a[j].key>temp.key DO BEGIN a[j+1]:= a[j]; j:= j-1 [3.2.1.b] END; {WHILE} a[j+1]:= temp END {FOR} END; {SortingByInsertion} -----------------------------------------------------------SortingByInsertion FOR (n -1 iterations) 2 assignments O(1) O((n-1)∗ n) = O(n2) O(n2) WHILE (n -1 iterations) 1 comparison 1 assignment O(n-1) = O(n) 1 assignment O(1) Fig.3.2.1.a. Temporal scheme of sorting by insertion algorithm • The sorting algorithm contains an external loop driven by i, which executes n-1 iteration (FOR loop). • Inside each external iteration, an internal variable loop driven by j is executed, until the WHILE condition is fulfilled (WHILE loop). • In the step i of the external cycle FOR : • (1) The minimum number of iteration in the inner cycle is 0 (the array is already ordered), • (2) The maximum number is i-1 (the array is ordered in inverse order). 3.2.1.1 Performance Analysis of the Straight Insertion • • In the i-th step of the FOR loop, the number Ci of key comparisons executed in WHILE loop, depends on the initial order of the keys, being: • At least 1 (ordered sequence). • At most i-1 (sequence ordered in reverse order). • In average i/2, presuming that all the permutations of the n given keys are equally possible. Because we have n-1 iterations of the FOR loop for i:= 2,3,...,n , the C indicator can obtain the values presented in [3.2.1.c]. -----------------------------------------------------------n C min = ∑1 = n − 1 i =2 n n−1 i =2 i =1 C max = ∑ (i − 1) = ∑ i = Cavg = (n − 1) ⋅ n 2 [3.2.1.c] C min + C max n 2 + n − 2 = 2 4 ------------------------------------------------------------ • The number of item moves Mi inside a FOR cycle is C i + 3. • Explanation: at the number C i moves executed in the inner WHILE cycle of type a[j+1]:= a[j] 3 more moves are added (temp:= a[i], a[0]:= temp and a[i+1]:= temp). • Even for the minimum number of comparisons which is 0, the 3 mentioned assignations remain valuable. • As result, the M indicator can take the following values [3.2.1.d]. ------------------------------------------------------------ M min = 3 ⋅ (n − 1) n n i =2 i =2 n+2 M max = ∑ (Ci + 3) = ∑ (i + 2) = ∑ i − (1 + 2 + 3) = i =1 (n + 2) ⋅ (n + 3) n + 5⋅ n − 6 = −6 = 2 2 2 M avg = [3.2.1.d] M min + M max n 2 + 11 ⋅ n − 12 = 2 4 ------------------------------------------------------------ • We can notice that the values C and M are in the order n2 (O(n2)). • The minimal numbers occur if the items are initially in order, the worst case occurs if the items are initially in reverse order. • Sorting by straight insertion is a stable sorting. • In [3.2.1.e] is presented a light modified C variant of this sorting method. -----------------------------------------------------------// Sorting by straight insertion – C variant StraightInsertion(int a[],int n){//sentinel on position a[n] for(int i=n-2;i>=0;i--) { a[n]=a[i]; int j=i+1; while(a[j]<a[n]) { a[j-1]=a[j]; j++; /*3.2.1.e*/ } a[j-1]=a[n]; } } ------------------------------------------------------------ • Referring to [3.2.1.e] the following observation can be made: • Implementation is "in mirror" variant in comparison with the Pascal variant. • The array containing n items is a[0]...a[n-1]. • The source sequence is a[0]...a[i]. • The destination sequence (the ordered one) is a[i+1]..a[n-1]. • The sentinel is the position n of the array a. • In the process of finding the insertion place, in the current step, the destination sequence is scanned using the index j from left to right, respectively starting with position i+1 until the insertion place is found or the position n is reached. • The encountered items having keys which are smaller than the inserting key are shifted to the left with a position, until the condition is fulfilled. 3.2.1.2 Sorting by Binary Insertion • The algorithm of straight insertion is easily improved by noting that the destination sequence a[0] ... a[i-1],in which the new item has to be inserted, is already ordered. • In this case a faster method of determining the insertion point is to use the binary searching. • • That’s presume to successively divide in two equal parts the searching interval, until the insertion place is found. The modified algorithm is named binary insertion [3.2.1.f]. -----------------------------------------------------------{Sorting by binary insertion – Pascal Variant} PROCEDURE SortingByBinaryInsertion; VAR i,j,left,right,m: TypeIndex; temp: TypeElement; a: TypeArray; BEGIN FOR i:= 2 TO n DO BEGIN temp:= a[i]; left:= 1; right:= i-1; WHILE left<=right DO BEGIN [3.2.1.f] m:= (left+right)DIV 2; IF a[m].key>temp.key THEN right:= m-1 ELSE left:= m+1 END;{WHILE} FOR j:= i-1 DOWNTO left DO a[j+1]:= a[j]; a[left]:= temp END {FOR} END; {SortingByBinaryInsertion} -----------------------------------------------------------/* Sorting by binary insertion – C Variant */ void sorting_by_binary_insertion() { type_index i,j,left,right,m; type_element temp; type_array a; for(i=2; i<=n; i++) { temp= a[i]; left= 1; right= i-1; while (left<=right) { /*[3.2.1.f]*/ m= (left+right)/ 2; if (a[m].key>temp.key) right= m-1; else left= m+1; } /*while*/ for( j= i-1; j >= left; j --) a[j+1]= a[j]; a[left]= temp; } /*for*/ } /* sorting_by_binary_insertion */ /*--------------------------------------------------------*/ 3.2.1.3 Performance Analysis of Binary Insertion • In the case of sorting by binary insertion the insertion position is found if a[j].key ≤ x.key ≤ a[j+1].key , meaning that the searching interval has the dimension 1. • If the searching interval has the length i, for determining the insertion place are necessary log2(i) steps. • Because the length of the searching interval in each step is i, and we have n steps, the total number of comparisons C executed in the outer FOR loop is presented in [3.2.1.g] -----------------------------------------------------------n C = ∑ log 2i [3.2.1.g] i =1 -----------------------------------------------------------This sum can be approximated by the integral [3.2.1.h]. -----------------------------------------------------------n C = ∫ log 2 x ⋅ dx = x ⋅ (log 2 x − c) 1n = n ⋅ (log 2 n − c) + c 1 [3.2.1.h] c = log 2 e = 1 / ln 2 = 1.44269 ------------------------------------------------------------ • The number of comparisons is essentially independent of the initial order of the items. • That is not usual for a sorting algorithm. • Unfortunately, the improvement obtained by using a binary search method applies only to the number of comparisons but not to the number of necessary moves. • In fact, since moving items, i.e., keys and associated information, is in general considerably more time-consuming than comparing two keys, the improvement is by no means drastic: • • The important term M is still of the order n2. • And, in fact, sorting the already sorted array takes more time than does straight insertion with sequential search. In conclusion, sorting by insertion is not a suitable sorting method using a computing system, because the insertion of an item presume the shifting with one position of a number of items, which is not nor economic neither efficient. • • One should expect better results from a method in which moves of items are only performed upon single items and over longer distances. This idea leads to sorting by selection. 3.2.2 Sorting by Straight Selection • Sorting by straight selection is based on the idea of selecting the item with the minimum key and to interchange the position of this item with the item in the first position. • The procedure is repeated for the remaining n-1 items, then with n-2 items, etc, finishing with the last two items. • We remember that the sorting by insertion method presumes at each step a single item of the source sequence, and all the items of the destination source in which searches the insertion place. • Contrary, sorting by straight selection method presumes all the items of the source sequence on which selects the item with the smallest key and places it as next item of the destination sequence. -----------------------------------------------------------{Sorting by straight selection} FOR i:= 1 TO n-1 DO [3.2.2.a] BEGIN *find the smallest item of the a[i]...a[n] and assign variable min with its index; *interchange a[i] with a[min] END; ------------------------------------------------------------ • By refinement results the algorithm presented in [3.2.2.b] whose temporal scheme appears in figure 3.2.2.a. -----------------------------------------------------------{Sorting by straight selection – Pascal Variant} PROCEDURE SortingBySelection; VAR i,j,min: TypeIndex; temp: TypeElement; a: TypeArray; BEGIN FOR i:= 1 TO n-1 DO BEGIN min:= i; temp:= a[i]; FOR j:= i+1 TO n DO IF a[j].key<temp.key THEN [3.2.2.b] BEGIN min:= j; temp:= a[j] END;{FOR} a[min]:= a[i]; a[i]:= temp END {FOR} END; {SortingBySelection} -----------------------------------------------------------/* Sorting by straight selection – C Variant */ void sorting_by_selection() { typeindex i,j,min; typeelement temp; typearray a; for(i=1; i<=n-1; i++) { min= i; temp= a[i]; for(j=i+1; j<=n; j++) if(a[j].key<temp.key) /*[3.2.2.b]*/ { min= j; temp= a[j]; } /*for*/ a[min]= a[i]; a[i]= temp; } /*for*/ } /*sorting_by_selection*/ /*--------------------------------------------------------*/ SortingBySelection FOR (n -1 iterations) 1 assignment FOR (i -1 iterations) Hm -1 assignments 1 comparison 1 assignment 2 assignments Fig.3.2.2.a. Temporal scheme of the sorting by selection algorithm 3.2.2.1 Performance Analysis of Sorting by Straight Selection • Evidently, the number C of key comparisons is independent of the initial order of keys. It is fixed being determined by the integral execution of the two nested FOR loops [3.2.2.c]. • In this sense, this method may be said to behave less naturally than straight insertion. -----------------------------------------------------------n −1 n−2 i =1 i =1 C = ∑ (i − 1) = ∑i= n2 − 3 ⋅ n + 2 2 [3.2.2.c] ------------------------------------------------------------ • The number M of moves is at least 3 for each (temp:= a[i],a[min]:= a[i],a[i]:= temp), as result: value of i, -----------------------------------------------------------M min = 3 ⋅ (n − 1) [3.2.2.d] ------------------------------------------------------------ • This minimum becomes effective in the case of initially ordered keys. • If the keys are initially in reverse order, Mmax can be determined using the empiric formula [3.2.2.e] [Wi76]. ----------------------------------------------------------- n2 M max = 4 (1) + 3 ⋅ (n − 1) [3.2.2.e] ------------------------------------------------------------ • The value of indicator Mavg is not the average of Mmin şi Mmax . • In order to determine Mavg we make the following deliberations: • The algorithm scans the array containing m items, comparing each element with the minimal value so far detected and, if smaller than that minimum, performs an assignment. • The probability that the second element is less than the first, is 1/2; this is also the probability for a new assignment to the minimum. • The chance for the third element to be less than the first two is 1/3 • The chance of the fourth to be the smallest than first three is 1/4, and so on. • Therefore the total expected number of moves for an array containing m items is Hm-1, where Hm is the m-th harmonic number [3.2.2.f][Wi85]: ------------------------------------------------------------ Hm = 1 + 1 1 1 + +3+ 2 3 m [3.2.2.f] ------------------------------------------------------------ • This value represents the total expected number of moves, that means the number of assignments of the variable temp, because in the sorting process of a sequence of m items, in the inner FOR loop, temp is assigned whenever an item is found to be smaller than all its precedent items. • We have to add to this value the constant 3 representing the assignments temp:=a[i],a[min]:=a[i] şi a[i]:=temp. • As result the average value of the total expected number of moves at a scanning of a sequence containing m items is Hm+2. • It is demonstrated that the series is divergent, but we can calculate a partial sum using the Euler’s formula [3.2.2.g]: -----------------------------------------------------------H m ≈ ln m + γ + 1 1 1 − + 2 ⋅ m 12 ⋅ m 2 120 ⋅ m 4 [3.2.2.g] where γ = 0.5772156649... is Euler’s constant [Kn76]. ------------------------------------------------------------ • For a m sufficiently big, the value of Hm can be approximated by expression [3.2.2.h]: -----------------------------------------------------------H m ≈ ln m + γ [3.2.2.h] ------------------------------------------------------------ • All we have discuss until now is valuable for a single scan of a sequence of m keys. (One cycle of the inner FOR loop). • Because, in the sorting process, are scanned consequently n sequences having the lengths respectively m = n , n-1 , n-2 ,..., 1, each of them requiring in average Hm+2 moves, the average number of moves Mavg is [3.2.2.i]: -----------------------------------------------------------n n n m =1 m =1 m =1 M avg ≈ ∑ (H m + 2) ≈ ∑ (ln m + g + 2) = n ⋅ (g + 2) + ∑ ln m [3.2.2.i] ------------------------------------------------------------ • The sum can be approximated using the integral calculus [3.2.2.j]: -----------------------------------------------------------n ∫ ln x ⋅ dx = x ⋅ (ln x − 1) 1 = n ⋅ ln (n) − n + 1 n [3.2.2.j] 1 ------------------------------------------------------------ • That leads to the final result [3.2.2.k]: -----------------------------------------------------------M avg ≈ n ⋅ (ln m + g + 1) + 1 = O (n ⋅ ln n) [3.2.2.k] ------------------------------------------------------------ • We may conclude that in general the algorithm of straight selection is to be preferred over straight insertion. • Although, in the cases in which keys are initially sorted or almost sorted, straight insertion is still somewhat faster. • The optimization of the sorting performance can be achieved by reducing the number of moves. • Sedgewik [Se88] propose a such a variant in which, instead of storing each time the current minimum item in variable temp, only its index is memorized, the effective move being achieved only for the last minimum determined, after the inner FOR loop is consumed [3.2.2.l]. -----------------------------------------------------------{Optimized sorting by selection - Pascal Variant} PROCEDURE OptimizedSelection; VAR i,j,min: TypeIndex; temp: TypeElement; a: TypeArray; BEGIN FOR i:= 1 TO n-1 DO [3.2.2.l] BEGIN min:= i; FOR j:= i+1 TO n DO IF a[j].key<temp.key THEN min:= j; temp:= a[min]; a[min]:= a[i]; a[i]:= temp END {FOR} END; {OptimizedSelection} -----------------------------------------------------------/* Optimized sorting by selection - C Variant */ void optimized_selection() { type_index i,j,min; type_element temp; type_array a; for(i= 1; i <= n-1; i ++) { min= i; for(j= i+1; j <= n; j ++) /*[3.2.2.l]*/ if(a[j].key<temp.key) min= j; temp= a[min]; a[min]= a[i]; a[i]= temp; } /*for*/ } /*optimized_selection*/ /*--------------------------------------------------------*/ • Unfortunately the experimental measurements on this algorithm doesn’t reveal any improvement of the performance even for large dimensions of the arrays to be sorted. • The explanation: There is not difference between a normal assignment and an assignment presuming the access to an indexed variable. 3.2.3 Sorting by Straight Exchange. Bubblesort and Shakersort • The classification of a sorting method as by insertion, selection or exchange is seldom entirely clear-cut. • Both previously discussed methods can also be viewed as exchange sorts. • In this section, however, we present a method in which the exchange of two items is the dominant characteristic of the process. • The subsequent algorithm of straight exchanging is based on the principle of comparing and exchanging pairs of adjacent items until all items are sorted. • As in the previous methods of straight selection, we make repeated passes over the array, each time sifting the least item of the remaining set to the left end of the array. • If, for a change, we view the array to be in a vertical instead of a horizontal position, and -- with the help of some imagination -- the items as bubbles in a water tank with weights according to their keys, then each pass over the array results in the ascension of a bubble to its appropriate level of weight. • That is the reason for this method is widely known as the Bubblesort. • The subsequent algorithm is shown in [3.2.3.a]: -----------------------------------------------------------{Sorting by exchange: Bubblesort - Variant 1} PROCEDURE Bubblesort; VAR i,j: TypeIndex; temp: TypeElement; BEGIN FOR i:= 2 TO n DO BEGIN [3.2.3.a] FOR j:= n DOWNTO i DO IF a[j-1].key>a[j].key THEN BEGIN temp:= a[j-1]; a[j-1]:= a[j]; a[j]:= temp END {IF} END {FOR} END; {Bubblesort} ------------------------------------------------------------ /* Sorting by exchange: Bubblesort - Variant 1*/ void bubblesort() { type_index i,j; type_element temp; for(i=2; i<=n; i++) { /*[3.2.3.a]*/ for(j= n; j>=i; j--) if (a[j-1].key>a[j].key) { temp= a[j-1]; a[j-1]= a[j]; a[j]= temp; } } } /*bubblesort*/ /*--------------------------------------------------------*/ • The temporal scheme of sorting by exchange algorithm is presented in figure 3.2.a. • Based on this scheme the algorithm performance estimation leads to O(n2). Bubblesort FOR (n - 1 iterations) FOR (i - 1 iterations) 2 O(n ) 1 comparison O(n) 2 O(n∗n) = O(n ) 3 assignments Fig.3.2.a. Temporal scheme of sorting by exchange algorithm • Tree important elements can be noticed: • (1) In many cases, the sorting process is finished before all the repetition of the external FOR loop to be consumed. • In this case the remaining iteration have no effect, because de array is already sorted. • An obvious technique for improving this algorithm is to remember whether or not any exchange had taken place during a pass. • A last pass without further exchange operations is therefore necessary to determine that the algorithm may be terminated. • In [3.3.3.b] appears a variant of sorting by exchange based on this observation. This variant is a well-known by the programmers do to its simplicity. ----------------------------------------------------------{ Sorting by exchange: Bubblesort - Variant 2} PROCEDURE Bubblesort1; VAR i: TypeIndex; modified: boolean; temp: TypeElement; BEGIN REPEAT modified:= false; FOR i:= 1 TO n-1 DO IF a[i].key>a[i+1].key THEN [3.2.3.b] BEGIN temp:= a[i]; a[i]:= a[i+1]; a[i+1]:= temp; modified:= true END UNTIL NOT modified END; {Bubblesort1} -----------------------------------------------------------/* Sorting by exchange: Bubblesort - Variant 2*/ typedef int boolean; #define true (1) #define false (0) void bubblesort1() { type_index i; boolean modified; type_element temp; do { modified= false; for(i=1; i<=n-1; i++) if (a[i].key>a[i+1].key) /*[3.2.3.b]*/ { temp= a[i]; a[i]= a[i+1]; a[i+1]= temp; modified= true; } } while (!(! modified)); } /*bubblesort1*/ /*--------------------------------------------------------*/ • • (2) However, this improvement may itself be improved by remembering not merely the fact that an exchange took place, but rather the position k (index) of the last exchange. • Its obvious that all pairs of adjacent items below this index k are in the desired order. • As result, the subsequent scans may therefore be terminated at this index instead of having to proceed to the predetermined lower limit i. (3) The careful programmer notices, however, a peculiar asymmetry: • A single misplaced bubble in the heavy end of an otherwise sorted array will sift into order in a single pass. • For example, the array 12 18 22 34 65 67 83 04 will is sorted by bublesort variant 2 in a single pass. • Instead, a misplaced item in the light end will sink towards its correct position only one step in each pass. • Instead the array: 83 04 12 18 22 34 65 67 requires 7 passes for sorting. • This unnatural asymmetry suggests a third improvement: alternating the direction of consecutive passes. • We appropriately call the resulting algorithm Shakersort [3.2.3.c]. -----------------------------------------------------------{Sorting by exchange - Variant 3} PROCEDURE Shakersort; VAR j,last,up,down: TypeIndex; temp: TypeElement; BEGIN up:= 2; down:= n; last:= n; REPEAT FOR j:= down DOWNTO up DO [3.2.3.c] IF a[j-1].key>a[j].key THEN BEGIN temp:= a[j-1]; a[j-1]:= a[j]; a[j]:= temp; last:= j END;{FOR} up:= last+1; FOR j:=up TO down DO IF a[j-1].key>a[j].key THEN BEGIN temp:=a[j-1]; a[j-1]:=a[j]; a[j]:=temp; last:=j END;{FOR} down:=last-1 UNTIL (up>down) {REPEAT} END; {Shakersort} -----------------------------------------------------------/* Sorting by exchange - Variant 3*/ void shakersort() { type_index j,last,up,down; type_element temp; up= 2; down= n; last= n; do { for(j=down; j>= up; j--) /*[3.2.3.c]*/ if (a[j-1].key>a[j].key) { temp= a[j-1]; a[j-1]= a[j]; a[j]= temp; last= j; } /*for*/ up= last+1; for(j=up; j<= down; j++) if (a[j-1].key>a[j].key) { temp=a[j-1]; a[j-1]=a[j]; a[j]=temp; last=j; } /*for*/ down=last-1; } while (!(up>down)); } /*shakersort*/ /*--------------------------------------------------------*/ 3.2.3.1 Performance Analysis of Bubblesort and Shakersort • The number of comparison for bubblesort is constant and has the value: -----------------------------------------------------------n −1 C = ∑ (i − 1) = i =1 n2 − 3 ⋅ n + 2 2 [3.2.3.d] ------------------------------------------------------------ • The minimum, maximum and average values of the number of moves are: ------------------------------------------------------------ M min = 0 M max = 3 ⋅ C = 3 ⋅ (n 2 + 3 ⋅ n + 2) 2 [3.2.3.e] 3 M avg = (n 2 + 3 ⋅ n + 2) 4 ------------------------------------------------------------ • The performance analysis of shakersort leads to C min = n - 1 . • For the other indicators, Knuth arrives at an average number of passes proportional to n - k1 n and an average number of comparisons C med = 1/2 (n2 - n ( k2 + ln n)) [Kn76]. • But we have to note that all improvements mentioned above do in no way affect the number of exchanges. • They only reduce the number of redundant double checks. • Unfortunately, an exchange of two items is generally a more costly operation than a comparison of keys; our clever improvements therefore have a much less profound effect than one would intuitively expect. • The comparative analysis of the performance of sorting algorithms reveals the following conclusions: • (1) Sorting by exchange is inferior as performance than sorting by insertion or selection, so it is not recommended. • (2) The shakersort algorithm is used with advantage in those cases in which it is known that the items are already almost in order -- a rare case in practice • It can be shown that the average distance that each of the n items has to travel during a sort is n/3 places. • • This figure provides a clue in the search for improved, i.e. more effective sorting methods. All straight sorting methods essentially move each item by one position in each elementary step. • Therefore, they are bound to require in the order n2 such steps. • An effective improvement of performance must be based on the principle of moving items over greater distances in single leaps. • Subsequently, three improved methods will be discussed, namely, one for each basic sorting method: insertion, selection, and exchange. 3.2.4 Insertion Sort by Diminishing Increment. Shellsort • A refinement of the straight insertion sort was proposed by D. L. Shell in l959. • The idea of this method is explained and demonstrated on a standard example of eight items in Figure 3.2.4. 34 65 34 18 12 22 83 18 04 67 65 12 67 4-sort 04 22 83 2- sort 04 18 12 22 34 65 83 67 04 12 18 22 34 65 67 83 1-sort Fig. 3.2.4. Insertion sort by diminishing increment • First, all items that are four positions apart are grouped and sorted separately. • This process is called a 4-sort. • In the example in figure 3.2.4, where there are eight items, 4 such groups are formed, each sorted group containing exactly two items separated by 4 positions. • After this first pass, the items are regrouped into groups with items two positions apart and then sorted anew. • This process is called a 2-sort. • Finally, in a third pass, all items are sorted in an ordinary sort or 1-sort. • We must underline that each k-sort is in fact an insertion sort in which the step is k and not 1 like in ordinary sort by insertion. • At a first glance, this method which requires several sorting passes, each of which involves all items, seems to introduce more work than it saves. • However, at a deeper analysis, each sorting step over a chain either involves relatively few items or the items are already quite well ordered and comparatively few rearrangements are required. • It is obvious that the method results in an ordered array, and it is fairly obvious that each pass profits from previous passes, since each i-sort combines groups sorted in the preceding j-sort. • It is also obvious that any sequence of increments is acceptable, as long as the last one is unity, because • In the worst case the last pass does all the work. • It is, however, much less obvious, but the practice demonstrate, that the method of diminishing increments yields even better results with increments other than powers of 2. • The program presented in [3.2.4.b], is conceived for any sequence of t increments hi which fulfil the conditions [3.2.4.a.]. -----------------------------------------------------------h1 , h2 , ... , ht , where ht = 1, hi > hi+1 and 1 ≤ i < t [3.2.4.a] ------------------------------------------------------------ • The increments are stored in the array h. • Each h-sort is implemented as an inserting sort using a sentinel in order to simplify the finishing condition of the sorting process. • Because each h-sort requires its one sentinel, to simplify the searching process, the array a is extended to its left not with one position a[0] but with h[1]positions, that means a number equal with the value of the bigger increment. -----------------------------------------------------------{ Insertion sort by diminishing increment - Shellsort} PROCEDURE Shellsort; CONST t=4; VAR i,j,step,s: TypeIndex; temp: TypeElement; m: 1..t; h: ARRAY[1..t] OF integer; BEGIN {increments’ assignment} h[1]:= 9; h[2]:= 5; h[3]:= 3; h[4]:= 1; FOR m:= 1 TO t DO BEGIN {s is the index of the current sentinel} step:= h[m]; s:= -step; FOR i:= step+1 TO n DO BEGIN temp:= a[i]; j:= i-step; IF s=0 THEN s:= -step; s:= s+1; a[s]:= temp; [3.2.4.b] WHILE temp.key<a[j].key DO BEGIN a[j+step]:= a[j]; j:= j-step {shift} END;{WHILE} a[j+step]:= temp {insertion of the item} END{FOR} END{FOR} END; {Shellsort} -----------------------------------------------------------/*Insertion sort by diminishing increment - variant C */ void shellsort() { enum { t = 4}; type_index i,j,step,s; type_element temp; unsigned char m; int h[t]; /* increments’ assignment */ h[0]= 9; h[1]= 5; h[2]= 3; h[3]= 1; for(m=1; m<=t; m++) { /*s s is the index of the current sentinel */ step= h[m-1]; s= -step; for(i=step+1; i<= n; i++) { temp= a[i]; j= i-step; if (s==0) s= -step; s= s+1; a[s]= temp; /*[3.2.4.b]*/ while (temp.key<a[j].key) { a[j+step]= a[j]; j= j-step; /*shift*/ } /*while*/ a[j+step]= temp; /* insertion of the item*/ } /*for*/ } /*for*/ } /*shellsort*/ /*--------------------------------------------------------*/ 3.2.4.1 Performance Analysis of Shellsort • The analysis of shellsort algorithm poses some very difficult mathematical problems, many of which have not yet been solved. • In particular, it is not known which choice of increments yields the best results. • One surprising fact, however, is that they should not be multiples of each other. • This will avoid the phenomenon evident from the example given above in which each sorting pass combines two chains that before had no interaction whatsoever. • For a higher efficiency of the sorting process, it is indeed desirable that interaction between various chains takes place as often as possible • In fact, the following theorem holds: • If a k-sorted sequence is i-sorted, then it remains k-sorted. • That means the process of sorting by diminishing increment is cumulative. • Knuth indicates that a reasonable choice of increments the sequence is one of [3.2.4.c] or [3.2.4.d] (written in reverse order). -----------------------------------------------------------1, 4, 13, 40, 121, ... h t , h t-1 , ..., h k , h k-1 , ..., h1 [3.2.4.c] where h k-1 = 3 ⋅ h k +1 , h t = 1 and t = log3 n - 1 -----------------------------------------------------------l, 3, 7, 15, 3l, ... [3.2.4.d] where h k-1 = 2 ⋅ h k +1 , h t = 1 and t = log 2 n - 1 ------------------------------------------------------------ • For the latter choice, mathematical analysis yields an effort proportional to n1.2 required for sorting n items with the Shellsort algorithm. • Although this is a significant improvement over n2, we will not expound further on this method, since even better algorithms are known. • In [3.2.4.e] is presented another implementation variant of the shellsort. • This algorithm uses the increments generated by the formula [3.2.4.c], where t is calculated as function of the lengths of the array to be sorted in the first REPEAT loop [Se88]. • The array h is no more necessary, because for one side, the current increment is automatically calculated at each iteration of the REPEAT loop, and on the other side, it has renounced to use the sentinel technique. • In fact we have a sorting by straight insertion with variable step h. -----------------------------------------------------------{Shellsort (Sedgewick variant )} PROCEDURE Shellsort1; VAR i,j,h: TypeIndex; temp: TypeElement; BEGIN h:= 1; REPEAT h:= 3*h+1 UNTIL h>n; REPEAT h:= h DIV 3; FOR i:= h+1 TO n DO [3.2.4.e] BEGIN temp:= a[i]; j:= i; WHILE (a[j-h].key>temp.key) AND (j>h) DO BEGIN a[j]:= a[j-h]; j:= j-h END; {WHILE} a[j]:= temp END;{FOR} UNTIL h=1 END;{Shellsort1} -----------------------------------------------------------/*Shellsort (Sedgewick variant) – C implementation*/ void shellsort1() { typeindex i,j,h; typeelement temp; h= 1; do {h=3*h+1; } while (!(h>n)); do { h= h/3; for(i= h+1; i <= n; i++) /*[3.2.4.e]*/ { temp= a[i]; j= i; while ((a[j-h].key>temp.key) && (j>h)) { a[j]= a[j-h]; j= j-h; } /*while*/ a[j]= temp; } /*for*/ } while (!(h==1)); } /*shellsort1*/ /*--------------------------------------------------------*/ 3.2.5. Heapsort • The method of sorting by is based on the repeated selection of the least key among n items, then among the remaining n-1 items, etc. • Clearly, finding the least key among n items requires n-1 comparisons, finding it among n-1 items needs n-2 comparisons, etc., • It’s obvious that this selection sort can be possible improved, if in each pass we will retain more information than just the identification of the single least item. • Each scan can produce more information than just the identification of the single least item. • Thus, for instance, with n/2 comparisons it is possible to determine the smaller key of each pair of items, • With another n/4 comparisons, the smallest of each pair of such smallest keys can be selected, and so on. • Finally, using only n/2+n/4+...+4+2+1= n -1 comparisons, we can construct a selection tree as shown in Fig. 3.2 5a and identify the root as the desired least key. • The selection tree is in fact a partial ordered binary tree. 04 12 04 34 34 12 65 12 18 22 83 04 18 04 67 Fig.3.2.5.a. Selection tree • How can we use this tree in the sorting process? • (1) Extract the least key from the root of the tree. • (2) Descend down along the path marked by the least key and eliminate it by successively replacing it by either an empty hole at the bottom, or by the item at the alternative branch at intermediate nodes (see Figs. 3.2.5.b and 3.2.5.c). • Again, the item which gets the root of the tree is the least of the remaining items. • (3) Repeat the steps (1) and (2). 12 34 34 12 65 12 18 22 83 18 67 Fig. 3.2.5.b. Selecting the path of the least key 12 12 18 34 34 12 65 12 18 22 83 67 18 67 Fig. 3.2.5.c. Refilling the holes • After n such iterations, the n keys are extracted in increasing order, the tree is empty and the sorting process is finished. • We have to notice that each of the n steps of the selection requires only log2 n comparisons, that means a number equal with the tree height. • As consequence, the integral sorting process requires: • (1) n steps for construction of the tree. • (2) A number of elementary operations on the order of n ⋅ log2 n for the sorting itself. • This is a considerable improvement over the straight sorting methods requiring an effort on the order O(n2) and even comparing with shellsort which requires n1.2 steps. • Naturally, the task of bookkeeping has become more elaborate, and therefore the complexity of individual steps is greater in the tree sort method. • After all, in order to retain the increased amount of information gained from the initial pass, some sort of tree data structure has to be created. • Our next task is to find methods of organizing this information efficiently. • The new data structure is desired to respect the following specification: • • • (1) To eliminate the need for the holes that in the end populate the entire tree and are the source of many unnecessary comparisons. • (2) To represent the tree of n items in n units of storage, instead of in 2n-1 units as shown above (figures 3.2.5.a, b, c). These goals are achieved by a method called heapsort by its inventor J. Williams. • It is plain that this method represents an exception achievement presuming a drastic improvement over more conventional tree sorting approaches. • It is based on special representation of a partially ordered binary tree named heap. A heap is defined as a sequence of keys hl,hl+1,...,hr which has the following proprieties [3.2.5.a]: -----------------------------------------------------------h i ≤ h 2i h i ≤ h 2i +1 for all i = l, ..., r/2 [3.2.5.a] ------------------------------------------------------------ • A heap can be assimilated with a partially ordered binary tree and represented as an array. • For example the heap h1,h2,....,h15 can be assimilated with the binary tree from the figure 3.2.5.d and can be represented as the array h based on the following technique: • (1) The elements of the heap (tree) are successively numbered level by level, from up to down, and from left to right in each level.. • (2) The heap’s elements are associated with the locations of an array h, so that the heap element hi is associated with h[i]location of the array.. h1 h2 h3 h4 h8 1 h5 h9 2 3 h6 h10 4 5 h11 6 7 h7 h12 8 9 h13 10 11 12 h14 13 h15 14 15 Fig. 3.2.5.d. Representation of a heap using a linear array • A heap has the propriety that its first element is the smallest of all heap’s elements i.e. h1 = min (h1 ,....,hn) . • Let us now assume that a heap with elements hleft+1,hleft+2,....,hright is given for some values left and right. • A new element x is added to the left to form the extended heap hleft...hright. • In figure 3.2.5.e (a) appears for example, the initial heap h2...h7 and in the same figure (b) is shown the extend heap to the left by an element h1=34. • The new heap is obtained by first putting x on top of the tree structure and then by letting it sift down along the path of the smaller components, which at the same time move up. • In the given example the value 34 is first exchanged with 04, then with 12, and thus forming the tree shown in 3.2.5.e (b). • It is very easy to observe that the proposed method of sifting actually preserves the heap invariants that define a heap [3.2.5.a]. 34 h1 04 22 65 1 04 83 2 3 22 04 18 22 12 12 65 4 5 6 7 1 65 83 18 12 04 (a) 83 2 18 34 3 4 5 22 12 65 83 6 7 18 34 (b) Fig. 3.2.5.e. Shifting a key in a heap • We now formulate this sifting algorithm as follows: • i, j are the pair of indices denoting the items to be exchanged during each sift step. • x is introduced on position hleft • The pseudocod format of the shifting algorithm appears in [3.2.5.b]. --------------------------------------------------------*Shifting a key up-down in a heap - pseudocod variant procedure Shift(left,right: TypeIndex){heap limits} i:= left; {the current element) j:= 2*i; {the left son of the current element} temp:= h[i] {the shifting element} while there are levels in the heap(j≤right) and the placement place was not found execute *select the smallest son of the element h[i] (h[j]or h[j+1]) if temp>selected_son then *shift the selected son in the place of his father (h[i]:=h[j]); *advance on the next level of the heap (i:=j; j:=2*i) then [3.2.5.b] return {the place was found} *place temp in its place in the heap (h[i]:=temp); --------------------------------------------------------• The Pascal procedure respectively the C function C implementing the shift algorithm apear in [3.2.5.c]. --------------------------------------------------------{Shifting a key up-down in a heap - Pascal variant } PROCEDURE Shift(left,right: TypeIndex); VAR i,j: TypeIndex; temp: TypeElement; ret: boolean; BEGIN i:= left; j:= 2*i; temp:= h[i]; ret:= false; WHILE(j<=right) AND (NOT ret) DO BEGIN IF j<right THEN IF h[j].key>h[j+1].key THEN j:= j+1; IF temp.key>h[j].key THEN BEGIN h[i]:= h[j]; i:= j; j:= 2*i [3.2.5.c] END ELSE ret:= true END;{WHILE} h[i]:= temp END; {Shift} --------------------------------------------------------/*Shifting a key up-down in a heap - C variant */ void shift(type_index left,type_index right) { type_index i,j; type_element temp; boolean ret; i= left; j= 2*i; temp= h[i]; ret= false; while((j<=right) && (! ret)) { if (j<right) if (h[j].key>h[j+1].key) j= j+1; if (temp.key>h[j].key) { h[i]= h[j]; i= j; j= 2*i; /*[3.2.5.c]*/ } else ret= true; } /*while*/ h[i]= temp; } /*shift*/ /*-----------------------------------------------------*/ • Is to be noticed that in fact a new abstract data type named heap has been defined. • ADT Heap consists in the mathematical model described by an partially ordered binary tree for which the specific operator developed. shift(left,right)has been • A neat way to construct a heap in situ was suggested by R. W. Floyd using the ADT heap and the shift operator defined above. • Let consider an array h1,...,hn containing the n elements from which the heap will be built. • Clearly, the elements hn/2,...,hn form a heap already, since there are no two indices i, j such that j=2*i or j=2*i+1. • These elements form what may be considered as the bottom row of the associated binary tree among which no ordering relationship is required. • The heap is now extended to the left, whereby in each step a new element is included and properly positioned by a shift. • Accordingly, considering that the initial array is stored in h, the procedure which generates a heap in situ is presented in [3.2.5.d]. -----------------------------------------------------------{Generating the heap phase} left:= (n DIV 2)+1; WHILE left>1 DO BEGIN [3.2.5.d] left:= left-1; shift(left,n) END;{WHILE} ------------------------------------------------------------ • Once the heap was generated, the next step is to sort the heap elements in situ, based on Wiliams’ ideea. • For this purpose we use the same ADT heap and the shift operator. • In order to obtain a full ordering among the elements, n shift steps have to follow, whereby after each step the least item may be picked off the top of the heap. • Once more, the question arises about where to store the emerging top elements and whether or not an in situ sort would be possible. • Of course there is such a solution. In each step: • (1) Interchange the last current component (say x) of the heap with the top element of the heap h[1]. • (2) Reduce the heap at its right with one position. • (3) Shift down the top of the heap h[1] in its proper place in the heap. • (4) Return to step (1) until the left limit of the heap is reached. • The result is the array h ordered in the reverse order. • In terms of shift operator this technique can be described as in [3.2.5.e]. -----------------------------------------------------------{Heap sorting phase} right:= n; WHILE right>1 DO BEGIN temp:= h[1]; h[1]:= h[right]; h[right]:= temp [3.2.5.e] right:= right-1; shift(1,right) END;{WHILE} ------------------------------------------------------------ • The sorted keys are in reverse order. This can be solved modifying the sense of comparison relationship in the shift operator. • The next algorithm illustrates the heap sort technique [3.2.5.f]. -----------------------------------------------------------{Heapsort – Pascal Variant} PROCEDURE Heapsort; VAR left,right: TypeIndex; temp: TypeElement; PROCEDURE Shift; VAR i,j: TypeIndex; ret: boolean; BEGIN i:=left; j:= 2*i; temp:= h[i]; ret:= false; WHILE (j<=right) AND (NOT ret) DO BEGIN IF j<right THEN IF h[j].key<h[j+1].key THEN j:= j+1; IF temp.key<h[j] THEN BEGIN h[i]:= h[j]; i:= j; j:= 2*i END ELSE ret:= true END;{WHILE} h[i]:= temp END; {Shift} BEGIN {Generating the heap phase} left:= (n DIV 2)+1; Right:= n; WHILE left>1 DO BEGIN left:= left-1; Shift END;{WHILE} [3.2.5.f] WHILE right>1 DO {Heap sorting phase} BEGIN temp:= h[1]; h[1]:= h[right]; h[right]:= temp; right:= right-1; Shift END END; {Heapsort} -----------------------------------------------------------/* Heapsort – C Variant */ void heapsort(); static void shift1(type_index* left, type_element* temp, type_index* right) { type_index i,j; boolean ret; i=*left; j= 2*i; *temp= h[i]; ret= false; while ((j<=*right) && (!ret)) { if (j<*d) if (h[j].key<h[j+1].key) j= j+1; if (temp->key<h[j]) { h[i]= h[j]; i= j; j= 2*i; } else ret= true; } /*while*/ h[i]= *temp; } /*shift1*/ void heapsort() { type_index left,right; type_element temp; /*generating the heap*/ left= (n/2)+1; right= n; /*[3.2.5.f]*/ while (left>1) { left= left-1; shift1(&left, &temp, &right); } /*while*/ while (right>1) /*sorting the heap*/ { temp= h[1]; h[1]= h[right]; h[right]= temp; right= right-1; shift1(&left, &temp, &right); } /*while*/ } /*heapsort*/ /*--------------------------------------------------------*/ 3.2.5.1 Performance Analysis of Heapsort • At first sight it is not evident that this method of sorting provides good results. • After all, the large items are first sifted to the left before finally being deposited at the far right. • The detailed analysis of this method reveals another opinion: • (1) To generate the heap, are necessary n/2 shift steps . • In each step, in the worst case, the items are shifted through respectively log(n/2),log(n/2 +1),... ,log(n-1) positions where the logarithm (to the base 2) is truncated to the next lower integer. • (2) Subsequently, the sorting phase takes n-1 shifts, with at most log(n1), log(n-2),...,1 moves each. • (3) In addition, there are 3⋅(n -1) moves for stashing the shifted item away at the right. • This argument shows that Heapsort takes of the order of O(n ⋅ log n) steps even in the worst possible case [3.2.5.g]. • This excellent worst-case performance is one of the strongest qualities of Heapsort. --------------------------------------------------------O(n/2 ⋅ log 2(n-1) + (n-1) ⋅ log 2(n-1) + 3 ⋅ (n-1)) = O(n ⋅ log 2 n) [3.2.5.g] -------------------------------------------------------- • It is not at all clear in which case the worst (or the best) performance can be expected. • But generally Heapsort seems to like initial sequences in which the items are more or less sorted in the inverse order, and therefore it displays an unnatural behaviour. • The heap creation phase requires zero moves if the inverse order is present. • The average number of moves is approximately 1/2⋅n ⋅log n , that means a move at two sorting steps, and the deviations from this value are relatively small. • In a manner specific to the advanced sorting methods, the small values of n (the number of elements), are not representative, the efficiency of increasing with the increasing of n. the algorithm • In [3.2.5.h] is presented a C variant of heapsort. --------------------------------------------------------//Heapsort - variant 1 C shift(int left,int right) { //globale: int a[],int n int i=left,j=2*left,x=a[i-1],ret=0; while(j<=right && !ret) { if(j<right && a[j-1]<a[j])j++; (x<a[j-1])?(a[i-1]=a[j-1],i=j,j=2*i):(ret=1); } a[i-1]=x; } [3.2.5.h] heapsort() { //globale: int a[],int n //generating the heap int left=n/2+1,right=n; while(left-1)shift(--left,n); //sorting the heap while(right-1) { int x=a[0]; a[0]=a[right-1]; a[right-1]=x; shift(1,--right); } } --------------------------------------------------------- 3.2.6. Partition Sort. Quicksort • After having discussed two advanced sorting methods based on the principles of insertion and selection, we introduce a third improved method based on the principle of exchange. • In view of the fact that Bubblesort was on the average the least effective of the three straight sorting algorithms, a relatively significant improvement factor shouldn’t be expected. • Still, it comes as a surprise that the improvement based on exchanges to be discussed subsequently yields the best sorting method on arrays known so far. • Its performance is so spectacular that its inventor, C.A.R. Hoare, called it Quicksort. • The method is based on the recognition that exchanges should preferably be performed over large distances in order to be most effective. • Quicksort is based on the following partitioning algorithm: • Pick any item at random (and call it x) of the array to be sorted a1,...,an . • Scan the array from the left until an item ai>x is found. • Scan the array from the right until an item aj<x is found. • Exchange the two items ai and aj . • Continue this scan and swap process until the two scans meet somewhere in the middle of the array. • The final result is that the array is now partitioned into a left part with keys less than (or equal to) x, and a right part with keys greater than (or equal to) x. • This partitioning process is now formulated in the form of a pseudocode algorithm [3.2.6.a]. ------------------------------------------------------------ *Partitionong an array – pseudocod variant procedure Partition {Partition of the array a[s..d]} *select element x (usually from the middle of the interval to be partitioned) repeat *find the first item a[i]>x, scanning the interval from left to right *find the first item a[j]<x, scanning the interval from right to left if i<=j then [3.2.6.a] *exchange a[i] with a[j] until the two scans meet (i>j) ------------------------------------------------------------ • This partitioning process is now formulated in the form of a procedure[3.2.6.b] . • Note that the relations > and < have been replaced by ≥ and ≤, whose negations in the while clause are < and >. • With this change x acts as a sentinel for both scans -----------------------------------------------------------{Procedure Partition – Pascal variant } PROCEDURE Partition; VAR x,temp: TypeElement; BEGIN [1] i:= 1; j:= n; [2] x:= a[n DIV 2]; [3.2.6.b] [3] REPEAT [4] WHILE a[i].key<x.key DO i:= i+1; [5] WHILE a[j].key>x.key DO j:= j-1; [6] IF i<=j THEN BEGIN [7] temp:= a[i]; a[i]:= a[j]; a[j]:= temp; [8] i:= i+1; j:= j-1 END [9] UNTIL i>j END; {Partition} ------------------------------------------------------------ • Subsequently, using the partition, the sorting process is simple: • After a first partitioning of the array, apply the same process to both resulting partitions. • Then to the partitions of the partitions, and so on. • The process is finished when every partition consists of a single item only. • This recipe is described schematic as follows[3.2.6.c] : -----------------------------------------------------------*Quicksort – pseudocod variant procedure QuickSort(left,right); *partitioning interval left,right relative to middle if there is a left partition then QuickSort(left,middle-1) [3.2.6.c] if there is a right partition then QuickSort(middle+1,right); ------------------------------------------------------------ • In [3.2.6.d] a Pascal variant and in [3.2.6.e] a C variant of Quicksort is presented. -----------------------------------------------------------{Quicksort – Pascal variant } PROCEDURE Quicksort; PROCEDURE Sort(VAR left,right: TypeIndex); VAR i,j: TypeIndex; x,temp: TypeElement; BEGIN i:= left; j:= right; x:= a[(left+right) DIV 2]; REPEAT WHILE a[i].key<x.key DO i:= i+1; WHILE x.key<a[j].key DO j:= j-1; [3.2.6.d] IF i<=j THEN BEGIN temp:= a[i]; a[i]:= a[j]; a[j]:= temp; i:= i+1; j:= j-1 END UNTIL i>j; IF left<j THEN Sort(left,j); IF i<right THEN Sort(i,right); END; {Sort} BEGIN Sort(1,n) END; {Quicksort} -----------------------------------------------------------//quicksort – C variant quicksort(int left,int right) { //int a[],int n int i=left,j=right,x=a[(left+right)/2]; do { while(a[i]<x)i++; while(a[j]>x)j--; if(i<=j) { int temp=a[i]; [3.2.6.e] a[i]=a[j]; a[j]=temp; i++;j--; } }while(i<=j); if(left<j)quicksort(left,j); if(right>i)quicksort(i,right); } ------------------------------------------------------------ • Procedure Quicksort activates itself recursively. Such use of recursion in algorithms is a very powerful tool and will be discussed further in Chap. 5. • We will now show how this same algorithm can be expressed as a non-recursive procedure. • Obviously, the solution is to express recursion as an iteration, whereby a certain amount of additional bookkeeping operations become necessary • The key to an iterative solution lies in maintaining a list of partitioning requests that have yet to be performed. • After each step, two partitioning tasks arise. • Only one of them can be attacked directly by the subsequent iteration; the other one is stacked away on that list as a partitioning request. • It is, of course, essential that the list of requests is obeyed in a specific sequence, namely, in reverse sequence. • This implies that the first request listed is the last one to be obeyed, and vice versa; • The list is in fact a pulsating stack. • In [3.2.6.f ] a scheme of this implementation is presented as pseudocode. -----------------------------------------------------------*QuickSort. Nonrecursive Implementation – pseudocod variant procedure NonRecursiveQuickSort; *the bounds of the interval to be partitioned are stored in the stack as a partitioning request (starting the process) repeat *extract the request from the top of the stack which becomes the CurrentInterval *reduce the stack with one position [3.2.6.f] repeat repeat *the CurrentInterval is partitioned until end partitioning if there is a right interval then *its bounds are stored in the top of the stack as a partitioning request *the left interval becomes CurrentInterval until CurrentInterval has the length 1 or 0 until the stack is empty ------------------------------------------------------------ • In the procedure which implements the algorithm presented above [3.2.6.f ]: • (1) A partitioning request is represented simply by a left and a right index specifying the bounds of the partition to be further partitioned. • (2) The stack is modelled by an array with a variable dimension called stack and an index is which denotes the top of the stack. [3.2.6.g]. • (3) The appropriate choice of the stack size m will be discussed during the analysis of Quicksort. --------------------------------------------------------{QuickSort. Nonrecursive Implementation – Pascal variant } PROCEDURE NonRecursiveQuickSort; CONST m = ...; VAR i,j,left,right: TypeIndex; x,temp: TypeElement; is: 0..m; stack: ARRAY[1..m] OF RECORD left,right: TypeIndex END; BEGIN is:= 1; stack[1].left:= 1; stack[1].right:= n; {starting} REPEAT {pick the request from the top of the stack} left:=stack[is].left; right:=stack[is].right; is:=is-1; REPEAT {partitioning CurrentInterval a[left],a[right]} i:= left; j:= right; x:= a[(left+right) DIV 2]; REPEAT WHILE a[i].key<x.key DO i:= i+1; WHILE x.key<a[j].key DO j:= j-1; IF i<=j THEN [3.2.6.g] BEGIN temp:= a[i]; a[i]:= a[j]; a[j]:= temp; i:= i+1; j:= j-1 END UNTIL i>j; IF i<right THEN BEGIN {the right partition is stored in the top of the stack} is:=is+1; stack[is].left:=i; stack[is].right:= right END; right:= j {left partition becomes CurrentInterval} UNTIL left>=right UNTIL is=0 END; {NonRecursiveQuickSort} -----------------------------------------------------------/* QuickSort. Nonrecursive Implementation - C variant */ void qsort_nonrec() { enum { m = 15}; typeindex i,j,left,right; typeelement x,temp; unsigned char is; struct { typeindex left,right; } stack[m]; is= 1; stack[0].left= 1; stack[0].right= n; /*starting*/ do { /* pick the request from the top of the stack */ left= stack[is-(1)].left; right= stack[is-(1)].right; is= is-1; do { /* partitioning current_interval a[left],a[right]*/ i= left; j= right; x= a[(left+right)/2]; do { while (a[i].key<x.key) i= i+1; while (x.key<a[j].key) j= j-1; if (i<=j) /*[3.2.6.g]*/ { temp= a[i]; a[i]= a[j]; a[j]= temp; i= i+1; j= j-1; } } while (!(i>j)); if (i<right) { /* the right partition is stored in the top of the stack */ is= is+1; stack[is-(1)].left= i; stack[is-(1)].right= right; } right= j; /*left partition becomes current_interval*/ } while (!(left>=right)); } while (!(is==0)); } /*NonRecursiveQuickSort*/ /*--------------------------------------------------------*/ 3.2.6.1 Performance Analysis of Quicksort • In order to analyse the performance of Quicksort, we need to investigate the behaviour of the partitioning process first. • For simplifying we presume the following preconditions: • (1) The key set to be partitioned consists in n distinct and unique keys having the values {1,2,3,...,n}. • (2) From these n keys, the key with value x has been selected for partitioning. • As result, the selected key occupies the x-th position in the ordered set of keys. This key is named pivot. • There are some questions to be answered: • (1) After selecting the x value like pivot, which is the probability that some key of the partition to be exchanged? • A key is exchanged if it is greater than x. • There are n-x+1 keys greater than x. • As consequence, the probability that some key to be exchanged is ca o (n–x+1)/n (the ratio between favourable and possible cases). • (2) After selecting the x value (position) like pivot, which is the number of interchanges necessary for partitioning the sequence of n keys? • On the right side of the pivot there are n–x positions that will be processed during partitioning. • The possible number of exchanges when x was selected as pivot, is equal with the product of the number of keys which will be scanned (n – x ) and the probability determined above, that a key to be exchanged (n–x+1)/n [3.2.6.h]. --------------------------------------------------------NrExc = (n − x) ⋅ (n − x + 1) n [3.2.6.h] -------------------------------------------------------- • (3) Which is the average number of interchanges necessary for partitioning a sequence of n keys? • When partitioning, any of the n keys can be selected as pivot. As result, x can take any value between 1 and n. • The average number of interchanges M necessary for partitioning a sequence of n key can be calculated as follow: • (1) For each value of x selected as pivot in the range 1 and n, NrExcx is calculated using formula [3.2.6.h]. • (2) The obtained values of NrExcx are summed for all the x values determined above. • (3) The obtained sum is divided by the total number of the keys n [3.2.6.i]. ------------------------------------------------------------ M= 1 n 1 n (n − x + 1) n 1 n ⋅ ∑ NrExc = ⋅ ∑ (n − x) ⋅ = − ≈ 6 6⋅n 6 n x =1 n x =1 n [3.2.6.i] ------------------------------------------------------------ • Presuming, in an exaggerated manner, that always the median value will be selected as pivot, each partitioning process will divide the array in two equal parts. • We underline that the median is the key positioned as value in the middle of the partition when it is ordered. • As result, for sorting, a number of log n passes through all the elements of the array are necessary. (fig.3.2.6.a). • For example, in figure 3.2.6.a, is shown that for sorting an array with 15 elements, are necessary log 2 15 =4 passes through all the array elements. That means 4 steps of integral partitioning of the array. Number of calls Number of partitioned elements per call 1sorting call (15 elements) 15 elements 15 2 sorting calls (7 elements) 7 elements 4 sorting calls (3 elements) 3 elem 1 3 elem 3 elem 1 3 elem 3 8 sorting calls (1 element) 1 1 1 1 1 1 1 1 1 1 1 7 elements 1 1 1 7 Fig. 3.2.6.a. Principal description of the partitioning sort • Unfortunately the number of recursive calls is 15, equal with the number of elements. • The total number of comparisons C is n ⋅ log n because in a pass, all the keys are compared [3.2.6.j]. • The number of moves M is n/6 ⋅ log n because in accordance with formula [3.2.6.i], for partitioning n keys, are necessary in average n/6 moves [3.2.6.j]. --------------------------------------------------------C = n ⋅ log 2 n M= 1 ⋅ n ⋅ log 2 n 6 [3.2.6.j] -------------------------------------------------------- • This results are exceptionally, but they refer to the optimum case, when was supposed that in each pass, is selected the median as pivot. • In fact, this event has the probability only 1/n to happen. • The big success of quicksort is due to the surprisingly fact that its average performance, when the pivot is selected at random, is inferior to the optimal case by a factor of only 2⋅ln(2)=1.4, that means approximate 40 % [Wi76]. • But Quicksort does have its pitfalls. • (1) First of all, it performs moderately well for small values of n, as do all advanced methods. • Its advantage over the other advanced methods lies in the ease with which a straight sorting method can be incorporated to handle small partitions. This is particularly advantageous when considering the recursive version of the program. • (2) Another problem is the worst case, when performance decrease in a catastrophic manner. • This situation appears when in all partitioning process is selected as pivot the biggest (or the smallest) key. • In this case, each step will part the sequence of n elements, in a left partition with n-1 elements and a right partition with only one element. • As result, n partitioning processes will be necessary instead of log(n), that means a performance on the order of O(n2). • Apparently, the crucial step is the selection of the pivot x [GG78]. • In our example program it is chosen as the middle element. • Note that one might almost as well select either the first or the last element. In these cases, the worst case is the initially sorted array. • Quicksort then shows a definite dislike for the trivial job and a preference for disordered arrays. • In choosing the middle element, the strange characteristic of Quicksort is less obvious because the initially reverse sorted array becomes the optimal case. • • In fact, also the average performance is slightly better, if the middle element is selected. Hoare suggests that the choice of x be made: • (1) At random, • (2) Or by selecting it as the median of a small sample of, say, three keys. • Such a judicious choice hardly influences the average performance of Quicksort, but it improves the worst-case performance considerably. • What are the consequences of the worst case behaviour mentioned above to the performance Quicksort? • We have realized that each split results in a right partition of only a single element; the request to sort this partition is stacked for later execution . [3.2.6.g] • Consequently, the maximum number of requests, and therefore the total required stack size, is n. • This is, of course, totally unacceptable. • Note that we fare no better -- in fact even worse -- with the recursive version because a system allowing recursive activation of procedures will have to store the values of local variables and parameters of all procedure activations automatically, and it will use an implicit stack for this purpose. • The remedy lies in stacking the sort request for the longer partition and in continuing directly with the further partitioning of the smaller section. • • In this case, the size of the stack m can be limited to m=log 2 n. The change necessary is localized in the section setting up new requests in [3.2.6.g]. It now reads [3.2.6.k]. -----------------------------------------------------------{Reducing stack dimension in iterative Quicksort} IF j-left < right-i THEN BEGIN IF i<right THEN BEGIN {the request for right partition sorting is stored in stack} is:= is+1; stack[is].left:= i; stack[is].right:= right END; right:= j {the left partition is sorted} END ELSE BEGIN [3.2.6.k] IF left<j THEN BEGIN {the request for left partition sorting is stored in stack} s:= is+1; stack[is].left:= left; stack[is].right:= j END; left:= i { the right partition is sorted } END; ------------------------------------------------------------ 3.2.7 Finding the Median • The median of n items is defined as that item which is less than (or equal to) half of the n items and which is larger than (or equal to) the other half of the n items. • For example, the median of sequence 14, 11, 75, 91, 17, 64, 12 is 17 . • The problem of finding the median is customarily connected with that of sorting, because the obvious method of determining the median is: • (1) Sort the n items. • (2) Pick the item in the middle. • The partitioning technique yields a potentially much faster way of finding the median. The method to be displayed easily generalizes to the problem of finding the k-th smallest as value element of n items. • Finding the median represents the special case k=n/2. • In the same context k=1 means finding the minimum, and k=n , finding the maximum. • The algorithm for determining the k-th element as value, invented by C.A.R. Hoare functions as follows. • We presume that the considered elements are stored in the array a of dimension n. • First, the partitioning operation of Quicksort is applied with limits left=0 and right=n and with a[k]selected as splitting value x. • The resulting index values i and j are such that satisfy the following relations [3.2.7.a]. -----------------------------------------------------------1) x = a[k] 2) a[h] ≤ x for all h < i [3.2.7.a] 3) a[h] ≥ x for all h > j 4) i > j ------------------------------------------------------------ • There are three possible cases that may arise: • (1) The splitting value x was too small, as a result, the limit between the two partitions is below the desired value k. • The partitioning process has to be repeated upon the elements a[i],...,a[right]of the right partition (fig.3.2.7.a (a)). • (2) The chosen bound x was too large. • The splitting operation has to be repeated on the left partition a[left],...,a[j] (fig.3.2.7.a (b)). • (3) j < k < i . • In this case the element a[k]splits the array into two partitions in the specified proportions and therefore is the desired median (fig.3.2.7.a (c)). • The splitting process has to be repeated until case (3) arises. ≤ ≥ left j i (a) k ≤ left ≥ k j ≤ left right i (b) right ≥ j k i Fig. 3.2.7.a. Finding the median (c) right • The corresponding algorithm is presented as pseudocode in [3.2.7.b] respectively its first refinement in [3.2.7.c]. -----------------------------------------------------------*Finding the median - pseudocod variant - refining step 0 procedure Median (left,right,k); while there exist a partition [3.2.7.b] *select pivot (element in the position k) *partition the current interval reported to the pivot value if pivot position<k then *select the right partition if pivot position>k then *select the left partition -----------------------------------------------------------{Finding the median - refining step 1} left:= 1; right:= n ; WHILE left<right DO BEGIN x:= a[k]; [3.2.7.c] *a[left]...a[right]is partitioned IF j<k THEN left:= i; IF k<i THEN right:= j END; -----------------------------------------------------------• The associated program appears in [3.2.7.d] in Pascal respective in C. -----------------------------------------------------------{Finding the median – Pascal variant} PROCEDURE Median (k:integer); VAR left,right,i,j: TypeIndex; x,temp: TypeElement; BEGIN left:=1; right:=n; WHILE left<right DO BEGIN x:= a[k]; i:= L; j:= R; REPEAT {partitioning} [3.2.7.d] WHILE a[i]<x DO i:= i+1; WHILE x<a[j] DO j:= j-1; IF i<=j DO BEGIN temp:= a[i]; a[i]:= a[j]; a[j]:= temp; i:= i+1; j:= j-1 END UNTIL i>j; IF j<k THEN left:= i; IF k<i THEN right:= j END {WHILE} END; {Median} -----------------------------------------------------------/* Finding the median – C variant */ void median (int k) { typeindex left,right,i,j; typeelement x,temp; left=0; right=n-1; while (left<right) { x= a[k]; i= left; j= right; do { /*partitioning*/ /*[3.2.7.d]*/ while(a[i]<x) i++; while(x<a[j]) j--; if (i<=j) { temp= a[i]; a[i]= a[j]; a[j]= temp; i++1; j--; } } while (!(i>j)); if (j<k) left= i; if (k<i) right= j; } } ------------------------------------------------------------ 3.2.7.1 Performance Analysis of Finding the Median • If we assume that on the average each split halves the size of the partition in which the desired median lies, then the number of necessary comparisons C is in the order of O(n) [3.2.7.e]. -----------------------------------------------------------C=n+ n n + + +1 = 2 ⋅ n −1 2 4 [3.2.7.e] ------------------------------------------------------------ • The number of moves M can’t be bigger than C, usually smaller, that means also O(n). • The values of C and M estimated for finding the median explain its superiority over the straightforward method of sorting the entire set of candidates before selecting the k-th where the best is of order O(n ⋅ log n). • In the worst case, however, each partitioning step reduces the size of the set of candidates only by 1, resulting in a required number of comparisons of order O(n2). • Again, there is hardly any advantage in using this algorithm, if the number of elements is small, say, fewer than 10(n > 10) [Wi85]. 3.2.8 Binsort. Distribution Counting • In general the algorithms based on advanced sorting methods need O(n ⋅ log n) steps to sort n elements. • We must underline that this is true if: • Doesn’t exist any supplementary information referring to the keys to be sorted, except the fact that over the key’s set is defined an ordering relation based on which we can establish if a key is smaller, equal or greater than another. • It will be shown that sorting can be faster than O(n⋅log n) if: • (1) There are other information referring to the keys to be sorted. • (2) We renounce to the "in situ" constraint. • For example, we have the following problem: • Sort a file of n records whose keys are distinct integers between 1 and n. • If a and b are arrays with n elements, array a storing the keys to be sorted, then the sorting can be done in one pass in array b [3.2.8.a]. -----------------------------------------------------------FOR i:= 1 TO n DO b[a[i].key]:= a[i]; {O(n)} [3.2.8.a] ------------------------------------------------------------ • The principle of the method: • The place of a[i] is determined and is placed exactly on its place in b. • The sorting process requires O(n) steps. • But the result is correct if there is only a single element with key x, for each value in interval [1,n]. • A second element with the same key x, will be stored in the same location b[x] destroying the previous element. • In [3.2.8.b] is presented the variant "in situ" of the algorithm. • Being given an array a of dimension n, whose elements have the keys respectively 1,...,n, • Its elements are scanned (external FOR loop). • If the element a[i] has the key j, then a[i] is exchanged with a[j]. • Each exchange places the element from location i exactly in its place in the ordered array. • This technique is illustrated in [3.2.8.b]. -----------------------------------------------------------FOR i:= 1 TO n DO WHILE a[i].key<>i DO BEGIN [3.2.8.b] temp:= a[i];a[i]:= a[a[i].key]; a[temp.key]:= temp END; ------------------------------------------------------------ • The sequences [3.2.8.a, b] illustrate the sorting method named binsort, a sorting process where is created a "bin" to hold all the records with a certain key value [AHU85]. • Binsort technique is not complicate: • (1) Each element to be sorted is examined. • (2) It is introduced in the bin corresponding to its key value. • In [3.2.8.a] the bins are the array elements b[1],...,b[n], and b[i] is the bin for key value i. • This technique simple and efficient is based on some priory requirements: • (1) The limited domain of the keys [1,n]. • (2) The uniqueness of each key. • If the second requirement is not respected, in the general case, however, we must be prepared both to store more than one record with the same key in a bin. • This problem can be solved if we string the bins together or concatenate the bins in proper order using for this purpose list structures. • In this case, the performance of the method is not altered in a significant manner, the sorting effort becoming O(n+m), where n is the total number of elements and m the number of keys values. • That is the reason for, this method is the starting point of many list sorting techniques [AHU85]. • For example, a method for solving such situations is "distribution counting" [Se88]. • The problem is formulated as follows: • Sort a file of n records whose keys are integers between 0 and m-1. • If m is not too large, an algorithm called distribution counting can be used to solve this problem. • The algorithm’s idea is: • (1) In a first pass, the number of keys with each value in array a, are counted. • (2) The values of the counts are adjusted. • (3) Then, in a second pass, using the counts, the records are moved into their ordered position in array b. • The distribution counting algorithm is shown in [3.2.8.c]. • To simplify, we suppose that array a contains only keys. -----------------------------------------------------------{Sorting by distribution counting – Pascal variant } TYPE TypeKey = 0..m-1; TypeArray = ARRAY [1..n] OF TypeKey; VAR count: ARRAY[0..m-1] OF TypeKey; a,b: TypeArray; i,j: TypeIndex; FOR j:= 1 TO m-1 DO count[j]:= 0; FOR i:= 1 TO n DO count[a[i]]:= count[a[i]]+1; FOR j:= 1 TO m-1 DO count[j]:= count[j-1]+count[j]; FOR i:=n DOWNTO 1 DO BEGIN b[count[a[i]]]:= a[i]; [3.2.8.c] count[a[i]]:= count[a[i]]-1 END; FOR i:= 1 TO n DO a[i]:= b[i]; END; {Sorting by distribution counting} -----------------------------------------------------------/* Sorting by distribution counting – C variant */ enum {n = 10, m = 10}; typedef unsigned typekey; typedef unsigned typeindex; typedef typekey typearray[n]; typekey count[m]; typearray a,b; typeindex i,j; int main(int argc, const char* argv[]) { for(j=1; j<=m-1; j++) count[j]= 0; for(i=1; i<=n; i++) count[a[i-1]]= count[a[i-1]]+1; for(j=1; j<=m-1; j++) count[j]= count[j-1]+count[j]; for(i=n; i>=1; i--) { b[count[a[i-1]]-1]=a[i-1]; /*[3.2.8.c]*/ count[a[i-1]]= count[a[i-1]]-1; } for(i= 1; i<=n; i++) a[i-1]= b[i-1]; return 0; } /* Sorting by distribution counting */ /*--------------------------------------------------------*/ • How this code works: • The counts associated to the keys are stored in array count of dimension m. • The first FOR loop initializes the counts to 0. • The second FOR loop counts the numbers for each key. • The third FOR loop adjusts the values in the array count. • In the fourth FOR loop, array a is scanned from the end to the beginning, and each scanned key is introduced in its proper ordered place in array b, using the count values stored in array count. • For each key i introduced in array b, the corresponding counter count[i] is decremented. As result, finally, the keys with identical value appear in their corresponding bin, in their initial order. • Last FOR loop is dedicated to move all the sorted keys from array b in the initial array a (if is necessary). • The next example illustrates the working manner of the distribution counting algorithm. ------------------------------------------------------------ • Example 3.2.8. In order to sort a sequence of keys using distribution counting, the following steps are executed: • (1) Suppose that initially array a has the following content: 1 a 0 2 b 1 3 b 1 4 a 0 5 c 2 6 a 0 7 d 3 8 a 0 9 b 1 10 b 1 11 a 0 12 d 3 13 d 3 14 a 0 • (2) The count array is initialized. 0 0 1 0 2 0 3 0 • (3) The keys of the array a are counted. 0 6 1 4 2 1 3 3 • (4) The values in the array count are adjusted. 0 6 1 10 2 11 3 14 • (5) The array a is scanned from left to right, and each elements is placed in array b, in the position indicated by its own value in array count. • After an element is stored in array b, its specific count in array count is decremented. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 a a a a a a b b b b c d d d ----------------------------------------------------------------- 3.2.8.1 Performance Analysis of Binsort • Although, some passes are necessary in order to sort an array, the overall performance of distribution counting algorithm is O(n) . • This method under that is fast, has the advantage that is a stable sorting method. • That is the reason for this method represents the basement for many sorting methods of type radix. 3.2.9 Radix Sorting • Usually the sorting methods conceive the sorting keys as entities and are defined in terms of the basic operations of “comparing” two keys and “exchanging” two records [Se92]. • Most of the methods we have studied can be described in terms of these two fundamental operations. • For many applications, however, it is possible to take advantage of the fact that the keys can be thought of as numbers from some restricted range. • Sorting methods which take advantage of the digital properties of these numbers are called radix sort. • Radix sorting algorithms: • (1) Treat the keys as numbers represented in a base-M number system, for different values of M (the radix) • (2) Work with individual digits of the numbers. • For example, consider an imaginary problem where a clerk must sort a pile of cards with three-digit numbers printed on them. One reasonable way for him to proceed is: • To make ten distinct piles, one for the numbers less than 100, one for the numbers between 100 and 199, etc., and place the cards in the piles. • Then deal with the piles individually, by using the same method on the next digit. • Then apply the same method to the last digit or use some simpler method if there are only a few cards. • This is a simple example of a radix sort with M=10. • We’ll examine this and some other methods in detail in this chapter. • Of course, with most computers it’s more convenient to work with M=2 (or some power of 2) rather than M=10. • Usually radix sort methods use binary numbers. • In general, given a key represented as a binary number, the fundamental operation needed for radix sorts is extracting a contiguous set of bits from the number. • For example: • We assume that the keys are represented by ten-bit binary numbers. • To extract the first 2 bits from a ten-bit binary number: • (1) Shift right eight bit positions the bit representation of the number. • (2) Do a bitwise operation “and” with the mask 0000000011. • These operations: • (1) Can be implemented using the bit processing facilities offered by the programming languages. • (2) Can be simulated using integer DIV and MOD operators. • In the example presented above, if x is the binary number, the leading 2 bits are extracted by the expression (x DIV 28)MOD 22. • For our implementations of sorting algorithms we will define a specific operator bits(x,k,j:integer):integer which combine the two mentioned operators returning j bits that appear k bit positions from the right in x. • A possible implementation in C of this operator is shown in [3.2.9.a]. -----------------------------------------------------------unsigned bits(unsigned x, int k, int j){ return (x>>k)&~(~0<<j);} [3.2.9.a] ------------------------------------------------------------ • There are two basic approaches to implement radix sort. • (1) The first approach examines the bits in the keys from left to right and is named radix exchange sort. • It is based on the fact that the outcome of “comparisons” between two keys depend only on the value of the bits at the first position at which they differ reading from left to right. • Thus, all keys with leading bit 0 are moved before all keys with leading bit 1 in the sorted file, generating two partition of keys, one with leading bit 0, the other with leading bit 1. • In each partition, all keys with second bit 0 are moved before all keys with second bit 1, and so forth. • The left-to-right radix sort, which is called radix exchange sort, sorts by systematically dividing up the keys in this way. • (2) The second basic method, called straight radix sort. • It examines the bits in the keys from right to left. • It is based on an interesting principle that reduces a sort on b-bit keys to b sorts on l-bit keys. • We’ll see how this can be combined with distribution counting to produce a sort that runs in linear time under quite generous assumptions. 3.2.9.1. Radix Exchange Sort • Radix exchange sort of an array a is based on following idea: • The elements of the array a are sorted so that all those whose keys begin with a 0 bit come before all those whose keys begin with a 1 bit. • This process generates two partitions of array a. • Then, the two partitions are sorted independently, using the same method, after the second bit of the keys resulting 4 partitions. • The for partitions are sorted after the 3-rd bit, resulting 8 partitions, an so on [3.2.9.1.a]. • This working manner suggests a recursive implementation. • As result radix exchange sort can be implemented as a partitioning process: • The array is scanned from left to right until a key which starts with 1 is found. • The array is scanned from right to left until is found a key which starts with 0. • The two elements are exchanged. • The scan continues from left, respectively right until the scanning indicators meet generating two partitions. • All the procedure is reiterated for the second bit in each of the two resulted partitions, then for the 3-rd bit in each of the 8 resulting partitions, and so on. • This leads to a recursive sorting procedure that is very similar to Quicksort [3.2.9.1.a]. -----------------------------------------------------------{Radix exchange sort – Pascal variant} PROCEDURE RadixExchange (left,right: TypeIndex, b: INTEGER); {left,right – the current limits of the sorting array} {b – the length in bits of the sorted key} VAR i,j: TypIndex; t: TypeElement; BEGIN [3.2.9.1.a] IF (right>left) AND (b>=0) THEN BEGIN i:= left; j:= right; b:= b-1; REPEAT WHILE(bits(a[i].key,b,1)=0)AND(i<j) DO i:= i+1; WHILE(bits(a[j].key,b,1)=1)AND(i<j) DO j:= j-1; t:= a[i]; a[i]:= a[j]; a[j]:= t UNTIL j=i; IF bits(a[right].key,b,1)= 0 THEN j:= j+1; {if the last tested bit is 0, the length of the partition is fixed} RadixExchange(left,j-1,b-1); RadixExchange(j,right,b-1); END {IF} END; {RadixExchange} -----------------------------------------------------------/* Radix exchange sort – C variant */ void radix_exchange (typeindex left, typeindex right, int b) /* left,right – the current limits of the sorting array */ /* the length in bits of the sorted key */ { typeindex i,j; typeelement t_; /*[3.2.9.1.a]*/ if((right>left) && (b>=0)) { i= left; j= right; b= b-1; do { while((bits(a[i].key,b,1)==0)&&(i<j)) i= i+1; while((bits(a[j].key,b,1)==1)&&(i<j)) j= j-1; t_= a[i]; a[i]= a[j]; a[j]= t_; } while (!(j==i)); if (bits(a[right].key,b,1)== 0) j= j+1; /*if the last tested bit is 0, the length of the partition is fixed*/ radix_exchange(left,j-1,b-1); radix_exchange(j,right,b-1); } } ------------------------------------------------------------ • For example, assume that an array a[1..15] contains integer keys less than 25 (represented with 5 bits). • The binary representation of the keys used for this example is a simple five-bit code with the i-th letter in the alphabet represented by the binary representation of the number i. • The call of procedure RadixExchange(1,15,5) will sort the array as presented in figure 3.2.9.1. A S O R T I N G E X A M P L E 00001 10011 01111 10010 10100 01001 01110 00111 00101 11000 00001 01101 10000 01100 00101 A E O L M I N G E A X T P R S 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0001 0101 1111 1100 1101 1001 1110 0111 0101 0001 1000 0100 0000 0010 0011 A E A E G I N M L O S T P R X 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 1 1 1 1 1 0 0 0 0 1 001 101 001 101 111 001 110 101 100 111 011 100 000 010 000 A A E E G I M M L O S R P T X 00 00 00 00 00 01 01 01 01 01 10 10 10 10 11 0 0 1 1 1 0 1 1 1 1 0 0 0 1 0 01 01 01 01 11 01 10 01 00 11 11 10 00 00 00 A A E E G I L M N O P R S T X 000 000 001 001 001 010 011 011 011 011 100 100 100 101 110 0 0 0 0 1 0 0 0 1 1 0 1 1 0 0 1 1 1 1 1 1 0 1 0 1 0 0 1 0 0 A A E E G I L M N O P R S T X 0000 0000 0010 0010 0011 0100 0110 0110 0111 0111 1000 1001 1001 1010 1100 1 1 1 1 1 1 0 1 0 1 0 0 1 0 0 A A E E G I L M N O P R S T X Fig. 3.2.9.1. Radix exchange sort example • One serious potential problem for radix exchange sort, not brought out in this example, is that degenerate partitions with all keys having the same value for the bit being used can happen frequently. • For example, this arises commonly in real arrays when small numbers with many leading zeros are being sorted. • This situation is very frequent too in the case of characters represented with 8 bits. • As performance, radix exchange sorts n keys of b bits using a number of bit comparisons equal with n*b . • In other words, radix exchange sort is linear with bits number of a key. • For a normal distribution of the bits of the keys, radix exchange sort is slightly faster than quicksort [Se88]. 3.2.9.2 Straight Radix Sort • An alternative radix sorting method is to examine the bits from right to left. • This is the method used by old computer-card sorting machines: • A deck of cards is run through the machine 80 times, once for each column, proceeding from right to left. • This method is named straight radix sort. • It’s based on the following idea: • The keys are sorted considering one bit, the bits being examined from right to left. • The sort after i bit consists in extracting all the elements having 0 on the i bit position of the key and placing them before the keys having 1 on the same bit position. • When the sorting process reach the i bit coming from right, the keys are already sorted after the last their i-1 bits. • It’s not easy to be convinced that the method works; in fact it doesn’t work at all unless the one-bit partitioning process is stable. • Due to this requirement, the normal exchange can’t be used, because it’s not a stable sorting method. • In fact we have to sort stable an array with only two values: 0 and 1. • The distribution counting can be successfully used for this purpose. • Let’s consider distribution counting sorting algorithm where: • (1) Assume M=2. • (2) a[i] is replaced by bits(a[i],k,1). That means, from the current key a[i] is extracted 1 bit placed at distance k from the end of the key. • In fact, we obtain a stable sort of array a, after k bit from right to left, the result being stored in the temporary array t. • (3) The distribution counting procedure is iterated for each bit of the keys, from right to left, respectively for k=0,1,2,…,b-1. • For integral sorting, the algorithm requires b passes where b is the bit length of the key. • In order to increase the performance of the sorting process, is not indicated to take M=2 , but is convenient M to have a value as big as possible. • The result is an reduced number of passes. • If m bits are processed for once: • The sorting time decreases by reducing the number of passes. • The dimension of the count array increases at value m1 = 2m because it have to store 2m bit configurations which can be obtained with m bits. • Thus, straight radix sort becomes little more than a generalization of distribution counting sort. • The corresponding algorithm for sorting array a[1..n]is presented in [3.2.9.2.a]. • The keys have b bits length. • The keys are scanned from right to left, processing m bits for once. In consequence, b/m passes will be necessary. • For sorting, the supplementary array t[1..n]is necessary. • The algorithm works only if b is multiple of m because the sorting algorithm divides the bits of the keys in an integer number of parts, equal as dimensions, which are processed for once. • If we take m=b we obtain distribution counting sort algorithm. • If we take m=1 we obtain straight radix sort algorithm. • The proposed implementation of this algorithm, sorts a in array t and after each sorting step, re-copies the array t in a (last FOR loop). • This situation can be avoid, concatenating in the same algorithm, two instances of the sorting algorithm: one sorting from a in t, another from t in a. -----------------------------------------------------------{Straight radix sort – Pascal variant } PROCEDURE StraightRadix; VAR i,j,pass: integer; count: ARRAY[0..m1-1] OF integer; {m1:=2m} BEGIN FOR pass:= 0 TO (b DIV m)-1 DO BEGIN [3.2.9.2.a] FOR j:= 0 TO m1-1 DO count[j]:= 0; FOR i:= 1 TO n DO count[bits(a[i].key,pass*m,m)]:= count[bits(a[i].key,pass*m,m)]+1; FOR j:= 1 TO m1-1 DO count[j]:= count[j-1]+count[j]; FOR i:= n DOWNTO 1 DO BEGIN t[count[biti(a[i].key,pass*m,m)]]:= a[i]; count[bits(a[i].key,pass*m,m)]:= count[bits(a[i].key,pass*m,m)]-1 END; FOR i:= 1 TO n DO a[i]:= t[i] END {FOR} END; {StraightRadix} -----------------------------------------------------------/* Straight radix sort – C variant */ void straight_radix() { int i,j,pass; int count[m1]; /*m1:=2m*/ for(pass= 0; pass<=(b/m)-1; pass++) { /*[3.2.9.2.a]*/ for(j= 0; j<=m1-1; j++) count[j]= 0; for(i= 1; i<=n; i++) count[bits(a[i].key,pass*m,m)]= count[bits(a[i].key,pass*m,m)]+1; for(j= 1; j<=m1-1; j++) count[j]= count[j-1]+count[j]; for(i= n; i>=1; i--) { t[count[bits(a[i].key,pass*m,m)]]= a[i]; count[bits(a[i].key,pass*m,m)]= count[bits(a[i].key,pass*m,m)]-1; } for(i= 1; i<=n; i++) a[i]= t[i]; } } ------------------------------------------------------------ • Straight radix sort performance: • Sort n elements having keys of b bits in b/m passes. • Disadvantages: • (1) Uses an supplementary memory space for 2m counts. • (2) Uses a buffer for sorting having the same dimension as the initial array. 3.2.9.3 Performance Analysis of Radix Sorts • The running time of both radix sorts for sorting n elements with b-bit keys are essentially n * b [Se92]. • On the one hand, the running time can be approximated as being n *log (n), since if the numbers are all different, b must be at least log (n). • In reality, both methods usually use many fewer than n*b operations: • The left to right method can stop once the differences between keys have been found. • The right to left method can process many bits at once. • Radix sorts have the following proprieties [Se88]. • Property 1. Radix exchange sort examines about n*log (n) bits. • Property 2. Both radix sorts examine fewer than n * b bits when sort n b-bit key. • Property 3. Straight radix sort can sort n elements with b-bit keys, in b/m passes, using extra space for 2m counter and a buffer for rearranging the keys. 3.2.9.4 Linear Sort • The straight radix sort implementation given in the previous section makes b/m passes through the array. • By making m large, we get a very efficient sorting method, as long as we have M=2m words of memory available. • A reasonable choice is to make m about b/4 - one-fourth the word-size. • If architectural organization of the computer presumes 32-bit words, if we choose m=8, the radix sort requires 4 distribution counting passes. • In this case the sort is practically linear. • The keys are treated as base-M numbers, and each base-M digit of each key is examined, but there are only four digits per key. • Usually, values as m=4 or m=8 are suitable for the actual architectural organization of the computing systems. • This presume reasonable dimensions for the count array, 16 for m=4 respectively 256 locations for m=8. • As result, each pass is linear, and because for 32-bit keys only 8 (4) passes are necessary, the sorting process is practically linear. • This method is one of the fastest sorting method. • The choice between Quicksort and radix sort is a difficult decision. • That is likely to depend not only on features of the application such as key, record, and file size, but also on features of the programming and machine environment that relate to the efficiency of access and use of individual bits. • The major disadvantages of the method are: • (1) The necessity of the uniform distribution of the keys. • (2) The necessity of some supplementary memory spaces for the counters and for the sorting area. 3.2.10 Sorting Arrays with Big Dimension Elements. Indirect Sorting • When the arrays to be sorted have big dimension elements, the moving overhead is high. • In this situations, is much more convenient that: • (1) The sorting algorithm to operate indirect over the original array, through an index array. • (2) The original array to be sorted later in only one pass. • The idea of sorting arrays with big dimension elements method is: • We consider an array a[1..n] with big dimension elements. • We associate to a an index array p[1..n]. • Initially the index array is assigned with p[i]:=i for i=1,n. • The algorithm used for sorting is modified to access the a array elements using the syntactic construction a[p[i]] instead of a[i]. • The access to a[i] through p[i] will be realized only for comparisons, the necessary moves being done only in array p. • In other words, the index array will be sorted, so that p[1] will store the index of the smallest element of array a, p[2] the index of the next element, etc. • In this way, the moving overhead of some big dimension elements is avoided. • In fact we achieve an indirect sort of array a. • In principle such a sort is described in figure 3.2.10. Array a before sorting: 1 32 2 22 3 0 4 1 5 5 6 16 7 99 8 4 9 3 10 50 4 4 5 5 6 6 7 7 8 8 9 9 10 10 4 1 5 5 6 16 7 99 8 4 9 3 10 50 5 5 6 6 7 2 8 1 9 10 10 7 Index array p before sorting: 1 1 2 2 3 3 Array a after sorting: 1 32 2 22 3 0 Index array p after sorting: 1 3 2 4 3 9 4 8 Fig. 3.2.10. Indirect sort example • This idea can be practically applied to any sort algorithm. • As example we present an indirect sorting based on sorting by straight insertion of array a [3.2.10.a] -----------------------------------------------------------{Indirect sorting based on sorting by straight insertion method – Pascal variant} VAR a: ARRAY[0..n] OF TypeElement; p: ARRAY[0..n] OF TypeIndex; PROCEDURE IndirectInsertion; VAR i,j,v: TypeIndex; BEGIN [3.2.10.a] FOR i:= 0 TO n DO p[i]:= i; FOR i:= 2 TO n DO BEGIN v:= p[i]; a[0]:= a[i]; j:= i-1; WHILE a[p[j]].key>a[v].key DO BEGIN p[j+1]:= p[j]; j:= j-1 END;{WHILE} p[j+1]:= v END {FOR} END; {IndirectInsertion} -----------------------------------------------------------/* Indirect sorting based on sorting by straight insertion method – C variant */ type_element a1[n-0+1]; type_index p[n-0+1]; void indirect_insertion() { type_index i,j,v; /*[3.2.10.a]*/ for( i= 0; i <= n; i ++) p[i]= i; for( i= 2; i <= n; i ++) { v= p[i]; a1[0]= a1[i]; j= i-1; while (a1[p[j]].key>a1[v].key) { p[j+1]= p[j]; j= j-1; } p[j+1]= v; } } ------------------------------------------------------------ • As we can notice, excepting the sentinel assignment, the accesses to a array elements are performed only for comparisons. • In many applications is enough to obtain only the array p, without permuting the elements of a. • For example for printing, the elements can be listed in order, using for this purpose the index array. • If the moving is absolutely necessary, the simplest manner to do that is to use a second array b. • If this is not possible, a specific moving "in situ" procedure can be used for replacing de elements of array a in order [3.2.10.b]. -----------------------------------------------------------{"In situ" re-placing algorithm – Pascal variant} PROCEDURE MovingInSitu; VAR i,j,k: TypeIndex; t: TypeElement; BEGIN FOR i:= 1 TO n DO IF p[i]<>i THEN [3.2.10.b] BEGIN t:= a[i]; k:= i; REPEAT j:= k; a[j]:= a[p[j]]; k:= p[j]; p[j]:= j; UNTIL k=i; a[j]:= t END {IF} END; {MovingInSitu} -----------------------------------------------------------/* "In situ" re-placing algorithm – C variant */ void moving_insitu() { typeindex i,j,k; typeelement t_; for(i=1; i<=n; i++) if (p[i]!=i) { t_= a[i]; k= i; do { j= k; a[j]= a[p[j]]; k= p[j]; p[j]= j; } while (!(k==i)); a1[j]= t_; } /*[3.2.10.b]*/ } -------------------------------------------------------- • In case of some particular applications, the viability of this technique depends on the relative length of keys and records. • The method is not recommended for small dimension records, because requires a supplementary memory area for p, and supplementary time for indirect comparisons. • For big dimension records, the indirect sort is justified, without effective permuting of the elements. • For very big dimension records, the method is recommended, including the later permutations of the elements [Se88]. 3.2.11 Conclusions Concerning Array Sorting • To conclude this presentation of sorting methods, we shall try to compare their effectiveness [Wi76]. • If n denotes the number of items to be sorted, C and M shall again stand for the number of required key comparisons and item moves, respectively. • In figure 3.2.11 are shown the analytical formulas of the minimum, average and maximum values of indicators C and M for the three straight sorting methods. averaged over all n! permutations of n items Method Min Avg Max C n −1 n2 + n − 2 4 (n − 1) ⋅ n 2 M 3 ⋅ (n − 1) n 2 + 11 ⋅ n − 12 4 n2 + 5 ⋅ n − 6 2 C n2 − 3 ⋅ n + 2 2 n2 − 3 ⋅ n + 2 2 n2 − 3 ⋅ n + 2 2 M 3 ⋅ (n − 1) n ⋅ (ln n + 0,57) n2 + 3 ⋅ (n − 1) 4 C n2 − 3 ⋅ n + 2 2 n2 − 3 ⋅ n + 2 2 n2 − 3 ⋅ n + 2 2 M 0 3 ⋅ (n 2 − 3 ⋅ n + 2) 3 ⋅ (n 2 − 3 ⋅ n + 2) 2 Insertion Selection Exchange (bubblesort) 4 Fig. 3.2.11. Formulas for straight sorting methods performance • For the advanced sorting methods, the formulas are much more complicated. • The essential facts are that the computational effort needed is c⋅n 1.2 in the case of Shellsort and is c⋅n⋅log (n) in the cases of Heapsort and Quicksort, where the c are appropriate coefficients. • These formulas which coarse approximate the performance of the sorting methods as function of the number of elements n, permit to classify the sorting methods as: • (1) Primitive or straight sorting methods for which the sorting effort is on order n2. • (2) Advanced or "logarithmic" methods for which the sorting effort is on order n⋅log (n). • The analysis end experimental measurements realized over the presented sorting methods, revealed the following conclusions [Wi76]: • (1) The improvement of binary insertion over straight insertion is marginal indeed, and even negative in the case of an already existing order. • (2) Bubblesort is definitely the worst sorting method among all compared. Its improved version Shakersort is still worse than straight insertion and straight selection, except in the evident case of sorting a sorted array. • (3) Quicksort beats Heapsort by a factor of 2 to 3. It sorts the inversely ordered array with speed practically identical to the one that is already sorted. • It was ascertained that the ratio between the dimension of an element information content, and the dimension of its key, has no a significant influence over the sorting performance of the described methods. • However, if this ratio is high, the next supplementary conclusions can be drawn: • The performance of the straight selection improves, placing this technique on the first place between straight sorting methods. • Bubbesort is in this case too, the most low performance method, even losing terrain. The shakersort obtains better performances in the case of the array ordered in reverse order. • Quicksort consolidates its first position, being for far the fastest array sorting method. • We have to underline that only general techniques has been considerate, without any supplementary information concerning the keys to be sorted. • If such information are available, respectively the keys fulfil certain a priori conditions and/or supplementary memory area can be used giving up the in situ, constraint, the sorting performances can be improved. • In this category can be included binsort technique which limits the keys range, its variant distribution counting tending the linear performance O(n). • In the same category can be included straight and exchange radix sorts, the both competing hardly quicksort technique. • Using supplementary properly dimensioned arrays for storing the distribution counters of the keys, these methods aim the performances of the linear sorting O(n). • For arrays with big dimension records, in order to improve in a substantial manner the sorting performances, indirect sorting techniques can be used. • As a final remark, we underline the fact that we have discussed only about sorting of array data structures, although some of the presented techniques can be used for successfully sorting of other data structure types. 3.3 Sorting Sequences. External Sorting • The sorting algorithms presented in the preceding section are inapplicable, if the amount of data to be sorted does not fit into a computer's main store, but if it is, for instance, represented on a peripheral and sequential storage device such as a tape or a magnetic disk. • In this case we describe the data as a sequence structure whose characteristic is that at each moment one and only one component is directly accessible. • This is a very severe restriction compared to the possibilities offered by the array structure, and therefore different sorting techniques have to be used. • The most important one of these techniques is sorting by merging. 3.3.1 Sorting by Merging • Merging or collating means combining two or more ordered sequences into a single, ordered sequence by repeated selection among the currently accessible components. • Merging is a much simpler operation than sorting, and it is used as an auxiliary operation in the more complex process of sequential sorting. • One way of sorting a sequence a on the basis of merging, called straight merging, is the following: • (1) Split the sequence a into two halves, called b and c. • (2) Merge b and c by combining single items into ordered pairs and obtain a new a sequence. • (3) Call the merged sequence a, and repeat steps (1) and (2), this time merging ordered pairs into ordered quadruples. • (4) Repeat the previous steps, merging quadruples into 8-tuples, and continue doing this, each time doubling the lengths of the merged sub sequences, until the entire sequence is ordered. • For example, being given the sequence: 34 65 12 22 83 18 04 67 • The execution of step (1) leads to two halves of a: 34 65 12 22 83 18 04 67 • Merging single items into ordered pairs, we obtain a new a sequence: 34 83 | 18 65 | 04 12 | 22 67 • Splitting the sequence: 34 83 | 18 65 04 12 | 22 67 • And merging pairs in quadruples we obtain: 04 12 34 83 | 18 22 04 12 34 83 18 22 65 67 65 67 • Splitting again: • Merging the two quadruples into a 8-uples, the sequence is sorted: 04 12 18 22 34 65 67 83 • We have to notice that the access to the components of the sequences is strictly sequential. • Each operation that treats the entire set of data once is called a phase. • The sub-process that by repetition constitutes the sort process is called a pass or a stage. • In the above example the sort took three passes, each pass consisting of a splitting phase and a merging phase. • In order to perform the sort, three sequences are needed; the process is therefore called a three-sequences merge. • More correct this is in fact a non-balanced three-sequences merging. 3.3.1.1 Non-balanced Three Sequences Merging • Non-balanced three sequences merging illustrates the sorting process presented earlier. • The principle schema of this process is shown in a figure 3.3.1.1. b a c Fig. 3.3.1.1. Non-balanced three-sequences merging • The continues lines represents the merging phase and the dashed lines the splitting phase. • For the beginning, in [3.3.1.1.a, b] are presented the data structures and the principle schema of the algorithm. -----------------------------------------------------------{ Non-balanced 3 sequences merging – Data structures} TYPE TypeElement = RECORD key: TypeKey; {other fields} [3.3.1.1.a] TypeSequence = FILE OF TypeElement; VAR a,b,c: TypeSequence; -----------------------------------------------------------{Non-balanced 3 sequences merging – Refining step 0} PROCEDURE Merging3Sequences; p:= 1; {n-tuples length} [3.3.1.1.b] REPEAT *split; {distributes a on b and c} *merge; {merges from b and c on a] p:= 2*p UNTIL k=1; {k is n-tuples counter} ------------------------------------------------------------ • We can notice that each pass which consists in an iteration of the REPEAT loop contains two phases: • (1) A splitting phase which distributes the n-tuples of sequence a on the two sequences b and c. • (2) A merging phase which merges the n-tuples of sequences b and c in ntuples with doubled length on sequence a. • Variable p initialized with value 1, specifies the dimension of the current n-tuples, dimension which is doubled in each pass. • As result, the total number of passes will be log 2 n . • Variable k count the number of n-tuples generated by merging process. • The sorting process is finalized when results only one n-tuple (k=1). • Further, the sorting algorithm will be developed using the ”stepwise refinement” methodology [Wi76]. • The developing process consists in successive refinements in an iterative and incremental manner of the two mentioned phases. • In [3.3.1.1.c] appears the first refinement step of split phase, and in [3.3.1.1.c] the refinement of the sentence “write a n-tuple of length p on sequence d”. -----------------------------------------------------------{Split procedure - Refinement step 1} PROCEDURE Split(p: integer); {distributes n-tuples of a from b and c} {p - n-tuple length} RESET(a); REWRITE(b); REWRITE(c); WHILE NOT Eof(a) DO BEGIN [3.3.1.1.c] *write a n-tuple on b *write a n-tuple on c END; -----------------------------------------------------------PROCEDURE WriteNtuple(d: TypeSequence); {write a n-tuple of elements of length p on sequence d The elements are read from sequence a} i:= 0; {elements counter} WHILE (i<p) AND NOT Eof(a) DO BEGIN *read(a,x); [3.3.1.1.d] *write(d,x) i:= i+1 END; ------------------------------------------------------------ • Variable i represents the elements counter taking values between 0 and p. • The writing process is finished when the number of read elements is equal to p, or the end of the sequence is achieved. • The merge phase refinement is presented in [3.3.1.1.e]. -----------------------------------------------------------{Merge procedure – Refinement step 1} PROCEDURE Merge(p: integer; VAR k: integer); {p - n-tuple length, k - n-tuple counter} Rewrite(a); Reset(b); Reset(c); k:= 0; [3.3.1.1.e] *initialize merge *read in x respectively in y the first element from b respectively from c {lookahead} REPEAT *merge one n-tuple from b with one n-tuple from c on a *increment k by 1 UNTIL EndProc_b AND EndProc_c Close(a); Close(b); Close(c); ------------------------------------------------------------ • Input variable p represents the length of merged n-tuples, and k is n-tuple counter. • The merge process (REPEAT loop) is finished when sequences b and c are ended. • Due to implementation particularities of the files, some considerations are necessary: • (1) Eof(f) is set on true when the last element of file f is read. • (2) Reading from a file with Eof set on true produces an error. • (3) From the merge algorithm point of view, the end of a file processing is not coincident with the setting of Eof on true, because, at this moment the last read element hasn’t been processed yet. • To solve such constraints, the look-ahead technique is used. • The look-ahead technique consists in introducing a delay between the moment of reading and the moment of processing an element. • So, in each moment is processed the element read in the previous step, and a new element is read. • To implement this technique for each file involved in sorting process, two supplementary variable are defined: • (1) A special TypeElement variable which store the current element. • (2) A Boolean variable EndProc whose true value signifies the end of processing of the last file’s element. The refinement of the phrase “merge one n-tuple from b with one ntuple from c on a, and increment k by 1” which presumes the lookahead technique is presented in [3.3.1.1.f ]. • The specific variables attached to sequences b and c are x and y respectively EndProc_b şi EndProc_c. -----------------------------------------------------------{merge one n-tuple from b with one n-tuple from c on a, and increment k by 1} i:= 0; {counter n-tuples for b} j:= 0; {counter n-tuples for c} WHILE (i<p)AND(j<p) AND NOT EndProc_b AND NOT EndProc_c DO BEGIN IF x.key<y.key THEN BEGIN *write(a,x); i:= i+1; *read(b,x) [3.3.1.1.f] END ELSE BEGIN *write(a,y); j:= j+1; *read(c,y) END END; {WHILE} *copy the rest of n-tuple from b to a (if exists) *copy the rest of n-tuple from c to a (if exists) k:= k+1; ------------------------------------------------------------ • In [3.3.1.1.g] an implementation variant of non-balanced 3 sequences merging sort algorithm is presented. -----------------------------------------------------------{Non-balanced 3 sequences merging sort algorithm implementation} PROCEDURE Merge3Sequences; VAR a,b,c: TypeSequence; p,k: integer; PROCEDURE Split(p: Integer); VAR x: TypeElement; PROCEDURE WriteNtuple(VAR d: TypeSequence); VAR i: integer; BEGIN {WriteNtuple} i:= 0; WHILE (i<p) AND (NOT Eof(a)) DO BEGIN Read(a,x); Write(d,x); i:= i+1 END; {WHILE} END; {WriteNtuple} [3.3.1.1.g] BEGIN {Spliting} Reset(a); Rewrite(b); Rewrite(c); WHILE NOT Eof(a) DO BEGIN WriteNtuple(b); WriteNtuple (c); END; {WHILE} Close(a); Close(b); Close(c); END; {Split} PROCEDURE Merge(p: integer; VAR k: integer); VAR i,j: integer; x,y: TypeElement; EndProc_b,EndProc_c: Boolean; BEGIN {Merge} Reset(b); Reset(c); Rewrite(a); k:= 0; EndProc_b:= Eof(b); EndProc_c:= Eof(c); IF NOT EndProc_b THEN Read(b,x); {look-ahead} IF NOT EndProc_c THEN Read(c,y); {look-ahead} REPEAT i:= 0; j:= 0; {merge of a n-tuple} WHILE (i<p)AND(j<p) AND NOT EndProc_b AND NOT EndProc_c DO BEGIN IF x.key < y.key THEN BEGIN Write(a,x); i:= i+1; IF Eof(b) THEN EndProc_b:= true ELSE Read(b,x) END ELSE BEGIN Write(a,y); j:= j+1; IF Eof(c) THEN EndProc_c:= true ELSE Read(c,y) END; END; {WHILE} {copy the rest of n-tuple from b WHILE (i<p) AND NOT EndProc_b DO Write(a,x); i:= i+1; IF Eof(b) THEN EndProc_b:= true ELSE Read(b,x) END; {WHILE} {copy the rest of n-tuple from c WHILE (j<p) AND NOT EndProc_c DO Write(a,y); j:= j+1; IF Eof(c) THEN EndProc_c:= true ELSE Read(c,y) END; {WHILE} k:= k+1; UNTIL EndProc_b AND EndProc_c; Close(a); Close(b); Close(c); END; {Merge} to a} BEGIN to a} BEGIN BEGIN {Merge3Sequences} p:= 1; REPEAT Split(p); {phase (1)} Merge(p,k); {phase (2)} p:= p*2; UNTIL k=1; END; {Merge3Sequences} --------------------------------------------------------- 3.3.1.2 Balanced Merging • Actually, the splitting phases do not contribute to the sort since they do in no way permute the items; in a sense they are unproductive, although they constitute half of all copying operations. • They can be eliminated altogether by combining the split and the merge phase. • Instead of merging into a single sequence, the output of the merge process is immediately redistributed onto two sequences, which constitute the sources of the subsequent pass (figure 3.3.1.2.a). Fig. 3.3.1.2.a. Balanced merge with 4 sequences • In contrast to the previous two-phase merge sort, this method is called a single-phase merge or a balanced merge. • It is evidently superior because only half as many copying operations are necessary; the price for this advantage is a fourth sequence. • We shall develop a merge program in detail and initially let the data be represented as an array which, however, is scanned in strictly sequential fashion. • A later version of merge sort will then be based on the sequence structure, allowing a comparison of the two programs and demonstrating the strong dependence of the form of a program on the underlying representation of its data structures. • A single array may easily be used to model two sequences, if it is regarded as doubleended. • Instead of merging from two source files, we may pick items off the two ends of the array. • To model balanced merging sorting we will use two such array named SOURCE respectively DESTINATION. • Thus, the general form of the combined merge-split phase can be illustrated as shown in 3.3.1.2.b. i j k L distribute merge SOURCE DESTINATION Fig. 3.3.1.2.b. Model for balanced merging • The destination of the merged items is switched after each ordered pair in the first pass, after each ordered quadruple in the second pass, etc., thus evenly filling the two destination sequences, represented by the two ends of a single array. • After each pass, the two arrays interchange their roles, the source becomes the new destination, and vice versa. • A further simplification of the program can be achieved by joining the two conceptually distinct arrays into a single array of doubled size. Thus, the data will be represented by: a: ARRAY[1..2*n] OF TypeElement; • In a indices i and j denote two source items, whereas k and L designate the two destinations. • The initial sequence are the items a[0],...,a[n-1]. • Clearly, a Boolean variable up is needed to denote the direction of the data flow. • If up is true this shall mean that in the current pass source components a[0],...,a[n-1]will a[n+1],...,a[2*n], be moved up to the destination • Whereas ~up will indicate that the source a[n+1],...,a[2*n] will be transferred down into destination a[0],...,a[n-1]. • The value of up strictly alternates between consecutive passes. • And, finally, a variable p is introduced to denote the length of the sub-sequences to be merged. • Its value is initially 1, and it is doubled before each successive pass. • To simplify matters somewhat, we shall assume that n is always a power of 2. • Thus, the first version of the straight merge program appears as pseudocode in [3.3.1.2.a] and its first refinement step in [3.3.1.2.b]: -----------------------------------------------------------{Balanced merge with 4 sequences – pseudocod variant } PROCEDURE BalancedMerge up: boolean; p: integer; [3.3.1.2.a] up:= true; p:= 1; repeat if up then left <- source; right <- destination; else right <- source; left <- destination; *p-length sequences from the two ends of source are merged and distributed alternatively into the two ends of the destination; up:= not up; p:= 2*p until p=n; ------------------------------------------------------------ { Balanced merge with 4 sequences – Refinement step 1} PROCEDURE BalancedMerge; VAR i,j,k,l: index; up: boolean; p: integer; BEGIN up:= true; p:= 1; REPEAT {index initialization} IF up THEN BEGIN i:= 1; j:= n; {source} k:= n+1; L:= 2*n {destination} END ELSE [3.3.1.2.b] BEGIN k:= 1; L:= n; {source} i:= n+1; j:= 2*n {destination} END; *merge p-tuples belonging to sequences i and j, alternatively, in sequence k respectively L up:= NOT up; p:= 2*p UNTIL p=n END; {BalancedMerge} ------------------------------------------------------------ • The merge process is in fact the REPEAT loop cycling for p assuming values from 1 to n. • At each new pass p is doubled and up switches. • Inside a pass: • Function of up value the source/destination indices are assigned. • The p-length tuples of source sequences are merged in double length tuples and stored in the destination sequence. • The next refining step refines the statement "merge p-tuples belonging to sequences i and respectively L". j, alternatively, in sequence k • Evidently, the merge pass involving n items is itself a sequence of merges of sequences, i.e. of p-tuples which are merged in 2∙p-tuples. • Between every such partial merge the destination is switched from the lower to the upper end of the destination array, or vice versa, to guarantee equal distribution onto both destinations. • If the destination of the merged items is the lower end of the destination array, then the destination index is k, and k is incremented after each move of an item. • If they are to be moved to the upper end of the destination array, the destination index is L, and it is decremented after each move. • In order to simplify the actual merge statement, we choose the destination to be designated by k at all times, switching the values of the variables k and L after each p-tuple merge, and denote the increment to be used at all times by h, where h is either 1 or -1. • These design discussions lead to the following refinement [3.3.1.2.c]: -----------------------------------------------------------{merge p-tuples belonging to sequences i and j, alternatively, in sequence k respectively L} h:= 1; m:= n; {m = number of elements to be merged} REPEAT q:= p; r:= p; m:= m-2*p; [3.3.1.2.c] *merge q elements from i-source with r elements from jsource; destination index is k with the increment h h:= -h; *interchange destination indices (k and L) UNTIL m=0; ------------------------------------------------------------ • Regarding [3.3.1.2.c] we can specify: • r respectively q are the lengths of the sequences to be sorted. • As a general rule, at the beginning of the merging r=q=p. In the final stage of the merge, they can be modified if n is not a power of 2. • m is the number of items to be merged in the current pass. • Initially m=n and it is decreased after each merge of two p-tuples with 2*p (m:= m-2*p). • The merge process is finished when m=0. • In the next refinement step the actual merge statement is to be formulated. • Here we have to keep in mind that the tail of the one subsequence which is left non-empty after the merge has to be appended to the output sequence by simple copying operations[3.3.1.2.d]. . -----------------------------------------------------------{merge q elements from i-source with r elements from jsource } WHILE (q<>0) AND (r<>0) DO BEGIN {select the smallest element between i and j} IF a[i].key<a[j].key THEN BEGIN *move an element from i to k, increment i and k q:= q-1 END ELSE [3.3.1.2.d] BEGIN * move an element from j to k, increment j and k r:= r-1 END END; {WHILE} *copy the rest of sequence i *copy the rest of sequence j ------------------------------------------------------------ • Before going further, we wish to eliminate the restriction that n be a power of 2. • This means that we continue merging p-tuples until the remainders of the source sequences are of length less than p. • The one and only part that is influenced are the statements that determine the values of q and r, the lengths of the sequences to be merged[3.3.1.2.c]. • In consequence the statements: q:= p; r:= p; m:= m-2*p; • Will be replaced by: IF m>=p THEN q:= p ELSE q:= m; m:= m-q; IF m>=p THEN r:= p ELSE r:= m; m:= m-r; where m is the number of elements that remain to be merged. • In addition, in order to guarantee termination of the program, the condition p=n, which controls the outer repetition, must be changed to p≥n. • The final refining variant of BalancedMerge operating on the global array a with 2n elements appears in [3.3.1.2.e] --------------------------------------------------------{Balanced merge with 4 sequences – final variant} PROCEDURE BalancedMerge; VAR i,j,k,l,t: index; h,m,p,q,r: integer; up: boolean; BEGIN up:= true; p:= 1; REPEAT {R1} h:= 1; m:= n; [3.3.1.2.e] IF up THEN BEGIN i:= 1; j:= n; {source} k:= n+1; L:= 2*n {destination} END ELSE BEGIN k:= 1; L:= n; {source} i:= n+1; j:= 2*n {destination} END; {IF} REPEAT {merge from i and j to k q=length for i; r= length for j} IF m>=p THEN q:= p ELSE q:= m; m:= m-q; IF m>=p THEN r:= p ELSE r:= m; m:= m-r; WHILE(q<>0) AND (r<>0) DO BEGIN {merge} IF a[i].key<a[j].key THEN BEGIN a[k]:= a[i]; k:= k+h; i:= i+1; q:= q-1 END ELSE BEGIN a[k]:= a[j]; k:= k+h; j:= j-1; r:= r-1 END {IF} END; {WHILE} {copy the rest of sequence j} WHILE r<>0 DO BEGIN a[k]:= a[j]; k:= k+h; j:= j-1; r:= r-1 END; { copy the rest of sequence i} WHILE q<>0 DO BEGIN a[k]:= a[j]; k:= k+h; i:= i+1; q:= q-1 END; h:= -h; t:= k; k:= L; L:= t {interchange destination indices (k and L)} UNTIL m=0; up:= NOT up; p:= 2*p UNTIL p>=n; IF NOT up THEN FOR i:= 1 TO n DO a[i]:= a[i+n] END; {BalancedMerge} -----------------------------------------------------------3.3.1.3 Performance Analysis of Mergesort • Since each pass doubles p, and since the sort is terminated as soon as p>n, it involves log2 n passes. • Each pass, by definition, copies the entire set of n items exactly once. • As a consequence, the total number of moves is exactly M = n ⋅ log2 n • The number C of key comparisons is even less than M since no comparisons are involved in the tail copying operations. • However, since the mergesort technique is usually applied in connection with the use of peripheral storage devices, the computational effort involved in the move operations dominates the effort of comparisons often by several orders of magnitude. • The detailed analysis of the number of comparisons is therefore of little practical interest. • The merge sort algorithm apparently compares well with even the advanced array sorting techniques discussed in the previous chapter. • However: • The administrative overhead for the manipulation of indices is relatively high. • The decisive disadvantage is the need for storage of 2n items. • This is the reason sorting by merging is rarely used on arrays, i.e., on data located in main store. • The real measurements compare favourably mergesort with heapsort but unfavourably with quicksort. 3.3.2 Natural Merging • In straight merging no advantage is gained when the data are initially already partially sorted. • The length of all merged sub-sequences in the k-th pass is equal to 2k, independent of whether longer sub-sequences are already ordered and could as well be merged. • That means, even the initial sequence is already sorted, all the passes of the sorting algorithm are completed. • In fact, any two ordered sub-sequences of lengths m and n might be merged directly into a single sequence of m+n items. • The mergesort technique that at any time merges the two longest possible ordered sub-sequences is called a natural merge sort. • The central concept of this technique is the monotony, which is exemplified by the next example: • Let consider the following sequence a of keys: 1 13 2 4 7 6 18 9 10 14 11 3 75 • Vertical lines are placed at the extremities of the sequence, as well as between the elements aj and aj+1, whenever aj>aj+1. • The sequence was breakdown in monotone partial sequences. • The derived sequences are of maximum length, that means they cant be extended without loosing the their monotony. • In general, being given any sequence a1,a2,...,an • Formally, a monotony is any partial sub-sequence ai,...,aj which satisfies the following conditions [3.3.2.a]: -----------------------------------------------------------1) 1 ≤ i ≤ j ≤ n ; 2) ak ≤ ak+1 for any i ≤ k <j; 3) ai-1 > ai or i = 1 ; [3.3.2.a] 4) aj > aj+1 or j = n ; ------------------------------------------------------------ • The same thing can be expressed in predicate logic as [3.3.2.a’]: ------------------------------------------------------------ (ai-1>ai)&(Ak:i≤k<j:ak≤ak+1)&(aj>aj+1) [3.3.2.a’] ------------------------------------------------------------ • The definition includes also the monotonies with one element. In this case i=j and condition 2) is fulfilled, k being between i and j-1. • A natural merge sort, therefore, merges monotonies instead of sequences of fixed, predetermined length. • The natural merge is based on the following property: • If two sequences of n monotonies are merged, a single sequence of exactly n monotonies emerges. • Therefore, the total number of monotonies is halved in each pass, as result the maximum number of passes is log2 n • As result for the sorting process: • The number of required moves of items is in the worst case n*log(n), but in the average case it is even less. • The expected number of comparisons, however, is much larger because in addition to the comparisons necessary for the selection of items, further comparisons are needed between consecutive items of each file in order to determine the end of each monotony. • The next programming exercise develops a natural merge algorithm in the same stepwise refinement fashion that was used to explain the straight merging algorithm. • It employs the sequence structure represented by files, instead of the array, and it represents an unbalanced, two-phase, three-sequence merge sort. • We assume that the file variable c represents the initial sequence of items. • Naturally, in actual data processing application, the initial data are first copied from the original source to c for reasons of safety. • a and b are two auxiliary sequence variables. • The following data structures are used : [3.3.2.b] -----------------------------------------------------------{Natural merging – data structures} TYPE TypeSequence = FILE OF TypeElement; [3.3.2.b] VAR a,b,c: TypeSequence; ------------------------------------------------------------ • Each pass consists of two alternating phases: • (1) A distribution phase that distributes monotonies equally from c to a and b. • (2) A merge phase that merges monotonies from a and b to c. • This process is illustrated in Fig. 3.3.2. a c a c c pass 1 c b b distribution phase a c b merge phase pass 2 pass n Fig. 3.3.2. The model of natural merge sort. Passes and phases • The sorting process is finished when the number of monotonies on c became equal to 1. • Variable L is used to count the generated monotonies. • The initial refinement step of the natural merging sort appears in [3.3.2.c]. • The two phases appears as two rough statements, which will be refined herein after. -----------------------------------------------------------{Sorting by natural merge – Refinement step 0} PROCEDURE NaturalMerge; VAR L: integer; {the number of monotonies to be merged} a,b,c: TypeSequence; sm: boolean; BEGIN [3.3.2.c] REPEAT Rewrite(a); Rewrite(b); Reset(c); Distribution; Reset(a); Reset(b); Rewrite(c); L:= 0; Merge; UNTIL L=1 END; {NaturalMerge} ------------------------------------------------------------ • The refinement process can be achieved in two manners: • (1) By straight substitution of the statements with the corresponding code – process that will be named as refinement by insertion. • (2) By developing the statements as procedures or functions - process that will be named as refinement by selection. • In the non-balanced merge algorithm, we have used the refinement by insertion technique. • For the natural merge sort algorithm we will use the refining by selection technique. • In [3.3.2.d] respectively [3.3.2.e] appears the first refinement steps for Distribution respectively Merge. -----------------------------------------------------------{Sorting by natural merge – refinement of Distribution procedure} PROCEDURE Distribution; {from c to a and b} BEGIN REPEAT [3.3.2.d] CopyMonotony(c,a); IF NOT Eof(c) THEN CopyMonotony(c,b) UNTIL Eof(c) END; {Distribution} ------------------------------------------------------------ • This method of distribution supposedly results : • (1) In either equal numbers of monotonies in both a and b, if the number of monotonies on c even. • (2) In a sequence a containing one monotony more than b, if the number of monotonies on c odd. • Since corresponding pairs of runs are merged, a leftover run may still be on file a, which simply has to be copied on c. -----------------------------------------------------------{Sorting by natural merge – refinement of procedure Merge} PROCEDURE Merge; BEGIN {from a and b to c} REPEAT MergeMonotony; L:= L+1; UNTIL Eof(b); IF NOT Eof(a) THEN [3.3.2.e] BEGIN {odd monotony} CopyMonotony(a,c); L:= L+1 END END; {Merge} ------------------------------------------------------------ • The statements Merge and Distribute are formulated in terms of a refined statement MergeMonotony and a subordinate procedure CopyMonotony which refer to only one monotony, and will be refined further in [3.3.2.f] respectively [3.3.2.g]. • The Boolean variable em (end of monotony) specifies if the end of current monotony has been reached or not. • When one of the monotonies which are merged is finished, the rest of the other is copied to destination sequence. --------------------------------------------------------{Sorting by natural merge – refinement of procedure CopyMonotony } PROCEDURE CopyMonotony( source,destination: TypeSequence); {source – the sequence in which the monotony is identified destination – the sequence in which the monotony is copied} BEGIN REPEAT [3.3.2.f] CopyElement(source,destination) UNTIL em END; {CopyMonotony} -----------------------------------------------------------{Sorting by natural merge – refinement of procedure MergeMonotony} PROCEDURE MergeMonotony; BEGIN REPEAT IF a.elemCurrent.key < b.elemCurrent.key THEN BEGIN CopyElement(a,c); IF em THEN CopyMonotony(b,c) END [3.3.2.g] ELSE BEGIN CopyElement (b,c); IF sm THEN CopyMonotony(a,c) END {ELSE} UNTIL em END; {MergeMonotony} ------------------------------------------------------------ • In order to refine the above mentioned procedures, a subordinate procedure CopyElement(source,destination: TypeSequence), which transfers the current element of source sequence to destination sequence, setting the variable em on true if the end of the sequence is reached. • For this purpose the look-ahead technique is used: in the current step is processed the element read in the previous step, and is read the current element for the next step. • In consequence, the first element to be processed is introduced in the sequence’s buffer, before starting the distribution respectively merging process. • The TypeSequence data structure is modified as in [3.3.2.h]. --------------------------------------------------------{Sorting by natural merge – refinement of data structure} TYPE TypeSequence = RECORD sequence: FILE OF TypeElement; [3.3.2.h] elemCurrent: TypeElement; {sequence buffer} endProc: boolean {end of sequence processing} END; -----------------------------------------------------------The procedure CopyElement appears in [3.3.2.i]. -----------------------------------------------------------{Sorting by natural merge – refinement of procedure CopyElement} {copy an element from x to y} PROCEDURE CopyElement(VAR x,y: TypeSequence); VAR aux: TypeElement; BEGIN Write(y.sequence,x.elemCurrent);{write current element from x to y} IF Eof(x.sequence) THEN {x is the last element} BEGIN em:= true; x.endProc:= true END [3.3.2.i] ELSE BEGIN aux:= x.elemCurrent; {save the current element} Read(x.sequence,x.elemCurrent);{read next element} em:= aux.key>x.elemCurrent.key END; END; {CopyElement} ------------------------------------------------------------ • We can notice the look-ahead technique: • At the current moment, the element x.elemCurrent, which was read in the previous step, is written to the destination sequence y. • If x.elemCurrent where the last element of its sequence, then the processing of the sequence is finished and x.endProc is set to true. • If x.elemCurrent wasn’t the last element of its sequence, it is saved in variable aux:TypeElement and the next element is read in order to determine the end of current monotony em. • Regrettably, the program is incorrect in the sense that it does not sort properly in some cases. • Consider, for example, the following sequence of input data with 10 monotonies: 13 5717 1911 5923 297 6131 375 6741 432 347 71 • If the c sequence is distributed, due to the initial distribution of the key, on sequence a are written 5 monotonies and on sequence b only one, instead of 5 as we expect . a: 13 5711 597 615 672 3 b: 17 19 23 29 31 37 41 43 47 71 • The merge of a and b writes on c two monotonies, instead of 5, because when b is finished, only a single monotony is copied from a to c [3.3.2.e]. c: 13 17 19 23 29 31 37 41 43 47 57 7111 59 • In the next pass the merge is finished, but the result is incorrect: c:11 13 17 19 23 29 31 37 41 43 47 57 59 71 • Although procedure distribute supposedly outputs monotonies in equal numbers to the two sequences, the important consequence is that the actual number of resulting monotonies on a and b may differ significantly due to the distribution of the keys. • Our merge procedure, however, only merges pairs of runs and terminates as soon as b is read, thereby losing the tail of one of the sequences. • To fix this problem, the merge procedure has to be changed so that, after reaching the end of one sequence, the entire tail of the remaining sequence is copied instead of at most one run. • The revised version of natural merge sort algorithm is presented in[3.3.2.j]. -----------------------------------------------------------{Natural merge sort algorithm – final variant} PROCEDURE NaturalMerge; VAR l: integer; em: boolean; a,b,c: TypeSequence; PROCEDURE CopyElement(VAR x,y: TypeSequence); VAR aux: TypeElement; BEGIN Write(y.sequence,x.elemCurrent); IF Eof(x.sequence) THEN BEGIN em:= true; x.endProc:= true END [3.3.2.j] ELSE BEGIN aux:= x.elemCurrent; Read(x.sequence,x.elemCurrent); em:= aux.key > x.elemCurrent.key END; END; {CopyElement} PROCEDURE CopyMonotony(VAR x,y: TypeSequence); BEGIN REPEAT CopyElement(x,y) UNTIL em END; {CopyMonotony} PROCEDURE Distribution; BEGIN Rewrite(a.sequence);Rewrite(b.sequence); Reset(c.sequence); c.endProc:= Eof(c.sequence); Read(c.sequence,c.elemCurrent); REPEAT CopyMonotony(c,a); IF NOT c.endProc THEN CopyMonotony(c,b) UNTIL c.endProc; Close(a.sequence); Close(b.sequence); Close(c.sequence) END; {Distribution} PROCEDURE MergeMonotony; BEGIN REPEAT IF a.elemCurrent.key < b.elemCurrent.key THEN BEGIN CopyElement(a,c); IF em THEN CopyMonotony(b,c) END ELSE BEGIN CopyElement(b,c); IF em THEN CopyMonotony(a,c) END UNTIL em [3.3.2.j] END; {MergeMonotony} PROCEDURE Merge; BEGIN Reset(a.sequence); Reset(b.sequence); Rewrite(c.sequence); a.termPrelucr:= Eof(a.sequence); b.termPrelucr:= Eof(b.sequence); IF NOT a.endProc THEN Read(a.sequence,a.elemCurrent); {first element} IF NOT b.endProc THEN Read(b.sequence,b.elemCurrent); {first element} WHILE NOT a.endProc OR b.endProc DO BEGIN MergeMonotony; L:= L+1 END; {WHILE} WHILE NOT b.endProc DO BEGIN CopyMonotony(b,c); L:= L+1 WHILE NOT a.endProc DO BEGIN CopyMonotony(a,c); L:= L+1 END; {IF} Close(a.sequence);Close(b.sequence); Close(c.sequence); END; {Merge BEGIN {NaturalMerge} REPEAT Distribution; L:= 0; Merge; UNTIL l=1; END; {NaturalMerge} --------------------------------------------------------- 3.3.2.1 Performance Analysis of Natural Merge • As it was noticed, in the case of external sorting, the number of key comparisons is not relevant, because the processing time in the central unit of the computing system is negligible reported to the duration of external memory accesses. • For this reason the number of moves M will be considered the unique performance indicator. • In the case of sorting by natural merge: • In one pass, in each of the two phases (distribution and merge), all the elements are moved, so the number of moves is M = 2·n . • After each pass, the number of monotonies decreases two times, sometime even more substantial. That was the reason for, the Merge procedure was modified. • Knowing that the initial monotonies number is n, the maximum number of passes is log2 n , as result, in the worst case, the number of moves is M=2·n·log2 n , in average, substantially reduced . 3.3.3. Balanced Multi-Way Merging • The effort involved in a sequential sort is proportional to the number of required passes since, by definition, every pass involves the copying of the entire set of data. • One way to reduce this number is to distribute runs onto more than two sequences. • In consequence : • In the first merging step, r monotonies that are equally distributed on N sequences results in a sequence of r/N monotonies. • A second pass reduces their number to r/N2, a third pass to r/N3, and after k passes there are r/Nk monotonies left. • This is the N-way merging sort. • The total number of passes k required to sort n items by N-way merging is therefore k = log N n. • Since each pass requires n copy operations, the total number of copy operations is in the worst case M = n·log N n . • An implementation modality of this technique is balanced multi-way merge, which is achieved in only one phase. • Balanced multi-way merge presume that in each pass there is an equal number of input respectively output sequences. • The monotonies are merged from the input sequences and immediately distributed to the output sequences. • If N sequences (N even) are used, we have in fact a balanced merge sort with N/2 ways. • The principle schema of this method appears in figure 3.3.3.a. N/2 source sequences N/2 destination sequences Fig. 3.3.3.a. The model of balanced multi-way merge with N/2 ways • The algorithm to be further developed, a specific data structure named array of sequences was conceived. • As a matter of fact, it is surprising how strongly the following program differs from the previous one because of the change from two-way to multi-way merging. • (1) The change is primarily a result of the circumstance that the merge process can no longer simply be terminated after one of the input monotony is exhausted. Instead, a list of inputs sequences that are still active, i.e., not yet exhausted, must be kept. • (2) Another complication stems from the need to switch the groups of input and output sequences after each pass. • As result, the following data structures are defined [3.3.3.a]. -----------------------------------------------------------{Data structures for balanced multiway merge sort} TYPE TypeSequence = RECORD sequence: File of TypeElement; current: TypeElement; endProc: boolean [3.3.3.a] END; NrSequence: 1..n; VAR f0: TypeSequence; F: ARRAY[NrSequence]OF TypeSequence;{array of sequences} t,td: ARRAY[NrSequence] OF NrSequence; ------------------------------------------------------------ • A new data structure was introduced, namely the array of sequences F, whose elements belongs to TypeSequence. • The scalar type NrSequence is used as index for the array of sequences F. • We also presume that the initial sorting sequence is f0:TypeSequence and sorting process uses N sequences (N even). • The switching of input and output sequences is solved by means of array t having as components indices referring the sequences. • Thus, instead to address directly a sequence in array F, through index i, it will be indirectly addressed via array t, respectively F[t[i]] instead of F[i]. • Initial t[i]=i for all i values. • The switching of input and output sequences consist in interchanging the pairs of components of array t where NH = N/2, as follows: t[1] t[2] <-> t[NH+1] <-> t[NH+2] .... t[NH] <-> t[N] • The sequences F[t[1]],...,F[t[NH]] will be always considered input sequences, and F[t[NH+1]],..., output sequences (fig.3.3.3.b). Array of sequences F 1 2 3 B1 B2 B3 F[t[N]]will be always considered as 4 B4 5 B5 6 B6 Array of indices t (before commutation) 1 2 3 4 5 6 1 2 3 4 5 6 NH NH+1 Array of indices t (after commutation) 1 2 3 4 5 6 4 5 6 1 2 3 NH NH+1 Input Sequences Output Sequences Fig. 3.3.3.b. Switching of input and output sequences • The first shape of balanced multi-way merge sorting appears in [3.3.3.b]. -----------------------------------------------------------{Sorting by balanced multi-way merge –initial variant } PROCEDURE BalancedMultiwayMergeSort; VAR i,j: NrSequence; L: integer {number of distributed monotonies} t,td: ARRAY[NrSequence] OF NrSequence; F: ARRAY[NrSequence] OF TypeSequence; BEGIN {NH=N/2} j:= NH; l:= 0; {number of monotonies} [3.3.3.b] REPEAT {the initial monotonies are distributed from f0 to input sequences t[1],...,t[NH]} IF j<NH THEN j:= j+1 ELSE j:= 1; *copy a monotony from f0 to sequence F[j] L:= L+1 UNTIL endProc(f0); FOR i:= 1 TO n DO t[i]:= i; {initialize array t} REPEAT {merge from input sequences t[1],...,t[NH] to output sequences t[NH+1],...,t[N]} *initialize input sequences L:= 0; {number of monotonies} j:= NH+1;{j is the index of the current output sequence} REPEAT L:= L+1; *merge a monotony from each active input sequence to t[j] IF j<N THEN j:= j+1 ELSE j:= NH+1 UNTIL *all the active inputs has been exhausted *switch sequences UNTIL L=1 {the sorted sequence is t[1]} END; {BalancedMultiwayMergeSort} ------------------------------------------------------------ • As we can notice, the balanced sorting process consists in three steps. • (1) First step realize the distribution of initial monotonies (first REPEAT loop). In this step: • The initial monotonies are successively distributed from initial sequence f0 to input sequences indicated by j. • After each copied monotony, index j, which scans cyclically domain [1..NH] is incremented. • Distribution is finished when f0 is exhausted. • The distributed monotonies are counted in L. • (2) The second step initialize array t (loop FOR). • (3) The third step, merge input sequences F[t[1]]...F[t[NH]] into output sequences F[t[NH+1]]...F[t[N]]. • The merging principle is the following: • From all active input sequences, one monotony of each are merged into single monotony on F[j] sequence. • j is advanced to the next output sequence. j scans cyclically the domain [NH+1..N]. • The process is repeated until all inputs are exhausted (inner REPEAT loop). • In this moment, the sequences are commuted, the inputs become outputs and vice versa, and the merge is resumed. • This process continues until the number of merged monotonies is equal to 1. (external REPEAT loop). • Further, the refinements of the algorithm statements are presented. • In [3.3.3.c] is presented the refinement of statement *copy a monotony from f0 to sequence F[j], used in initial monotonies distribution. -----------------------------------------------------------{ copy a monotony from f0 to sequence F[j] } {f0 is not empty. First element is already read} VAR buf: TypeElement; REPEAT [3.3.3.c] buf:= f0.current; IF Eof(f0.sequence) THEN f0.endProc:= true {look-ahead technique} ELSE Read(f0.sequence, f0.current); {read the next item} Write(F[j].sequence,buf) {write the current item on sequence F[j]} UNTIL (buf.key>f0.current.key) OR f0.endProc; ------------------------------------------------------------ • The next sentence is *initialize input sequences. • For the beginning, the current input sequences must be identified, because the number of active sequences can be smaller than NH. • By an active sequence we mean a sequence which has one or more monotonies to be merged. • In fact, the number of input sequences reduces when the number of monotonies reduces. • Practically, can’t exist more sequences than monotonies, so the sorting process is finished when remains a single monotony. • Variable k1 is introduced to store the current number of active sequences, that means, those that still contain monotonies. • Now, we can refine the statement *initialize input sequences [3.3.3.d]: -----------------------------------------------------------{initialize input sequences} [3.3.3.d] FOR i:= 1 TO k1 DO Reset(F[t[i]]); -----------------------------------------------------------• Due to repeated merging process, the k1 tendency is to decrease, so the statement *all the active inputs has been exhausted can be expressed as k1=0 . • The statement *merge a monotony from each active input sequence to t[j it’s more complicated. • It consists of the repeated selection of the least key among the available sources, and its subsequent transport to the destination, i.e., the current output sequence. • From each input sequences a monotony is parsed. • The process is further complicated by the necessity of determining the end of each monotony. The end of a monotony may be reached because: • (1) The end of the source is reached. • (2) The subsequent key is less than the current key (close monotony). • In the case (1) the source sequence is eliminated by decrementing k1. • In the case (2) the monotony is closed by excluding temporary the sequence from further selection of items, but only until the creation of the current output monotony is completed. This operation is named monotony closing. • This makes it obvious that a second variable, say k2, is needed to denote the number of sources actually available for the selection of the next item. • The k2 value is initially set equal to k1 and is decremented whenever a monotony terminates because of condition (2). • When k2 becomes 0, the merging of one monotony from each active input sequence to the current output sequence, is finished. • First refinement step is presented in [3.3.3.d]. -----------------------------------------------------------{merge a monotony from each active input sequence to F[t[j]]} FOR i:= 1 TO k1 DO Reset(F[t[i]].fisier); k2:= k1; REPEAT *select the least key. Let t[mx] be the index of the corresponding input sequence buf:= F[t[mx]].current; IF Eof(F[td[mx]].sequence) THEN [3.3.3.d] F[td[mx]].endProc:= true ELSE Read(F[td[mx].sequence,F[td[mx].current); Write(F[t[j]].sequence,buf); IF Eof(F[td[mx]].sequence) THEN *eliminate sequence ELSE IF buf.key>F[t[mx]].current.key THEN *close monotony UNTIL k2=0; ------------------------------------------------------------ • Unfortunately, the introduction of k2 is not sufficient. We need to know not only the number of sequences, but also which sequences are still in actual use. • An obvious solution is to use an array with Boolean components indicating the availability of the sequences. • We choose, however, a different method that leads to a more efficient selection procedure which, after all, is the most frequently repeated part of the entire algorithm. • Instead of using a Boolean array, a sequence index map, say td, is introduced. • The array td is represented in figure 3.3.3.c. Array td 1 2 mx 3 N/2 k2 k1 input sequences Fig. 3.3.3.c. Index array td used as auxiliary in the merging process • Array td is used instead of array t to access input sequences, thus td[1],...,td[k2] are indices of the available input sequences. • Array td is initialized at the beginning of each merging, by copying the indices of the input sequences from array t in td, respectively from t[1],...,t[k1] in td[1],...,td[k2]. • Index k1 is initialized with: • The value N/2, if the number of monotonies L is greater than N/2. • The value L, if the number of monotonies is less than N/2. • Is to be mentioned that L represents the number of monotonies merged in the previous phase. • Index k1 denotes the number of active sequences. • The rest of sequences, (till N/2) doesn’t have monotonies, because they are physically terminated, so they are not considered • Index k2 which is initialized with the value k1, denotes the number of active sequences which still have monotonies in the current pass. • The sequences having the indices between k2 and k1 has finished their monotonies in the current pass, but are not physically terminated. • Thus the statement *close monotony can be formulated as follows: • Let be mx the index of sequence for which the current monotony is finished: • (1) Exchange in array td position mx with position k2. • (2) Decrement k2. • The statement *eliminate sequence, corresponding to condition (1), also considering that mx is the index of the exhausted sequence, presumes the following actions. • We also presume that the physical termination of a sequence, includes the closing of its last monotony. • (1) In array td, the sequence denoted by k2 is moved in the place of sequence denoted by mx (close monotony). • (2) The sequence denoted by k1 is moved in place of sequence denoted by k2 (eliminate sequence). • (3) Decrement k1. • (4) Decrement k2. • The final shape of balanced multiway merge sort appears in [3.3.3.e]. -----------------------------------------------------------{Sorting by balanced multiway merge – final variant} PROCEDURE BalancedMultiwayMergeSort; VAR i,j,mx,tx: NrSequence; k1,k2,l: integer; x,min: integer; t,td: ARRAY[NrSequence] OF NrSequence; F: ARRAY[NrSequence] OF TypeSequence; f0: TypeSequence; BEGIN FOR i:= 1 TO NH DO Rewrite(f[i]); {initialize sequences} j:= NH; l:= 0; Read(f0.sequence,f0.current); {first element of f0} REPEAT {distribution of initial monotonies on t[1],...,t[NH]} IF j<NH THEN j:= j+1 ELSE j:= 1; l:= l+1; REPEAT {copy a monotony from sequence f0 on sequence F[j]} buf:= f0.current; [3.3.3.e] IF Eof(f0.sequence) THEN f0.endProc:= true {look-ahead technique} ELSE Read(f0.sequence, f0.current); Write(F[j].sequence,buf) UNTIL (buf.key>f0.current.key) OR f0.endProc; UNTIL f0.endProc; FOR i:= 1 TO N DO t[i]:= i; {initialize index array} REPEAT {merge from t[1],...,t[NH] on t[NH+1],...,t[N]} IF l<NH THEN k1:= l ELSE k1:= NH; FOR i:= 1 TO k1 DO{initialize input sequences} BEGIN {k1 is number of input sequences} Reset(F[t[i]]); td[i]:= t[i]; Read(F[td[i]].sequence,F[td[i]].current) END ; L:= 0; {number of merged monotonies} j:= NH+1; {j=index of output sequence} REPEAT {merge a monotony from each t[1],...t[k2] to t[j]} k2:= k1; L:= L+1; {k2=nr of active sequences} REPEAT {select the less element} i:= 1; mx:= 1; min:= F[td[1]].current.key; WHILE i<k2 DO BEGIN i:= i+1; x:= F[td[i]].current.key; IF x<min THEN BEGIN min:= x; mx:= i END END; {WHILE} {td[mx] contains the minimum element. Is written to td[j]} buf:= F[td[mx]].current; IF Eof(F[td[mx]].sequence) THEN F[td[mx]].endProc:= true ELSE Read(F[td[mx]].sequence,F[td[mx]].current); Write(F[td[j]].sequence,buf) IF F[td[mx]].endProc THEN BEGIN {eliminate sequence} Rewrite(F[td[mx]]); td[mx]:= td[k2]; td[k2]:= td[k1]; k1:= k1-1; k2:= k2-1 END ELSE IF buf.key>F[td[mx]].current.key THEN BEGIN {close monotony} tx:= td[mx]; td[mx]:= td[k2]; td[k2]:= tx; k2:= k2-1 END UNTIL k2=0; {end of merging from inputs to t[j]]} IF j<n THEN j:= j+1 ELSE j:= NH+1 {select next destination sequence} UNTIL k1=0; {all inputs has been exhausted} FOR i:= 1 TO NH DO {commute sequences} BEGIN tx:= t[i]; t[i]:= t[i+NH]; t[i+NH]:= tx END UNTIL L=1; {sorted sequence is on t[1]} END; {BalancedMultiwayMergeSort} ------------------------------------------------------------ 3.3.4. Polyphase Sort • The balanced merging sort method eliminates the pure copying operations necessary when the distribution and the merging operations are united into a single phase. • This is achieved using more input and output sequences which are not used in totality. • The question arises whether or not the given sequences could be processed even more efficiently. • R.L. Gilstad, invented a new method called polyphase sort which avoid such inconvenient. • The key to the improvement introduced by Gilstad lies in abandoning the rigid notion of strict passes. • Gilstad suggests to use the sequences in a more sophisticated way than by always having N/2 sources and as many destinations and exchanging sources and destinations at the end of each distinct pass. • In his approach, even the notion of a pass becomes diffuse. • Let’s consider for the beginning an example with three sequences. • At each moment, the monotonies from two sequences are merged on the third. • Any time, one of the input sequences is exhausted, it becomes immediately the destination sequence of merging of the other two (the unfinished sequence and the old destination). • The process finishes when remains only one monotony. • As we know, merging n monotonies from each of input sequences results n monotonies on the destination sequence. • To illustrate the method, only the number of monotonies will be visualized, instead of the keys themselves. • Thus, in fig. 3.3.4.a (a) presume that input sequences f1 and f2 contain 13 respectively 8 monotonies. • In firs "pass", are merged from f1 and f2 on f3 8 monotonies. • In the the second "pass" are merged from f1 and f3 on f2 the 5 remaining monotonies, etc. • Finally f1 is the sorted sequence. f1 f2 13 8 5 0 8 8 0 5 3 3 2 1 0 1 f3 f1 f2 f3 f4 f5 14 12 8 7 6 4 0 8 4 3 2 0 4 4 0 2 1 0 2 2 2 0 2 1 0 1 1 1 1 1 (a) 0 1 0 1 0 0 0 0 16 15 f6 (b) 0 (b) (a) Fig. 3.3.4.a. Example of polyphase sort with 3 sequences • In the same figure (b) an example of polyphase sort with 65 monotonies and 6 sequences is presented. • Polyphase is more efficient than balanced merge because, given N sequences, it always operates with an N-1-way merge instead of an N/2-way merge. • As the number of required passes is approximately logN n, n being the number of items to be sorted and N being the degree of the merge operations (number of sequences). • Polyphase promises a significant improvement over balanced merging. • Of course, the distribution of initial monotonies was carefully chosen in the above examples. • In order to find out which initial distributions of monotonies lead to a proper functioning, the tables from figures 3.3.4.b and 3.3.4.c are build. • The construction of the two tables is realized starting from the examples form figure 3.3.4.a: • Each pass has its corresponding row in the table. • The levels are parsed from bottom-up in figure 3.3.4.a, and the table is filled up-down. • Each table row, excepting the last, contains on its first position (a1), the monotonies number of the destination sequence in current pass. • Further, in the table are introduced the monotonies numbers for each pass. Each level in figure 3.3.4.a is parsed from left to right, starting with the destination sequence, proceeding in a circular manner. • Last row of the resulting table contains the initial situation, that means the initial monotonies distribution. • In figure 3.3.4.b appears the table corresponding to 3 sequences polyphase sort, and in figure 3.3.4.c the table corresponding to 6 sequences polyphase. f1 f2 f3 k 13 a (k) 1 8 5 0 8 0 5 3 3 2 0 1 0 2 0 1 1 1 0 0 0 1 2 3 4 5 6 1 1 2 3 5 8 13 a (k) 2 ∑ a (k) i 0 1 1 2 3 5 8 1 2 3 5 8 13 21 Fig. 3.3.4.b. Perfect distribution of monotonies on three sequences k a (k) 1 a (k) 2 a (k) 3 a (k) 4 a (k) 5 ∑ a (k) i 0 1 2 3 4 5 1 1 2 4 8 16 0 1 2 4 8 15 0 1 2 4 7 14 0 1 2 3 6 12 0 1 1 2 4 8 1 5 9 17 33 65 Fig. 3.3.4.c. Perfect distribution of monotonies on six sequences • From table in figure 3.3.4.b the following relation can be derived [3.3.4.a]: -------------------------------------------------------------- +1) a (k = a (k) 2 1 +1) (k) (k) (k -1) a (k = a (k) 1 1 + a 2 = a1 + a1 (0) for [3.3.4.a] k>0 (0) where a 1 = 1 and a 2 = 0 -------------------------------------------------------------- • If we substitute (1) a (i) 1 = fi by replacement results [3.3.4.b]: -------------------------------------------------------------(1) (1) f i +1 = f i (1) f1 (1) + f i −1 i≥ 1 for =1 [3.3.4.b] (1) f0 =0 -------------------------------------------------------------- • But this is recursive rule defining the first order Fibonacci numbers: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, ... • That means, each Fibonacci number is the sum of its two predecessors. • As consequence, in the case of polyphase sorting with three sequences: • (1) The initial number of monotonies on the two source sequences must be two consecutive first order Fibonacci numbers. • (2) The total number of initial monotonies is the sum of the two consecutive first order Fibonacci numbers, which is a first order Fibonacci number too. • For the sorting example with 6 sequences, from the table in figure 3.3.4.c, the following formulas can be derived [3.3.4.c]: --------------------------------------------------------+1) a (k = a (k) 5 1 +1) (k) (k) (k −1) a (k = a (k) 4 1 + a 5 = a1 + a1 +1) (k) (k) (k −1) − 2) a (k = a (k) + a (k 3 1 + a 4 = a1 + a1 1 (k +1) a2 (k) (k) (k) (k −1) = a1 + a 3 = a1 + a1 (k − 2) + a1 [3.3.4.c] (k −3) + a1 (k 2) (k 3) (k 4) +1) (k) (k) (k −1) a (k = a (k) + a1 − + a1 − + a1 − 1 1 + a 2 = a1 + a1 --------------------------------------------------------- • If we do (4) a(i) 1 = fi results [3.3.4.d] , that means order 4 Fibonacci numbers. --------------------------------------------------------(4) f i(4) +1 = f i (4) (4) (4) (4) + f i −1 + f i −2 + f i −3 + f i −4 for i≥4 where for i < 4 [3.3.4.d] f = 1 , fi = 0 --------------------------------------------------------(4) 4 (4) Generally, the order p Fibonacci numbers are defined as follows [3.3.4.e]: -------------------------------------------------------(p) (p) f i(p) + f i(p) +1 = f i −1 + + f i − p for i ≥ p where f p(p) = 1 , f i(p) = 0 for 0 ≤ i < p [3.3.4.e] --------------------------------------------------------- • The first order Fibonacci numbers are the usual one. • In the case of polyphase sort with n sequences, the initial monotonies numbers on the n-1 source sequences are: • A sum of n-1 consecutive n-2 order Fibonacci numbers on the first sequence. • A sum of n-2 consecutive n-2 order Fibonacci numbers on the second sequence. • A sum of n-3 consecutive n-2 order Fibonacci numbers on the third sequence. • And so one. • On the last sequence, the (n-1)-th, must exist one Fibonacci number of n-2 order. • These considerations implies the fact that polyphase sort method, is applicable only if the initial sorting sequence contains an initial number of monotonies equal to a sum of n-1 of such mentioned above Fibonacci sums. • Such a distribution of the initial monotonies is named perfect distribution. • In the case of polyphase sort with 6 sequences (n = 6), are necessary Fibonacci numbers of order n-2 = 4: 0, 0, 0, 0, 1, 1, 2, 4, 8, 16, 31, 61, 120, ... • The initial distribution of monotonies, presented in fig. 3.3.4.a, is established starting with Fibonacci number 8 (the 9-th in the string), as follows: • Sequence f1 will contain a number of monotonies equal with the sum of n-1=5 consecutive 4-th order Fibonacci numbers: 1+1+2+4+8= 16. • Sequence f2 will contain a number of monotonies equal with the sum of n-2=4 consecutive 4-th order Fibonacci numbers: 1+2+4+8= 15. • Sequence f3, a sum of n-3=3 numbers: 2+4+8= 14. • Sequence f4 , a sum of n-4=2 numbers: 4+8= 12. • Sequence f5 , a sum de n-5=1 numbers Fibonacci of order 4, that means 8 monotonies. • As result the initial sequence to be sorted has to contain: 16+15+14+12+8= 65 monotonies. • These monotonies are initially distributed on the 5 source sequences in accordance with the numbers determined before, distribution. obtaining a perfect • It is not difficult to observe in figure 3.3.4.c, the distribution of the monotonies for each level k, is obtained applying the same algorithm, that means selecting as base, the consecutive numbers from Fibonacci string 1,1,2,4,8, ... for respectively k= 1,2,3,4,5,.... • If the initial number of monotonies doesn’t satisfy the condition to be a sum of n-1 Fibonacci partial sums, a corresponding number of hypothetical empty sequences are simulated, so that the sum to became perfect. • These monotonies are named “dummy monotonies". • The problem is how to recognize and process these monotonies. • For the beginning, we will investigate the initial distribution of monotonies problem, and then decide upon a rule for the distribution of actual and dummy monotonies onto the N-1 sequences. • It’s obvious that the selection of a dummy monotony from sequence i means that the monotony is ignored resulting a merging from less than n-1 source sequences. • Merging a dummy monotony from each of the n-1 source sequences leads to the absence of an effective merge, but only to record a dummy monotony on the destination sequence. • From this reason, the dummy monotonies must be distributed as uniform as possible on the n-1 input sequences. • First, we will analyze the problem of distribution of an a given number of monotonies on n-1 sequences, in order to obtain a perfect distribution. • The desired number of initial monotonies can be determined progressively, using the Fibonacci numbers of order n-2. • Thus, in case n= 6, having as landmark the table from fig. 3.3.4.c: • We start with the distribution corresponding to k = 1 (1,1,1,1,1). • If there are more monotonies, we pass to next row (2,2,2,2,1). • Then to (4,4,4,3,2) and so one. • The index k of the row is named level. • As the number of monotonies grow, the level k of Fibonacci numbers grows too . • The final levels number k denotes the number of passes necessary to sort the initially given monotonies. k a (k) 1 a (k) 2 0 1 2 3 4 5 1 1 2 4 8 16 0 1 2 4 8 15 a (k) 3 a (k) 4 0 1 2 4 7 14 0 1 2 3 6 12 (k) a5 0 1 1 2 4 8 ∑ a (k) i 1 5 9 17 33 65 Fig. 3.3.4.c. The perfect distribution of monotonies on 6 sequences (replay) • The distribution algorithm can be described as follows: • (1) Let the distribution goal be the Fibonacci numbers of order N-2, level 1. • (2) Distribute according to the set goal. • (3) If the goal is reached, compute the next level of Fibonacci numbers. The difference between them and those on the former level constitutes the new distribution goal. • (4) Return to step 2. • (5) If the goal cannot be reached because the source is exhausted, terminate the distribution process. • The rules for calculating the next level of Fibonacci numbers are contained in their definition [3.3.4.e]. • We can thus concentrate our attention on step 2, where, with a given goal, the subsequent runs are to be distributed one after the other onto the N-1 output sequences. • It is here where the dummy monotonies have to reappear in our considerations. • Let us assume that when raising the level, we record the next goal by the differences di for i = 1,2,...,n-1, where di denotes the number of runs to be put onto sequence i in this step. • We can now assume that we immediately put di dummy monotonies onto sequence i. • Then we can regard the subsequent distribution as the replacement of dummy monotonies by actual monotonies, each time recording a replacement by subtracting 1 from the count di. • Thus, the di indicates the number of dummy monotonies on sequence i when the source becomes empty. • It is not known which algorithm yields the optimal distribution, but the following has proved to be a very good method. • It is called horizontal distribution and it was proposed by Knuth [Kn76]. • The term horizontal distribution can be understood by imagining the monotonies as being piled up in the form of silos. • In figure 3.3.4.d are represented these silos for n= 6 , level 5, in conformity with fig. 3.3.4.c. • In order to reach an equal distribution of remaining dummy monotonies as quickly as possible, their replacement by actual runs reduces the size of the piles by picking off dummy monotonies on horizontal levels proceeding from left to right. • In this way, the runs are distributed onto the sequences as indicated by their numbers as shown in Fig. 3.3.4.d. • We have to mention that in this figure is represented the distribution of the monotonies when we pass from level 4 (k=4) containing 33 monotonies to level 5 (k=5) containing 65 de monotonies. • The hatched surfaces represents the first 33 de monotonies which have been distributed when the level 4 was processed. • For example, if initially there are only 53 de initial monotonies, all the monotonies having numbers between 54 and 65, will be treated as dummy monotonies. • The monotonies are actually written at the end of the sequences, but it’s more advantageous to imagine that they are written at the beginning, because in the sorting process, the monotonies are presumed to be at the beginning of the sequences. ` begin sequences Goal 34 35 38 42 46 51 56 61 36 39 43 47 52 57 62 37 40 44 48 53 58 63 41 45 49 54 59 64 50 55 60 65 f1 f2 f3 f4 f5 d[1]=8 d[2]=7 d[3]=7 d[4]=6 d[5]=4 Fig.3.3.4.d. Horizontal distribution of monotonies 3.3.4.1 Implementing Polyphase Sort Algorithm • We are now in a position to describe the algorithm in the form of a procedure called SelectSequence, • The procedure is activated each time a monotony has been copied and a new source is selected for the next monotony. • The procedure selects the new sequence on which the next monotony will be copied, taking into account the perfect distribution of the monotonies on source sequences. • We assume the existence of a variable j denoting the index of the current destination sequence. • The necessary data structures are described in [3.3.4.g]: -----------------------------------------------------------{Polyphase sort – data structures} TYPE TypeSequence = RECORD sequence: FILE OF TypeElement; current: TypeElement; endProc: boolean [3.3.4.g] END; NrSequence: 1..n; VAR j: NrSequence; a,d: ARRAY[NrSequence] OF integer; level: integer; ------------------------------------------------------------ • Arrays a and d store the numbers of ideal distribution ai respective fictive distribution di corresponding to each i sequence . • These arrays are initialized with values ai= 1, di= 1 for i = 1,...,n1 respectively an= 0, dn= 0 . • Variabiles j and level are initialized with value 1. • Procedure SelectSequence calculates each time when the level increases, the values for the next row of the table from figure 3.3.4.c, respectively the values a1(k),...,an-1(k). • In the same time the differences di= ai(k) - ai(k-1), which represent the next goal, are calculated. • The algorithm is based on the fact that di values decrease when the indices increase (fig. 3.3.4.d). • The algorithm starts with the level 1, not 0. • The procedure ends by decrementing dj by 1. This operation stands for the replacement of a dummy monotony on sequence j by an actual monotony [3.3.4.h]. -----------------------------------------------------------{Polyphase sort – procedure SelectSequence} PROCEDURE SelectSequence; VAR i: NrSequence; z: integer; BEGIN IF d[j]<d[j+1] THEN j:= j+1 ELSE BEGIN IF d[j]=0 THEN [3.3.4.h] BEGIN level:= level+1; z:= a[1]; FOR i:= 1 TO n-1 DO BEGIN d[i]:= z+a[i+1]-a[i] a[i]:= z+a[i+1] END END; j:= 1 END;{ELSE} d[j]:= d[j]-1 END; {SelectSequence} ------------------------------------------------------------ • Assuming the availability of a routine to copy a monotony from the source sequence f0 onto F[j], we can formulate the initial distribution phase as follows, assuming that the source contains at least one run [3.3.4.i]: -----------------------------------------------------------{Polyphase sort – initial distribution of monotonies – refinement step 0} REPEAT SelectSequence; [3.3.4.i] CopyMonotony UNTIL f0.endProc; ------------------------------------------------------------ • We remember the effect encountered in distributing monotonies in the previously discussed natural merge algorithm. • The fact that two runs consecutively arriving at the same destination may merge into a single run, causes the assumed numbers of monotonies to be incorrect. • The problem was solved by devising the natural merge sort algorithm such that its correctness does not depend on the number of monotonies. • In the Polyphase Sort, however, we are particularly concerned about keeping track of the exact number of monotonies on each sequence. • Consequently, we cannot afford to overlook the effect of such a coincidental merge. • As result, it becomes necessary to retain the keys of the last item of the last monotony on each sequence. • For this purpose, we intoroduce the ARRAY[NrSequence] OF TypeKey. array variable last: • The next refinement step of distribution algorithm appears in [3.3.4.j]: -----------------------------------------------------------{Polyphase sort – initial distribution of monotonies – refinement step 1} REPEAT SelectSequence; [3.3.4.j] IF last[j]<= f0.current.key THEN *continue the old monotony; CopyMonotony; last[j]:= f0.current.key UNTIL f0.endProc; ------------------------------------------------------------ • A problem arises: last[j] is positioned only after the first monotony is copied. • To solve this problem, at the beginning, the distribution of monotonies must be realized without inspecting last[j]. • The rest of the monotonies are distributed in conformity with [3.3.4.k]. • It’s assumed that last[j] assignation is realized in procedure CopyMonotony. -----------------------------------------------------------{ Polyphase sort – procedure Copy monotony} WHILE NOT f0.endProc DO BEGIN SelectSequence; IF last[j]<=f0.current.key THEN BEGIN {continue the old monotony} CopyMonotony; IF f0.endProc THEN [3.4.4.k] d[j]:= d[j]+1 ELSE CopyMonotony END ELSE CopyMonotony END; --------------------------------------------------------- • The structure of the polyphase sort is quite similar with the structure of balanced merge n multi-way sort: • An external loop which merges the monotonies until all the sources are exhausted. • A loop nested in the previous, which merge a single monotony from each source sequence. • Finally, a third loop, nested in the precedent, which selects the keys and sends the involved items to the destination sequence. • The principal differences to balanced merging are the following: • (1) Instead of N, there is only one output sequence in each pass. • (2) Instead of switching N/2 input and N/2 output sequences after each pass, the sequences are rotated. This is achieved by using a sequence index map t. • (3) The number of input sequences varies from monotony to monotony. At the start of each monotony, it is determined from the counts di of dummy monotonies. • If di>0for all i, then N-1 dummy monotonies are pseudo-merged into a single dummy monotony by merely incrementing the count dN of the output sequence. • Otherwise, one monotony is merged from all sources with di= 0, and di is decremented for all other sequences, indicating that one dummy monotony was taken off. • We denote the number of input sequences involved in a merge by k1. • (4) It is impossible to derive termination of a phase by the end-of status of the N-1'st sequence, because more merges might be necessary involving dummy monotonies from that source. • Instead, the theoretically necessary number of monotonies is determined from the coefficients ai. • The coefficients ai(k) were computed during the distribution phase. They can now be recomputed backward. • Taking into account all these observations the merge phase of the polyphase sort algorithm is presented in [3.3.4.l]. We presume that: • All the n-1 source sequences containing the initial monotonies are reseted. • The sequences index array t is initialized with t[i] = i, for all i. -----------------------------------------------------------{Polyphase sort – merge phase} REPEAT {merge from F[t[1]],...,F[t[n-1]] on F[t[n]]} z:= a[n-1]; d[n]:= 0; Rewrite(F[t[n]]); REPEAT {merge a monotony} k1:= 0 {determine the number k1 of active input sequences} FOR t:= 1 TO n-1 DO IF d[i]>0 THEN d[i]:= d[i]-1 ELSE BEGIN k1:= k1+1; td[k1]:= t[i] [3.3.4.l] END; IF k1=0 THEN d[n]:= d[n]+1 ELSE *merge a monotony from each F[t[1]],...,F[t[k1]]; z:= z-1 UNTIL z=0; Reset(F[t[n]]); *rotate sequences in array t; *calculate a[i] for the next level; Rewrite(F[t[n]]); level:= level-1 UNTIL level=0; {the sorted elements are on F[t[1]]} ------------------------------------------------------------ • The program Polyphase Sort is similar with balanced n-way merge sort. • The actual merge operation is almost identical with that of the N-way merge sort, the only difference being that the sequence elimination algorithm is somewhat simpler. • The rotation of the sequence index map t and the corresponding counts di and the down-level recomputation of the coefficients ai is straightforward detailed in [3.3.4.m]. • In fact, Polyphase sort algorithm is presented in its entirety in [3.3.4.m]. -----------------------------------------------------------{Polyphase sort –final variant} PROGRAM Polyphase; {polyphase sort with n sequences} CONST n = 6; {number of sequences} [3.3.4.m] TYPE TypeElement = RECORD key: integer END; TypeSequence = RECORD sequence: FILE OF TypeElement; current: TypeElement; endProc: boolean END; NrSequence = 1..n; VAR dim,aleat,tmp: integer; {used for generating the initial sequence} eob: boolean; {sequence end} buf: TypeElement; f0: TypeSequence; {input sequence containing random numbers} F: ARRAY[NrSequence] OF TypeSequence; PROCEDURE List(VAR f: TipSecvenţa; n: NrSequence); VAR z: integer; BEGIN WriteLn(' sequence ',n); z:= 0; WHILE NOT Eof(f.sequence) DO BEGIN Read(f.sequence,buf); Write(buf.key); z:= z+1 END; WriteLn; Reset(f.fisier) END; {List} PROCEDURE PolyphaseSort; VAR i,j,mx,tn: NrSequence; k1,level: integer; a,d: ARRAY[NrSequence] OF integer; {a[j] – the ideal monotonies number on sequence j} {d[j] - the dummy monotonies number on sequence j} dn,x,min,z: integer; last: ARRAY[NrSequence] OF integer; {last[j]=the key of the last element of sequence j} t,td: ARRAY[NrSequence] OF NrSequence; {mapping arrays for the sequence numbers} PROCEDURE SelectSequence; VAR i: NrSequence; z: integer; BEGIN IF d[j]<d[j+1] THEN j:= j+1 ELSE BEGIN IF d[j]=0 THEN BEGIN level:= nivel+1; z:= a[1]; FOR i:= 1 TO n-1 DO BEGIN d[i]:= z+a[i+1]-a[i]; a[i]:= z+a[i+1] END END; j:= 1 END; d[j]:= d[j]-1 END; {SelectSequence} PROCEDURE CopyMonotony; VAR buf: TypeElement; BEGIN {copy a monotony from f0 on sequence j} REPEAT buf:= f0.current; IF Eof(fo.sequence) THEN f0.endProc:= true ELSE Read(f0.sequence, f0.current); Write(F[t[j]].sequence,buf) UNTIL (buf.key>f0.current.key) OR f0.endProc); last[j]:= buf.key END; {CopyMonotony} BEGIN {initial monotonies distribution} FOR i:= 1 TO n-1 DO BEGIN a[i]:= 1; d[i]:= 1; Rewrite(F[i].sequence) END; level:= 1; j:= 1; a[n]:= 0; d[n]:= 0; REPEAT SelectSequence; CopyMonotony UNTIL f0.endProc OR (j=n-1); WHILE NOT f0.endProc DO BEGIN SelectSequence; IF last[j]<=f0.current.key THEN BEGIN {continue old monotony} CopyMonotony; IF f0.endProc THEN d[j]:= d[j]+1 ELSE CopyMonotony END ELSE CopyMonotony END; FOR i:= 1 TO n-1 DO Reset(F[i]); FOR i:= 1 TO n DO t[i]:= i; REPEAT {merge from F[t[1]],...,F[t[n-1]] on F[t[n]]} z:= a[n-1]; d[n]:= 0; Rewrite(F[t[n]]); REPEAT {merge a monotony} k1:= 0; FOR i:= 1 TO n-1 DO IF d[i]>0 THEN d[i]:= d[i]-1 ELSE BEGIN k1:= k1+1; td[k1]:= t[i] END; IF k1=0 THEN d[n]:= d[n]+1 ELSE BEGIN {merge a monotony from F[t[1]],...,F[t[k1]] to F[t[n]]} REPEAT i:= 1; mx:= 1; min:= F[td[1]].current.key; WHILE i<k1 DO BEGIN i:= i+1; x:= F[td[i]].current.key; IF x<min THEN BEGIN min:= x; mx:= i END END; {td[mx] contains the minimum element; is moved on t[n]} buf:= F[td[mx]].current; IF Eof(F[td[mx]].sequence) THEN F[td[mx]].endProc:= true ELSE Read(F[td[mx].sequence, F[td[mx].current); Write(F[t[n]].sequence,buf); IF (buf.key>F[td[mx]].current.key) OR F[t[mx]].endProc THEN BEGIN {omit this sequence} td[mx]:= td[k1]; k1:= k-1 END UNTIL k=0 END; z:= z-1 UNTIL z=0; Reset(f[t[n]]); List(F[t[n]],t[n]); {sequences rotation} tn:= t[n]; dn:= d[n]; z:= a[n-1]; FOR i:= n DOWNTO 2 DO BEGIN t[i]:=t [i-1]; d[i]:= d[i-1); a[i]:= a[i-1]-z END; t[1]:= tn; d[1]:= dn; a[1]:= z; {the sorted elements are on t[1]} List(f[t[1]],t[1]); level:= level-1 UNTIL level=0; END; {PolyphaseSort} BEGIN {generation of a random sequence of numbers} dim:= 200; aleat:= 7789; REPEAT aleat:= (131071*aleat) MOD 2147483647; tmp:= aleat DIV 214784; Write(f0.sequence,tmp); dim:= dim-1 UNTIL dim=0; PoliphaseSort END. ------------------------------------------------------------ 3.3.5. Conclusions • The complexity of the presented external sorting methods, makes quite difficult the attempt to formulate some generalizing conclusions. In the same time, performance analysis of this kind of algorithms is extremely complicated. • At least, some observations can be formulated: • (1) There is an intimate connection between a certain algorithm and underlying data structure, in particular, a very strong influence of the latter on the former. • This thing is perfectly illustrated by the external sorting methods, which are completely different as addressing manner from the internal sorting methods. • (2) Generally, increasing the performance of an algorithm presumes much sophistication even if advanced and dedicated data structures are used. • Paradoxical, the high performance algorithms, are more complicated, more difficult to be understood, require much lines of code, more memory and use advanced and specialized data structures.