* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Sorting Algorithms
Survey
Document related concepts
Transcript
Data Structures and Algorithms Rabie A. Ramadan [email protected] Part II Data Structures and Algorithms Algorithms and Programs Algorithm: A mechanical procedure written in such a way that human beings can understand it (pseudo code) Program: • A mechanical procedure written in such a way that a computer can execute it Given an algorithm, we are led to ask: • What is it supposed to do? • Does it really do what it is supposed to do? • How efficiently does it do it? Computing the Running Time • • Algorithms Goals: Easy to understand, code, and debug. Makes efficient use of the computer's resources, especially, one that runs as fast as possible. The running time depends on factors such as: • • • • the input to the program • The size of the input and its nature (e.g. array size to be sorted) the quality of code generated by the compiler used to create the object program, the nature and speed of the instructions on the machine used to execute the program, and the time complexity of the algorithm underlying the program Computing the Running Time If the number of instructions is n • T(n) is the running time • Worst case All of the instructions are executed • Best case the minimum number of instructions are executed • The Average is the best way to measure the running time Tavg(n) Big-Oh and Big-Omega Notation Big O • We say that T(n) is O(f(n)) if there are constants c and n0 such that T(n) <= cf(n) whenever n >= n0. • Defines the upper bound of an algorithm/program Omega • T(n) is (W(g(n))), means that there exists a positive constant c such that T(n) >= cg(n) infinitely often (for an infinite number of values of n). Example Find the Complexity of the following equation: Group Activity Find the complexity Procedure Calls What is a procedure ? Count all of the callings to this procedure Example function fact ( n: integer ): integer; { fact(n) computes n! } begin(1) (1) if n <= 1 then(2) (2) fact := 1 else(3) (3) fact := n * fact(n-1) end; { fact } Let T(n) be the running time for fact(n). The running time for lines (1) and (2) is O(1), for line (3) it is O(1) + T(n-1). Basic Data Structures Way of organizing information Many different data structures Different problems may require different data structures Each data structure has unique properties that makes it well suited to give a certain view of the data. There are many different ways of creating the same data structure in a computer. Our Objectives Show how data structures are represented in the computer Identify linear and nonlinear data structures Manipulate data structures with basic operations Compare different implementations of the same data structure Computer Memory Every piece of data that is stored in a computer is kept in a memory cell with a specific address. • The computer can store many different types of data in its memory. Computer Memory (Cont.) Storing the string 'apple' in the computer's memory, it might look like this. Storing a list of names might look like this Computer Memory (Cont.) Can we represent the following tree the same way? It does not make sense. Right!! Stack (Call Back) last-in, first-out, or LIFO, Insert PUSH Delete POP Queue (Call Back) first-in, first-out, or FIFO, Insert ENQUEUE, Delete DEQUEUE It has a head and tail Queue (Call Back) Priority queue Abstract data type that supports the following operations: • • • Add an element to the queue with an associated priority Remove the element from the queue that has the highest priority, and return it O(1) to insert the element and O(n) to return an element Linked Lists objects are arranged in a linear order. the order in a linked list is determined by a pointer in each object provide a simple, flexible representation for dynamic sets Linked List Could be single or double linked list Could be circular list Sorted or unsorted Liked list operations Searching O(n) Inserting O(1) Deleting O(1) O(n) if sorted Trees • Binary tree A list with two pointers right and left Complete Binary Tree If the height is h, the number of nodes is 2h+1-1 . The missing nodes could be only at the bottom of the tree. Rooted trees with unbounded branching The number of children is k The number of children are not known a head The space problem appears when creating the unbounded tree • • The Solution: left-child[x] points to the leftmost child of node x, and right-sibling[x] points to the sibling of x immediately to the right. Heaps A binary tree with the following properties: • It is a complete binary tree; that is, each level of the tree is completely • filled, except possibly the bottom level. At this level, it is filled from left to right. It satisfies the heap-order property: The data item stored in each node is greater than or equal to the data items stored in its children. Heap Example Heap Not a Heap Not complete Not a Heap complete but does not satisfy the heap property Heap Representation • • • • As an Array An array A that represents a heap is an array with two attributes – length, the number of elements in the array – heap-size, the number of heap elements stored in the array • Viewed as a binary tree and as an array : The root of the tree is stored at A[0], its left-child at A[1], its right child at A[2] etc. Heap Operations The height of a node in a tree is the number of edges on the longest simple downward path from the node to a leaf. (i.e. maximum depth from that node The height of an n-element heap based on a binary tree is log (n) The basic operations on heaps run in time at most proportional to the height of the tree and thus take O(log( n)) time. Maintaining the Heap Property One of the more basic heap operations is converting a complete binary tree to a heap. Such an operation is called “Heapify”. Its inputs are an array A and an index i into the array. When Heapify is called, it is assumed that the binary trees rooted at LeftChild(i) and RightChild(i) are heaps, but that A[i] may be smaller than its children, thus violating the 2nd heap property. The function of Heapify is to let the value at A[i] “float down” in the heap so that the subtree rooted at index i becomes a heap. The action required from Heapify is as follows: Sorting Algorithms Bubble sort O(n2) Insertion sortO(n2) Selection sortO(n2) Shell sortO(n2) Heap sortO(n log n) Merge sortO(n log n) Quick sortO(n log n) O(n2) Complexity O(n log n) Bubble Sort It's also the slowest Compares each item in the list with the item next to it, Swapping them if required. The algorithm repeats this process until it makes a pass all the way through the list without swapping any items Efficiency Pros: Simplicity and ease of implementation. Cons: Horribly inefficient. O(n2) Heap Sort Building a heap out of the data set Given an array A[1…, n] Since the elements in the subarray A[floor(n/2 +1) . . n] are all leaves, Use 'Heapify' to sort the array Heap Sort Pros: In-place and non-recursive, making it a good choice for extremely large data sets. O(n log n) Cons: Slower than the merge and quick sorts. Insertion Sort Requires two lists In-place sort is used to save space Based on the technique used by card players to arrange a hand of cards • Player keeps the cards that have been picked up so far in sorted order • When the player picks up a new card, he makes room for the new card and then inserts it in its proper place Analysis Pros: Relatively simple and easy to implement. O(n2 ) Cons: Inefficient for large lists. Merge Sort Divide-And-Conquer Algorithm The list to be sorted into two equal halves Places them in separate arrays. Each array is recursively sorted, Then merged back together to form the final sorted list. Example Analysis Pros: Marginally faster than the heap sort for larger sets. O(n log n) Cons: At least twice the memory requirements of the other sorts; recursive. Selection Sort Selects the smallest unsorted item remaining in the list. Then swapping it with the item in the next position to be filled. Example: Analysis Pros: Simple and easy to implement. O(n2). Cons: Inefficient for large lists, so similar to the more efficient insertion sort that the insertion sort should be used in its place. Shell Sort Shell sort works by comparing elements that are distant rather than adjacent elements in an array or list where adjacent elements are compared Shell sort makes multiple passes through a list and sorts a number of equally sized sets using the insertion sort. Shellsort Examples Sort: 18 32 12 5 38 33 16 2 8 Numbers to be sorted, Shell’s increment will be floor(n/2) * floor(8/2) floor(4) = 4 increment 4: 1 2 3 4 18 32 12 5 38 33 16 (visualize underlining) 2 Step 1) Only look at 18 and 38 and sort in order ; 18 and 38 stays at its current position because they are in order. Step 2) Only look at 32 and 33 and sort in order ; 32 and 33 stays at its current position because they are in order. Step 3) Only look at 12 and 16 and sort in order ; 12 and 16 stays at its current position because they are in order. Step 4) Only look at 5 and 2 and sort in order ; 2 and 5 need to be switched to be in order. Shellsort Examples (con’t) Sort: 18 32 12 5 38 33 16 2 Resulting numbers after increment 4 pass: 18 32 12 2 38 33 16 5 * floor(4/2) floor(2) = 2 increment 2: 1 2 18 32 12 2 38 33 16 5 Step 1) Look at 18, 12, 38, 16 and sort them in their appropriate location: 12 38 16 2 18 33 38 5 Step 2) Look at 32, 2, 33, 5 and sort them in their appropriate location: 12 2 16 5 18 32 38 33 Shellsort Examples (con’t) Sort: 18 32 12 5 38 33 16 2 * floor(2/2) floor(1) = 1 increment 1: 1 12 2 16 5 18 32 38 33 2 5 12 16 18 32 33 38 The last increment or phase of Shellsort is basically an Insertion Sort algorithm. Analysis Pros: Efficient for medium-size lists. O(n2) Cons: Somewhat complex algorithm, not nearly as efficient as the merge, heap, and quick sorts. Quick Sort If there are one or less elements in the array to be sorted, return immediately. Pick an element in the array to serve as a "pivot" point. Split the array into two parts - one with elements larger than the pivot and the other with elements smaller than the pivot. Recursively repeat the algorithm for both halves of the original array. Example 52 Analysis Its complexity is affected by the pivot point The worst-case efficiency of the quick sort, O(n2) As long as the pivot point is chosen randomly, the algorithmic complexity of O(n log n). Pros: Extremely fast. Cons: Very complex algorithm, massively recursive Bucket Sort It assumes that the input is generated by a random process that distributes elements uniformly over the interval [0, 1). divide the interval [0, 1) into n equal-sized subintervals, or buckets, and then distribute the n input numbers into the buckets. simply sort the numbers in each bucket and then go through the bucket in order, listing the elements in each. O(n) Example