Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
LECTURE-1 Course Description: Introduction to design and analysis of algorithms, Growth of Functions Recurrences, solution of recurrences by substitution, recursion tree and Master methods, Merge sort, Quick sort and Binary search, Divide and conquer algorithms. The heap sort algorithm, Priority Queue, Dynamic programming algorithms (Matrix-chain multiplication, Elements of dynamic programming, Longest common subsequence) Greedy Algorithms - (Assembly-line scheduling, Activity- selection Problem, Elements of Greedy strategy, Fractional knapsack problem, Huffman codes). Data structure for disjoint sets:- Disjoint set operations, Linked list representation, Disjoint set forests. Graph Algorithms: Breadth first and depth-first search, Minimum Spanning Trees, Kruskal and Prim's algorithms, single- source shortest paths (Bellman-ford and Dijkstra's algorithms), All-pairs shortest paths (Floyd – Warshall Algorithm). Back tracking, Branch and Bound. String matching (Rabin-Karp algorithm), NP - Completeness and reducibility, Approximation algorithms (Vertex-Cover Problem, Traveling Salesman Problem). Course Objective: In this subject students will learn about the following points: 1. To make students familiar with the basic concepts of Algorithm and its use. 2. To learn about the detail design of Algorithm and how analysis it 3. To learn about the different technique of Algorithm. 4. To understand the divide and conquer technique. 5. To understand the dynamic programming technique. 6. To understand the greedy algorithm. 7. To learn about the data compression using Hoffman coding. 8. To learn how to calculate the shortest path. 9. To understand the branch and bound method. 10.To understand the approximation algorithm. Course Outcome: a. Students will know which problem is solved on which algorithm. b. Student will understand the searching technique. This helps to search the data from the database efficiently. c. Students will be able to understand how to reduce the operation in matrix multiplication. d. Students will be able to understand how find longest common subsequences from given two string and the knowledge about DNA testing. e. Students will know how compress the data file. f. Students will know what is the shortest path in case of single-source and multisource graph. g. Students will know how execute maximum no of activity in a fixed period of time out of N no of activity in real life. h. Students will know how to use the cutting of raw materials in real world decision making process. i. Students will know how solve N-queens problem. j. Students will know how a salesman will cover all the city in a geometric area (each city visit once a time) and return to the starting point. k. Graduate will develop confidence for self education and ability for life long learning. l. Graduate can participate and succeed in competitive examination like GATE, GRE, DRDO, Software Company. m. Graduate will show of impact of engineering solution. LECTURE-2 Introduction to algorithm Algorithm is a Persian name derived from Abu Abd Allah Jafar Mohammad ibn Musba al Khowarizmi who was a great mathematician worked on algebra, geometry and astronomy. An algorithm is a sequence of unambiguous instructions for solving a problem, i.e., for obtaining a required output for any legitimate input in a finite amount of time. or Algorithm is defined as a formula or set of finite steps for solving a particular problem or we can say that it is a well-defined computational procedure that takes some values as input and produces some values as output. Unlike programs, algorithms are not dependent on a particular programming language, machine, system, or compiler. They are mathematical entities, which can be thought of as running on some sort of idealized computer with an infinite random access memory and an unlimited word size. Every algorithm has the following criteria • INPUT – Zero or more quantities are externally supplied. All the algorithms should have some input. The logic of the algorithm should work on this input to give the desired result. • OUTPUT – At least one output should be produced from the algorithm based on the input given. • DEFINITENESS – Every step of the algorithm should be clear and unambiguous. Ambiguity mean doubtfulness or uncertainty as regards interpretation. • FINITENESS – Every algorithm should have a proper end. If we trace out the instructions of an algorithm, then for all cases, the algorithm terminates after a finite number of steps. • EFFECTIVENESS – Every step in the algorithm should be easy to understand and can be implemented using any programming language. Programming is a very complex task, and there are a number of aspects of programming that make it so complex. The first is that most programming projects are very large, requiring the coordinated efforts of many people. (This is the topic a course like software engineering.) The next is that many programming projects involve storing and accessing large quantities of data efficiently. (This is the topic of courses on data structures and databases.) The last is that many programming projects involve solving complex computational problems, for which simplistic or naive solutions may not be efficient enough. The complex problems may involve numerical data (the subject of courses on numerical analysis), but often they involve discrete data. This is where the topic of algorithm design and analysis is important. Algorithms as a technology Suppose computers were infinitely fast and computer memory was free. Would you have any reason to study algorithms? The answer is yes, if for no other reason than that you would still like to demonstrate that your solution method terminates and does so with the correct answer. If computers were infinitely fast, any correct method for solving a problem would do. You would probably want your implementation to be within the bounds of good software engineering practice (i.e., well designed and documented), but you would most often use whichever method was the easiest to implement. Of course, computers may be fast, but they are not infinitely fast. And memory may be cheap, but it is not free. Computing time is therefore a bounded resource, and so is space in memory. These resources should be used wisely, and algorithms that are efficient in terms of time or space will help you do so. Issues in algorithm design Algorithms are mathematical objects (in contrast to the must more concrete notion of a computer program implemented in some programming language and executing on some machine). As such, we can reason about the properties of algorithms mathematically. When designing an algorithm there are two fundamental issues to be considered: correctness and efficiency. It is important to justify an algorithm’s correctness mathematically. For very complex algorithms, this typically requires a careful mathematical proof, which may require the proof of many lemmas and properties of the solution, upon which the algorithm relies. Establishing efficiency is a much more complex endeavor. Intuitively, an algorithm’s efficiency is a function of the amount of computational resources it requires, measured typically as execution time and the amount of space, or memory, that the algorithm uses. The amount of computational resources can be a complex function of the size and structure of the input set. In order to reduce matters to their simplest form, it is common to consider efficiency as a function of input size. Some of the techniques used in designing algorithms are brute force approach, divide and conquer approach, greedy approach, and dynamic programming approach. Brute force approach: This is one of the most popular algorithmic techniques and covers wide range of problem solving. In this approach, the algorithm is based on the problem statement and considers complete range of input values for execution. Divide and conquer approach: A technique which divides the problem into smaller units. After dividing the problem, the technique tries to solve the smaller unit. Once the smaller units are solved, the solution of smaller units is combined to get the final solution of the problem. Greedy approach: A technique used to find an optimal solution to the given problem. Dynamic programming problem: A technique which divides the problem into subproblems and then solves it. The sub-problems are overlapping in nature. Issues in algorithm analysis Any algorithm when implemented, uses the computer’s primary memory(RAM) to hold the program and data and CPU for execution. To analyze an algorithm there is need to determine how much TIME an algorithm takes when implemented and how much SPACE it occupies in primary memory. Analysis of algorithm needs great mathematical skills. It enables 1. Quantitative studies (refers to study with some facts and numbers) on algorithms. 2. Knowledge if the software will meet the qualitative requirements. Before starting analysis of algorithms, it is necessary to know the kind of computer on which algorithms are expected to run. If an algorithm is assumed to run on a standard computer, analysis of the algorithm maps to number of operation being executed in it. Analyzing of an algorithm is concerned of three cases 1. Worst case complexity: The worst case complexity of an algorithm is the function defined by the maximum number of steps taken on any instance of size n. 2. Best case complexity: The best case complexity of the algorithm is the function defined by the minimum number of steps taken on any instance of size n. 3. Average case complexity: The average case complexity of the algorithm is the function defined by an average number of steps taken on an instance of size n. Each of these functions defines a numerical function a. Time complexity: The time complexity of an algorithm is the amount of computer time it needs to run to completion. The time T(p) taken by a program p is the sum of the compile time and the run time. The time complexity of an algorithm can depend on various factors like 1. The input to the program. 2. The quality of code generated by the compiler used to create the object program. 3. The nature & speed of the instructions on the machine used to execute the program. The time complexity is denoted by tp which means the tp of an algorithm is given by the number of steps taken by the algorithm to compute the function it was written for. The number of steps is itself a function of the instance characteristics. There are two ways to determine the no of steps needed to solve a particular problem. In the first method we introduce a variable count into the program. This is a global variable with initial value zero. In the second method we have to build a table in which the total no of steps contributed by each statement. Here first we have to determine the no of steps per execution (s/e) of the statement and the total no of times each statement is executed (frequency). The steps per execution of a statement is the amount by which the count changes as a result of the execution of that statement. b. Space complexity: The space complexity of an algorithm is the amount of memory it needs to run to completion. The space requirement s(p) of any algorithm p may therefore be written as s(p)=c+sp where c is a constant and sp is instance characteristic. Generally we concentrate solely on estimating which instance characteristic to use t measure the space requirements. The space needed by any algorithm is seen to be the sum of the following components. 1. A fixed part that is independent of the characteristics (e.g number, size) of inputs and outputs. This part includes the space for code, space for simple variable and fixed size component variable, space for constant and so on. 2. A variable part that consists of the space needed by component variables whose size is dependent on the particular problem instance being solved, the space needed by referenced variable and the recursion stack space etc. LECTURE-3 Asymptotic notation There are notations for determination of order of magnitude of algorithms during priori analysis i.e. analysis of algorithm done before running the algorithm on any computer. These notations are called as asymptotic notations. Asymptotic relates to or of the nature of an asymptote which means a line whose distance to a given curve tends to zero. An asymptote may or may not intersect its associated curve. The asymptotic notation describes the algorithm efficiency and performance in a meaningful way and also describes the behaviour of time or space complexity for large instance characteristics. Asymptotic notation provides us with a way to simplify the functions that arise in analyzing algorithm running times by ignoring constant factors and concentrating on the trends for large values of n. Ignore constant factors: Multiplicative constant factors are ignored. Constant factors appearing exponents cannot be ignored. Focus on large n: Asymptotic analysis means that we consider trends for large values of n. Thus, the fastest growing function of n is the only one that needs to be considered. The asymptotic running time of an algorithm is defined in terms of functions. The domain of these function are set of natural numbers and real numbers. When we look at input sizes large enough to make only the order of growth of the running time relevant, we are studying the asymptotic efficiency of algorithms. That is, we are concerned with how the running time of an algorithm increases with the size of the input in the limit, as the size of the input increases without bound. Usually, an algorithm that is asymptotically more efficient will be the best choice for all but very small inputs. The notations we use to describe the asymptotic running time of an algorithm are defined in terms of functions whose domains are the set of natural numbers N = {0, 1, 2, ...}. Such notations are convenient for describing the worst-case running-time function T (n), which is usually defined only on integer input sizes. Θ-notation . For a given function g(n), we denote by Θ(g(n)) the set of functions Θ(g(n)) = {f(n) : there exist positive constants c1, c2, and n0 such that 0 ≤ c1g(n) ≤ f(n) ≤ c2g(n) for all n ≥ n0}. A function f(n) belongs to the set Θ(g(n)) if there exist positive constants c1 and c2 such that it can be "sandwiched" between c1g(n) and c2g(n), for sufficiently large n. Because Θ(g(n)) is a set, we could write "f(n) ∈ Θ(g(n))" to indicate that f(n) is a member of Θ(g(n)). Instead, we will usually write "f(n) = Θ(g(n))" to express the same notion. The definition of Θ(g(n)) requires that every member f(n) ∈ Θ(g(n)) be asymptotically nonnegative, that is, that f(n) be nonnegative whenever n is sufficiently large. (An asymptotically positive function is one that is positive for all sufficiently large n.) Consequently, the function g(n) itself must be asymptotically nonnegative, or else the set Θ(g(n)) is empty. We shall therefore assume that every function used within Θ-notation is asymptotically nonnegative. Let us briefly justify this intuition by using the formal definition to show that 1/2n2 - 3n = Θ(n2). To do so, we must determine positive constants c1, c2, and n0 such that c1n2 ≤ 1/2n2 - 3n ≤ c2n2 for all n ≥ n0. Dividing by n2 yields c1 ≤ 1/2 - 3/n ≤ c2. The right-hand inequality can be made to hold for any value of n ≥ 1 by choosing c2 ≥ 1/2. Likewise, the left-hand inequality can be made to hold for any value of n ≥ 7 by choosing c1 ≤ 1/14. Thus, by choosing c1 = 1/14, c2 = 1/2, and n0 = 7, we can verify that 1/2n2 - 3n = Θ(n2). Certainly, other choices for the constants exist, but the important thing is that some choice exists. Note that these constants depend on the function 1/2n2 - 3n; a different function belonging to Θ(n2) would usually require different constants. We can also use the formal definition to verify that 6n3 ≠ Θ(n2). Suppose for the purpose of contradiction that c2 and n0 exist such that 6n3 ≤ c2n2 for all n ≥ n0. But then n ≤ c2/6, which cannot possibly hold for arbitrarily large n, since c2 is constant. Intuitively, the lower-order terms of an asymptotically positive function can be ignored in determining asymptotically tight bounds because they are insignificant for large n. A tiny fraction of the highest-order term is enough to dominate the lower-order terms. Thus, setting c1 to a value that is slightly smaller than the coefficient of the highest-order term and setting c2 to a value that is slightly larger permits the inequalities in the definition of Θ-notation to be satisfied. The coefficient of the highest-order term can likewise be ignored, since it only changes c1 and c2 by a constant factor equal to the coefficient. Since any constant is a degree-0 polynomial, we can express any constant function as Θ(n0), or Θ(1). We shall often use the notation Θ(1) to mean either a constant or a constant function with respect to some variable. O-notation The Θ-notation asymptotically bounds a function from above and below. When we have only an asymptotic upper bound, we use O-notation. For a given function g(n), we denote by O(g(n)) (pronounced "big-oh of g of n" or sometimes just "oh of g of n") the set of functions O(g(n)) = {f(n): there exist positive constants c and n0 such that 0 ≤ f(n) ≤ cg(n) for all n ≥ n0}. We use O-notation to give an upper bound on a function, to within a constant factor. We write f(n) = O(g(n)) to indicate that a function f(n) is a member of the set O(g(n)). Note that f(n) = Θ(g(n)) implies f(n) = O(g(n)), since Θ-notation is a stronger notation than O-notation. Using O-notation, we can often describe the running time of an algorithm merely by inspecting the algorithm's overall structure. When we say "the running time is O(n2)," we mean that there is a function f(n) that is O(n2) such that for any value of n, no matter what particular input of size n is chosen, the running time on that input is bounded from above by the value f(n). Equivalently, we mean that the worst-case running time is O(n2). Ω-notation Just as O-notation provides an asymptotic upper bound on a function, Ω-notation provides an asymptotic lower bound. For a given function g(n), we denote by Ω(g(n)) (pronounced "big-omega of g of n" or sometimes just "omega of g of n") the set of functions Ω(g(n)) = {f(n): there exist positive constants c and n0 such that 0 ≤ cg(n) ≤ f(n) for all n ≥ n0}. (Graphic examples of the Θ, O, and Ω notations. In each part, the value of n0 shown is the minimum possible value; any greater value would also work.) (a) Θ-notation bounds a function to within constant factors. We write f(n) = Θ(g(n)) if there exist positive constants n0, c1, and c2 such that to the right of n0, the value of f(n) always lies between c1g(n) and c2g(n) inclusive. (b) O-notation gives an upper bound for a function to within a constant factor. We write f(n) = O(g(n)) if there are positive constants n0 and c such that to the right of n0, the value of f(n) always lies on or below cg(n). (c) Ω-notation gives a lower bound for a function to within a constant factor. We write f(n) = Ω(g(n)) if there are positive constants n0 and c such that to the right of n0, the value of f(n) always lies on or above cg(n). o-notation The asymptotic upper bound provided by O-notation may or may not be asymptotically tight. The bound 2n2 = O(n2) is asymptotically tight, but the bound 2n = O(n2) is not. We use o-notation to denote an upper bound that is not asymptotically tight. We formally define o(g(n)) ("little-oh of g of n") as the set o(g(n)) = {f(n) : for any positive constant c > 0, there exists a constant n0 > 0 such that 0 ≤ f(n) < cg(n) for all n ≥ n0}. For example, 2n = o(n2), but 2n2 ≠ o(n2). The definitions of O-notation and o-notation are similar. The main difference is that in f(n) = O(g(n)), the bound 0 ≤ f(n) ≤ cg(n) holds for some constant c > 0, but in f(n) = o(g(n)), the bound 0 ≤ f(n) < cg(n) holds for all constants c > 0. Intuitively, in the o-notation, the function f(n) becomes insignificant relative to g(n) as n approaches infinity; that is, ω-notation By analogy, ω-notation is to Ω-notation as o-notation is to O-notation. We use ωnotation to denote a lower bound that is not asymptotically tight. One way to define it is by f(n) ∈ ω(g(n)) if and only if g(n) ∈ o(f(n)). Formally, however, we define ω(g(n)) ("little-omega of g of n") as the set ω(g(n)) = {f(n): for any positive constant c > 0, there exists a constant n0 > 0 such that 0 ≤ cg(n) < f(n) for all n ≥ n0}. For example, n2/2 = ω(n), but n2/2 ≠ ω(n2). The relation f(n) = ω(g(n)) implies that if the limit exists. That is, f(n) becomes arbitrarily large relative to g(n) as n approaches infinity. Comparison of functions Many of the relational properties of real numbers apply to asymptotic comparisons as well. For the following, assume that f(n) and g(n) are asymptotically positive. Asymptotic notations Definition: f(n) = O(g(n)) "at most" c, and n0, |f(n)| c|g(n)| n n0 Example: f(n) = 3n2 + 2 g(n) = n2 n0=2, c=4 f(n) = O(n2) e.g. f(n) = n3 + n = O(n3) Def: f(n) = (g(n)) "at least" "lower bound" c, and n0, |f(n)| c|g(n)| n n0 Def: f(n) = (g(n)) c1, c2, and n0, c1|g(n)| |f(n)| c2|g(n)| n n0 Def: f(n) o(g(n)) f ( n) lim g( n) 1 n e.g. f(n) = 3n2+n = o(3n2) Problem size 10 102 103 104 log2n 3.3 6.6 10 13.3 n 10 102 103 104 nlog2n 0.33x102 0.7x103 104 1.3x105 n2 102 104 106 108 2n 1024 1.3x1030 >10100 >10100 n! 3x106 >10100 >10100 >10100 Time Complexity Functions Figure: Rate of growth of common computing time functions O(1) O(log n) O(n) O(n log n) O(n2) O(n3) O(2n) O(n!) O (nn) Ex: log (n!) = log (n(n1)…1) = log2 + log3 +…+ log n n 1 log xdx = log e n 1 ln xdx = log e[ x ln x x]1 n = log e(n ln n n + 1) = n log n n log e + 1.44 n log n 1.44n =(n log n) LECTURE-4 Recurrences Solve recurrences using substitution method When an algorithm contains a recursive call to itself, its running time can often be described by a recurrence. A recurrence is an equation or inequality that describes a function in terms of its value on smaller inputs. For example the worst-case running time T (n) of the MERGE-SORT procedure could be described by the recurrence whose solution was claimed to be T (n) = Θ(n lg n). This chapter offers three methods for solving recurrences-that is, for obtaining asymptotic "Θ" or "O" bounds on the solution. In the substitution method, we guess a bound and then use mathematical induction to prove our guess correct. The recursion-tree method converts the recurrence into a tree whose nodes represent the costs incurred at various levels of the recursion; we use techniques for bounding summations to solve the recurrence. The master method provides bounds for recurrences of the form T (n) = aT (n/b) + f (n), where a ≥ 1, b > 1, and f (n) is a given function; it requires memorization of three cases, but once you do that, determining asymptotic bounds for many simple recurrences is easy. Technicalities In practice, we neglect certain technical details when we state and solve recurrences. A good example of a detail that is often glossed over is the assumption of integer arguments to functions. Normally, the running time T (n) of an algorithm is only defined when n is an integer, since for most algorithms, the size of the input is always an integer. For example, the recurrence describing the worst-case running time of MERGE-SORT is really Boundary conditions represent another class of details that we typically ignore. Since the running time of an algorithm on a constant-sized input is a constant, the recurrences that arise from the running times of algorithms generally have T(n) = Θ(1) for sufficiently small n. Consequently, for convenience, we shall generally omit statements of the boundary conditions of recurrences and assume that T (n) is constant for small n. For example, we normally state recurrence as without explicitly giving values for small n. The reason is that although changing the value of T (1) changes the solution to the recurrence, the solution typically doesn't change by more than a constant factor, so the order of growth is unchanged. When we state and solve recurrences, we often omit floors, ceilings, and boundary conditions. We forge ahead without these details and later determine whether or not they matter. They usually don't, but it is important to know when they do. Experience helps, and so do some theorems stating that these details don't affect the asymptotic bounds of many recurrences encountered in the analysis of algorithms. The substitution method The substitution method for solving recurrences entails two steps: 1. Guess the form of the solution. 2. Use mathematical induction to find the constants and show that the solution works. The name comes from the substitution of the guessed answer for the function when the inductive hypothesis is applied to smaller values. This method is powerful, but it obviously can be applied only in cases when it is easy to guess the form of the answer. The substitution method can be used to establish either upper or lower bounds on a recurrence. As an example, let us determine an upper bound on the recurrence which is similar to recurrences (3.2) and (3.3). We guess that the solution is T (n) = O(n lg n). Our method is to prove that T (n) ≤ cn lg n for an appropriate choice of the constant c > 0. We start by assuming that this bound holds for ⌊n/2⌋, that is, that T (⌊n/2⌋) ≤ c ⌊n/2⌋ lg(⌊n/2⌋). Substituting into the recurrence yields T(n) ≤ 2(c ⌊n/2⌋lg(⌊n/2⌋)) + n ≤ cn lg(n/2) + n = cn lg n - cn lg 2 + n = cn lg n - cn + n ≤ cn lg n, where the last step holds as long as c ≥ 1. Mathematical induction now requires us to show that our solution holds for the boundary conditions. Typically, we do so by showing that the boundary conditions are suitable as base cases for the inductive proof. For the recurrence (3.4), we must show that we can choose the constant c large enough so that the bound T(n) = cn lg n works for the boundary conditions as well. This requirement can sometimes lead to problems. Let us assume, for the sake of argument, that T (1) = 1 is the sole boundary condition of the recurrence. Then for n = 1, the bound T (n) = cn lg n yields T (1) = c1 lg 1 = 0, which is at odds with T (1) = 1. Consequently, the base case of our inductive proof fails to hold. This difficulty in proving an inductive hypothesis for a specific boundary condition can be easily overcome. For example, in the recurrence (3.4), we take advantage of asymptotic notation only requiring us to prove T (n) = cn lg n for n ≥ n0, where n0 is a constant of our choosing. The idea is to remove the difficult boundary condition T (1) = 1 from consideration in the inductive proof. Observe that for n > 3, the recurrence does not depend directly on T (1). Thus, we can replace T (1) by T (2) and T (3) as the base cases in the inductive proof, letting n0 = 2. Note that we make a distinction between the base case of the recurrence (n = 1) and the base cases of the inductive proof (n = 2 and n = 3). We derive from the recurrence that T (2) = 4 and T (3) = 5. The inductive proof that T (n) ≤ cn lg n for some constant c ≥ 1 can now be completed by choosing c large enough so that T (2) ≤ c2 lg 2 and T (3) ≤ c3 lg 3. As it turns out, any choice of c ≥ 2 suffices for the base cases of n = 2 and n = 3 to hold. For most of the recurrences we shall examine, it is straightforward to extend boundary conditions to make the inductive assumption work for small n. Making a good guess Unfortunately, there is no general way to guess the correct solutions to recurrences. Guessing a solution takes experience and, occasionally, creativity. Fortunately, though, there are some heuristics that can help you become a good guesser. You can also use recursion trees to generate good guesses. If a recurrence is similar to one you have seen before, then guessing a similar solution is reasonable. As an example, consider the recurrence T (n) = 2T (⌊n/2⌋ + 17) + n , which looks difficult because of the added "17" in the argument to T on the righthand side. Intuitively, however, this additional term cannot substantially affect the solution to the recurrence. When n is large, the difference between T (⌊n/2⌋) and T (⌊n/2⌋ + 17) is not that large: both cut n nearly evenly in half. Consequently, we make the guess that T (n) = O(n lg n), which you can verify as correct by using the substitution method. Another way to make a good guess is to prove loose upper and lower bounds on the recurrence and then reduce the range of uncertainty. For example, we might start with a lower bound of T (n) = Ω(n) for the recurrence (3.4), since we have the term n in the recurrence, and we can prove an initial upper bound of T (n) = O(n2). Then, we can gradually lower the upper bound and raise the lower bound until we converge on the correct, asymptotically tight solution of T (n) = Θ(n lg n). Subtleties There are times when you can correctly guess at an asymptotic bound on the solution of a recurrence, but somehow the math doesn't seem to work out in the induction. Usually, the problem is that the inductive assumption isn't strong enough to prove the detailed bound. When you hit such a snag, revising the guess by subtracting a lower-order term often permits the math to go through. Consider the recurrence T (n) = T (⌊n/2⌋) + T (⌈n/2⌉) + 1. We guess that the solution is O(n), and we try to show that T (n) ≤ cn for an appropriate choice of the constant c. Substituting our guess in the recurrence, we obtain T (n) ≤ c ⌊n/2⌋ + c ⌈n/2⌉ + 1 = cn + 1 , which does not imply T (n) ≤ cn for any choice of c. It's tempting to try a larger guess, say T (n) = O(n2), which can be made to work, but in fact, our guess that the solution is T (n) = O(n) is correct. In order to show this, however, we must make a stronger inductive hypothesis. Intuitively, our guess is nearly right: we're only off by the constant 1, a lower-order term. Nevertheless, mathematical induction doesn't work unless we prove the exact form of the inductive hypothesis. We overcome our difficulty by subtracting a lower-order term from our previous guess. Our new guess is T (n) ≤ cn b, where b ≥ 0 is constant. We now have T (n) ≤ (c ⌊n/2⌋ - b) + (c ⌈n/2⌉ - b) + 1 = cn - 2b + 1 ≤ cn - b , as long as b ≥ 1. As before, the constant c must be chosen large enough to handle the boundary conditions. Most people find the idea of subtracting a lower-order term counterintuitive. After all, if the math doesn't work out, shouldn't we be increasing our guess? The key to understanding this step is to remember that we are using mathematical induction: we can prove something stronger for a given value by assuming something stronger for smaller values. LECTURE-5 The master method The master method provides a "cookbook" method for solving recurrences of the form where a ≥ 1 and b > 1 are constants and f (n) is an asymptotically positive function. The master method requires memorization of three cases, but then the solution of many recurrences can be determined quite easily, often without pencil and paper. The recurrence describes the running time of an algorithm that divides a problem of size n into a subproblems, each of size n/b, where a and b are positive constants. The a subproblems are solved recursively, each in time T (n/b). The cost of dividing the problem and combining the results of the subproblems is described by the function f (n). For example, the recurrence arising from the MERGE-SORT procedure has a = 2, b = 2, and f (n) = Θ(n). As a matter of technical correctness, the recurrence isn't actually well defined because n/b might not be an integer. Replacing each of the a terms T (n/b) with either T (⌊n/b⌋) or T (⌈n/b⌉) doesn't affect the asymptotic behavior of the recurrence, however. The master theorem The master method depends on the following theorem. Let a ≥ 1 and b > 1 be constants, let f (n) be a function, and let T (n) be defined on the nonnegative integers by the recurrence T(n) = aT(n/b) + f(n), where we interpret n/b to mean either ⌊n/b⌋ or ⌈n/b⌉. Then T (n) can be bounded asymptotically as follows. 1. If 2. If for some constant ∈ > 0, then , then . 3. If for some constant ∈ > 0, and if a f (n/b) ≤ cf (n) for some constant c < 1 and all sufficiently large n, then T (n) = Θ(f (n)). Before applying the master theorem to some examples, let's spend a moment trying to understand what it says. In each of the three cases, we are comparing the function f (n) with the function . Intuitively, the solution to the recurrence is determined by the larger of the two functions. If, as in case 1, the function is the larger, then the solution is . If, as in case 3, the function f (n) is the larger, then the solution is T (n) = Θ(f (n)). If, as in case 2, the two functions are the same size, we multiply by a logarithmic factor, and the solution is . Beyond this intuition, there are some technicalities that must be understood. In the first case, not only must f (n) be smaller than , it must be polynomially smaller. That is, f (n) must be asymptotically smaller than by a factor of n∈ for some constant ∈ > 0. In the third case, not only must f (n) be larger than , it must be polynomially larger and in addition satisfy the "regularity" condition that af (n/b) ≤ cf(n). This condition is satisfied by most of the polynomially bounded functions that we shall encounter. It is important to realize that the three cases do not cover all the possibilities for f (n). There is a gap between cases 1 and 2 when f (n) is smaller than but not polynomially smaller. Similarly, there is a gap between cases 2 and 3 when f (n) is larger than but not polynomially larger. If the function f (n) falls into one of these gaps, or if the regularity condition in case 3 fails to hold, the master method cannot be used to solve the recurrence. Using the master method To use the master method, we simply determine which case (if any) of the master theorem applies and write down the answer. As a first example, consider T (n) = 9T(n/3) + n. For this recurrence, we have a = 9, b = 3, f (n) = n, and thus we have that . Since , where ∈ = 1, we can apply case 1 of the master theorem and conclude that the solution is T (n) = Θ(n2). Now consider T (n) = T (2n/3) + 1, in which a = 1, b = 3/2, f (n) = 1, and . Case 2 applies, since , and thus the solution to the recurrence is T(n) = Θ(lg n). For the recurrence T(n) = 3T(n/4) + n lg n, we have a = 3, b = 4, f (n) = n lg n, and . Since , where ∈ ≈ 0.2, case 3 applies if we can show that the regularity condition holds for f (n). For sufficiently large n, af (n/b) = 3(n/4)lg(n/4) ≤ (3/4)n lg n = cf (n) for c = 3/4. Consequently, by case 3, the solution to the recurrence is T(n) = Θ(nlg n). The master method does not apply to the recurrence T(n) = 2T(n/2) + n lg n, even though it has the proper form: a = 2, b = 2, f(n) = n lg n, and . It might seem that case 3 should apply, since f (n) = n lg n is asymptotically larger than . The problem is that it is not polynomially larger. The is asymptotically less than n∈ for any positive ratio constant ∈. Consequently, the recurrence falls into the gap between case 2 and case 3. Example: T (n) = 9T (n/3) + n a=9, b=3, f (n) = n nlogb a = nlog3 9 = (n2) Since f(n) = O(nlog3 9 - ), where =1, case 1 applies Thus the solution is T (n) = (n2) T (n) = n2 log n + n2 log n/2 + n2 log n/4 + . . . + n2 log n/(2log n) = n2 (log n + log n/2 + log n + 4 + . . .) = n2 (log n · n/2 · n/4 + . . . + n/(2logn)) = n2 (log 2log n) (Using geometric series) = n2 log n (Using 2logn = n) Thus, T (n) = n2 log n. LECTURE-6 The recursion-tree method A recursion tree models the costs (time) of a recursive execution of an algorithm. In a recursion tree, each node represents the cost of a single sub problem somewhere in the set of recursive function invocations. We sum the costs within each level of the tree to obtain a set of per-level costs, and then we sum all the perlevel costs to determine the total cost of all levels of the recursion. Recursion trees are particularly useful when the recurrence describes the running time of a divideand-conquer algorithm. A recursion tree is best used to generate a good guess, which is then verified by the substitution method. When using a recursion tree to generate a good guess, you can often tolerate a small amount of "sloppiness," since you will be verifying your guess later on. If you are very careful when drawing out a recursion tree and summing the costs, however, you can use a recursion tree as a direct proof of a solution to a recurrence. The construction of a recursion tree for the recurrence T(n) = 3T(n/4) + cn2. Part (a) shows T(n), which is progressively expanded in (b)-(d) to form the recursion tree. The fully expanded tree in part (d) has height log4 n (it has log4 n + 1 levels). Because subproblem sizes decrease as we get further from the root, we eventually must reach a boundary condition. How far from the root do we reach one? The subproblem size for a node at depth i is n/4i. Thus, the subproblem size hits n = 1 when n/4i = 1 or, equivalently, when i = log4 n. Thus, the tree has log 4n + 1 levels (0, 1, 2,..., log4 n). Next we determine the cost at each level of the tree. Each level has three times more nodes than the level above, and so the number of nodes at depth i is 3i. Because subproblem sizes reduce by a factor of 4 for each level we go down from the root, each node at depth i, for i = 0, 1, 2,..., log4 n - 1, has a cost of c(n/4i)2. Multiplying, we see that the total cost over all nodes at depth i, for i = 0, 1, 2,..., log4 n - 1, is 3i c(n/4i)2 = (3/16)i cn2. The last level, at depth log4 n, has nodes, each contributing cost T (1), for a total cost of , which is . Now we add up the costs over all levels to determine the cost for the entire tree: Thus, we have derived a guess of T (n) = O(n2) for our original recurrence T (n) = 3T (⌊n/4⌋) + Θ(n2). In this example, the coefficients of cn2 form a decreasing geometric series and, the sum of these coefficients is bounded from above by the constant 16/13. Since the root's contribution to the total cost is cn2, the root contributes a constant fraction of the total cost. In other words, the total cost of the tree is dominated by the cost of the root. As another, more intricate example, Figure 3.2 shows the recursion tree for T (n) = T(n/3) + T(2n/3) + O(n). A recursion tree for the recurrence T(n) = T (n/3) + T (2n/3) + cn. (Again, we omit floor and ceiling functions for simplicity.) As before, we let c represent the constant factor in the O(n) term. When we add the values across the levels of the recursion tree, we get a value of cn for every level. The longest path from the root to a leaf is n → (2/3)n →(2/3)2n → ··· → 1. Since (2/3)kn = 1 when k = log3/2 n, the height of the tree is log3/2 n. Intuitively, we expect the solution to the recurrence to be at most the number of levels times the cost of each level, or O(cn log3/2 n) = O(n lg n). The total cost is evenly distributed throughout the levels of the recursion tree. There is a complication here: we have yet to consider the cost of the leaves. If this recursion tree were a complete binary tree of height log3/2 n, there would be leaves. Since the cost of each leaf is a constant, the total cost of all leaves would then be , which is ω(n lg n). This recursion tree is not a complete binary tree, however, and so it has fewer than leaves. Moreover, as we go down from the root, more and more internal nodes are absent. Consequently, not all levels contribute a cost of exactly cn; levels toward the bottom contribute less. We could work out an accurate accounting of all costs, but remember that we are just trying to come up with a guess to use in the substitution method. Let us tolerate the sloppiness and attempt to show that a guess of O(n lg n) for the upper bound is correct. LECTURE-7 Doubt Clearing Class LECTURE-8 Divide and Conquer: Binary Search Algorithm BinarySearch (data, key): Input: a sorted array of integers (data) and a key (key) Output: the position of the key in the array (-1 if not found) low 0 high data.length - 1 while low <= high do int mid (low + high) / 2 if (data[mid] < key) low mid + 1 else if (data[mid] > key) high mid - 1 else return mid // couldn't find the key return -1 We wish to understand the average case running time of binary search. What do we mean by average case? Let us consider the case where each key in the array is equally likely to be searched for, and what the average time is to find such a key. To make the analysis simpler, let us assume that n = 2k – 1, for some integer k 1. (Why does it make the analysis simpler?) We first notice that in the two lines of pseudo code before the while loop, 3 primitive operations always get executed (an assignment, a subtraction, and then an assignment). However, since these operations happen no matter what the input is, we will ignore them for now. Focusing our attention on the while loop, we notice that each time the program enters the while loop, we execute 6 primitive operations (a <= comparison, an addition, a divide, an assignment, an array index, and a < comparison), before the program might branch depending on the result of the first if statement. Depending on the result of the conditional, the program will execute different numbers of primitive operations. If data [mid] < key, then the program executes 2 more primitive operations. If data [mid] > key, then the program executes 4 more primitive operations (an array index and a < comparison in the next if, and then a subtraction and assignment). If data [mid] = = key, then we execute 3 more primitive operations (2 operations for the if and then 1 operation to return mid). In other words, the number of primitive operations executed in an iteration of the loop is 10 , if data[mid] > key, 9 , if data[mid] = = key, and 8 , if data[mid] < key. We can now construct a tree that summarizes how many operations are executed in the while loop, depending on the number of times the while loop is executed and the result of the comparisons between data[mid] and key in each iteration. Each node in the tree represents the exit from the algorithm because data [mid] = = key. A node is a left child of its parent if in the previous iteration of the while loop, the search was restricted to the left part of the sub array (that is, when data [mid] > key). A node is a right child of its parent if in the previous iteration of the while loop, the search was restricted to the right part of the sub array (that is, when data [mid] < key). Here are the first three levels of the tree: 9 10 + 9 10 + 10 + 9 8+9 10 + 8 + 9 8 + 10 + 9 … 8+8+9 … The root of the tree corresponds to the situation where data[mid] = = key in the first iteration of the while loop. Nodes on the ith level of the tree correspond to finding the key after executing i iterations of the while loop (level 1 is the root of the tree). Notice that the number of numbers at each node is equal to what level it is on. Since n = 2k – 1, then the number of levels in the tree is k = log (n + 1). Now consider the number of operations that occur at each level. Notice that every node has a 9 in it (which comes from the last iteration of the while loop when the item is found) and that for every node with some number of 10s and 8s in it, there is a corresponding node on the same level that has exactly same number of 8s and 10s, respectively. Since we are only interested in adding all of the numbers in the tree, we can simplify our calculations by simply making all of the numbers 9s. Let T be the sum of all of the numbers at all of the nodes in the tree. Except for the 3 operations we ignored earlier, T is the total amount of time it would take to execute binary search n times, one for each item in the array. So, T / n is the average time to find a key. Each node at level i has i 9s in it, and there are 2i–1 nodes in level i. It is easy to see that T / n has the value 9 n log(n 1) i2 i 1 i 1 . log(n 1) We now simply need to compute what i2 i 1 is. Let us write out the terms in i 1 the summation: log(n 1) i2 i 1 = 1 20 + 2 21 + 3 22 + 4 23 + … + log (n + 1) 2log (n + 1) – 1 i 1 = 1 20 + 1 21 + 1 22 + 1 23 + … + 1 2log (n + 1) – 1 (1) + 1 21 + 1 22 + 1 23 + … + 1 2log (n + 1) – 1 (2) + 1 22 + 1 23 + … + l 2log (n + 1) – 1 (3) + 1 23 + … + 1 2log (n + 1) – 1 (4) … + 1 2log (n + 1) – 1 Note how the summation has been broken up. What we do now is notice that the terms on line (1) are the form of a geometric series that grows by a factor of 2. 1 20 + 1 21 + 1 22 + 1 23 + … + 1 2log (n + 1) – 1 = log(n 1) 1 i log(n 1) 1 i i 0 i 2 = 2 – 1 2 i = 2log (n + 1) – 20 i Line (2) can be calculated in a similar way: 1 21 + 1 22 + 1 23 + … + 1 2log (n + 1) – 1 = log(n 1) 1 i log(n 1) 1 i i 1 i 2 = 2 – 0 2 i = 2log (n + 1) – 21 i In general, the sum on line i is 2log (n + 1) – 2i-1 = (n + 1) – 2i-1, and there are log (n + log(n 1) 1) lines. To compute the entire sum, we need to compute ((n 1) 2 i 1 log(n 1) log(n 1) log(n 1) i 1 i 1 i 1 ((n 1) 2i1 ) = (n 1) – 2 = (n + 1) log (n + 1) – (2log (n + 1) – 1) i 1 i 1 ). = (n + 1) log (n + 1) – ((n + 1) – 1) = (n + 1) log (n + 1) – n n 1 So, T = 9 log( n 1) 9 9 log n – 9. n This means that that average number of primitive operations on a successful search is about 9 log n – 6. (We need to add the three operations we ignored at the beginning of our analysis.) Note that in the worst case, a successful search executes approximately 10 log n steps, since the bottommost, leftmost node of the tree has value 10 log (n + 1) – 1. The average time for a successful search is only log n steps smaller than the worst case because almost exactly half of the nodes are at the bottom of the tree, and those nodes have an average value of 9 log n. So, even though there are many nodes near the top of the tree that have a very small value, they influence the average by only a constant (9 log n – 9 on average for all nodes, as opposed to 9 log n on average for just the leaves of the tree) because there are so many more nodes at the bottom of the tree. LECTURE-9 Quick Sort: The array A [p…..r] is partitioned into two nonempty sub array A [p…..q-1] and A [q+1…..r]. Such that each element of A [p…..q-1] is less than or equal to A [q+1…..r]. The index q is compute as part of this partitioning procedure. Quick Algorithm QUICKSORT (A, p, r) 1. 2. 3. 4. If p<r then q=partition(A, p, r) QUICKSORT(A,p,q-1) QUICKSORT(A,q+1,r) Partition (A, p, r) At the beginning of each iteration of the loop of lines 3-6, for any array index k, 1. if p ≤ k ≤ i, then A[k] ≤ x. 2. if i + 1 ≤ k≤ j -1, then A[k] > x. 3. if k = r, then A[k] = x. Example: Time complexity: Worst case: O (n2) (T (n) =T (n-1) +n) Best case: O (nlogn) (T (n) =2T (n/2) +n) Average case: O (nlogn) To sort an array of n distinct elements, quick sort takes O(n log n) time in expection, averaged over all n! Permutations of n elements with equal probability. Why? For a start, it is not hard to see that the partition operation takes O (n) time. In the most unbalanced case, each time we perform a partition we divide the list into two sub lists of size 0 and n − 1 (for example, if all elements of the array are equal). This means each recursive call processes a list of size one less than the previous list. Consequently, we can make n − 1 nested calls before we reach a list of size 1. This means that the call tree is a linear chain of n − 1 nested calls. The ith call does O(n − i) work to do the partition, and , so in that case Quick sort takes O(n²) time. That is the worst case: given knowledge of which comparisons are performed by the sort, there are adaptive algorithms that are effective at generating worst-case input for quick sort on-the-fly, regardless of the pivot selection strategy. So the total running time is still O(n log n). LECTURE-10 Merge Sort Merge sort is an O (n log n) comparison-based sorting algorithm. The merge sort algorithm is based on divide-and-conquer paradigm. It was invented by John von Neumann in 1945. It operates as follows: DIVIDE: Partition the n-element sequence to be sorted into two subsequences of n/2 elements each. CONQUER: Sort the two subsequences recursively using the merge sort. COMBINE: Merge the two sorted subsequences of size n/2 each to produce the sorted sequence consisting of n elements. Mergesort algorithm is based on a divide and conquer strategy. First, the sequence to be sorted is decomposed into two halves (Divide). Each half is sorted independently (Conquer). Then the two sorted halves are merged to a sorted sequence (Combine) (Figure 1). Figure 1: Mergesort(n) The following procedure mergesort sorts a sequence a from index lo to index hi. void mergesort(int lo, int hi) { if (lo<hi) { int m=(lo+hi)/2; mergesort(lo, m); mergesort(m+1, hi); merge(lo, m, hi); } } First, index m in the middle between lo and hi is determined. Then the first part of the sequence (from lo to m) and the second part (from m+1 to hi) are sorted by recursive calls of mergesort. Then the two sorted halves are merged by procedure merge. Recursion ends when lo = hi, i.e. when a subsequence consists of only one element. The main work of the Mergesort algorithm is performed by function merge. There are different possibilities to implement this function. void merge(int lo, int m, int hi) { int i, j, k; i=0; j=lo; // copy first half of array a to auxiliary array b while (j<=m) b[i++]=a[j++]; i=0; k=lo; // copy back next-greatest element at each time while (k<j && j<=hi) if (b[i]<=a[j]) a[k++]=b[i++]; else a[k++]=a[j++]; while (k<j) a[k++]=b[i++]; } Example:- Time complexity: Analysis The straightforward version of function merge requires at most 2n steps (n steps for copying the sequence to the intermediate array b, and at most n steps for copying it back to array a). The time complexity of mergesort is therefore T(n) 2n + 2 T(n/2) and T(1) = 0 The solution of this recursion yields T(n) 2n log(n) O(n log(n)) Thus, the mergesort algorithm is optimal, since the lower bound for the sorting problem of Ω(n log(n)) is attained. In the more efficient variant, function merge requires at most 1.5n steps (n/2 steps for copying the first half of the sequence to the intermediate array b, n/2 steps for copying it back to array a, and at most n/2 steps for processing the second half). This yields a running time of mergesort of at most 1.5n log(n) steps. Conclusions Algorithm mergesort has a time complexity of Θ(n log(n)) which is optimal. A drawback of mergesort is that it needs an additional space of Θ(n) for the temporary array b. LECTURE-11 and 12 Heap Sort Heap sort is a comparison-based sorting algorithm. Heap sort begins by building a heap out of the data set, and then removing the largest item and placing it at the end of the array. After removing the largest item, it reconstructs the heap, removes the largest remaining item, and places it in the next position from the end of the array. This is repeated until there are no items left in the heap. Heaps (Binary heap) The binary heap data structure is an array object that can be viewed as a complete tree. Heap property: Max-heap : A [parent(i)] A[i] Min-heap : A [parent(i)] ≤ A[i] The height of a node in a tree: the number of edges on the longest simple downward path from the node to a leaf. The height of a tree: the height of the root The height of a heap: O(log n). Basic procedures on heap: Max-Heapify procedure, Build-Max-Heap procedure,Heapsort procedure, MaxHeap-Insert procedure, Heap-Extract-Max procedure,Heap-Increase-Key procedure,Heap-Maximum procedure Maintaining the heap property: Heapify is an important subroutine for manipulating heaps. Its inputs are an array A and an index i in the array. When Heapify is called, it is assume that the binary trees rooted at LEFT(i) and RIGHT(i) are heaps.if A[i] is smaller than its child, then it violating its heap property. Building a heap: Heap sort Algorithm: Time complexity: O (nlog2n) LECTURE-13 Priority Queue A priority queue is an abstract data type which is like a regular queue or stack data structure, but where additionally each element has a "priority" associated with it. In a priority queue, an element with high priority is served before an element with low priority. If two elements have the same priority, they are served according to their order in the queue. Operations A priority queue must at least support the following operations: Insert with priority: add an element to the queue with an associated priority. Pull highest priority element: remove the element from the queue that has the highest priority, and return it. This is also known as "pop element (Off)", "get maximum element" or "get front (most) _element". Some conventions reverse the order of priorities, considering lower values to be higher priority, so this may also be known as "get minimum element", and is often referred to as "get-min" in the literature. This may instead be specified as separate "peek at highest priority element" and "delete element" functions, which can be combined to produce "pull highest priority element". A data structure for maintaining a set S of elements with the following Operations: Maximum(S): returns the element in S with the largest key. Insert(S; x): inserts an element x into S. Extract-Max(S): removes and returns the element in S with the largest key. The following function is useful in many applications of priority queue. Increase-Key(S; i; k): increases i key to k. Basic idea: use a max-heap. Build a heap Problem: Convert an array A[1::n] of integers into a max-heap. Procedure Build-Max-Heap (A[1::n]) 1. heap-size[A] <-n 2. for i <- parent(n) downto 1 do 3. MaxHeapify(A; i) Running time: O (n log n). All the above operations can be done in O(log n) time, where n is the size of S. Procedure Increase-Key(A[1::n],i,k) /* increase A[i] to k */ 1. 2. 3. 4. 5. if (k < A[i]) then error \new key smaller than current key" A[i] <- k while (i > 1) and (A[parent(i)] < A[i]) do exchange A[i] $ A[parent(i)] i<- parent(i) A priority queue is often considered to be a "container data structure". The Standard Template Library (STL), and the C++ 1998 standard, specifies priority queue as one of the STL container adaptor class templates. It implements a maxpriority-queue, and has three parameters: a comparison object for sorting such as a function (defaults to less<T> if unspecified), the underlying container for storing the data structures (defaults to std::vector<T>), and two iterators to the beginning and end of a sequence. Unlike actual STL containers, it does not allow iteration of its elements (it strictly adheres to its abstract data type definition). STL also has utility functions for manipulating another random-access container as a binary max-heap. The Boost (C++ libraries) also have an implementation in the library heap. LECTURE-14 Sorting [Lower Bound Analysis] Insertion Sort: - Each time when you add a new element to the list, compare it with the rest of the list, and add it to the right place. Example: - Insert Sort 34, 8, 64, 51, 32, and 21 - Show how sorting process works. Void insertion Sort (vector<comparable> & a) { int j; for (int p=1; p < a. size( ); p++) { Comparable tmp = a[p]; for (j = p; j > 0 && tmp < a[j-1]; j--) a[j] = a[j-1]; a[j] = tmp; } } The performance of insert sort N I=1 2 + 3 + 4 +………..+ N = 0(N2 ) A collection of simple sorting algorithm based on comparison two adjacent elements and exchange them if not in right order. Ex’s – insert Sort, selection Sort, bubble Sort, etc. A lower bound for simple sorting algorithm based on comparing adjacent elements Definition: - Give two integers x, y. (x, y) is called an inversion if x > y. Given a list of N integers, what is the average number of inversions in it? Given a list L, we reverse it to get a new list Lr. Given any pair of two adjacent elements (x, y), it represents an inversion in exactly one of L and Lr. The total number of all those inversions in L and Lr is N (N-1) / 2 So, the average number of inversions in the list L is N (N-1) / 4 The average number of inversions in a list tells us the lower bound of the algorithms. You need to exchange all the inversions! So, the performance of any such simple sorting algorithms based on comparison two adjacent elements is ) The above lower bound tells us that if we want to break ) barrier, we need to compare element at distance. Heap Sort We still remember what is a heap, right? For a heap, we know that the minimum element is at the root We can exploit this property to do sorting. 1) Build a heap 2) Remove the root, rehear 3) Repeat step 2 until empty. Performance of Heap Sort. Step 1 = 0(N) Step 2,3 = 0(N log N) Total = 0 (N log N) Quick sort = the fastest known sorting algorithm in practice. Its average running time is 0(N log N). its worst-case performance is 0(N2 ). A recursive algorithm based on divide and conquers. - If the number of elements in S is 0 or 1, then return. - Pick any element v in S. This is called the pivot. - Partition S – {v} into two disjoint groups = S1 = {x S – {v} | x v}, and S2 = {x S – {v} | x v}. -Return { quicksort (S1) followed by v followed by quicksort (S2) } A decision is a binary tree. Each internal node respects a comparison of two integers; it has two children nodes represent respectively the true/ false outcome of the comparison. Every leaf node represents a sorted list. The path from the root to the leaf represents the sorting process that results the sorted list. Show some example of decision trees. Given n district integers, their n! Many different ways to array them in a sorted manner. This implies that any decision tree for sorting a list of n integers has at least n! Leaf node. The lower bound for sorting would be the lower bound on the path from the root to a leaf. Since the decision has at least n! leaf nodes, Any path from the root to a leaf is at least log2 N! N log N = (N log2 N) Possible, but other ways than comparison should be used. LECTURE-15 Doubt Clearing Class/BPUT Question paper solved LECTURE-16 Dynamic Programming: Elements of Dynamic Programming • Like divide-and-conquer, solve problem by combining the solutions to subproblems. • Differences between divide-and-conquer and DP: – Independent sub-problems, solve sub-problems independently and recursively, (so same sub(sub)problems solved repeatedly) Sub-problems are dependent, i.e., sub-problems share sub-sub-problems, every sub(sub)problem solved just once, solutions to sub(sub)problems are stored in a table and used for solving higher level sub-problems One technique that attempts to solve problems by dividing them into sub problems is called dynamic programming. It uses a “bottom-up” approach in that the sub problems are arranged and solved in a systematic fashion, which leads to a solution to the original problem. This bottom-up approach implementation is more efficient than a “top-down” counterpart mainly because duplicated computation of the same problems is eliminated. This technique is typically applied to solving optimization problems, although it is not limited to only optimization problems. Dynamic programming typically involves two steps: (1) develop a recursive strategy for solving the problem; and (2) develop a “bottom-up” implementation without recursion Application domain of DP: Optimization problem: find a solution with optimal (maximum or minimum) value. An optimal solution, not the optimal solution, since may more than one optimal solution, any one is OK Typical steps of DP: • Characterize the structure of an optimal solution. • Recursively define the value of an optimal solution. • Compute the value of an optimal solution in a bottom-up fashion. • Compute an optimal solution from computed/stored information. • Example: Compute the binomial coefficients C(n, k) defined by the following recursive formula: if k 0 or k n; 1, C (n, k ) C (n 1, k ) C (n 1, k 1), if 0 k n; 0, otherwise. • The following “call tree” demonstrates repeated (duplicated) computations in a straightforward recursive implementation • Example: Solve the make-change problem using dynamic programming. Suppose there are n types of coin denominations, d1, d2, …, and dn. (We may assume one of them is penny.) There are an infinite supply of coins of each type. To make change for an arbitrary amount j using the minimum number of coins, we first apply the following recursive idea: • If there are only pennies, the problem is simple: simply use j pennies to make change for the total amount j. More generally, if there are coin types 1 through i, let C[i, j] stands for the minimum number of coins for making change of amount j. By considering coin denomination i, there are two cases: either we use at least one coin denomination i, or we don’t use coin type i at all. • In the first case, the total number of coins must be 1+ C[i, j–di] because the total amount is reduced to j – di after using one coin of amount di, the rest of coin selection from the solution of C[i, j] must be an optimal solution to the reduced problem with reduced amount, still using coin types 1 through i. • In the second case, i.e., suppose no coins of denomination i will be used in an optimal solution. Thus, the best solution is identical to solving the same problem with the total amount j but using coin types 1 through i – 1, i.e. C[i –1 , j]. Therefore, the overall best solution must be the better of the two alternatives, resulting in the following recurrence: • C[i, j] = min (1 + C[i, j – di ], C[i –1 , j]) • The boundary conditions are when i 0 or when j < 0 (in which case let C[i, j] = ), and when j = 0 (let C[i, j] = 0). The Principle of Optimality: In solving optimization problems which require making a sequence of decisions, such as the change making problem, we often apply the following principle in setting up a recursive algorithm: Suppose an optimal solution made decisions d1, d2, and …, dn. The subproblem starting after decision point di and ending at decision point dj, also has been solved with an optimal solution made up of the decisions di through dj. That is, any subsequence of an optimal solution constitutes an optimal sequence of decisions for the corresponding subproblem. This is known as the principle of optimality which can be illustrated by the shortest paths in weighted graphs as follows The Partition Problem: Given a set of positive integers, A = {a1, a2, …, an}. The question is to select a subset B of A such that the sum of the numbers in B equals the sum of the numbers not in B, i.e., . We may assume that the sum of all numbers in A is 2K, an even number. We now propose a dynamic programming solution. For 1 i n and 0 j K, define P[i, j] = True if there exists a subset of the first i through ai whose sum equals j; False otherwise. numbers a1 Thus, P[i, j] = True if either j = 0 or if (i = 1 and j = a1). When i > 1, we have the following recurrence: P[i, j] = P[i – 1, j] or (P[i – 1, j – ai] if j – ai 0) That is, in order for P[i, j] to be true, either there exists a subset of the first i – 1 numbers whose sum equals j, or whose sum equals j – ai (this latter case would use the solution of P[i – 1, j – ai] and add the number ai. The value P[n, K] is the answer. LECTURE-17 and 18 Matrix Chain Multiplication: (MCM) • Problem: given A1, A2, …,An, compute the product: A1A2…An , find the fastest way (i.e., minimum number of multiplications) to compute it. • Suppose two matrices A(p,q) and B(q,r), compute their product C(p,r) in p q r multiplications – for i=1 to p for j=1 to r C[i,j]=0 – for i=1 to p • for j=1 to r for k=1 to q C[i,j] = C[i,j]+ A[i,k]B[k,j] • Different parenthesizations will have different number of multiplications for product of multiple matrices • Example: A(10,100), B(100,5), C(5,50) – If ((A B) C), 10 100 5 +10 5 50 =7500 – If (A (B C)), 10 100 50+100 5 50=75000 • The first way is ten times faster than the second !!! • Denote A1, A2, …,An by < p0,p1,p2,…,pn> – i.e, A1(p0,p1), A2(p1,p2), …, Ai(pi-1,pi),… An(pn-1,pn) • Intuitive brute-force solution: Counting the number of parenthesizations by exhaustively checking all possible parenthesizations. • Let P(n) denote the number of alternative parenthesizations of a sequence of n matrices: – P(n) = 1 if n=1 k=1n-1 P(k)P(n-k) if n2 • The solution to the recursion is (2n). • So brute-force will not work. • Step 1: structure of an optimal parenthesization • Let Ai..j (ij) denote the matrix resulting from AiAi+1…Aj • Any parenthesization of AiAi+1…Aj must split the product between Ak and Ak+1 for some k, (ik<j). The cost = # of computing Ai..k + # of computing Ak+1..j + # Ai..k Ak+1..j. • If k is the position for an optimal parenthesization, the parenthesization of “prefix” subchain AiAi+1…Ak within this optimal parenthesization of AiAi+1…Aj must be an optimal parenthesization of AiAi+1…Ak. AiAi+1…Ak Ak+1…Aj • Step 2: a recursive relation – Let m[i,j] be the minimum number of multiplications for AiAi+1…Aj – m[1,n] will be the answer – m[i,j] = 0 if i = j min {m[i,k] + m[k+1,j] +pi-1pkpj } if i<j • Step 3, Computing the optimal cost – If by recursive algorithm, exponential time (2n) (ref. to P.346 for the proof.), no better than brute-force. – Total number of subproblems: +n = (n2) – Recursive algorithm will encounter the same subproblem many times. – If tabling the answers for subproblems, each subproblem is only solved once. overlapping subproblems and solve every subproblem just once. – array m[1..n,1..n], with m[i,j] records the optimal cost for AiAi+1…Aj . – array s[1..n,1..n], s[i,j] records index k which achieved the optimal cost when computing m[i,j]. – Suppose the input to the algorithm is p=< p0 , p1 ,…, pn >. MCM DP—order of matrix computations: • Step 4, constructing a parenthesization order for the optimal solution. Since s[1..n,1..n] is computed, and s[i,j] is the split position for AiAi+1…Aj , i.e, Ai…As[i,j] and As[i,j] +1…Aj , thus, the parenthesization order can be obtained from s[1..n,1..n] recursively, beginning from s[1,n]. LECTURE-19 Longest Common Subsequence (LCS): The longest common subsequence (LCS) problem is to find the longest subsequence common to all sequences in a set of sequences (often just two). • DNA analysis, two DNA string comparisons. • DNA string: a sequence of symbols A,C,G,T. S=ACCGGTCGAGCTTCGAAT • Subsequence (of X): is X with some symbols left out. Z=CGTC is a subsequence of X=ACGCTAC. • Common subsequence Z (of X and Y): a subsequence of X and also a subsequence of Y. Z=CGA is a common subsequence of both X=ACGCTAC and Y=CTGACA. • Longest Common Subsequence (LCS): the longest one of common subsequences. Z' =CGCA is the LCS of the above X and Y. • LCS problem: given X=<x1, x2,…, xm> and Y=<y1, y2,…, yn>, find their LCS. The LCS problem has what is called an "optimal substructure": the problem can be broken down into smaller, simple "sub problems", which can be broken down into yet simpler sub problems, and so on, until, finally, the solution becomes trivial. The LCS problem also has what are called "overlapping sub problems": the solution to a higher sub problem depends on the solutions to several of the lower sub problems. Problems with these two properties—optimal substructure and overlapping sub problems—can be approached by a problem-solving technique called dynamic programming, in which the solution is built up starting with the simplest sub problems. LCS DP –step 1: Optimal Substructure • Characterize optimal substructure of LCS. • Theorem 15.1: Let X=<x1, x2,…, xm> (= Xm) and Y=<y1, y2,…,yn> (= Yn) and Z=<z1, z2,…, zk> (= Zk) be any LCS of X and Y, – 1. if xm= yn, then zk= xm= yn, and Zk-1 is the LCS of Xm-1 and Yn-1. – 2. if xm yn, then zk xm implies Z is the LCS of Xm-1 and Yn. – 3. if xm yn, then zk yn implies Z is the LCS of Xm and Yn-1. LCS DP –step 2: Recursive Solution • What the theorem says: – If xm= yn, find LCS of Xm-1 and Yn-1, then append xm. – If xm yn, find LCS of Xm-1 and Yn and LCS of Xm and Yn-1, take which one is longer. • Overlapping substructure: – Both LCS of Xm-1 and Yn and LCS of Xm and Yn-1 will need to solve LCS of Xm-1 and Yn-1. • c[i,j] is the length of LCS of Xi and Yj . LCS DP-- step 3:Computing the Length of LCS • c[0..m,0..n], where c[i,j] is defined as above. – c[m,n] is the answer (length of LCS). • b[1..m,1..n], where b[i,j] points to the table entry corresponding to the optimal subproblem solution chosen when computing c[i,j]. – From b[m,n] backward to find the LCS. The problem with the recursive solution is that the same subproblems get called many different times. A subproblem consists of a call to lcs_length, with the arguments being two suffixes of A and B, so there are exactly (m+1)(n+1) possible subproblems (a relatively small number). If there are nearly 2^n recursive calls, some of these subproblems must be being solved over and over. The dynamic programming solution is to check whenever we want to solve a subproblem, whether we've already done it before. If so we look up the solution instead of recomputing it. Implemented in the most direct way, we just add some code to our recursive algorithm to do this look up -- this "top down", recursive version of dynamic programming is known as "memoization". In the LCS problem, subproblems consist of a pair of suffixes of the two input strings. To make it easier to store and look up subproblem solutions, LECTURE-20 Greedy Algorithms Definition: An algorithm that always takes the best immediate, or local, solution while finding an answer. Greedy algorithms find the overall, or globally, optimal solution for some optimization problems, but may find less-than-optimal solutions for some instances of other problems. An optimization problem is one in which you want to find, not just a solution, but the best solution A “greedy algorithm” sometimes works well for optimization problems A greedy algorithm works in phases. At each phase: o we take the best you can get right now, without regard for future consequences We hope that by choosing a local optimum at each step, we will end up at a global optimum. We have completed data structures. We now are going to look at algorithm design methods. Often we are looking at optimization problems whose performance is exponential. For an optimization problem, we are given a set of constraints and an optimization function. Solutions that satisfy the constraints are called feasible solutions. A feasible solution for which the optimization function has the best possible value is called an optimal solution. Cheapest lunch possible: Works by getting the cheapest meat, fruit, vegetable, etc. In a greedy method we attempt to construct an optimal solution in stages. At each stage we make a decision that appears to be the best (under some criterion) at the time. A decision made at one stage is not changed in a later stage, so each decision should assure feasibility. Consider getting the best major: What is best now, may be worst later. Consider change making: Given a coin system and an amount to make change for, we want minimal number of coins. A greedy criterion could be, at each stage increase the total amount of change constructed as much as possible. In other words, always pick the coin with the greatest value at every step. A greedy solution is optimal for some change systems. Machine scheduling: Have a number of jobs to be done on a minimum number of machines. Each job has a start and end time. Order jobs by start time. If an old machine becomes available by the start time of the task to be assigned, assign the task to this machine; if not assign it to a new machine. This is an optimal solution. Note that our Huffman tree algorithm is an example of a greedy algorithm: Pick least weight trees to combine. Structure Greedy Algorithm Initially the set of chosen items is empty i.e., solution set. At each step o item will be added in a solution set by using selection function. o IF the set would no longer be feasible reject items under consideration (and is never consider again). o ELSE IF set is still feasible THEN add the current item. A feasible set (of candidates) is promising if it can be extended to produce not merely a solution, but an optimal solution to the problem. In particular, the empty set is always promising why? (because an optimal solution always exists) Unlike Dynamic Programming, which solves the subproblems bottom-up, a greedy strategy usually progresses in a top-down fashion, making one greedy choice after another, reducing each problem to a smaller one. Greedy-Choice Property The "greedy-choice property" and "optimal substructure" are two ingredients in the problem that lend to a greedy strategy. Greedy-Choice Property It says that a globally optimal solution can be arrived at by making a locally optimal choice. Greedy algorithms can be characterized as being 'short sighted', and as 'nonrecoverable'. They are ideal only for problems which have 'optimal substructure'. Despite this, greedy algorithms are best suited for simple problems (e.g. giving change). It is important, however, to note that the greedy algorithm can be used as a selection algorithm to prioritize options within a search, or branch and bound algorithm. There are a few variations to the greedy algorithm: Pure greedy algorithms Orthogonal greedy algorithms Relaxed greedy algorithms Applications Greedy algorithms mostly (but not always) fail to find the globally optimal solution, because they usually do not operate exhaustively on all the data. They can make commitments to certain choices too early which prevent them from finding the best overall solution later. For example, all known greedy coloring algorithms for the graph coloring problem and all other NP-complete problems do not consistently find optimum solutions. Nevertheless, they are useful because they are quick to think up and often give good approximations to the optimum. If a greedy algorithm can be proven to yield the global optimum for a given problem class, it typically becomes the method of choice because it is faster than other optimisation methods like dynamic programming. Examples of such greedy algorithms are Kruskal's algorithm and Prim's algorithm for finding minimum spanning trees, Dijkstra's algorithm for finding single-source shortest paths, and the algorithm for finding optimum Huffman trees. The theory of matroids, and the more general theory of greedoids, provide whole classes of such algorithms. Greedy algorithms appear in network routing as well. Using greedy routing, a message is forwarded to the neighboring node which is "closest" to the destination. The notion of a node's location (and hence "closeness") may be determined by its physical location, as in geographic routing used by ad-hoc networks. Location may also be an entirely artificial construct as in small world routing and distributed hash table. LECTURE-21 Knapsack Problem Statement A thief robbing a store and can carry a maximal weight of w into their knapsack. There are n items and ith item weigh wi and is worth vi dollars. What items should thief take? There are two versions of problem I. Fractional knapsack problem The setup is same, but the thief can take fractions of items, meaning that the items can be broken into smaller pieces so that thief may decide to carry only a fraction of xi of item i, where 0 ≤ xi ≤ 1. Exhibit greedy choice property. Greedy algorithm exists. Exhibit optimal substructure property. II. 0-1 knapsack problem The setup is the same, but the items may not be broken into smaller pieces, so thief may decide either to take an item or to leave it (binary choice), but may not take a fraction of an item. Exhibit No greedy choice property. No greedy algorithm exists. Exhibit optimal substructure property. Only dynamic programming algorithm exists. Greedy Solution to the Fractional Knapsack Problem There are n items in a store. For i =1,2, . . . , n, item i has weight wi > 0 and worth vi > 0. Thief can carry a maximum weight of W pounds in a knapsack. In this version of a problem the items can be broken into smaller piece, so the thief may decide to carry only a fraction xi of object i, where 0 ≤ xi ≤ 1. Item i contributes xiwi to the total weight in the knapsack, and xivi to the value of the load. In Symbol, the fraction knapsack problem can be stated as follows. maximize nSi=1 xivi subject to constraint nSi=1 xiwi ≤ W It is clear that an optimal solution must fill the knapsack exactly, for otherwise we could add a fraction of one of the remaining objects and increase the value of the load. Thus in an optimal solution nSi=1 xiwi = W. Greedy-fractional-knapsack (w, v, W) FOR i =1 to n do x[i] =0 weight = 0 while weight < W do i = best remaining item IF weight + w[i] ≤ W then x[i] = 1 weight = weight + w[i] else x[i] = (w - weight) / w[i] weight = W return x Analysis If the items are already sorted into decreasing order of vi / wi, then the while-loop takes a time in O(n); Therefore, the total time including the sort is in O(n log n). If we keep the items in heap with largest vi/wi at the root. Then creating the heap takes O(n) time while-loop now takes O(log n) time (since heap property must be restored after the removal of root) Although this data structure does not alter the worst-case, it may be faster if only a small number of items are need to fill the knapsack. One variant of the 0-1 knapsack problem is when order of items are sorted by increasing weight is the same as their order when sorted by decreasing value. The optimal solution to this problem is to sort by the value of the item in decreasing order. Then pick up the most valuable item which also has a least weight. First, if its weight is less than the total weight that can be carried. Then deduct the total weight that can be carried by the weight of the item just pick. The second item to pick is the most valuable item among those remaining. Keep follow the same strategy until thief cannot carry more item (due to weight). Proof One way to proof the correctness of the above algorithm is to prove the greedy choice property and optimal substructure property. It consist of two steps. First, prove that there exists an optimal solution begins with the greedy choice given above. The second part prove that if A is an optimal solution to the original problem S, then A - a is also an optimal solution to the problem S - s where a is the item thief picked as in the greedy choice and S - s is the subproblem after the first greedy choice has been made. The second part is easy to prove since the more valuable items have less weight. Note that if v` / w` , is not it can replace any other because w` < w, but it increases the value because v` > v. Theorem The fractional knapsack problem has the greedy-choice property. Proof Let the ratio v`/w` is maximal. This supposition implies that v`/w` ≥ v/w for any pair (v, w), so v`v / w > v for any (v, w). Now Suppose a solution does not contain the full w` weight of the best ratio. Then by replacing an amount of any other w with more w` will improve the value. LECTURE-22 Huffman Codes Huffman code is a technique for compressing data. Huffman's greedy algorithm look at the occurrence of each character and it as a binary string in an optimal way. Example Suppose we have a data consists of 100,000 characters that we want to compress. The characters in the data occur with following frequencies. a b c d e f Frequency 45,000 13,000 12,000 16,000 9,000 5,000 Consider the problem of designing a "binary character code" in which each character is represented by a unique binary string. Fixed Length Code In fixed length code, needs 3 bits to represent six(6) characters. a Frequency b c d e f 45,000 13,000 12,000 16,000 9,000 5,000 Fixed Length 000 code 001 010 011 100 101 This method require 3000,000 bits to code the entire file. How do we get 3000,000? Total number of characters are 45,000 + 13,000 + 12,000 + 16,000 + 9,000 + 5,000 = 1000,000. Add each character is assigned 3-bit codeword => 3 * 1000,000 = 3000,000 bits. Conclusion Fixed-length code requires 300,000 bits while variable code requires 224,000 bits. • Saving of approximately 25%. Prefix Codes In which no codeword is a prefix of other codeword. The reason prefix codes are desirable is that they simply encoding (compression) and decoding. Can we do better? A variable-length code can do better by giving frequent characters short codewords and infrequent characters long codewords. a Frequency Fixed Length code b c d e f 45,000 13,000 12,000 16,000 9,000 5,000 0 101 100 111 1101 1100 Character 'a' are 45,000 each character 'a' assigned 1 bit codeword. 1 * 45,000 = 45,000 bits. Characters (b, c, d) are 13,000 + 12,000 + 16,000 = 41,000 each character assigned 3 bit codeword 3 * 41,000 = 123,000 bits Characters (e, f) are 9,000 + 5,000 = 14,000 each character assigned 4 bit codeword. 4 * 14,000 = 56,000 bits. Implies that the total bits are: 45,000 + 123,000 + 56,000 = 224,000 bits Encoding: Concatenate the code words representing each characters of the file. String Encoding TEA 10 00 010 SEA 011 00 010 TEN 10 00 110 Decoding Since no codeword is a prefix of other, the codeword that begins an encoded file is unambiguous. To decode (Translate back to the original character), remove it from the encode file and repeatedly parse. For example in "variable-length codeword" table, the string 001011101 parse uniquely as 0.0.101.1101, which is decode to aabe. The representation of "decoding process" is binary tree, whose leaves are characters. We interpret the binary codeword for a character as path from the root to that character, where 0 means "go to the left child" and 1 means "go to the right child". Note that an optimal code for a file is always represented by a full (complete) binary tree. Theorem code. A Binary tree that is not full cannot correspond to an optimal prefix Proof Let T be a binary tree corresponds to prefix code such that T is not full. Then there must exist an internal node, say x, such that x has only one child, y. Construct another binary tree, T`, which has save leaves as T and have same depth as T except for the leaves which are in the subtree rooted at y in T. These leaves will have depth in T`, which implies T cannot correspond to an optimal prefix code. To obtain T`, simply merge x and y into a single node, z is a child of parent of x (if a parent exists) and z is a parent to any children of y. Then T` has the desired properties: it corresponds to a code on the same alphabet as the code which are obtained, in the sub tree rooted at y in T have depth in T` strictly less (by one) than their This completes the proof. a Frequency Fixed Length code Variable-length Code depth b c d in e 45,000 13,000 12,000 16,000 9,000 T. f 5,000 000 001 010 011 100 101 0 101 100 111 1101 1100 Constructing a Huffman code A greedy algorithm that constructs an optimal prefix code called a Huffman code. The algorithm builds the tree T corresponding to the optimal code in a bottom-up manner. It begins with a set of |c| leaves and perform |c|-1 "merging" operations to create the final tree. Data Structure used: Priority queue = Q Huffman (c) n = |c| Q=c for i =1 to n-1 do z = Allocate-Node () x = left[z] = EXTRACT_MIN(Q) y = right[z] = EXTRACT_MIN(Q) f[z] = f[x] + f[y] INSERT (Q, z) return EXTRACT_MIN(Q) Analysis Q implemented as a binary heap. Line 2 can be performed by using BUILD-HEAP (P. 145; CLR) in O(n) time. FOR loop executed |n| - 1 times and since each heap operation requires O(lg n) time. => the FOR loop contributes (|n| - 1) O(lg n) => O(n lg n) Thus the total running time of Huffman on the set of n characters is O(nlg n). LECTURE-23 An Activity Selection Problem An activity-selection is the problem of scheduling a resource among several competing activity. Problem Statement Given a set S of n activities with and start time, Si and fi, finish time of an ith activity. Find the maximum size set of mutually compatible activities. Compatible Activities Activities i and j are compatible if the half-open internal [si, fi) and [sj, fj) do not overlap, that is, i and j are compatible if si ≥ fj and sj ≥ fi Greedy Algorithm for Selection Problem I. Sort the input activities by increasing finishing time. f1 ≤ f2 ≤ . . . ≤ fn II. Call GREEDY-ACTIVITY-SELECTOR (s, f) 1. 2. 3. 4. 5. 6. 7. 8. n = length [s] A={i} j=1 for i = 2 to n do if si ≥ fj then A= AU{i} j=i return set A Operation of the algorithm Let 11 activities are given S = {p, q, r, s, t, u, v, w, x, y, z} start and finished times for proposed activities are (1, 4), (3, 5), (0, 6), 5, 7), (3, 8), 5, 9), (6, 10), (8, 11), (8, 12), (2, 13) and (12, 14). A = {p} Initialization at line 2 A = {p, s} line 6 - 1st iteration of FOR - loop A = {p, s, w} line 6 -2nd iteration of FOR - loop A = {p, s, w, z} line 6 - 3rd iteration of FOR-loop Out of the FOR-loop and Return A = {p, s, w, z} Analysis Part I requires O(n lg n) time (use merge of heap sort). Part II requires θ(n) time assuming that activities were already sorted in part I by their finish time. Correctness Note that Greedy algorithm do not always produce optimal solutions but GREEDYACTIVITY-SELECTOR does. Theorem Algorithm GREED-ACTIVITY-SELECTOR produces solution of maximum size for the activity-selection problem. Proof Idea Show the activity problem satisfied I. II. Greedy choice property. Optimal substructure property. Proof I. II. Let S = {1, 2, . . . , n} be the set of activities. Since activities are in order by finish time. It implies that activity 1 has the earliest finish time. Suppose, A S is an optimal solution and let activities in A are ordered by finish time. Suppose, the first activity in A is k. If k = 1, then A begins with greedy choice and we are done (or to be very precise, there is nothing to proof here). If k 1, we want to show that there is another solution B that begins with greedy choice, activity 1. Let B = A - {k} {1}. Because f1 fk, the activities in B are disjoint and since B has same number of activities as A, i.e., |A| = |B|, B is also optimal. Once the greedy choice is made, the problem reduces to finding an optimal solution for the problem. If A is an optimal solution to the original problem S, then A` = A - {1} is an optimal solution to the activity-selection problem S` = {i S: Si fi}. why? Because if we could find a solution B` to S` with more activities then A`, adding 1 to B` would yield a solution B to S with more activities than A, there by contradicting the optimality. As an example consider the example. Given a set of activities to among lecture halls. Schedule all the activities using minimal lecture halls. In order to determine which activity should use which lecture hall, the algorithm uses the GREEDY-ACTIVITY-SELECTOR to calculate the activities in the first lecture hall. If there are some activities yet to be scheduled, a new lecture hall is selected and GREEDY-ACTIVITY-SELECTOR is called again. This continues until all activities have been scheduled. GREED-ACTIVITY-SELECTOR (s, f, n) j = first (s) A = i = j + 1 to n if s(i] not = "-" then if s[i] ≥ f[j]| then A = AU{i} s[i] = "-" j=i return A Analysis In the worst case, the number of lecture halls require is n. GREED-ACTIVITYSELECTOR runs in θ(n). The running time of this algorithm is O(n2). Two important Observations Choosing the activity of least duration will not always produce an optimal solution. For example, we have a set of activities {(3, 5), (6, 8), (1, 4), (4, 7), (7, 10)}. Here, either (3, 5) or (6, 8) will be picked first, which will be picked first, which will prevent the optimal solution of {(1, 4), (4, 7), (7, 10)} from being found. Choosing the activity with the least overlap will not always produce solution. For example, we have a set of activities {(0, 4), (4, 6), (6, 10), (0, 1), (1, 5), (5, 9), (9, 10), (0, 3), (0, 2), (7, 10), (8, 10)}. Here the one with the least overlap with other activities is (4, 6), so it will be picked first. But that would prevent the optimal solution of {(0, 1), (1, 5), (5, 9), (9, 10)} from being found. LECTURE-24 Assembly Line Scheduling (ALS): Find the structure of the fastest way through factory Consider the fastest way from starting point through station S1,j (same for S2,j) • j=1, only one possibility • j=2,3,…,n, two possibilities: from S1,j-1 or S2,j-1 – from S1,j-1, additional time a1,j – from S2,j-1, additional time t2,j-1 + a1,j • suppose the fastest way through S1,j is through S1,j-1, then the chassis must have taken a fastest way from starting point through S1,j-1. Why??? Similarly for S2,j-1. Step 1: Find Optimal Structure • An optimal solution to a problem contains within it an optimal solution to sub problems. • the fastest way through station Si,j contains within it the fastest way through station S1,j-1 or S2,j-1 . • Thus can construct an optimal solution to a problem from the optimal solutions to subproblems. Step 2: A recursive solution • Let fi[j] (i=1,2 and j=1,2,…, n) denote the fastest possible time to get a chassis from starting point through Si,j. • Let f* denote the fastest time for a chassis all the way through the factory. Then • f* = min(f1[n] +x1, f2[n] +x2) • f1[1]=e1+a1,1, fastest time to get through S1,1 • f1[j]=min(f1[j-1]+a1,j, f2[j-1]+ t2,j-1+ a1,j) • Similarly to f2[j]. • Recursive solution: • f* = min(f1[n] +x1, f2[n] +x2) • f1[j]= e1+a1,1 • if j=1 min(f1[j-1]+a1,j, f2[j-1]+ t2,j-1+ a1,j) if j>1 • f2[j]= e2+a2,1 • if j=1 min(f2[j-1]+a2,j, f1[j-1]+ t1,j-1+ a2,j) if j>1 • fi[j] (i=1,2; j=1,2,…,n) records optimal values to the subproblems. • To keep the track of the fastest way, introduce li[j] to record the line number (1 or 2), whose station j-1 is used in a fastest way through Si,j. • Introduce l* to be the line whose station n is used in a fastest way through the factory. Step 3: Computing the fastest time • One option: a recursive algorithm. • Let ri(j) be the number of references made to fi[j] • r1(n) = r2(n) = 1 • r1(j) = r2(j) = r1(j+1)+ r2(j+1) • ri (j) = 2n-j. • So f1[1] is referred to 2n-1 times. • Total references to all fi[j] is (2n). • Thus, the running time is exponential. Step 4: Construct the fastest way through the factory LECTURE-25 Doubt clearing class LECTURE: 26, 27 Data Structures for Disjoint Sets: A simple approach to creating a disjoint-set data structure is to create a linked list for each set. The element at the head of each list is chosen as its representative. MakeSet creates a list of one element. Union appends the two lists, a constanttime operation. The drawback of this implementation is that Find requires O(n) or linear time to traverse the list backwards from a given element to the head of the list. This can be avoided by including in each linked list node a pointer to the head of the list; then Find takes constant time, since this pointer refers directly to the set representative. However, Union now has to update each element of the list being appended to make it point to the head of the new combined list, requiring Ω(n) time. When the length of each list is tracked, the required time can be improved by always appending the smaller list to the longer. Using this weighted-union heuristic, a sequence of m MakeSet, Union, and Find operations on n elements requires O(m + nlog n) time. For asymptotically faster operations, a different data structure is needed. we describe some methods for maintaining a collection of disjoint sets. Each set is represented as a pointer-based data structure, with one node per element. Each set has a `leader' element, which uniquely identies the set. (Since the sets are always disjoint, the same object cannot be the leader of more than one set.) We want to support the following operations. MakeSet(x): Create a new set {x} containing the single element x. The element x must not appear in any other set in our collection. The leader of the new set is obviously x. Find(x): Find (the leader of) the set containing x. Union (A,B): Replace two sets A and B in our collection with their union A U B. For example, Union(A, MakeSet(x)) adds a new element x to an existing set A. The sets A and B are specified by arbitrary elements, so Union(x; y) has exactly the same behavior as Union (Find(x), Find (y)). Disjoint set data structures have lots of applications. For instance, Kruskal's minimum spanning tree algorithm relies on such a data structure to maintain the components of the intermediate spanning forest. Another application might be maintaining the connected components of a graph as new vertices and edges are added. In both these applications, we can use a disjoint-set data structure, where we keep a set for each connected component, containing that component's vertices. Reversed Trees One of the easiest ways to store sets is using trees. Each object points to another object, called its parent, except for the leader of each set, which points to itself and thus is the root of the tree. MakeSet is trivial. Find traverses the parent pointers up to the leader. Union just redirects the parent pointer of one leader to the other. Notice that unlike most tree data structures, objects do not have pointers down to their children. Make-Set clearly takes Ө(1) time, and Union requires only O(1) time in addition to the two Finds. The running time of Find(x) is proportional to the depth of x in the tree. It is not hard to come up with a sequence of operations that results in a tree that is a long chain of nodes, so that Find takes Ө (n) time in the worst case. However, there is an easy change we can make to our Union algorithm, called union by depth, So that the trees always have logarithmic depth. Whenever we need to merge two trees, we always make the root of the shallower tree a child of the deeper one. This requires us to also maintain the depth of each tree, but this is quite easy. With this simple change, Find and Union both run in Ө (log n) time in the worst case. Shallow Threaded Trees Alternately, we could just have every object keep a pointer to the leader of its set. Thus, each set is represented by a shallow tree, where the leader is the root and all the other elements are its children. With this representation, MakeSet and Find are completely trivial. Both operations clearly run in constant time. Union is a little more difficult, but not much. Our algorithm sets all the leader pointers in one set to point to the leader of the other set. To do this, we need a method to visit every element in a set; we will `thread' a linked list through each set, starting at the set's leader. The two threads are merged in the Union algorithm in constant time. The worst-case running time of Union is a constant times the size of the larger set. Thus, if we merge a one-element set with another n-element set, the running time can be _(n). Generalizing this idea, it is quite easy to come up with a sequence of n MakeSet and n - 1 Union operations that requires Ө (n2) time to create the set {1; 2; : : : ; n} from scratch. LECTURE: 28 0-1 Knapsack Problem Partition the problem into sub problems. 2. Solve the sub problems. 3. Combine the solutions to solve the original one. Remark: If the sub problems are not independent, i.e. sub problems share sub subproblems, then a divide and conquer algorithm repeatedly solves the common sub subproblems. The Idea of Developing a DP Algorithm Step1: Structure: Characterize the structure of an optimal solution. Decompose the problem into smaller problems, and find a relation between the structure of the optimal solution of the original problem and the solutions of the smaller problems. Step2: Principle of Optimality: Recursively define the value of an optimal solution. Express the solution of the original problem in terms of optimal solutions for smaller problems Step 3: Bottom-up computation: Compute the value of an optimal solution in a bottom-up fashion by using a table structure. Step 4: Construction of optimal solution: Construct an optimal solution from computed information. Steps 3 and 4 may often be combined . .