Download DAA Lecture Notes

Document related concepts

Quadtree wikipedia , lookup

Binary search tree wikipedia , lookup

Transcript
LECTURE-1
Course Description:
Introduction to design and analysis of algorithms, Growth of Functions Recurrences,
solution of recurrences by substitution, recursion tree and Master methods, Merge
sort, Quick sort and Binary search, Divide and conquer algorithms. The heap sort
algorithm, Priority Queue,
Dynamic programming algorithms (Matrix-chain multiplication, Elements of dynamic
programming, Longest common subsequence) Greedy Algorithms - (Assembly-line
scheduling, Activity- selection Problem, Elements of Greedy strategy, Fractional
knapsack problem, Huffman codes).
Data structure for disjoint sets:- Disjoint set operations, Linked list representation,
Disjoint set forests.
Graph Algorithms: Breadth first and depth-first search, Minimum Spanning Trees,
Kruskal and Prim's algorithms, single- source shortest paths (Bellman-ford and
Dijkstra's algorithms), All-pairs shortest paths (Floyd – Warshall Algorithm). Back
tracking, Branch and Bound.
String matching (Rabin-Karp algorithm), NP - Completeness and reducibility,
Approximation algorithms (Vertex-Cover Problem, Traveling Salesman Problem).
Course Objective:
In this subject students will learn about the following points:
1. To make students familiar with the basic concepts of Algorithm and its use.
2. To learn about the detail design of Algorithm and how analysis it
3. To learn about the different technique of Algorithm.
4. To understand the divide and conquer technique.
5. To understand the dynamic programming technique.
6. To understand the greedy algorithm.
7. To learn about the data compression using Hoffman coding.
8. To learn how to calculate the shortest path.
9. To understand the branch and bound method.
10.To understand the approximation algorithm.
Course Outcome:
a. Students will know which problem is solved on which algorithm.
b. Student will understand the searching technique. This helps to search the data
from the database efficiently.
c. Students will be able to understand how to reduce the operation in matrix
multiplication.
d. Students will be able to understand how find longest common subsequences
from given two string and the knowledge about DNA testing.
e. Students will know how compress the data file.
f. Students will know what is the shortest path in case of single-source and multisource graph.
g. Students will know how execute maximum no of activity in a fixed period of time
out of N no of activity in real life.
h. Students will know how to use the cutting of raw materials in real world decision
making process.
i. Students will know how solve N-queens problem.
j. Students will know how a salesman will cover all the city in a geometric area
(each city visit once a time) and return to the starting point.
k. Graduate will develop confidence for self education and ability for life long
learning.
l. Graduate can participate and succeed in competitive examination like GATE,
GRE, DRDO, Software Company.
m. Graduate will show of impact of engineering solution.
LECTURE-2
Introduction to algorithm
Algorithm is a Persian name derived from Abu Abd Allah Jafar Mohammad
ibn Musba al Khowarizmi who was a great mathematician worked on algebra,
geometry and astronomy.
An algorithm is a sequence of unambiguous instructions for solving a problem,
i.e., for obtaining a required output for any legitimate input in a finite amount of
time.
or
Algorithm is defined as a formula or set of finite steps for solving a particular
problem or we can say that it is a well-defined computational procedure that takes
some values as input and produces some values as output. Unlike programs,
algorithms are not dependent on a particular programming language, machine,
system, or compiler. They are mathematical entities, which can be thought of as
running on some sort of idealized computer with an infinite random access
memory and an unlimited word size.
Every algorithm has the following criteria
•
INPUT – Zero or more quantities are externally supplied. All the algorithms
should have some input. The logic of the algorithm should work on this input to
give the desired result.
•
OUTPUT – At least one output should be produced from the algorithm
based on the input given.
•
DEFINITENESS – Every step of the algorithm should be clear and
unambiguous. Ambiguity mean doubtfulness or uncertainty as regards
interpretation.
•
FINITENESS – Every algorithm should have a proper end. If we trace out the
instructions of an algorithm, then for all cases, the algorithm terminates after a
finite number of steps.
•
EFFECTIVENESS – Every step in the algorithm should be easy to understand
and can be implemented using any programming language.
Programming is a very complex task, and there are a number of aspects of
programming that make it so complex. The first is that most programming projects
are very large, requiring the coordinated efforts of many people. (This is the topic
a course like software engineering.) The next is that many programming projects
involve storing and accessing large quantities of data efficiently. (This is the topic
of courses on data structures and databases.) The last is that many programming
projects involve solving complex computational problems, for which simplistic or
naive solutions may not be efficient enough. The complex problems may involve
numerical data (the subject of courses on numerical analysis), but often they
involve discrete data. This is where the topic of algorithm design and analysis is
important.
Algorithms as a technology
Suppose computers were infinitely fast and computer memory was free. Would
you have any reason to study algorithms? The answer is yes, if for no other reason
than that you would still like to demonstrate that your solution method
terminates and does so with the correct answer.
If computers were infinitely fast, any correct method for solving a problem would
do. You would probably want your implementation to be within the bounds of
good software engineering practice (i.e., well designed and documented), but you
would most often use whichever method was the easiest to implement.
Of course, computers may be fast, but they are not infinitely fast. And memory
may be cheap, but it is not free. Computing time is therefore a bounded resource,
and so is space in memory. These resources should be used wisely, and algorithms
that are efficient in terms of time or space will help you do so.
Issues in algorithm design
Algorithms are mathematical objects (in contrast to the must more concrete
notion of a computer program implemented in some programming language and
executing on some machine). As such, we can reason about the properties of
algorithms mathematically. When designing an algorithm there are two
fundamental issues to be considered: correctness and efficiency. It is important to
justify an algorithm’s correctness mathematically. For very complex algorithms,
this typically requires a careful mathematical proof, which may require the proof
of many lemmas and properties of the solution, upon which the algorithm relies.
Establishing efficiency is a much more complex endeavor. Intuitively, an
algorithm’s efficiency is a function of the amount of computational resources it
requires, measured typically as execution time and the amount of space, or
memory, that the algorithm uses. The amount of computational resources can be a
complex function of the size and structure of the input set. In order to reduce
matters to their simplest form, it is common to consider efficiency as a function of
input size.
Some of the techniques used in designing algorithms are brute force
approach, divide and conquer approach, greedy approach, and dynamic
programming approach.
Brute force approach: This is one of the most popular algorithmic techniques
and covers wide range of problem solving. In this approach, the algorithm is based
on the problem statement and considers complete range of input values for
execution.
Divide and conquer approach: A technique which divides the problem into smaller
units. After dividing the problem, the technique tries to solve the smaller unit.
Once the smaller units are solved, the solution of smaller units is combined to get
the final solution of the problem.
Greedy approach: A technique used to find an optimal solution to the given
problem.
Dynamic programming problem: A technique which divides the problem into subproblems and then solves it. The sub-problems are overlapping in nature.
Issues in algorithm analysis
Any algorithm when implemented, uses the computer’s primary memory(RAM) to
hold the program and data and CPU for execution. To analyze an algorithm there is
need to determine how much TIME an algorithm takes when implemented and
how much SPACE it occupies in primary memory. Analysis of algorithm needs great
mathematical skills. It enables
1. Quantitative studies (refers to study with some facts and numbers) on
algorithms.
2. Knowledge if the software will meet the qualitative requirements.
Before starting analysis of algorithms, it is necessary to know the kind of
computer on which algorithms are expected to run. If an algorithm is assumed to
run on a standard computer, analysis of the algorithm maps to number of
operation being executed in it.
Analyzing of an algorithm is concerned of three cases
1. Worst case complexity: The worst case complexity of an algorithm is the
function defined by the maximum number of steps taken on any
instance of size n.
2. Best case complexity: The best case complexity of the algorithm is the
function defined by the minimum number of steps taken on any
instance of size n.
3. Average case complexity: The average case complexity of the algorithm
is the function defined by an average number of steps taken on an
instance of size n.
Each of these functions defines a numerical function
a. Time complexity: The time complexity of an algorithm is the amount of
computer time it needs to run to completion. The time T(p) taken by a
program p is the sum of the compile time and the run time. The time
complexity of an algorithm can depend on various factors like
1. The input to the program.
2. The quality of code generated by the compiler used to create the
object program.
3. The nature & speed of the instructions on the machine used to
execute the program.
The time complexity is denoted by tp which means the tp of an algorithm
is given by the number of steps taken by the algorithm to compute the
function it was written for. The number of steps is itself a function of the
instance characteristics.
There are two ways to determine the no of steps needed to solve a
particular problem.
In the first method we introduce a variable count into the program.
This is a global variable with initial value zero.
In the second method we have to build a table in which the total no
of steps contributed by each statement. Here first we have to determine
the no of steps per execution (s/e) of the statement and the total no of
times each statement is executed (frequency).
The steps per execution of a statement is the amount by which the count
changes as a result of the execution of that statement.
b. Space complexity: The space complexity of an algorithm is the amount
of memory it needs to run to completion. The space requirement s(p) of
any algorithm p may therefore be written as s(p)=c+sp where c is a
constant and sp is instance characteristic. Generally we concentrate
solely on estimating which instance characteristic to use t measure the
space requirements.
The space needed by any algorithm is seen to be the sum of the
following components.
1. A fixed part that is independent of the characteristics (e.g number,
size) of inputs and outputs. This part includes the space for code, space
for simple variable and fixed size component variable, space for
constant and so on.
2. A variable part that consists of the space needed by component
variables whose size is dependent on the particular problem instance
being solved, the space needed by referenced variable and the recursion
stack space etc.
LECTURE-3
Asymptotic notation
There are notations for determination of order of magnitude of algorithms
during priori analysis i.e. analysis of algorithm done before running the algorithm
on any computer. These notations are called as asymptotic notations. Asymptotic
relates to or of the nature of an asymptote which means a line whose distance to a
given curve tends to zero. An asymptote may or may not intersect its associated
curve. The asymptotic notation describes the algorithm efficiency and performance
in a meaningful way and also describes the behaviour of time or space complexity
for large instance characteristics.
Asymptotic notation provides us with a way to simplify the functions that
arise in analyzing algorithm running times by ignoring constant factors and
concentrating on the trends for large values of n.
Ignore constant factors: Multiplicative constant factors are ignored. Constant
factors appearing exponents cannot be ignored.
Focus on large n: Asymptotic analysis means that we consider trends for large
values of n. Thus, the fastest growing function of n is the only one that needs to
be considered.
The asymptotic running time of an algorithm is defined in terms of functions.
The domain of these function are set of natural numbers and real numbers.
When we look at input sizes large enough to make only the order of growth of the
running time relevant, we are studying the asymptotic efficiency of algorithms.
That is, we are concerned with how the running time of an algorithm increases
with the size of the input in the limit, as the size of the input increases without
bound. Usually, an algorithm that is asymptotically more efficient will be the best
choice for all but very small inputs. The notations we use to describe the
asymptotic running time of an algorithm are defined in terms of functions whose
domains are the set of natural numbers N = {0, 1, 2, ...}. Such notations are
convenient for describing the worst-case running-time function T (n), which is
usually defined only on integer input sizes.
Θ-notation
. For a given function g(n), we denote by Θ(g(n)) the set of functions
Θ(g(n)) = {f(n) : there exist positive constants c1, c2, and n0 such that
0 ≤ c1g(n) ≤ f(n) ≤ c2g(n) for all n ≥ n0}.
A function f(n) belongs to the set Θ(g(n)) if there exist positive
constants c1 and c2 such that it can be "sandwiched" between c1g(n) and c2g(n), for
sufficiently large n. Because Θ(g(n)) is a set, we could write "f(n) ∈ Θ(g(n))" to
indicate that f(n) is a member of Θ(g(n)). Instead, we will usually write "f(n)
= Θ(g(n))" to express the same notion.
The definition of Θ(g(n)) requires that every member f(n) ∈ Θ(g(n))
be asymptotically nonnegative, that is, that f(n) be nonnegative whenever n is
sufficiently large. (An asymptotically positive function is one that is positive for all
sufficiently large n.) Consequently, the function g(n) itself must be asymptotically
nonnegative, or else the set Θ(g(n)) is empty. We shall therefore assume that every
function used within Θ-notation is asymptotically nonnegative.
Let us briefly justify this intuition by using the formal definition to show that
1/2n2 - 3n = Θ(n2). To do so, we must determine positive constants c1, c2,
and n0 such that
c1n2 ≤ 1/2n2 - 3n ≤ c2n2
for all n ≥ n0. Dividing by n2 yields
c1 ≤ 1/2 - 3/n ≤ c2.
The right-hand inequality can be made to hold for any value of n ≥ 1 by
choosing c2 ≥ 1/2. Likewise, the left-hand inequality can be made to hold for any
value of n ≥ 7 by choosing c1 ≤ 1/14. Thus, by choosing c1 = 1/14, c2 = 1/2, and n0 =
7, we can verify that 1/2n2 - 3n = Θ(n2). Certainly, other choices for the constants
exist, but the important thing is that some choice exists. Note that these
constants depend on the function 1/2n2 - 3n; a different function belonging
to Θ(n2) would usually require different constants.
We can also use the formal definition to verify that 6n3 ≠ Θ(n2). Suppose for the
purpose of contradiction that c2 and n0 exist such that 6n3 ≤ c2n2 for all n ≥ n0. But
then n ≤ c2/6, which cannot possibly hold for arbitrarily large n, since c2 is
constant.
Intuitively, the lower-order terms of an asymptotically positive function can be
ignored in determining asymptotically tight bounds because they are insignificant
for large n. A tiny fraction of the highest-order term is enough to dominate
the lower-order terms. Thus, setting c1 to a value that is slightly smaller than the
coefficient of the highest-order term and setting c2 to a value that is slightly larger
permits the inequalities in the definition of Θ-notation to be satisfied. The
coefficient of the highest-order term can likewise be ignored, since it only
changes c1 and c2 by a constant factor equal to the coefficient.
Since any constant is a degree-0 polynomial, we can express any constant
function as Θ(n0), or Θ(1). We shall often use the notation Θ(1) to mean either a
constant or a constant function with respect to some variable.
O-notation
The Θ-notation asymptotically bounds a function from above and below. When
we have only an asymptotic upper bound, we use O-notation. For a given
function g(n), we denote by O(g(n)) (pronounced "big-oh of g of n" or sometimes
just "oh of g of n") the set of functions
O(g(n)) = {f(n): there exist positive constants c and n0 such that 0 ≤ f(n) ≤ cg(n) for
all n ≥ n0}.
We use O-notation to give an upper bound on a function, to within a constant
factor.
We write f(n) = O(g(n)) to indicate that a function f(n) is a member of the
set O(g(n)).
Note that f(n) = Θ(g(n)) implies f(n) = O(g(n)), since Θ-notation is a stronger
notation than O-notation.
Using O-notation, we can often describe the running time of an algorithm merely
by inspecting the algorithm's overall structure. When we say "the running time
is O(n2)," we mean that there is a function f(n) that is O(n2) such that for any value
of n, no matter what particular input of size n is chosen, the running time on that
input is bounded from above by the value f(n). Equivalently, we mean that the
worst-case running time is O(n2).
Ω-notation
Just as O-notation provides an asymptotic upper bound on a function, Ω-notation
provides an asymptotic lower bound. For a given function g(n), we denote
by Ω(g(n)) (pronounced "big-omega of g of n" or sometimes just "omega
of g of n") the set of functions
Ω(g(n)) = {f(n): there exist positive constants c and n0 such that 0 ≤ cg(n) ≤ f(n) for
all n ≥ n0}.
(Graphic examples of the Θ, O, and Ω notations. In each part, the value
of n0 shown is the minimum possible value; any greater value would also work.)
(a) Θ-notation bounds a function to within constant factors. We
write f(n) = Θ(g(n)) if there exist positive constants n0, c1,
and c2 such that to the right of n0, the value of f(n) always lies
between c1g(n) and c2g(n) inclusive.
(b) O-notation gives an upper bound for a function to within a
constant factor. We write f(n) = O(g(n)) if there are positive
constants n0 and c such that to the right of n0, the value of f(n)
always lies on or below cg(n).
(c) Ω-notation gives a lower bound for a function to within a
constant factor. We write f(n) = Ω(g(n)) if there are positive
constants n0 and c such that to the right of n0, the value of f(n)
always lies on or above cg(n).
o-notation
The asymptotic upper bound provided by O-notation may or may not be
asymptotically tight. The bound 2n2 = O(n2) is asymptotically tight, but the bound
2n = O(n2) is not. We use o-notation to denote an upper bound that is not
asymptotically tight. We formally define o(g(n)) ("little-oh of g of n") as the set
o(g(n)) = {f(n) : for any positive constant c > 0, there exists a constant n0 > 0 such
that 0 ≤ f(n) < cg(n) for all n ≥ n0}.
For example, 2n = o(n2), but 2n2 ≠ o(n2).
The definitions of O-notation and o-notation are similar. The main difference is
that in f(n) = O(g(n)), the bound 0 ≤ f(n) ≤ cg(n) holds for some constant c > 0, but
in f(n) = o(g(n)), the bound 0 ≤ f(n) < cg(n) holds for all constants c > 0. Intuitively,
in the o-notation, the function f(n) becomes insignificant relative to g(n)
as n approaches infinity; that is,
ω-notation
By analogy, ω-notation is to Ω-notation as o-notation is to O-notation. We use ωnotation to denote a lower bound that is not asymptotically tight. One way to
define it is by
f(n) ∈ ω(g(n)) if and only if g(n) ∈ o(f(n)).
Formally, however, we define ω(g(n)) ("little-omega of g of n") as the set
ω(g(n)) = {f(n): for any positive constant c > 0, there exists a constant n0 > 0 such
that 0 ≤ cg(n) < f(n) for all n ≥ n0}.
For example, n2/2 = ω(n), but n2/2 ≠ ω(n2). The relation f(n) = ω(g(n)) implies that
if the limit exists. That is, f(n) becomes arbitrarily large relative to g(n)
as n approaches infinity.
Comparison of functions
Many of the relational properties of real numbers apply to asymptotic
comparisons as well. For the following, assume that f(n) and g(n) are
asymptotically positive.
Asymptotic notations
Definition: f(n) = O(g(n)) "at most"
c, and n0,  |f(n)|  c|g(n)|  n  n0
Example:

f(n) = 3n2 + 2
g(n) = n2
n0=2, c=4
f(n) = O(n2)
e.g.
f(n) = n3 + n = O(n3)
Def: f(n) = (g(n))
"at least"
"lower bound"
 c, and n0,  |f(n)|  c|g(n)|  n  n0
Def: f(n) = (g(n))
 c1, c2, and n0,  c1|g(n)|  |f(n)|  c2|g(n)|  n  n0
Def: f(n)  o(g(n))
f ( n)
lim g( n)  1
n
e.g. f(n) = 3n2+n = o(3n2)
Problem size
10
102
103
104
log2n
3.3
6.6
10
13.3
n
10
102
103
104
nlog2n
0.33x102
0.7x103
104
1.3x105
n2
102
104
106
108
2n
1024
1.3x1030
>10100
>10100
n!
3x106
>10100
>10100
>10100
Time Complexity Functions
Figure: Rate of growth of common computing time functions
O(1)  O(log n)  O(n)  O(n log n)  O(n2)  O(n3)  O(2n)  O(n!)  O (nn)
Ex: log (n!) = log (n(n1)…1)
= log2 + log3 +…+ log n

n
1 log xdx
= log e
n
1 ln xdx
= log e[ x ln x  x]1
n
= log e(n ln n  n + 1)
= n log n  n log e + 1.44
 n log n  1.44n
=(n log n)
LECTURE-4
Recurrences
Solve recurrences using substitution method
When an algorithm contains a recursive call to itself, its running time can often be
described by a recurrence. A recurrence is an equation or inequality that describes
a function in terms of its value on smaller inputs. For example the worst-case
running time T (n) of the MERGE-SORT procedure could be described by the
recurrence
whose solution was claimed to be T (n) = Θ(n lg n).
This chapter offers three methods for solving recurrences-that is, for obtaining
asymptotic "Θ" or "O" bounds on the solution. In the substitution method, we
guess a bound and then use mathematical induction to prove our guess correct.
The recursion-tree method converts the recurrence into a tree whose nodes
represent the costs incurred at various levels of the recursion; we use techniques
for bounding summations to solve the recurrence. The master method provides
bounds for recurrences of the form
T (n) = aT (n/b) + f (n),
where a ≥ 1, b > 1, and f (n) is a given function; it requires memorization of three
cases, but once you do that, determining asymptotic bounds for many simple
recurrences is easy.
Technicalities
In practice, we neglect certain technical details when we state and solve
recurrences. A good example of a detail that is often glossed over is the assumption
of integer arguments to functions. Normally, the running time T (n) of an algorithm
is only defined when n is an integer, since for most algorithms, the size of the input
is always an integer. For example, the recurrence describing the worst-case running
time of MERGE-SORT is really
Boundary conditions represent another class of details that we typically ignore.
Since the running time of an algorithm on a constant-sized input is a constant, the
recurrences that arise from the running times of algorithms generally have T(n)
= Θ(1) for sufficiently small n. Consequently, for convenience, we shall generally
omit statements of the boundary conditions of recurrences and assume that T (n) is
constant for small n. For example, we normally state recurrence as
without explicitly giving values for small n. The reason is that although changing
the value of T (1) changes the solution to the recurrence, the solution typically
doesn't change by more than a constant factor, so the order of growth is unchanged.
When we state and solve recurrences, we often omit floors, ceilings, and boundary
conditions. We forge ahead without these details and later determine whether or
not they matter. They usually don't, but it is important to know when they do.
Experience helps, and so do some theorems stating that these details don't affect
the asymptotic bounds of many recurrences encountered in the analysis of
algorithms.
The substitution method
The substitution method for solving recurrences entails two steps:
1. Guess the form of the solution.
2. Use mathematical induction to find the constants and show that the solution
works.
The name comes from the substitution of the guessed answer for the function when
the inductive hypothesis is applied to smaller values. This method is powerful, but
it obviously can be applied only in cases when it is easy to guess the form of the
answer.
The substitution method can be used to establish either upper or lower bounds on a
recurrence. As an example, let us determine an upper bound on the recurrence
which is similar to recurrences (3.2) and (3.3). We guess that the solution is T (n)
= O(n lg n). Our method is to prove that T (n) ≤ cn lg n for an appropriate choice
of the constant c > 0. We start by assuming that this bound holds for ⌊n/2⌋, that is,
that T (⌊n/2⌋) ≤ c ⌊n/2⌋ lg(⌊n/2⌋). Substituting into the recurrence yields
T(n)
≤
2(c ⌊n/2⌋lg(⌊n/2⌋)) + n
≤
cn lg(n/2) + n
=
cn lg n - cn lg 2 + n
=
cn lg n - cn + n
≤
cn lg n,
where the last step holds as long as c ≥ 1.
Mathematical induction now requires us to show that our solution holds for the
boundary conditions. Typically, we do so by showing that the boundary conditions
are suitable as base cases for the inductive proof. For the recurrence (3.4), we must
show that we can choose the constant c large enough so that the bound T(n)
= cn lg n works for the boundary conditions as well. This requirement can
sometimes lead to problems. Let us assume, for the sake of argument, that T (1) = 1
is the sole boundary condition of the recurrence. Then for n = 1, the bound T (n)
= cn lg n yields T (1) = c1 lg 1 = 0, which is at odds with T (1) = 1. Consequently,
the base case of our inductive proof fails to hold.
This difficulty in proving an inductive hypothesis for a specific boundary condition
can be easily overcome. For example, in the recurrence (3.4), we take advantage of
asymptotic notation only requiring us to prove T (n) = cn lg n for n ≥ n0,
where n0 is a constant of our choosing. The idea is to remove the difficult boundary
condition T (1) = 1 from consideration in the inductive proof. Observe that
for n > 3, the recurrence does not depend directly on T (1). Thus, we can
replace T (1) by T (2) and T (3) as the base cases in the inductive proof, letting n0 =
2. Note that we make a distinction between the base case of the recurrence (n = 1)
and the base cases of the inductive proof (n = 2 and n = 3). We derive from the
recurrence that T (2) = 4 and T (3) = 5. The inductive proof that T (n) ≤ cn lg n for
some constant c ≥ 1 can now be completed by choosing c large enough so
that T (2) ≤ c2 lg 2 and T (3) ≤ c3 lg 3. As it turns out, any choice of c ≥ 2 suffices
for the base cases of n = 2 and n = 3 to hold. For most of the recurrences we shall
examine, it is straightforward to extend boundary conditions to make the inductive
assumption work for small n.
Making a good guess
Unfortunately, there is no general way to guess the correct solutions to recurrences.
Guessing a solution takes experience and, occasionally, creativity. Fortunately,
though, there are some heuristics that can help you become a good guesser. You
can also use recursion trees to generate good guesses.
If a recurrence is similar to one you have seen before, then guessing a similar
solution is reasonable. As an example, consider the recurrence
T (n) = 2T (⌊n/2⌋ + 17) + n ,
which looks difficult because of the added "17" in the argument to T on the righthand side. Intuitively, however, this additional term cannot substantially affect the
solution to the recurrence. When n is large, the difference between T (⌊n/2⌋)
and T (⌊n/2⌋ + 17) is not that large: both cut n nearly evenly in half. Consequently,
we make the guess that T (n) = O(n lg n), which you can verify as correct by using
the substitution method.
Another way to make a good guess is to prove loose upper and lower bounds on
the recurrence and then reduce the range of uncertainty. For example, we might
start with a lower bound of T (n) = Ω(n) for the recurrence (3.4), since we have the
term n in the recurrence, and we can prove an initial upper bound of T (n) = O(n2).
Then, we can gradually lower the upper bound and raise the lower bound until we
converge on the correct, asymptotically tight solution of T (n) = Θ(n lg n).
Subtleties
There are times when you can correctly guess at an asymptotic bound on the
solution of a recurrence, but somehow the math doesn't seem to work out in the
induction. Usually, the problem is that the inductive assumption isn't strong enough
to prove the detailed bound. When you hit such a snag, revising the guess by
subtracting a lower-order term often permits the math to go through.
Consider the recurrence
T (n) = T (⌊n/2⌋) + T (⌈n/2⌉) + 1.
We guess that the solution is O(n), and we try to show that T (n) ≤ cn for an
appropriate choice of the constant c. Substituting our guess in the recurrence, we
obtain
T (n)
≤
c ⌊n/2⌋ + c ⌈n/2⌉ + 1
=
cn + 1 ,
which does not imply T (n) ≤ cn for any choice of c. It's tempting to try a larger
guess, say T (n) = O(n2), which can be made to work, but in fact, our guess that the
solution is T (n) = O(n) is correct. In order to show this, however, we must make a
stronger inductive hypothesis.
Intuitively, our guess is nearly right: we're only off by the constant 1, a lower-order
term. Nevertheless, mathematical induction doesn't work unless we prove the exact
form of the inductive hypothesis. We overcome our difficulty by subtracting a
lower-order term from our previous guess. Our new guess is T (n) ≤ cn b, where b ≥ 0 is constant. We now have
T (n)
≤
(c ⌊n/2⌋ - b) + (c ⌈n/2⌉ - b) + 1
=
cn - 2b + 1
≤
cn - b ,
as long as b ≥ 1. As before, the constant c must be chosen large enough to handle
the boundary conditions.
Most people find the idea of subtracting a lower-order term counterintuitive. After
all, if the math doesn't work out, shouldn't we be increasing our guess? The key to
understanding this step is to remember that we are using mathematical induction:
we can prove something stronger for a given value by assuming something
stronger for smaller values.
LECTURE-5
The master method
The master method provides a "cookbook" method for solving recurrences of the
form
where a ≥ 1 and b > 1 are constants and f (n) is an asymptotically positive function.
The master method requires memorization of three cases, but then the solution of
many recurrences can be determined quite easily, often without pencil and paper.
The recurrence describes the running time of an algorithm that divides a problem
of size n into a subproblems, each of size n/b, where a and b are positive constants.
The a subproblems are solved recursively, each in time T (n/b). The cost of
dividing the problem and combining the results of the subproblems is described by
the function f (n). For example, the recurrence arising from the MERGE-SORT
procedure has a = 2, b = 2, and f (n) = Θ(n).
As a matter of technical correctness, the recurrence isn't actually well defined
because n/b might not be an integer. Replacing each of the a terms T (n/b) with
either T (⌊n/b⌋) or T (⌈n/b⌉) doesn't affect the asymptotic behavior of the
recurrence, however.
The master theorem
The master method depends on the following theorem.
Let a ≥ 1 and b > 1 be constants, let f (n) be a function, and let T (n) be defined on
the nonnegative integers by the recurrence
T(n) = aT(n/b) + f(n),
where we interpret n/b to mean either ⌊n/b⌋ or ⌈n/b⌉. Then T (n) can be bounded
asymptotically as follows.
1. If
2. If
for some constant ∈ > 0, then
, then
.
3. If
for some constant ∈ > 0, and if a f (n/b) ≤ cf (n) for some
constant c < 1 and all sufficiently large n, then T (n) = Θ(f (n)).
Before applying the master theorem to some examples, let's spend a moment trying
to understand what it says. In each of the three cases, we are comparing the
function f (n) with the function
. Intuitively, the solution to the recurrence is
determined by the larger of the two functions. If, as in case 1, the function
is
the larger, then the solution is
. If, as in case 3, the function f (n) is the
larger, then the solution is T (n) = Θ(f (n)). If, as in case 2, the two functions are the
same size, we multiply by a logarithmic factor, and the solution
is
.
Beyond this intuition, there are some technicalities that must be understood. In the
first case, not only must f (n) be smaller than
, it must be polynomially smaller.
That is, f (n) must be asymptotically smaller than
by a factor of n∈ for some
constant ∈ > 0. In the third case, not only must f (n) be larger than
, it must be
polynomially larger and in addition satisfy the "regularity" condition
that af (n/b) ≤ cf(n). This condition is satisfied by most of the polynomially
bounded functions that we shall encounter.
It is important to realize that the three cases do not cover all the possibilities
for f (n). There is a gap between cases 1 and 2 when f (n) is smaller than
but
not polynomially smaller. Similarly, there is a gap between cases 2 and 3
when f (n) is larger than
but not polynomially larger. If the function f (n) falls
into one of these gaps, or if the regularity condition in case 3 fails to hold, the
master method cannot be used to solve the recurrence.
Using the master method
To use the master method, we simply determine which case (if any) of the master
theorem applies and write down the answer.
As a first example, consider
T (n) = 9T(n/3) + n.
For this recurrence, we have a = 9, b = 3, f (n) = n, and thus we have
that
. Since
, where ∈ = 1, we can apply case 1 of the
master theorem and conclude that the solution is T (n) = Θ(n2).
Now consider
T (n) = T (2n/3) + 1,
in which a = 1, b = 3/2, f (n) = 1, and
. Case 2 applies,
since
, and thus the solution to the recurrence is T(n) = Θ(lg n).
For the recurrence
T(n) = 3T(n/4) + n lg n,
we have a = 3, b = 4, f (n) = n lg n, and
. Since
, where ∈ ≈ 0.2, case 3 applies if we can show that the regularity
condition
holds
for f (n).
For
sufficiently
large n, af (n/b)
=
3(n/4)lg(n/4) ≤ (3/4)n lg n = cf (n) for c = 3/4. Consequently, by case 3, the
solution to the recurrence is T(n) = Θ(nlg n).
The master method does not apply to the recurrence
T(n) = 2T(n/2) + n lg n,
even though it has the proper form: a = 2, b = 2, f(n) = n lg n, and
. It might
seem that case 3 should apply, since f (n) = n lg n is asymptotically larger
than
. The problem is that it is not polynomially larger. The
is asymptotically less than n∈ for any positive
ratio
constant ∈. Consequently, the recurrence falls into the gap between case 2 and
case 3.
Example: T (n) = 9T (n/3) + n
a=9, b=3, f (n) = n
nlogb a = nlog3 9 = (n2)
Since f(n) = O(nlog3 9 - ), where =1, case 1 applies
Thus the solution is T (n) = (n2)
T (n) = n2 log n + n2 log n/2 + n2 log n/4 + . . . + n2 log n/(2log n)
= n2 (log n + log n/2 + log n + 4 + . . .)
= n2 (log n · n/2 · n/4 + . . . + n/(2logn))
= n2 (log 2log n) (Using geometric series)
= n2 log n (Using 2logn = n)
Thus, T (n) = n2 log n.
LECTURE-6
The recursion-tree method
A recursion tree models the costs (time) of a recursive execution of an algorithm.
In a recursion tree, each node represents the cost of a single sub problem
somewhere in the set of recursive function invocations. We sum the costs within
each level of the tree to obtain a set of per-level costs, and then we sum all the perlevel costs to determine the total cost of all levels of the recursion. Recursion trees
are particularly useful when the recurrence describes the running time of a divideand-conquer algorithm.
A recursion tree is best used to generate a good guess, which is then verified by the
substitution method. When using a recursion tree to generate a good guess, you can
often tolerate a small amount of "sloppiness," since you will be verifying your
guess later on. If you are very careful when drawing out a recursion tree and
summing the costs, however, you can use a recursion tree as a direct proof of
a solution to a recurrence.
The construction of a recursion tree for the recurrence T(n) = 3T(n/4) + cn2.
Part (a) shows T(n), which is progressively expanded in (b)-(d) to form the
recursion tree. The fully expanded tree in part (d) has height log4 n (it has log4 n +
1 levels).
Because subproblem sizes decrease as we get further from the root, we
eventually must reach a boundary condition. How far from the root do we reach
one? The subproblem size for a node at depth i is n/4i. Thus, the subproblem size
hits n = 1 when n/4i = 1 or, equivalently, when i = log4 n. Thus, the tree has log 4n +
1 levels (0, 1, 2,..., log4 n).
Next we determine the cost at each level of the tree. Each level has three times
more nodes than the level above, and so the number of nodes at depth i is 3i.
Because subproblem sizes reduce by a factor of 4 for each level we go down from
the root, each node at depth i, for i = 0, 1, 2,..., log4 n - 1, has a cost of c(n/4i)2.
Multiplying, we see that the total cost over all nodes at depth i, for i = 0, 1, 2,...,
log4 n - 1, is 3i c(n/4i)2 = (3/16)i cn2. The last level, at depth log4 n,
has
nodes, each contributing cost T (1), for a total cost of
, which
is
.
Now we add up the costs over all levels to determine the cost for the entire tree:
Thus, we have derived a guess of T (n) = O(n2) for our original recurrence T (n) =
3T (⌊n/4⌋) + Θ(n2). In this example, the coefficients of cn2 form a decreasing
geometric series and, the sum of these coefficients is bounded from above by the
constant 16/13. Since the root's contribution to the total cost is cn2, the root
contributes a constant fraction of the total cost. In other words, the total cost of
the tree is dominated by the cost of the root.
As another, more intricate example, Figure 3.2 shows the recursion tree for T (n)
= T(n/3) + T(2n/3) + O(n).
A recursion tree for the recurrence T(n) = T (n/3) + T (2n/3) + cn.
(Again, we omit floor and ceiling functions for simplicity.) As before, we
let c represent the constant factor in the O(n) term. When we add the values
across the levels of the recursion tree, we get a value of cn for every level. The
longest path from the root to a leaf is n → (2/3)n →(2/3)2n → ··· → 1. Since
(2/3)kn = 1 when k = log3/2 n, the height of the tree is log3/2 n.
Intuitively, we expect the solution to the recurrence to be at most the number of
levels times the cost of each level, or O(cn log3/2 n) = O(n lg n). The total cost is
evenly distributed throughout the levels of the recursion tree. There is a
complication here: we have yet to consider the cost of the leaves. If this recursion
tree were a complete binary tree of height log3/2 n, there would
be
leaves. Since the cost of each leaf is a constant, the total cost of all
leaves would then be
, which is ω(n lg n). This recursion tree is not a
complete binary tree, however, and so it has fewer than
leaves. Moreover,
as we go down from the root, more and more internal nodes are absent.
Consequently, not all levels contribute a cost of exactly cn; levels toward the
bottom contribute less. We could work out an accurate accounting of all costs,
but remember that we are just trying to come up with a guess to use in the
substitution method. Let us tolerate the sloppiness and attempt to show that a
guess of O(n lg n) for the upper bound is correct.
LECTURE-7
Doubt Clearing Class
LECTURE-8
Divide and Conquer: Binary Search
Algorithm BinarySearch (data, key):
Input: a sorted array of integers (data) and a key (key)
Output: the position of the key in the array (-1 if not found)
low  0
high  data.length - 1
while low <= high do
int mid  (low + high) / 2
if (data[mid] < key)
low  mid + 1
else if (data[mid] > key)
high  mid - 1
else
return mid
// couldn't find the key
return -1
We wish to understand the average case running time of binary search. What do
we mean by average case? Let us consider the case where each key in the array is
equally likely to be searched for, and what the average time is to find such a key.
To make the analysis simpler, let us assume that n = 2k – 1, for some integer k  1.
(Why does it make the analysis simpler?)
We first notice that in the two lines of pseudo code before the while loop, 3
primitive operations always get executed (an assignment, a subtraction, and then
an assignment). However, since these operations happen no matter what the
input is, we will ignore them for now.
Focusing our attention on the while loop, we notice that each time the program
enters the while loop, we execute 6 primitive operations (a <= comparison, an
addition, a divide, an assignment, an array index, and a < comparison), before the
program might branch depending on the result of the first if statement.
Depending on the result of the conditional, the program will execute different
numbers of primitive operations. If data [mid] < key, then the program executes
2 more primitive operations. If data [mid] > key, then the program executes 4
more primitive operations (an array index and a < comparison in the next if, and
then a subtraction and assignment). If data [mid] = = key, then we execute 3
more primitive operations (2 operations for the if and then 1 operation to return
mid). In other words, the number of primitive operations executed in an iteration
of the loop is
10
, if data[mid] > key,
9
, if data[mid] = = key, and
8
, if data[mid] < key.
We can now construct a tree that summarizes how many operations are executed
in the while loop, depending on the number of times the while loop is executed
and the result of the comparisons between data[mid] and key in each iteration.
Each node in the tree represents the exit from the algorithm because data [mid] =
= key. A node is a left child of its parent if in the previous iteration of the while
loop, the search was restricted to the left part of the sub array (that is, when data
[mid] > key). A node is a right child of its parent if in the previous iteration of the
while loop, the search was restricted to the right part of the sub array (that is,
when data [mid] < key). Here are the first three levels of the tree:
9
10 + 9
10 + 10 + 9
8+9
10 + 8 + 9
8 + 10 + 9
…
8+8+9
…
The root of the tree corresponds to the situation where data[mid] = = key in the
first iteration of the while loop. Nodes on the ith level of the tree correspond to
finding the key after executing i iterations of the while loop (level 1 is the root of
the tree). Notice that the number of numbers at each node is equal to what level
it is on. Since n = 2k – 1, then the number of levels in the tree is k = log (n + 1).
Now consider the number of operations that occur at each level. Notice that
every node has a 9 in it (which comes from the last iteration of the while loop
when the item is found) and that for every node with some number of 10s and 8s
in it, there is a corresponding node on the same level that has exactly same
number of 8s and 10s, respectively. Since we are only interested in adding all of
the numbers in the tree, we can simplify our calculations by simply making all of
the numbers 9s.
Let T be the sum of all of the numbers at all of the nodes in the tree. Except for
the 3 operations we ignored earlier, T is the total amount of time it would take to
execute binary search n times, one for each item in the array. So, T / n is the
average time to find a key.
Each node at level i has i 9s in it, and there are 2i–1 nodes in level i. It is easy to
see that T / n has the value
9
n
log(n 1)
 i2
i 1
i 1
.
log(n 1)
We now simply need to compute what
 i2
i 1
is. Let us write out the terms in
i 1
the summation:
log(n 1)
 i2
i 1
= 1  20 + 2  21 + 3  22 + 4  23 + … + log (n + 1)  2log (n + 1) – 1
i 1
= 1  20 + 1  21 + 1  22 + 1  23 + … + 1  2log (n + 1) – 1
(1)
+ 1  21 + 1  22 + 1  23 + … + 1  2log (n + 1) – 1
(2)
+ 1  22 + 1  23 + … + l  2log (n + 1) – 1
(3)
+ 1  23 + … + 1  2log (n + 1) – 1
(4)
…
+ 1  2log (n + 1) – 1
Note how the summation has been broken up. What we do now is notice that
the terms on line (1) are the form of a geometric series that grows by a factor of
2.
1  20 + 1  21 + 1  22 + 1  23 + … + 1  2log (n + 1) – 1
=
log(n 1) 1
i
log(n 1) 1
i
i 0
i  
2 =
2 –
1
2
i
= 2log (n + 1) – 20
i  
Line (2) can be calculated in a similar way:
1  21 + 1  22 + 1  23 + … + 1  2log (n + 1) – 1
=
log(n 1) 1
i
log(n 1) 1
i
i 1
i  
2 =
2 –
0
2
i
= 2log (n + 1) – 21
i  
In general, the sum on line i is 2log (n + 1) – 2i-1 = (n + 1) – 2i-1, and there are log (n +
log(n 1)
1) lines. To compute the entire sum, we need to compute
 ((n  1)  2
i 1
log(n 1)
log(n 1)
log(n 1)
i 1
i 1
i 1
 ((n  1)  2i1 ) =
 (n  1) –
2
= (n + 1)  log (n + 1) – (2log (n + 1) – 1)
i 1
i 1
).
= (n + 1)  log (n + 1) – ((n + 1) – 1)
= (n + 1)  log (n + 1) – n
n 1
So, T = 9
 log( n  1)  9  9 log n – 9.
 n 
This means that that average number of primitive operations on a successful
search is about 9 log n – 6. (We need to add the three operations we ignored at
the beginning of our analysis.)
Note that in the worst case, a successful search executes approximately 10 log n
steps, since the bottommost, leftmost node of the tree has value 10 log (n + 1) –
1. The average time for a successful search is only log n steps smaller than the
worst case because almost exactly half of the nodes are at the bottom of the tree,
and those nodes have an average value of 9 log n. So, even though there are
many nodes near the top of the tree that have a very small value, they influence
the average by only a constant (9 log n – 9 on average for all nodes, as opposed to
9 log n on average for just the leaves of the tree) because there are so many more
nodes at the bottom of the tree.
LECTURE-9
Quick Sort:
The array A [p…..r] is partitioned into two nonempty sub array A [p…..q-1] and A
[q+1…..r]. Such that each element of A [p…..q-1] is less than or equal to A
[q+1…..r]. The index q is compute as part of this partitioning procedure.
Quick Algorithm
QUICKSORT (A, p, r)
1.
2.
3.
4.
If p<r
then q=partition(A, p, r)
QUICKSORT(A,p,q-1)
QUICKSORT(A,q+1,r)
Partition (A, p, r)
At the beginning of each iteration of the loop of lines 3-6, for any array index k,
1. if p ≤ k ≤ i, then A[k] ≤ x.
2. if i + 1 ≤ k≤ j -1, then A[k] > x.
3. if k = r, then A[k] = x.
Example:
Time complexity:
Worst case: O (n2)
(T (n) =T (n-1) +n)
Best case: O (nlogn)
(T (n) =2T (n/2) +n)
Average case: O (nlogn)
To sort an array of n distinct elements, quick sort takes O(n log n) time in expection, averaged
over all n! Permutations of n elements with equal probability. Why? For a start, it is not hard to
see that the partition operation takes O (n) time.
In the most unbalanced case, each time we perform a partition we divide the list into two sub
lists of size 0 and n − 1 (for example, if all elements of the array are equal). This means each
recursive call processes a list of size one less than the previous list. Consequently, we can make
n − 1 nested calls before we reach a list of size 1. This means that the call tree is a linear chain
of n − 1 nested calls. The ith call does O(n − i) work to do the partition, and
, so in that case Quick sort takes O(n²) time. That is the worst case:
given knowledge of which comparisons are performed by the sort, there are adaptive
algorithms that are effective at generating worst-case input for quick sort on-the-fly, regardless
of the pivot selection strategy. So the total running time is still O(n log n).
LECTURE-10
Merge Sort
Merge sort is an O (n log n) comparison-based sorting algorithm.
The merge sort algorithm is based on divide-and-conquer paradigm.
It was invented by John von Neumann in 1945.
It operates as follows:
DIVIDE: Partition the n-element sequence to be sorted into two subsequences of
n/2 elements each.
CONQUER: Sort the two subsequences recursively using the merge sort.
COMBINE: Merge the two sorted subsequences of size n/2 each to produce the
sorted sequence consisting of n elements.
Mergesort algorithm is based on a divide and conquer strategy. First, the
sequence to be sorted is decomposed into two halves (Divide). Each half is sorted
independently (Conquer). Then the two sorted halves are merged to a sorted
sequence (Combine) (Figure 1).
Figure 1: Mergesort(n)
The following procedure mergesort sorts a sequence a from index lo to index hi.
void mergesort(int lo, int hi)
{
if (lo<hi)
{
int m=(lo+hi)/2;
mergesort(lo, m);
mergesort(m+1, hi);
merge(lo, m, hi);
}
}
First, index m in the middle between lo and hi is determined. Then the first part of
the sequence (from lo to m) and the second part (from m+1 to hi) are sorted by
recursive calls of mergesort. Then the two sorted halves are merged by procedure
merge. Recursion ends when lo = hi, i.e. when a subsequence consists of only one
element.
The main work of the Mergesort algorithm is performed by function merge. There
are different possibilities to implement this function.
void merge(int lo, int m, int hi)
{
int i, j, k;
i=0; j=lo;
// copy first half of array a to auxiliary array b
while (j<=m)
b[i++]=a[j++];
i=0; k=lo;
// copy back next-greatest element at each time
while (k<j && j<=hi)
if (b[i]<=a[j])
a[k++]=b[i++];
else
a[k++]=a[j++];
while (k<j)
a[k++]=b[i++];
}
Example:-
Time complexity:
Analysis
The straightforward version of function merge requires at most 2n steps (n steps
for copying the sequence to the intermediate array b, and at most n steps for
copying it back to array a). The time complexity of mergesort is therefore
T(n)
2n + 2 T(n/2) and
T(1) = 0
The solution of this recursion yields
T(n)
2n log(n)
O(n log(n))
Thus, the mergesort algorithm is optimal, since the lower bound for the sorting
problem of Ω(n log(n)) is attained.
In the more efficient variant, function merge requires at most 1.5n steps (n/2
steps for copying the first half of the sequence to the intermediate array b, n/2
steps for copying it back to array a, and at most n/2 steps for processing the
second half). This yields a running time of mergesort of at most 1.5n log(n) steps.
Conclusions
Algorithm mergesort has a time complexity of Θ(n log(n)) which is optimal. A
drawback of mergesort is that it needs an additional space of Θ(n) for the
temporary array b.
LECTURE-11 and 12
Heap Sort
Heap sort is a comparison-based sorting algorithm. Heap sort begins by building a
heap out of the data set, and then removing the largest item and placing it at the
end of the array. After removing the largest item, it reconstructs the heap,
removes the largest remaining item, and places it in the next position from the
end of the array. This is repeated until there are no items left in the heap.
Heaps (Binary heap)
The binary heap data structure is an array object that can be viewed as a
complete tree.
Heap property:
 Max-heap : A [parent(i)]  A[i]
 Min-heap : A [parent(i)] ≤ A[i]
 The height of a node in a tree: the number of edges on the longest simple
downward path from the node to a leaf.
 The height of a tree: the height of the root
 The height of a heap: O(log n).
Basic procedures on heap:
Max-Heapify procedure, Build-Max-Heap procedure,Heapsort procedure, MaxHeap-Insert procedure,
Heap-Extract-Max procedure,Heap-Increase-Key procedure,Heap-Maximum
procedure
Maintaining the heap property:
Heapify is an important subroutine for manipulating heaps. Its inputs are an array
A and an index i in the array. When Heapify is called, it is assume that the binary
trees rooted at LEFT(i) and RIGHT(i) are heaps.if A[i] is smaller than its child, then
it violating its heap property.
Building a heap:
Heap sort Algorithm:
Time complexity: O (nlog2n)
LECTURE-13
Priority Queue
A priority queue is an abstract data type which is like a regular queue or stack
data structure, but where additionally each element has a "priority" associated
with it. In a priority queue, an element with high priority is served before an
element with low priority. If two elements have the same priority, they are served
according to their order in the queue.
Operations
A priority queue must at least support the following operations:


Insert with priority: add an element to the queue with an associated
priority.
Pull highest priority element: remove the element from the queue that has
the highest priority, and return it.
This is also known as "pop element (Off)", "get maximum element" or "get
front (most) _element".
Some conventions reverse the order of priorities, considering lower values
to be higher priority, so this may also be known as "get minimum element",
and is often referred to as "get-min" in the literature.
This may instead be specified as separate "peek at highest priority element"
and "delete element" functions, which can be combined to produce "pull
highest priority element".
A data structure for maintaining a set S of elements with the following
Operations:
Maximum(S): returns the element in S with the largest key.
Insert(S; x): inserts an element x into S.
Extract-Max(S): removes and returns the element in S with the largest key.
The following function is useful in many applications of priority queue.
Increase-Key(S; i; k): increases i key to k.
Basic idea: use a max-heap.
Build a heap
Problem: Convert an array A[1::n] of integers into a max-heap.
Procedure Build-Max-Heap (A[1::n])
1. heap-size[A] <-n
2. for i <- parent(n) downto 1 do
3. MaxHeapify(A; i)
Running time: O (n log n).
All the above operations can be done in O(log n) time, where n is the size of S.
Procedure Increase-Key(A[1::n],i,k)
/* increase A[i] to k */
1.
2.
3.
4.
5.
if (k < A[i]) then error \new key smaller than current key"
A[i] <- k
while (i > 1) and (A[parent(i)] < A[i]) do
exchange A[i] $ A[parent(i)]
i<- parent(i)
A priority queue is often considered to be a "container data structure".
The Standard Template Library (STL), and the C++ 1998 standard, specifies priority
queue as one of the STL container adaptor class templates. It implements a maxpriority-queue, and has three parameters: a comparison object for sorting such as
a function (defaults to less<T> if unspecified), the underlying container for storing
the data structures (defaults to std::vector<T>), and two iterators to the
beginning and end of a sequence. Unlike actual STL containers, it does not allow
iteration of its elements (it strictly adheres to its abstract data type definition).
STL also has utility functions for manipulating another random-access container as
a binary max-heap. The Boost (C++ libraries) also have an implementation in the
library heap.
LECTURE-14
Sorting [Lower Bound Analysis]
Insertion Sort: - Each time when you add a new element to the list, compare it
with the rest of the list, and add it to the right place.
 Example: - Insert Sort 34, 8, 64, 51, 32, and 21
- Show how sorting process works.
Void insertion Sort (vector<comparable> & a)
{
int j;
for (int p=1; p < a. size( ); p++)
{
Comparable tmp = a[p];
for (j = p; j > 0 && tmp < a[j-1]; j--)
a[j] = a[j-1];
a[j] = tmp;
}
}
 The performance of insert sort N

I=1
2 + 3 + 4 +………..+ N = 0(N2 )
 A collection of simple sorting algorithm based on comparison two adjacent
elements and exchange them if not in right order.
Ex’s – insert Sort, selection Sort, bubble Sort, etc.
A lower bound for simple sorting algorithm based on comparing adjacent
elements
Definition: - Give two integers x, y. (x, y) is called an inversion if x > y.
Given a list of N integers, what is the average number of inversions in it?
Given a list L, we reverse it to get a new list Lr. Given any pair of two adjacent
elements (x, y), it represents an inversion in exactly one of L and Lr.
 The total number of all those inversions in L and Lr is
N (N-1) / 2
 So, the average number of inversions in the list L is
N (N-1) / 4
 The average number of inversions in a list tells us the lower bound of the
algorithms.
 You need to exchange all the inversions!
 So, the performance of any such simple sorting algorithms based on
comparison two adjacent elements is  )
 The above lower bound tells us that if we want to break  ) barrier, we
need to compare element at distance.
 Heap Sort
We still remember what is a heap, right?
For a heap, we know that the minimum element is at the root
We can exploit this property to do sorting.
1) Build a heap
2) Remove the root, rehear
3) Repeat step 2 until empty.
Performance of Heap Sort.
Step 1 = 0(N)
Step 2,3 = 0(N log N)
Total = 0 (N log N)
 Quick sort = the fastest known sorting algorithm in practice. Its average
running time is 0(N log N). its worst-case performance is 0(N2 ).
A recursive algorithm based on divide and conquers.
- If the number of elements in S is 0 or 1, then return.
- Pick any element v in S. This is called the pivot.
- Partition S – {v} into two disjoint groups =
S1 = {x  S – {v} | x  v}, and
S2 = {x  S – {v} | x  v}.
-Return { quicksort (S1) followed by v followed by quicksort (S2) }
A decision is a binary tree. Each internal node respects a comparison of two
integers; it has two children nodes represent respectively the true/ false
outcome of the comparison.
 Every leaf node represents a sorted list. The path from the root to the leaf
represents the sorting process that results the sorted list.
 Show some example of decision trees.
 Given n district integers, their n! Many different ways to array them in a
sorted manner. This implies that any decision tree for sorting a list of n
integers has at least n! Leaf node.
 The lower bound for sorting would be the lower bound on the path from
the root to a leaf.
 Since the decision has at least n! leaf nodes, Any path from the root to a
leaf is at least log2 N!  N log N = (N log2 N)
 Possible, but other ways than comparison should be used.
LECTURE-15
Doubt Clearing Class/BPUT Question paper solved
LECTURE-16
Dynamic Programming: Elements of Dynamic Programming
• Like divide-and-conquer, solve problem by combining the solutions to subproblems.
• Differences between divide-and-conquer and DP:
– Independent sub-problems, solve sub-problems independently and
recursively, (so same sub(sub)problems solved repeatedly)
Sub-problems are dependent, i.e., sub-problems share sub-sub-problems, every
sub(sub)problem solved just once, solutions to sub(sub)problems are stored in a
table and used for solving higher level sub-problems
One technique that attempts to solve problems by dividing them into sub
problems is called dynamic programming. It uses a “bottom-up” approach in that
the sub problems are arranged and solved in a systematic fashion, which leads to
a solution to the original problem. This bottom-up approach implementation is
more efficient than a “top-down” counterpart mainly because duplicated
computation of the same problems is eliminated. This technique is typically
applied to solving optimization problems, although it is not limited to only
optimization problems.
Dynamic programming typically involves two steps: (1) develop a recursive
strategy for solving the problem; and (2) develop a “bottom-up” implementation
without recursion
Application domain of DP:
Optimization problem: find a solution with optimal (maximum or minimum) value.
An optimal solution, not the optimal solution, since may more than one optimal
solution, any one is OK
Typical steps of DP:
• Characterize the structure of an optimal solution.
• Recursively define the value of an optimal solution.
• Compute the value of an optimal solution in a bottom-up fashion.
• Compute an optimal solution from computed/stored information.
• Example: Compute the binomial coefficients C(n, k) defined by the
following recursive formula:
if k  0 or k  n;
 1,

C (n, k )  C (n  1, k )  C (n  1, k  1), if 0  k  n;
 0,
otherwise.

• The following “call tree” demonstrates repeated (duplicated) computations
in a straightforward recursive implementation
• Example: Solve the make-change problem using dynamic programming.
Suppose there are n types of coin denominations, d1, d2, …, and dn. (We
may assume one of them is penny.) There are an infinite supply of coins of
each type. To make change for an arbitrary amount j using the minimum
number of coins, we first apply the following recursive idea:
•
If there are only pennies, the problem is simple: simply use j pennies
to make change for the total amount j. More generally, if there are coin
types 1 through i, let C[i, j] stands for the minimum number of coins for
making change of amount j. By considering coin denomination i, there are
two cases: either we use at least one coin denomination i, or we don’t use
coin type i at all.
• In the first case, the total number of coins must be 1+ C[i, j–di] because the
total amount is reduced to j – di after using one coin of amount di, the rest
of coin selection from the solution of C[i, j] must be an optimal solution to
the reduced problem with reduced amount, still using coin types 1 through
i.
• In the second case, i.e., suppose no coins of denomination i will be used in
an optimal solution. Thus, the best solution is identical to solving the same
problem with the total amount j but using coin types 1 through i – 1, i.e. C[i
–1 , j]. Therefore, the overall best solution must be the better of the two
alternatives, resulting in the following recurrence:
•
C[i, j] = min (1 + C[i, j – di ], C[i –1 , j])
• The boundary conditions are when i  0 or when j < 0 (in which case let C[i,
j] = ), and when j = 0 (let C[i, j] = 0).
The Principle of Optimality:
In solving optimization problems which require making a sequence of decisions,
such as the change making problem, we often apply the following principle in
setting up a recursive algorithm: Suppose an optimal solution made decisions d1,
d2, and …, dn. The subproblem starting after decision point di and ending at
decision point dj, also has been solved with an optimal solution made up of the
decisions di through dj. That is, any subsequence of an optimal solution
constitutes an optimal sequence of decisions for the corresponding subproblem.
This is known as the principle of optimality which can be illustrated by the
shortest paths in weighted graphs as follows
The Partition Problem:
Given a set of positive integers, A = {a1, a2, …, an}. The question is to select a
subset B of A such that the sum of the numbers in B equals the sum of the
numbers not in B, i.e.,
. We may assume that the sum of all numbers in A
is 2K, an even number. We now propose a dynamic programming solution. For 1
 i  n and 0  j  K,
define P[i, j] = True if there exists a subset of the first i
through ai whose sum equals j;
False otherwise.
numbers a1
Thus, P[i, j] = True if either j = 0 or if (i = 1 and j = a1). When i > 1, we have the
following recurrence:
P[i, j] = P[i – 1, j] or (P[i – 1, j – ai]
if j – ai  0)
That is, in order for P[i, j] to be true, either there exists a subset of the first i – 1
numbers whose sum equals j, or whose sum equals j – ai (this latter case would
use the solution of P[i – 1, j – ai] and add the number ai. The value P[n, K] is the
answer.
LECTURE-17 and 18
Matrix Chain Multiplication: (MCM)
• Problem: given A1, A2, …,An, compute the product: A1A2…An , find the
fastest way (i.e., minimum number of multiplications) to compute it.
• Suppose two matrices A(p,q) and B(q,r), compute their product C(p,r) in p 
q  r multiplications
– for i=1 to p for j=1 to r C[i,j]=0
– for i=1 to p
• for j=1 to r
for k=1 to q C[i,j] = C[i,j]+ A[i,k]B[k,j]
• Different parenthesizations will have different number of multiplications
for product of multiple matrices
• Example: A(10,100), B(100,5), C(5,50)
– If ((A B) C), 10 100 5 +10 5 50 =7500
– If (A (B C)), 10 100 50+100 5 50=75000
• The first way is ten times faster than the second !!!
• Denote A1, A2, …,An by < p0,p1,p2,…,pn>
– i.e, A1(p0,p1), A2(p1,p2), …, Ai(pi-1,pi),… An(pn-1,pn)
• Intuitive brute-force solution: Counting the number of parenthesizations by
exhaustively checking all possible parenthesizations.
• Let P(n) denote the number of alternative parenthesizations of a sequence
of n matrices:
– P(n) =
1 if n=1
k=1n-1 P(k)P(n-k) if n2
• The solution to the recursion is (2n).
• So brute-force will not work.
• Step 1: structure of an optimal parenthesization
• Let Ai..j (ij) denote the matrix resulting from AiAi+1…Aj
• Any parenthesization of AiAi+1…Aj must split the product between
Ak and Ak+1 for some k, (ik<j). The cost = # of computing Ai..k + # of
computing Ak+1..j + # Ai..k  Ak+1..j.
• If k is the position for an optimal parenthesization, the
parenthesization of “prefix” subchain AiAi+1…Ak within this
optimal parenthesization of AiAi+1…Aj must be an optimal
parenthesization of AiAi+1…Ak.
AiAi+1…Ak  Ak+1…Aj
• Step 2: a recursive relation
– Let m[i,j] be the minimum number of multiplications for AiAi+1…Aj
– m[1,n] will be the answer
– m[i,j] = 0 if i = j
min {m[i,k] + m[k+1,j] +pi-1pkpj } if i<j
• Step 3, Computing the optimal cost
– If by recursive algorithm, exponential time (2n) (ref. to P.346 for the
proof.), no better than brute-force.
– Total number of subproblems:
+n = (n2)
– Recursive algorithm will encounter the same subproblem many
times.
– If tabling the answers for subproblems, each subproblem is only
solved once.
overlapping subproblems and solve every subproblem just once.
– array m[1..n,1..n], with m[i,j] records the optimal cost for
AiAi+1…Aj .
– array s[1..n,1..n], s[i,j] records index k which achieved the optimal
cost when computing m[i,j].
– Suppose the input to the algorithm is p=< p0 , p1 ,…, pn >.
MCM DP—order of matrix computations:
• Step 4, constructing a parenthesization order for the optimal solution.
Since s[1..n,1..n] is computed, and s[i,j] is the split position for AiAi+1…Aj , i.e,
Ai…As[i,j] and As[i,j] +1…Aj , thus, the parenthesization order can be obtained from
s[1..n,1..n] recursively, beginning from s[1,n].
LECTURE-19
Longest Common Subsequence (LCS):
The longest common subsequence (LCS) problem is to find the longest
subsequence common to all sequences in a set of sequences (often just two).
• DNA analysis, two DNA string comparisons.
• DNA string: a sequence of symbols A,C,G,T.
S=ACCGGTCGAGCTTCGAAT
• Subsequence (of X): is X with some symbols left out.
Z=CGTC is a subsequence of X=ACGCTAC.
• Common subsequence Z (of X and Y): a subsequence of X and also a
subsequence of Y.
Z=CGA is a common subsequence of both X=ACGCTAC and
Y=CTGACA.
• Longest Common Subsequence (LCS): the longest one of common
subsequences.
Z' =CGCA is the LCS of the above X and Y.
• LCS problem: given X=<x1, x2,…, xm> and Y=<y1, y2,…, yn>, find their LCS.
The LCS problem has what is called an "optimal substructure": the problem can be
broken down into smaller, simple "sub problems", which can be broken down into
yet simpler sub problems, and so on, until, finally, the solution becomes trivial.
The LCS problem also has what are called "overlapping sub problems": the
solution to a higher sub problem depends on the solutions to several of the lower
sub problems.
Problems with these two properties—optimal substructure and overlapping sub
problems—can be approached by a problem-solving technique called dynamic
programming, in which the solution is built up starting with the simplest sub
problems.
LCS DP –step 1: Optimal Substructure
• Characterize optimal substructure of LCS.
•
Theorem 15.1: Let X=<x1, x2,…, xm> (= Xm) and Y=<y1, y2,…,yn> (= Yn) and
Z=<z1, z2,…, zk> (= Zk) be any LCS of X and Y,
– 1. if xm= yn, then zk= xm= yn, and Zk-1 is the LCS of Xm-1 and Yn-1.
– 2. if xm yn, then zk  xm implies Z is the LCS of Xm-1 and Yn.
– 3. if xm yn, then zk  yn implies Z is the LCS of Xm and Yn-1.
LCS DP –step 2: Recursive Solution
• What the theorem says:
– If xm= yn, find LCS of Xm-1 and Yn-1, then append xm.
– If xm  yn, find LCS of Xm-1 and Yn and LCS of Xm and Yn-1, take which one
is longer.
• Overlapping substructure:
– Both LCS of Xm-1 and Yn and LCS of Xm and Yn-1 will need to solve LCS of
Xm-1 and Yn-1.
• c[i,j] is the length of LCS of Xi and Yj .
LCS DP-- step 3:Computing the Length of LCS
• c[0..m,0..n], where c[i,j] is defined as above.
– c[m,n] is the answer (length of LCS).
• b[1..m,1..n], where b[i,j] points to the table entry corresponding to the
optimal subproblem solution chosen when computing c[i,j].
– From b[m,n] backward to find the LCS.
The problem with the recursive solution is that the same subproblems get called
many different times. A subproblem consists of a call to lcs_length, with the
arguments being two suffixes of A and B, so there are exactly (m+1)(n+1) possible
subproblems (a relatively small number). If there are nearly 2^n recursive calls,
some of these subproblems must be being solved over and over.
The dynamic programming solution is to check whenever we want to solve a
subproblem, whether we've already done it before. If so we look up the solution
instead of recomputing it. Implemented in the most direct way, we just add some
code to our recursive algorithm to do this look up -- this "top down", recursive
version of dynamic programming is known as "memoization".
In the LCS problem, subproblems consist of a pair of suffixes of the two input
strings. To make it easier to store and look up subproblem solutions,
LECTURE-20
Greedy Algorithms
Definition: An algorithm that always takes the best immediate, or local, solution
while finding an answer. Greedy algorithms find the overall, or globally, optimal
solution for some optimization problems, but may find less-than-optimal solutions
for some instances of other problems.
 An optimization problem is one in which you want to find, not just a
solution, but the best solution
 A “greedy algorithm” sometimes works well for optimization problems
 A greedy algorithm works in phases. At each phase:
o we take the best you can get right now, without regard for future
consequences
We hope that by choosing a local optimum at each step, we will end up at a
global optimum.
We have completed data structures.
 We now are going to look at algorithm design methods.
 Often we are looking at optimization problems whose performance is
exponential.
 For an optimization problem, we are given a set of constraints and an
optimization function.
 Solutions that satisfy the constraints are called feasible solutions.
 A feasible solution for which the optimization function has the best possible
value is called an optimal solution.
 Cheapest lunch possible: Works by getting the cheapest meat, fruit,
vegetable, etc.
 In a greedy method we attempt to construct an optimal solution in stages.
 At each stage we make a decision that appears to be the best (under some
criterion) at the time.
 A decision made at one stage is not changed in a later stage, so each
decision should assure feasibility.
 Consider getting the best major: What is best now, may be worst later.
 Consider change making: Given a coin system and an amount to make
change for, we want minimal number of coins.
 A greedy criterion could be, at each stage increase the total amount of
change constructed as much as possible.
 In other words, always pick the coin with the greatest value at every step.
 A greedy solution is optimal for some change systems.
 Machine scheduling:
 Have a number of jobs to be done on a minimum number of machines.
Each job has a start and end time.
 Order jobs by start time.
 If an old machine becomes available by the start time of the task to be
assigned, assign the task to this machine; if not assign it to a new machine.
 This is an optimal solution.
 Note that our Huffman tree algorithm is an example of a greedy algorithm:
 Pick least weight trees to combine.
Structure Greedy Algorithm


Initially the set of chosen items is empty i.e., solution set.
At each step
o item will be added in a solution set by using selection function.
o IF the set would no longer be feasible
 reject items under consideration (and is never consider again).
o ELSE IF set is still feasible THEN
 add the current item.
A feasible set (of candidates) is promising if it can be extended to produce not
merely a solution, but an optimal solution to the problem. In particular, the
empty set is always promising why? (because an optimal solution always exists)
Unlike Dynamic Programming, which solves the subproblems bottom-up, a greedy
strategy usually progresses in a top-down fashion, making one greedy choice after
another, reducing each problem to a smaller one.
Greedy-Choice Property
The "greedy-choice property" and "optimal substructure" are two ingredients in
the problem that lend to a greedy strategy.
Greedy-Choice Property
It says that a globally optimal solution can be arrived at by making a locally
optimal choice.
Greedy algorithms can be characterized as being 'short sighted', and as 'nonrecoverable'. They are ideal only for problems which have 'optimal substructure'.
Despite this, greedy algorithms are best suited for simple problems (e.g. giving
change). It is important, however, to note that the greedy algorithm can be used
as a selection algorithm to prioritize options within a search, or branch and bound
algorithm. There are a few variations to the greedy algorithm:



Pure greedy algorithms
Orthogonal greedy algorithms
Relaxed greedy algorithms
Applications
Greedy algorithms mostly (but not always) fail to find the globally optimal
solution, because they usually do not operate exhaustively on all the data. They
can make commitments to certain choices too early which prevent them from
finding the best overall solution later. For example, all known greedy coloring
algorithms for the graph coloring problem and all other NP-complete problems do
not consistently find optimum solutions. Nevertheless, they are useful because
they are quick to think up and often give good approximations to the optimum.
If a greedy algorithm can be proven to yield the global optimum for a given
problem class, it typically becomes the method of choice because it is faster than
other optimisation methods like dynamic programming. Examples of such greedy
algorithms are Kruskal's algorithm and Prim's algorithm for finding minimum
spanning trees, Dijkstra's algorithm for finding single-source shortest paths, and
the algorithm for finding optimum Huffman trees.
The theory of matroids, and the more general theory of greedoids, provide whole
classes of such algorithms.
Greedy algorithms appear in network routing as well. Using greedy routing, a
message is forwarded to the neighboring node which is "closest" to the
destination. The notion of a node's location (and hence "closeness") may be
determined by its physical location, as in geographic routing used by ad-hoc
networks. Location may also be an entirely artificial construct as in small world
routing and distributed hash table.
LECTURE-21
Knapsack Problem
Statement A thief robbing a store and can carry a maximal weight of w into
their knapsack. There are n items and ith item weigh wi and is worth vi dollars.
What items should thief take?
There are two versions of problem
I.
Fractional knapsack problem
The setup is same, but the thief can take fractions of items, meaning
that the items can be broken into smaller pieces so that thief may
decide to carry only a fraction of xi of item i, where 0 ≤ xi ≤ 1.
Exhibit greedy choice property.

Greedy algorithm exists.
Exhibit optimal substructure property.
II.
0-1 knapsack problem
The setup is the same, but the items may not be broken into smaller
pieces, so thief may decide either to take an item or to leave it
(binary choice), but may not take a fraction of an item.
Exhibit No greedy choice property.

No greedy algorithm exists.
Exhibit optimal substructure property.

Only dynamic programming algorithm exists.
Greedy Solution to the Fractional Knapsack Problem
There are n items in a store. For i =1,2, . . . , n, item i has weight wi > 0 and worth
vi > 0. Thief can carry a maximum weight of W pounds in a knapsack. In this
version of a problem the items can be broken into smaller piece, so the thief may
decide to carry only a fraction xi of object i, where 0 ≤ xi ≤ 1. Item i contributes xiwi
to the total weight in the knapsack, and xivi to the value of the load.
In Symbol, the fraction knapsack problem can be stated as follows.
maximize nSi=1 xivi subject to constraint nSi=1 xiwi ≤ W
It is clear that an optimal solution must fill the knapsack exactly, for otherwise we
could add a fraction of one of the remaining objects and increase the value of the
load. Thus in an optimal solution nSi=1 xiwi = W.
Greedy-fractional-knapsack (w, v, W)
FOR i =1 to n
do x[i] =0
weight = 0
while weight < W
do i = best remaining item
IF weight + w[i] ≤ W
then x[i] = 1
weight = weight + w[i]
else
x[i] = (w - weight) / w[i]
weight = W
return x
Analysis
If the items are already sorted into decreasing order of vi / wi, then
the while-loop takes a time in O(n);
Therefore, the total time including the sort is in O(n log n).
If we keep the items in heap with largest vi/wi at the root. Then


creating the heap takes O(n) time
while-loop now takes O(log n) time (since heap property must be restored
after the removal of root)
Although this data structure does not alter the worst-case, it may be faster if only
a small number of items are need to fill the knapsack.
One variant of the 0-1 knapsack problem is when order of items are sorted by
increasing weight is the same as their order when sorted by decreasing value.
The optimal solution to this problem is to sort by the value of the item in
decreasing order. Then pick up the most valuable item which also has a least
weight. First, if its weight is less than the total weight that can be carried. Then
deduct the total weight that can be carried by the weight of the item just pick.
The second item to pick is the most valuable item among those remaining. Keep
follow the same strategy until thief cannot carry more item (due to weight).
Proof
One way to proof the correctness of the above algorithm is to prove the greedy
choice property and optimal substructure property. It consist of two steps. First,
prove that there exists an optimal solution begins with the greedy choice given
above. The second part prove that if A is an optimal solution to the original
problem S, then A - a is also an optimal solution to the problem S - s where a is
the item thief picked as in the greedy choice and S - s is the subproblem after the
first greedy choice has been made. The second part is easy to prove since the
more
valuable
items
have
less
weight.
Note that if v` / w` , is not it can replace any other because w` < w, but it
increases the value because v` > v.
Theorem
The fractional knapsack problem has the greedy-choice property.
Proof Let the ratio v`/w` is maximal. This supposition implies that v`/w`
≥ v/w for any pair (v, w), so v`v / w > v for any (v, w). Now Suppose a solution
does not contain the full w` weight of the best ratio. Then by replacing an amount
of any other w with more w` will improve the value.
LECTURE-22
Huffman Codes
Huffman code is a technique for compressing data. Huffman's greedy algorithm
look at the occurrence of each character and it as a binary string in an optimal
way.
Example
Suppose we have a data consists of 100,000 characters that we want to compress.
The characters in the data occur with following frequencies.
a
b
c
d
e
f
Frequency 45,000 13,000 12,000 16,000 9,000 5,000
Consider the problem of designing a "binary character code" in which each
character is represented by a unique binary string.
Fixed Length Code
In fixed length code, needs 3 bits to represent six(6) characters.
a
Frequency
b
c
d
e
f
45,000 13,000 12,000 16,000 9,000 5,000
Fixed Length
000
code
001
010
011
100
101
This method require 3000,000 bits to code the entire file.
How do we get 3000,000?


Total number of characters are 45,000 + 13,000 + 12,000 + 16,000 + 9,000 +
5,000 = 1000,000.
Add each character is assigned 3-bit codeword => 3 * 1000,000 = 3000,000
bits.
Conclusion
Fixed-length code requires 300,000 bits while variable code requires 224,000 bits.
• Saving of approximately 25%.
Prefix Codes
In which no codeword is a prefix of other codeword. The reason prefix codes are
desirable is that they simply encoding (compression) and decoding.
Can we do better?
A variable-length code can do better by giving frequent characters short
codewords and infrequent characters long codewords.
a
Frequency
Fixed Length
code
b
c
d
e
f
45,000 13,000 12,000 16,000 9,000 5,000
0
101
100
111
1101
1100
Character 'a' are 45,000
each character 'a' assigned 1 bit codeword.
1 * 45,000 = 45,000 bits.
Characters (b, c, d) are 13,000 + 12,000 + 16,000 = 41,000
each character assigned 3 bit codeword
3 * 41,000 = 123,000 bits
Characters (e, f) are 9,000 + 5,000 = 14,000
each character assigned 4 bit codeword.
4 * 14,000 = 56,000 bits.
Implies that the total bits are: 45,000 + 123,000 + 56,000 = 224,000 bits
Encoding: Concatenate the code words representing each characters of the file.
String Encoding
TEA 10 00 010
SEA 011 00 010
TEN 10 00 110
Decoding
Since no codeword is a prefix of other, the codeword that begins an encoded file
is unambiguous.
To decode (Translate back to the original character), remove it from the encode
file and repeatedly parse.
For example in "variable-length codeword" table, the string 001011101 parse
uniquely as 0.0.101.1101, which is decode to aabe.
The representation of "decoding process" is binary tree, whose leaves are
characters. We interpret the binary codeword for a character as path from the
root to that character, where 0 means "go to the left child" and 1 means "go to
the right child". Note that an optimal code for a file is always represented by a full
(complete) binary tree.
Theorem
code.
A Binary tree that is not full cannot correspond to an optimal prefix
Proof
Let T be a binary tree corresponds to prefix code such that T is not full.
Then there must exist an internal node, say x, such that x has only one child, y.
Construct another binary tree, T`, which has save leaves as T and have same
depth as T except for the leaves which are in the subtree rooted at y in T. These
leaves will have depth in T`, which implies T cannot correspond to an optimal
prefix
code.
To obtain T`, simply merge x and y into a single node, z is a child of parent of x (if a
parent exists) and z is a parent to any children of y. Then T` has the desired
properties: it corresponds to a code on the same alphabet as the code which are
obtained, in the sub tree rooted at y in T have depth in T` strictly less (by one)
than
their
This completes the proof.
a
Frequency
Fixed Length
code
Variable-length
Code
depth
b
c
d
in
e
45,000 13,000 12,000 16,000 9,000
T.
f
5,000
000
001
010
011
100
101
0
101
100
111
1101
1100
Constructing a Huffman code
A greedy algorithm that constructs an optimal prefix code called a Huffman code.
The algorithm builds the tree T corresponding to the optimal code in a bottom-up
manner. It begins with a set of |c| leaves and perform |c|-1 "merging" operations
to create the final tree.
Data Structure used: Priority queue = Q
Huffman (c)
n = |c|
Q=c
for i =1 to n-1
do z = Allocate-Node ()
x = left[z] = EXTRACT_MIN(Q)
y = right[z] = EXTRACT_MIN(Q)
f[z] = f[x] + f[y]
INSERT (Q, z)
return EXTRACT_MIN(Q)
Analysis




Q implemented as a binary heap.
Line 2 can be performed by using BUILD-HEAP (P. 145; CLR) in O(n) time.
FOR loop executed |n| - 1 times and since each heap operation requires O(lg n) time.
=> the FOR loop contributes (|n| - 1) O(lg n)
=> O(n lg n)
Thus the total running time of Huffman on the set of n characters is O(nlg n).
LECTURE-23
An Activity Selection Problem
An activity-selection is the problem of scheduling a resource among several
competing activity.
Problem Statement
Given a set S of n activities with and start time, Si and fi, finish time of an ith
activity. Find the maximum size set of mutually compatible activities.
Compatible Activities
Activities i and j are compatible if the half-open internal [si, fi) and [sj, fj)
do not overlap, that is, i and j are compatible if si ≥ fj and sj ≥ fi
Greedy Algorithm for Selection Problem
I.
Sort the input activities by increasing finishing time.
f1 ≤ f2 ≤ . . . ≤ fn
II. Call GREEDY-ACTIVITY-SELECTOR (s, f)
1.
2.
3.
4.
5.
6.
7.
8.
n = length [s]
A={i}
j=1
for i = 2 to n
do if si ≥ fj
then A= AU{i}
j=i
return set A
Operation of the algorithm
Let 11 activities are given S = {p, q, r, s, t, u, v, w, x, y, z} start and finished times
for proposed activities are (1, 4), (3, 5), (0, 6), 5, 7), (3, 8), 5, 9), (6, 10), (8, 11), (8,
12), (2, 13) and (12, 14).
A = {p} Initialization at line 2
A = {p, s} line 6 - 1st iteration of FOR - loop
A = {p, s, w} line 6 -2nd iteration of FOR - loop
A = {p, s, w, z} line 6 - 3rd iteration of FOR-loop
Out of the FOR-loop and Return A = {p, s, w, z}
Analysis
Part I requires O(n lg n) time (use merge of heap sort).
Part II requires θ(n) time assuming that activities were already sorted in part I by
their finish time.
Correctness
Note that Greedy algorithm do not always produce optimal solutions but GREEDYACTIVITY-SELECTOR does.
Theorem Algorithm GREED-ACTIVITY-SELECTOR produces solution of maximum
size for the activity-selection problem.
Proof Idea Show the activity problem satisfied
I.
II.
Greedy choice property.
Optimal substructure property.
Proof
I.
II.
Let S = {1, 2, . . . , n} be the set of activities. Since activities are in order by
finish time. It implies that activity 1 has the earliest finish time.
Suppose, A S is an optimal solution and let activities in A are ordered by
finish time. Suppose, the first activity in A is k.
If k = 1, then A begins with greedy choice and we are done (or to be very
precise, there is nothing to proof here).
If k 1, we want to show that there is another solution B that begins with
greedy choice, activity 1.
Let B = A - {k} {1}. Because f1 fk, the activities in B are disjoint and since
B has same number of activities as A, i.e., |A| = |B|, B is also optimal.
Once the greedy choice is made, the problem reduces to finding an optimal
solution for the problem. If A is an optimal solution to the original problem
S, then A` = A - {1} is an optimal solution to the activity-selection problem S`
= {i S: Si fi}.
why? Because if we could find a solution B` to S` with more activities then
A`, adding 1 to B` would yield a solution B to S with more activities than A,
there by contradicting the optimality.
As an example consider the example. Given a set of activities to among lecture
halls. Schedule all the activities using minimal lecture halls.
In order to determine which activity should use which lecture hall, the algorithm
uses the GREEDY-ACTIVITY-SELECTOR to calculate the activities in the first lecture
hall. If there are some activities yet to be scheduled, a new lecture hall is selected
and GREEDY-ACTIVITY-SELECTOR is called again. This continues until all activities
have been scheduled.
GREED-ACTIVITY-SELECTOR (s, f, n)
j = first (s)
A = i = j + 1 to n
if s(i] not = "-" then
if s[i] ≥ f[j]|
then A = AU{i}
s[i] = "-"
j=i
return A
Analysis
In the worst case, the number of lecture halls require is n. GREED-ACTIVITYSELECTOR runs in θ(n). The running time of this algorithm is O(n2).
Two important Observations

Choosing the activity of least duration will not always produce an optimal
solution. For example, we have a set of activities {(3, 5), (6, 8), (1, 4), (4, 7),
(7, 10)}. Here, either (3, 5) or (6, 8) will be picked first, which will be picked
first, which will prevent the optimal solution of {(1, 4), (4, 7), (7, 10)} from
being
found.

Choosing the activity with the least overlap will not always produce
solution. For example, we have a set of activities {(0, 4), (4, 6), (6, 10), (0, 1),
(1, 5), (5, 9), (9, 10), (0, 3), (0, 2), (7, 10), (8, 10)}. Here the one with the
least overlap with other activities is (4, 6), so it will be picked first. But that
would prevent the optimal solution of {(0, 1), (1, 5), (5, 9), (9, 10)} from
being found.
LECTURE-24
Assembly Line Scheduling (ALS):
Find the structure of the fastest way through factory
Consider the fastest way from starting point through station S1,j (same for S2,j)
• j=1, only one possibility
• j=2,3,…,n, two possibilities: from S1,j-1 or S2,j-1
– from S1,j-1, additional time a1,j
– from S2,j-1, additional time t2,j-1 + a1,j
• suppose the fastest way through S1,j is through S1,j-1, then the
chassis must have taken a fastest way from starting point
through S1,j-1. Why???
Similarly for S2,j-1.
Step 1: Find Optimal Structure
• An optimal solution to a problem contains within it an optimal solution to
sub problems.
• the fastest way through station Si,j contains within it the fastest way
through station S1,j-1 or S2,j-1 .
• Thus can construct an optimal solution to a problem from the optimal
solutions to subproblems.
Step 2: A recursive solution
• Let fi[j] (i=1,2 and j=1,2,…, n) denote the fastest possible time to get a
chassis from starting point through Si,j.
• Let f* denote the fastest time for a chassis all the way through the factory.
Then
•
f* = min(f1[n] +x1, f2[n] +x2)
• f1[1]=e1+a1,1, fastest time to get through S1,1
• f1[j]=min(f1[j-1]+a1,j, f2[j-1]+ t2,j-1+ a1,j)
• Similarly to f2[j].
• Recursive solution:
• f* = min(f1[n] +x1, f2[n] +x2)
• f1[j]= e1+a1,1
•
if j=1
min(f1[j-1]+a1,j, f2[j-1]+ t2,j-1+ a1,j) if j>1
• f2[j]= e2+a2,1
•
if j=1
min(f2[j-1]+a2,j, f1[j-1]+ t1,j-1+ a2,j) if j>1
• fi[j] (i=1,2; j=1,2,…,n) records optimal values to the subproblems.
• To keep the track of the fastest way, introduce li[j] to record the line
number (1 or 2), whose station j-1 is used in a fastest way through Si,j.
• Introduce l* to be the line whose station n is used in a fastest way through
the factory.
Step 3: Computing the fastest time
• One option: a recursive algorithm.
• Let ri(j) be the number of references made to fi[j]
• r1(n) = r2(n) = 1
• r1(j) = r2(j) = r1(j+1)+ r2(j+1)
• ri (j) = 2n-j.
• So f1[1] is referred to 2n-1 times.
• Total references to all fi[j] is (2n).
• Thus, the running time is exponential.
Step 4: Construct the fastest way through the factory
LECTURE-25
Doubt clearing class
LECTURE: 26, 27
Data Structures for Disjoint Sets:
A simple approach to creating a disjoint-set data structure is to create a linked list
for each set. The element at the head of each list is chosen as its representative.
MakeSet creates a list of one element. Union appends the two lists, a constanttime operation. The drawback of this implementation is that Find requires O(n) or
linear time to traverse the list backwards from a given element to the head of the
list.
This can be avoided by including in each linked list node a pointer to the head of
the list; then Find takes constant time, since this pointer refers directly to the set
representative. However, Union now has to update each element of the list being
appended to make it point to the head of the new combined list, requiring Ω(n)
time.
When the length of each list is tracked, the required time can be improved by
always appending the smaller list to the longer. Using this weighted-union
heuristic, a sequence of m MakeSet, Union, and Find operations on n elements
requires O(m + nlog n) time. For asymptotically faster operations, a different data
structure is needed.
we describe some methods for maintaining a collection of disjoint sets. Each set is
represented as a pointer-based data structure, with one node per element. Each
set has a `leader' element, which uniquely identies the set. (Since the sets are
always disjoint, the same object cannot be the leader of more than one set.) We
want to support the following operations.
MakeSet(x): Create a new set {x} containing the single element x. The element x
must not appear in any other set in our collection. The leader of the new set is
obviously x.
Find(x): Find (the leader of) the set containing x.
Union (A,B): Replace two sets A and B in our collection with their union A U B.
For example, Union(A, MakeSet(x)) adds a new element x to an existing set A. The
sets A and B are specified by arbitrary elements, so Union(x; y) has exactly the
same behavior as Union (Find(x), Find (y)).
Disjoint set data structures have lots of applications. For instance, Kruskal's
minimum spanning tree algorithm relies on such a data structure to maintain the
components of the intermediate spanning forest. Another application might be
maintaining the connected components of a graph as new vertices and edges are
added. In both these applications, we can use a disjoint-set data structure, where
we keep a set for each connected component, containing that component's
vertices.
Reversed Trees
One of the easiest ways to store sets is using trees. Each object points to another
object, called its parent, except for the leader of each set, which points to itself
and thus is the root of the tree.
MakeSet is trivial. Find traverses the parent pointers up to the leader. Union just
redirects the parent pointer of one leader to the other. Notice that unlike most
tree data structures, objects do not have pointers down to their children.
Make-Set clearly takes Ө(1) time, and Union requires only O(1) time in addition to
the two Finds.
The running time of Find(x) is proportional to the depth of x in the tree. It is not
hard to come up with a sequence of operations that results in a tree that is a long
chain of nodes, so that Find takes Ө (n) time in the worst case.
However, there is an easy change we can make to our Union algorithm, called
union by depth, So that the trees always have logarithmic depth. Whenever we
need to merge two trees, we always make the root of the shallower tree a child of
the deeper one. This requires us to also maintain the depth of each tree, but this
is quite easy.
With this simple change, Find and Union both run in Ө (log n) time in the worst
case.
Shallow Threaded Trees
Alternately, we could just have every object keep a pointer to the leader of its set.
Thus, each set is represented by a shallow tree, where the leader is the root and
all the other elements are its children. With this representation, MakeSet and
Find are completely trivial. Both operations clearly run in constant time. Union is a
little more difficult, but not much. Our algorithm sets all the leader pointers in
one set to point to the leader of the other set. To do this, we need a method
to visit every element in a set; we will `thread' a linked list through each set,
starting at the set's leader. The two threads are merged in the Union algorithm in
constant time.
The worst-case running time of Union is a constant times the size of the larger set.
Thus, if we merge a one-element set with another n-element set, the running
time can be _(n). Generalizing this idea, it is quite easy to come up with a
sequence of n MakeSet and n - 1 Union operations that requires Ө (n2) time to
create the set {1; 2; : : : ; n} from scratch.
LECTURE: 28
0-1 Knapsack Problem
Partition the problem into sub problems.
2. Solve the sub problems.
3. Combine the solutions to solve the original one.
Remark: If the sub problems are not independent, i.e. sub problems share sub subproblems,
then a divide and conquer algorithm repeatedly solves the common sub subproblems.
The Idea of Developing a DP Algorithm
Step1: Structure: Characterize the structure of an optimal solution.
Decompose the problem into smaller problems, and find a relation between the structure of
the optimal solution of the original problem and the solutions of the smaller problems.
Step2: Principle of Optimality: Recursively define the value of an optimal solution.
Express the solution of the original problem in terms of optimal solutions for smaller problems
Step 3: Bottom-up computation: Compute the value of an optimal solution in a bottom-up
fashion by using a table structure.
Step 4: Construction of optimal solution: Construct an optimal solution from computed
information.
Steps 3 and 4 may often be combined
.
.