Download Chapter 1--Algorithm Analysis

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
CSC401 – Analysis of Algorithms
Chapter 1
Algorithm Analysis
Objectives:
Introduce algorithm and algorithm analysis
Discuss algorithm analysis methodologies
Introduce pseudo code of algorithms
Asymptotic notion of algorithm efficiency
Mathematics foundation for algorithm analysis
Amortization analysis techniques
What is an algorithm ?
Input
Algorithm
Output
An algorithm is a
step-by-step procedure
for solving a problem
in a finite amount of
time.
What is algorithm analysis ?
Two aspects:
• Running time – How much time is taken to
complete the algorithm execution?
• Storage requirement – How much memory is
required to execute the program?
2
Running Time
– Easier to analyze
– Crucial to applications such
as games, finance and
robotics
best case
average case
worst case
120
100
Running Time
Most algorithms
transform input objects
into output objects.
The running time of an
algorithm typically grows
with the input size.
Average case time is
often difficult to
determine.
We focus on the worst
case running time.
80
60
40
20
0
1000
2000
3000
4000
Input Size
3
How Is An Algorithm Analyzed?
Two methodologies
9000
8000
– Experimental analysis
Time (ms)
Method
7000
– Write a program implementing the
algorithm
– Run the program with inputs of
varying size and composition
– Use a method like System.currentTimeMillis()
to get an accurate measure of the
actual running time
– Plot the results
6000
5000
4000
3000
2000
1000
0
0
50
100
Input Size
Limitations
– It is necessary to implement the algorithm, which may
be difficult
– Results may not be indicative of the running time on
other inputs not included in the experiment.
– In order to compare two algorithms, the same
hardware and software environments must be used
4
How Is An Algorithm Analyzed?
– Theoretical Analysis
Method
– Uses a high-level description of the algorithm
instead of an implementation
– Characterizes running time as a function of the
input size, n.
– Takes into account all possible inputs
– Allows us to evaluate the speed of an algorithm
independent of the hardware/software
environment
Characteristics
– A description language -- Pseudo code
– Mathematics
5
Pseudocode
Example: find max
element of an array
High-level description
of an algorithm
More structured than Algorithm arrayMax(A, n)
English prose
Input array A of n integers
Output maximum element of A
Less detailed than a
program
currentMax  A[0]
Preferred notation for
for i  1 to n  1 do
describing algorithms
if A[i]  currentMax then
Hides program design
currentMax  A[i]
issues
return currentMax
6
Pseudocode Details
Control flow
–
–
–
–
–
–
if … then … [else …]
switch …
while … do …
repeat … until …
for … do …
Indentation replaces
braces
Method declaration
Algorithm method (arg [, arg…])
Input …
Output …
Method call
var.method (arg [, arg…])
Return value
return expression
Expressions
 Assignment
(like  in Java)
 Equality testing
(like  in Java)
n2 Superscripts and other
mathematical
formatting allowed
Array indexing: A[i]
7
The Random Access Machine (RAM) Model
A CPU
An potentially
unbounded bank of
memory cells, each of
which can hold an
arbitrary number or
character
2
0
1
Memory cells are numbered and
accessing any cell in memory takes
unit time.
8
Primitive Operations
Basic computations
performed by an algorithm
Identifiable in pseudocode
Largely independent from
the programming language
Exact definition not
important (we will see why
later)
Assumed to take a
constant amount of time in
the RAM model
Examples:
– Evaluating an
expression
– Assigning a value to
a variable
– Performing an
arithmetic operation
– Comparing two
numbers
– Indexing into an
array
– Following an object
reference
– Calling a method
– Returning from a
method
9
Counting Primitive Operations
By inspecting the pseudocode, we can determine
the maximum number of primitive operations
executed by an algorithm, as a function of the
input size
Algorithm arrayMax(A, n)
currentMax  A[0]
for i  1 to n  1 do
if A[i]  currentMax then
currentMax  A[i]
{ increment counter i }
return currentMax
# operations
2
1+n
2(n  1)
2(n  1)
2(n  1)
1
Total 7n  2
10
Estimating Running Time
Algorithm arrayMax executes 7n  2 primitive
operations in the worst case.
Define:
a = Time taken by the fastest primitive operation
b = Time taken by the slowest primitive operation
Let T(n) be the worst-case time of arrayMax.
Then
a (7n  1)  T(n)  b(7n  1)
Hence, the running time T(n) is bounded by
two linear functions
11
Growth Rate of Running Time
Changing the hardware/ software environment
– Affects T(n) by a constant factor, but
– Does not alter the growth rate of T(n)
– Linear  n
– Quadratic  n2
– Cubic  n3
In a log-log chart, the
slope of the line
corresponds to the
growth rate of the
function
T (n )
The linear growth rate of the running time T(n) is
an intrinsic property of algorithm arrayMax
1E+30
Growth rates of
1E+28
Cubic
1E+26
functions:
Quadratic
1E+24
1E+22
1E+20
1E+18
1E+16
1E+14
1E+12
1E+10
1E+8
1E+6
1E+4
1E+2
1E+0
1E+0
Linear
1E+2
1E+4
1E+6
n
1E+8
1E+10
12
Constant Factors
The growth rate is
not affected by
Examples
– 102n + 105 is a linear
function
– 105n2 + 108n is a
quadratic function
T (n )
– constant factors or
– lower-order terms
1E+26
1E+24
1E+22
1E+20
1E+18
1E+16
1E+14
1E+12
1E+10
1E+8
1E+6
1E+4
1E+2
1E+0
1E+0
Quadratic
Quadratic
Linear
Linear
1E+2
1E+4
1E+6
1E+8
1E+10
n
13
Big-Oh Notation
Given functions f(n) and g(n),
we say that f(n) is O(g(n)) if
there are positive constants
c and n0 such that
f(n)  cg(n) for n  n0
Example: 2n + 10 is O(n)
–
–
–
–
3n
2n+10
1,000
n
100
10
1
1
2n + 10  cn
(c  2) n  10
n  10/(c  2)
Pick c  3 and n0  10
Example: the function
not O(n)
10,000
10
100
1,000
n
1,000,000
n^2
100n
100,000
n2
is
– n2  cn
– n c
– The above inequality cannot
be satisfied since c must be
a constant
10n
n
10,000
1,000
100
10
14
1
1
10
100
1,000
More Big-Oh Examples
7n-2

7n-2 is O(n)
need c > 0 and n0  1 such that 7n-2  c•n for n  n0
this is true for c = 7 and n0 = 1
3n3 + 20n2 + 5



3n3 + 20n2 + 5 is O(n3)
need c > 0 and n0  1 such that 3n3 + 20n2 + 5  c•n3
for n  n0
this is true for c = 4 and n0 = 21
3 log n + log log n



3 log n + log log n is O(log n)
need c > 0 and n0  1 such that 3 log n + log log n 
c•log n for n  n0
this is true for c = 4 and n0 = 2
15
Big-Oh and Growth Rate
The big-Oh notation gives an upper bound on
the growth rate of a function
The statement “f(n) is O(g(n))” means that the
growth rate of f(n) is no more than the growth
rate of g(n)
We can use the big-Oh notation to rank
functions according to their growth rate
g(n) grows more
f(n) grows more
Same growth
f(n) is O(g(n))
g(n) is O(f(n))
Yes
No
Yes
No
Yes
Yes
16
Big-Oh Rules
If is f(n) a polynomial of degree d, then
f(n) is O(nd), i.e.,
1.Drop lower-order terms
2.Drop constant factors
Use the smallest possible class of
functions
– Say “2n is O(n)” instead of “2n is O(n2)”
Use the simplest expression of the class
– Say “3n + 5 is O(n)” instead of “3n + 5 is
O(3n)”
17
Asymptotic Algorithm Analysis
The asymptotic analysis of an algorithm
determines the running time in big-Oh notation
To perform the asymptotic analysis
– We find the worst-case number of primitive operations
executed as a function of the input size
– We express this function with big-Oh notation
Example:
– We determine that algorithm arrayMax executes at
most 7n  2 primitive operations
– We say that algorithm arrayMax “runs in O(n) time”
Since constant factors and lower-order terms are
eventually dropped anyhow, we can disregard
them when counting primitive operations
18
Summations
Summation:
b
 f (i)  f (a) + f (a + 1) + f (a + 2) +  + f (b)
i a
1  a n+1
a  1+ a + a ++ a 
Geometric summation: 
1 a
i 0

1
i
2
n
a  1+ a + a ++ a + 

1 a
i 0
n
i
2
n
n
Natural summation:  i  1 + 2 + 3 +  + n 
i 0
n(n + 1)
2

1
1 1
1
Harmonic number:   1 + + +  + +   ln n
2 3
n
i 0 i
Split summation:
n
k
n
i 0
i 1
i  k +1
 f (i)   f (i) +  f (i)
19
Logarithms and Exponents
Logarithms
– properties of logarithms:
logb(xy) = logbx + logby
logb (x/y) = logbx - logby
logbxa = alogbx
logba = logxa/logxb
Exponents
– properties of exponentials:
a(b+c) = aba c
abc = (ab)c
ab /ac = a(b-c)
b = a logab
bc = a c*logab
Floor and ceiling functions
–
–
x the largest integer less than or equal to x
x the smallest integer greater than or equal to x
20
Proof Techniques
By Example -- counterexample
Contrapositive
– Principle: To justify “if p is true, then q is true”, show “if
q is not true, then p is not true”.
– Example: To justify “if ab is odd, a is odd or b is even”,
assume “’a is odd or b is odd’ is not true”, then “a is
even and b is odd”, then “ab is even”, then “’ab is odd’
is not true”
Contradiction
– Principle: To justify “p is true”, show “if p is not true,
then there exists a contradiction”.
– Example: To justify “if ab is odd, then a is odd or b is
even”, let “ab is odd”, and assume “’a is odd or b is
even’ is not true”, then “a is even and b is odd”, thus
“ab is even” which is a contradiction to “ab is odd”.
21
Proof by Induction
Induction -- To justify S(n) for n>=n0
– Principle
Base cases: Justify S(n) is true for n0<=n<=n1
Assumption: Assume S(n) is true for n=N>=n1;
or
Assume S(n) is true for n1<=n<=N
Induction: Justify S(n) is true for n=N+1
– Example:
Question: Fibonacci sequence is defined as F(1)=1,
F(2)=1, and F(n)=F(n-1)+F(n-2) for n>2. Justify
F(n)<2^n for all n>=1
Proof by induction where n0=1. Let n1=2.
– Base cases: For n0<=n<=n1, n=1 or n=2. If n=1,
F(1)=1<2=2^1. If n=2, F(2)=2<4=2^2. It holds.
– Assumption: Assume F(n)<2^n for n=N>=2.
– Induction: For n=N+1, F(N+1)=F(N)+F(N1)<2^N+2^(N-1)<2 2^N=2^(N+1). It holds.
22
Proof by Loop Invariants
Principle:
– To prove a statement S about a loop is correct, define S in
terms of a series of smaller statements S0, S1, …, Sk,
where
The initial claim S0 is true before the loop begins
If Si-1 is true before iteration i begins, then show that
Si is true after iteration i is over
The final statement Sk is true
Example: Consider the algorithm arrayMax
– Statement S: max is the maximum number when finished
– A series of smaller statements:
Si: max is the maximum
Algorithm arrayMax(A, n)
in the first i+1 elements
max  A[0]
of the array
for i  1 to n  1 do
S0 is true before the loop
if A[i]  max then
If Si is true, then easy to
max  A[i]
return max
show Si+1 is true
23
S=Sn-1 is also true
Basic Probability
Sample space: the set of all possible outcomes from
some experiment
Probability space: a sample space S together with a
probability function that maps subsets of S to real numbers
between 0 and 1
Event: Each subset of A of S called an event
Properties of the probability function Pr
–
–
–
–
Pr(Ø)=0
Pr(S)=1
0<=Pr(A)<=1 for any subset A of S
If A and B are subsets of S and AB=, then
Pr(AB)=Pr(A)+Pr(B)
Independence
– A and B are independent if Pr(AB)=Pr(A)Pr(B)
– A1, A2, …, An are mutually independent if Pr(A1A2…
An)=Pr(A1)Pr(A2)…Pr(An)
24
Basic Probability
Conditional probability
– The conditional probability that A occurs, given B, is
defined as: Pr(A|B)=Pr(AB)/Pr(B), assuming Pr(B)>0
Random variables
– Intuitively, Variables whose values depend on the
outcomes of some experiment
– Formally, a function X that maps outcomes from some
sample space S to real numbers
– Indicator random variable: a variable that maps
outcomes to either 0 or 1
Expectation:a random variable has random values
– Intuitively, average value of a random variable
– Formally, expected value of a random variable X is
defined as E(X)=xxPr(X=x)
– Properties:
Linearity: E(X+Y) = E(X) + E(Y)
Independence: If X and Y are independent, that is,
Pr(X=x|Y=y)=Pr(X=x), then E(XY)=E(X)E(Y)
25
Computing Prefix Averages
We further illustrate
asymptotic analysis with
two algorithms for prefix
averages
The i-th prefix average of
an array X is average of
the first (i + 1) elements of
X:
A[i]  (X[0] + X[1] + … +
X[i])/(i+1)
Computing the array A of
prefix averages of
another array X has
applications to financial
analysis
35
30
X
A
25
20
15
10
5
0
1 2 3 4 5 6 7
26
Prefix Averages (Quadratic)
The following algorithm computes prefix
averages in quadratic time by applying the
definition
Algorithm prefixAverages1(X, n)
Input array X of n integers
Output array A of prefix averages of X #operations
A  new array of n integers
n
for i  0 to n  1 do
n
s  X[0]
n
for j  1 to i do
1 + 2 + …+ (n  1)
s  s + X[j]
1 + 2 + …+ (n  1)
A[i]  s / (i + 1)
n
return A
1
27
Arithmetic Progression
The running time of
prefixAverages1 is
O(1 + 2 + …+ n)
The sum of the first n
integers is n(n + 1) / 2
– There is a simple visual
proof of this fact
Thus, algorithm
prefixAverages1 runs in
O(n2) time
7
6
5
4
3
2
1
0
1
2
3
4
5
6
28
Prefix Averages (Linear)
The following algorithm computes prefix
averages in linear time by keeping a running
sum
Algorithm prefixAverages2(X, n)
Input array X of n integers
Output array A of prefix averages of X
A  new array of n integers
s0
for i  0 to n  1 do
s  s + X[i]
A[i]  s / (i + 1)
return A
#operations
n
1
n
n
n
1
Algorithm prefixAverages2 runs in O(n) time
29
Amortization
Amortization
– Typical data structure supports a wide variety of
operations for accessing and updating the elements
– Each operation takes a varying amount of running time
– Rather than focusing on each operation
– Consider the interactions between all the operations by
studying the running time of a series of these operations
– Average the operations’ running time
Amortized running time
– The amortized running time of an operation within a
series of operations is defined as the worst-case running
time of the series of operations divided by the number of
operations
– Some operations may have much higher actual running
time than its amortized running time, while some others
30
have much lower
The Clearable Table Data Structure
The clearable table
– An ADT
Storing a table of elements
Being accessing by their index in the table
– Two methods:
add(e) -- add an element e to the next available cell
in the table
clear() -- empty the table by removing all elements
Consider a series of operations (add and
clear) performed on a clearable table S
– Each add takes O(1)
– Each clear takes O(n)
– Thus, a series of operations takes O(n2),
because it may consist of only clears
31
Amortization Analysis
Theorem:
– A series of n operations on an initially empty clearable
table implemented with an array takes O(n) time
Proof:
– Let M0, M1, …, Mn-1 be the series of operations performed
on S, where k operations are clear
– Let Mi0, Mi1, …, Mik-1 be the k clear operations within the
series, and others be the (n-k) add operations
– Define i-1=-1,Mij takes ij-ij-1, because at most ij-ij-1-1
elements are added by add operations between Mij-1 and
Mij
– The total time of the series is:
(n-k) + ∑k-1j=0(ij-ij-1) = n-k + (ik-1 - i-1) <= 2n-k
Total time is O(n)
Amortized time is O(1)
32
Accounting Method
The method
– Use a scheme of credits and debits: each
operation pays an amount of cyber-dollar
– Some operations overpay --> credits
– Some operations underpay --> debits
– Keep the balance at any time at least 0
Example: the clearable table
– Each operation pays two cyber-dollars
– add always overpays one dollar -- one credit
– clear may underpay a variety of dollars
the underpaid amount equals the number of add
operations since last clear - 2
– Thus, the balance is at least 0
– So the total cost is 2n -- may have credits
33
Potential Functions
Based on energy model
– associate a value with the structure, Ø,
representing the current energy state
– each operation contributes to Ø a certain
amount of energy t’ and consumes a varying
amount of energy t
– Ø0 -- the initial energy, Øi -- the energy after
the i-th operation
– for the i-th operation
ti -- the actual running time
t’i -- the amortized running time
ti = t’i + Øi-1 - Øi
– Overall: T=∑(ti), T’=∑(t’i)
– The total actual time T = T’ + Ø0 - Øn
– As long as Ø0 <= Øn, T<=T’
34
Potential Functions
Example -- The clearable table
– the current energy state Ø is defined as the
number of elements in the table, thus Ø>=0
– each operation contributes to Ø t’=2
– Ø0 = 0
– Øi = Øi-1 + 1, if the i-th operation is add
add: t = t’+ Øi-1 - Øi = 2 - 1 = 1
– Øi = 0, if the i-th operation is clear
clear: t =t’+ Øi-1 = 2 + Øi-1
– Overall: T=∑ (t), T’=∑(t’)
– The total actual time T = T’ + Ø0 - Øn
– Because Ø0 = 0 <= Øn, T<=T’ =2n
35
Extendable Array
ADT: Extendable array is an array with extendable
size. One of its methods is add to add an element to
the array. If the array is not full, the element is
added to the first available cell. When the array is
full, the add method performs the following
–
–
–
–
Allocate a new array with double size
Copy elements from the old array to the new array
Add the element to the first available cell in the new array
Replace the old array with the new array
Question: What is the amortization time of add?
– Two situations of add:
add -- O(1)
Extend and add -- (n)
– Amortization analysis:
Each add deposits 3 dollars and spends 1 dollar
Each extension spends k dollars from the size k to 2k
(copy)
36
Amortization time of add is O(1)
Relatives of Big-Oh
big-Omega
– f(n) is (g(n)) if there is a constant c > 0
and an integer constant n0  1 such that
f(n)  c•g(n) for n  n0
big-Theta
– f(n) is (g(n)) if there are constants c’ > 0 and c’’ > 0
and an integer constant n0  1 such that c’•g(n)  f(n) 
c’’•g(n) for n  n0
little-oh
– f(n) is o(g(n)) if, for any constant c > 0, there is an
integer constant n0  0 such that f(n)  c•g(n) for n  n0
little-omega
– f(n) is (g(n)) if, for any constant c > 0, there is an
integer constant n0  0 such that f(n)  c•g(n) for n  n0
37
Intuition for Asymptotic Notation
Big-Oh
– f(n) is O(g(n)) if f(n) is asymptotically less than or
equal to g(n)
big-Omega
– f(n) is (g(n)) if f(n) is asymptotically greater
than or equal to g(n)
big-Theta
– f(n) is (g(n)) if f(n) is asymptotically equal to
g(n)
little-oh
– f(n) is o(g(n)) if f(n) is asymptotically strictly less
than g(n)
little-omega
– f(n) is (g(n)) if is asymptotically strictly greater
than g(n)
38
Example Uses of the Relatives of Big-Oh



5n2 is (n2)
f(n) is (g(n)) if there is a constant c > 0 and an
integer constant n0  1 such that f(n)  c•g(n) for
n  n0 let c = 5 and n0 = 1
5n2 is (n)
f(n) is (g(n)) if there is a constant c > 0 and an
integer constant n0  1 such that f(n)  c•g(n) for
n  n0 let c = 1 and n0 = 1
5n2 is (n)
f(n) is (g(n)) if, for any constant c > 0, there is
an integer constant n0  0 such that f(n)  c•g(n)
for n  n0 need 5n02  c•n0  given c, the n0 that
satisfies this is n0  c/5  0
39