Download Complexity Analysis - Computer Information Systems

Document related concepts
no text concepts found
Transcript
Data Structures and
Algorithms
Week 2
Dr. Ken Cosh
Week 1 Review


Introduction to Data Structures and
Algorithms
Background


Computer Programming in C++
Mathematical Background
Week 2 Topics

Complexity Analysis





Computational and Asymptotic Complexity
Big-O Notation
Properties of Big-O Notation
Amortized Complexity
NP-Completeness
Computational Complexity

The same problem can be solved using
many different algorithms;



Factorials can be calculated iteratively or
recursively
Sorting can be done using shellsort,
heapsort, quicksort etc.
So how do we know what the best
algorithm is? And why do we need to
know?
Why do we need to know?


If searching for a value amongst 10 values,
as with many of the exercises we have
encountered while learning computer
programming, the efficiency of the program is
maybe not as significant as getting the job
done.
However, if we are looking for a value
amongst several trillion values, as only one
step in a longer algorithm establishing the
most efficient searching algorithm is very
significant.
How do we find the most efficient
algorithm?



To compare the efficiency of algorithms,
computational complexity can be used.
Computational Complexity is a measure of
how much effort is needed to apply an
algorithm, or how much it costs.
An algorithm’s cost can be considered in
different ways, but for our means Time and
Space are critical. Time being the most
significant.
Computational Complexity
Considerations

Computational Complexity is both platform /
system and language dependent;



An algorithm will run faster on my PC at home
than the PC’s in the lab.
A precompiled program written in C++ is likely to
be much faster than the same program written in
Basic.
Therefore to compare algorithms all should
be run on the same machine.
Computational Complexity
Considerations II


When comparing algorithm efficiencies,
real-time units such as nanoseconds
need not be used.
Instead logical units representing the
relationship between ‘n’ the size of a
file, and ‘t’ the time taken to process
the data should be used.
Time / Size relationships

Linear


If t=cn, then an increase in the size of data
increases the execution time by the same
factor
Logarithmic

If t=log2n then doubling the size ‘n’
increases ‘t’ by one time unit.
Asymptotic Complexity


Functions representing ‘n’ and ‘t’ are normally
much more complex, but calculating such a
function is only important when considering
large bodies of data, large ‘n’.
Ergo, any terms which don’t significantly
affect the outcome of the function can be
eliminated, producing a function which
approximates the functions efficiency. This is
called Asymptotic Complexity.
Example I

Consider this example;



F(n) = n2 + 100n + log10n + 1000
For small values of n, the final term is the
most significant.
However as n grows, the first term becomes
most significant. Hence for large ‘n’ it isn’t
worth considering the final term – how about
the penultimate term?
Example II
n
F(n)
Value
n2
Value
100n
%
Value
log10n
%
Valu
e
1000
%
Valu
e
%
1
1,101
1
0.1
100
9.1
0
0.0
1,000
90.83
10
2,101
100
4.76
1000
47.6
1
0.05
1,000
47.6
100
21,002
10,000
47.6
10,000
47.6
2
0.001
1,000
4.76
1000
1,101,003
1,000,00
0
90.8
100,000
9.1
3
0.0003
1,000
0.09
10000
101,001,0
04
100,000,
000
99
1,000,000
0.99
4
0.0
1,000
0.001
100000
10,010,00
1,005
10,000,0
00,000
99.9
10,000,000
0.099
5
0.0
1,000
0.00
Big-O Notation

Given 2 positively-valued functions (f() and g());



f(n) is O(g(n)) if (c>0 and N>0) exist such that f(n) ≤
cg(n) for all n ≥ N.
(in other words) f is big-O of g if there is a positive
number c such that f is not larger than cg for
sufficiently large ns (all ns larger than some number
N).
The relationship between f and g is that g(n) is an
upper bound of f(n), or that in the long run f grows at
most as fast as g.
Big-O Notation problems




The problem with the definition is that while c
and N must exist, no help is given towards
calculating them
No restrictions are given for these values.
No guidance for choosing values when more
than one exist.
The choice for g() is infinite! (so when
dealing Big-O the smallest g() is chosen).
Example I



Consider;
f(n) = 2n2 + 3n + 1 = O(n2)
When g(n) = n2, candidates for c and N
can be calculated using the following
inequality;



2n2 + 3n + 1 ≤ cn2
2 + (3/n) + 1/n2 ≤ c
If n = 1, c ≥ 6. If n = 2, c ≥ 3.75. If n
= 3, c ≥ 3.111, If n = 4, c ≥ 2.8125….
Example II

So what pair of c & N?


Choose the best pair by determining when a term
in f becomes the largest and stays the largest. In
our equation on 2n2 and 3n are candidates.
Comparing them, 2n2 > 3n holds true for n > 1,
hence N = 2 can be chosen.
But whats the practical significance of c and
N?


For any g an infinite number of pairs of c & N can
be calculated.
g is ‘almost always’ greater than or equal to f
when multiplied by a constant. Almost always
means when n is greater than N. The constant
then depends on the value of N chosen.
Big-O



Big-O is used to give an asymptotic upper
bound for a function, i.e. an approximation of
the upper bound of a function which is
difficult to formulate.
Just as there is an upper bound, there is a
lower bound (Big-Ω), we’ll come on to that
shortly…
But first, some useful properties of Big-O.
Fact 1 - Transitivity


If f(n) is O(g(n)) and g(n) is O(h(n)), then f(n) is
O(h)n)) – or O(O(g(n))) is O(g(n)).
Proof:





c1 and N1 exist so that f(n)≤c1g(n) for all n≥N1.
c2 and N2 exist so that g(n)≤c2h(n) for all n≥N2.
c1g(n)≤c1c2h(n) for all n≥N, when N= the larger of N1
and N2
Hence if c = c1c2, f(n)≤c1h(n) for all n≥N.
f(n) is O(h)n))
Fact 2


If f(n) is O(h(n)) and g(n) is O(h(n)),
then f(n) + g(n) is O(h(n)).
Proof:

After c = c1+c2, f(n)+g(n)≤ch(n).
Fact 3


The function ank is O(nk)
Proof:

For the inequality ank≤cnk to hold, c≥a is
necessary.
Fact 4


The function nk is O(nk+j) for any
positive j.
Proof:


This is true if c=N=1.
From this, it is clear that every
polynomial is big-O of n raised to the
largest power;

f(n) = aknk + ak-1nk-1 + … + a1n + a0 is
O(nk)
Big-O and Logarithms



First lets state that if the complexity of an
algorithm is on the order of a logarithmic
function, it is very good! (Check out slide
12).
Second lets state that despite that, there are
an infinite number of better functions,
however very few are useful; O(lg lg n) or
O(1).
Therefore, it is important to understand big-O
when it comes to Logarithms.
Fact 5 - Logarithms



The function logan is O(logbn) for any positive a and b ≠
1.
This means that regardless of their bases all logarithmic
functions are big-O of each other; i.e. all have the same
rate of growth.
Proof:






logan = x, logbn = y, i.e. ax=n, by=n
ln of both sides gives, x ln a = ln n and x ln b = ln n
x ln a = y ln b
ln a logan = ln b logbn
logan = (ln b / ln a) logbn = c logbn
Hence logan and logbn are multiples of each other.
Fact 5 (cont.)

Because the base of a logarithm is
irrelevant in terms of big-O we can use
just one base;

Logan is O(lg n) for any positive a≠1,
where lg n = log2n
Big-Ω

Big-O refers to the upper bound of functions.
The opposite of this is a definition for the lower
bound of functions, known as big-Ω (big omega)



f(n) is Ω(g(n)) if (c>0 and N>0) exist such that f(n) ≥
cg(n) for all n ≥ N.
(in other words) f is big- Ω of g if there is a positive
number c such that f is at least equal to cg for almost
all ns (all ns larger than some number N).
The relationship between f and g is that g(n) is an
lower bound of f(n), or that in the long run f grows at
least as fast as g.
Big-Ω example



Consider:
f(n) = 2n2 + 3n + 1 = Ω(n2)
When g(n) = n2, candidates for c and N can
be calculated using the following inequality;



2n2 + 3n + 1 ≥ cn2
2 + (3/n) + 1/n2 ≥ c
As we saw before, in this equation c tends
towards 2 as n grows, hence the proposal is
true for all c≤2.
Big-Ω


f(n) is Ω(g(n)) iff g(n) is O(f(n))
There is a clear relationship between big- Ω
and big-O, and the same (in reverse)
problems and facts hold true for in both
cases;


There are still infinite numbers of big-Ω equations.
Therefore we can explore the relationship
between big-O and big-Ω further by
introducing big-Θ (theta), which restricts the
sets of possible upper and lower bounds.
Big-Θ


f(n) is Θ(g(n)) if c1,c2,N > 0 exist such
that c1g(n) ≤ f(n) ≤ c2g(n) for all n≥N.
From this f(n) is Θg(n) if both functions
grow at the same rate in the long run.
O, Ω & Θ

For the function;


Options for big-O include;


g(n) = n2, g(n) = n, g(n) = n½
Options for big-Θ include;


g(n) = n2, g(n) = n3, g(n) = n4 etc.
Options for big-Ω include;


f(n) = 2n2 + 3n + 1
g(n) = n2, g(n) = 2n2, g(n) = 3n2
Therefore, while there are still an infinite
number of equations to choose from, it is
obvious which equation should be chosen.
Possible problems with Big-O





Given the rules of Big-O an equation g(n) can be chosen
such that f(n)≤cg(n) assuming the constant c is large
enough.
As c grows, the number of exceptions (essentially n) is
reduced.
If c=108, g(n) might not be very useful for
approximating f(n), as our algorithm may never need to
perform 108 operations.
This may lead to algorithms being rejected
unnecessarily.
If c is too large for practical significance g(n) is said to
be OO of f(n) (double-O), however ‘too large’ depends
upon the application.
Why Complexity Analysis?

Today’s computers can perform millions of
operations per second at relatively low cost,
so why complexity analysis?

With a PC that can perform 1 million operations
per second and 1 million items to be processed.



A quadratic equation O(n2) would take 11.6 days.
A cubic equation O(n3) would take 31,709 years.
An exponential equation O(2n) is not worth thinking
about.
Why Complexity Analysis

Even a 1,000 times improvement in
processing power (consider Moore’s
Law).



The cubic equation would take over 31
years.
The quadratic would still be over 16
minutes.
To make scalable programs algorithm
complexity does need to be analysed.
Complexity Classes
1 operation per μsec (microsecond), 10
operations to be completed.







Constant = O(1) = 1 μsec
Logarithmic = O(lg n) = 3 μsec
Linear = O(n) = 10 μsec
O(n lg n) = 33.2 μsec
Quadratic = O(n2) = 100 μsec
Cubic = O(n3) = 1msec
Exponential = O(2n) = 10msec
Complexity Classes
1 operation per μsec (microsecond), 102
operations to be completed.







Constant = O(1) = 1 μsec
Logarithmic = O(lg n) = 7 μsec
Linear = O(n) = 100 μsec
O(n lg n) = 664 μsec
Quadratic = O(n2) = 10 msec
Cubic = O(n3) = 1 sec
Exponential = O(2n) = 3.17*1017 yrs
Complexity Classes
1 operation per μsec (microsecond), 103
operations to be completed.







Constant = O(1) = 1 μsec
Logarithmic = O(lg n) = 10 μsec
Linear = O(n) = 1 msec
O(n lg n) = 10 msec
Quadratic = O(n2) = 1 sec
Cubic = O(n3) = 16.7min
Exponential = O(2n) = ……
Complexity Classes
1 operation per μsec (microsecond), 104
operations to be completed.






Constant = O(1) = 1 μsec
Logarithmic = O(lg n) = 13 μsec
Linear = O(n) = 10 msec
O(n lg n) = 133 msec
Quadratic = O(n2) = 1.7 min
Cubic = O(n3) = 11.6 days
Complexity Classes
1 operation per μsec (microsecond), 105
operations to be completed.






Constant = O(1) = 1 μsec
Logarithmic = O(lg n) = 17 μsec
Linear = O(n) = 0.1 sec
O(n lg n) = 1.6 sec
Quadratic = O(n2) = 16.7 min
Cubic = O(n3) = 31.7 years
Complexity Classes
1 operation per μsec (microsecond), 106
operations to be completed.






Constant = O(1) = 1 μsec
Logarithmic = O(lg n) = 20 μsec
Linear = O(n) = 1 sec
O(n lg n) = 20 sec
Quadratic = O(n2) = 11.6 days
Cubic = O(n3) = 31,709 years
Asymptotic Complexity
Example

Consider this simple code;



for (i = sum = 0; i < n; i++)
sum += a[i];
First 2 variables are initialised.
The loop executes n times, with 2
assignments each time (one updates sum
and one updates i)
Thus there are 2+2n assignments for this
code; and so an Asymptotic Complexity of
O(n).
Asymptotic Complexity
Example 2

Consider this code;
for (i = 0; i < n; i++) {
for (j = 1, sum = a[0]; j <= i; j++)
sum += a[j];
cout<<“sum for subarray 0 through “ << i << “
is “<<sum<<endl;
}
 Before loop starts there is 1 initialisation (i)
 The outer loop executes n times, each time calling
the inner loop and making 3 assignments (sum, i
and j)
 The inner loop executes i times for each iЄ{1,…,n1} (the elements in i) with 2 assignments in each
case (sum and j)
Asymptotic Complexity
Example 2 (cont.)

Therefore there are;
1+3n+n(n-1) or O(n2)

assignments before the program
completes.
Asymptotic Complexity 3

Consider this refinement;
for (i = 4; i < n; i++) {
for (j = i - 3, sum = a[i-4]; j <= i; j++)
sum += a[j];
cout<<“sum for subarray “<<i-4 << “
through “ << i << “
is “<<sum<<endl;
}
 How would you calculate the asymptotic
complexity of this code?
The Number Game

I’ve picked a number between 1 and 10
– can you guess what it is?

Take a guess, and I’ll tell you if its higher
or lower than your guess.
The Number Game

There are several approaches you could
take;



Guess 1, if wrong guess 2, if wrong guess
3, etc.
Alternatively, guess the midpoint 5. If
lower guess halfway between 1 and 5,
maybe 3 etc.
Which is more better?

It depends on what the number was! But,
in each option there is a best, worst and
average case.
Average Case Complexity

Best Case;


Worst Case;


Number of steps is smallest
Number of steps is maximum
Average Case;



Somewhere in between.
Could calculate as the sum of the number of steps
for each input divided by the number of inputs.
But this assumes each input has equal probability.
So we weight calculation with the probability of
each input.
Method 1

Choose 1, if wrong choose 2 , if wrong
choose 3…




Probability of success for 1st try = 1/n
Probability of success for 2nd try = 1/n
Probability of success for nth try = 1/n
Average;
1+2+…+n / n = (n+1)/2
Method 2

Picking midpoints;


Method 2 is actually like searching a binary
tree, so we will leave a full calculation until
week 6, as right now the maths could get
complicated.
But for n=10, you should be able to
calculate the average case – try it! (When
n=10 I make it 1.9 times as efficient)
Average Case Complexity

Calculating Average Case Complexity
can be difficult, even if the probabilities
are equal, so calculating approximations
in the form of big-O, big-Ω and big-Θ
can simplify the task.
Amortized Complexity


Thus far we have considered simple algorithms
independently from any others, however its
more likely these algorithms are part of a larger
problem.
To calculate the best, worst and average case
for the whole sequence, we could simply add the
best, worst and average cases for each
algorithm in the sequence;
Cworst(op1, op2, op3, …) =
Cworst(op1)+Cworst(op2)+Cworst(op3)+…
Grades Case


Suppose I create an array in which to store student
grades. I then enter the midterm grades and sort
the array best to worst. Next I enter the coursework
grades, and then sort the array best to worst. Finally
I enter the final exam grades and sort the array best
to worst.
This is a sequence of algorithms;






Input Values
Sort Array
Input Values
Sort Array
Input Values
Sort Array
Grades Case



However, is it fair to calculate the worst case
for this program by adding the worst cases
for each step?
Is it fair to use the worst case ‘Sort Array’
cost for sorting the array every time, even
after it has only changed slightly?
Is it likely that the array will need a complete
rearrangement after the coursework grade
has been added? i.e. is it likely that the
student who receives the lowest mid term
grade then has the highest score after
midterm and coursework are included?
Grades Case

In reality it is unlikely that the worst case
scenario will ever be run for the 2nd and 3rd
array sorts, so how do we approximate an
accurate worst case when combining a
sequence of operations?



Steal from the rich, and give to the poor.
Add a little to the quick operations and take a little
from the expensive operations.
Overcharge cheap operations, undercharge
expensive ones.
Bangkok

I want to drive to Bangkok – how long
will it take?




Average Case?
Best Case?
Worst Case?
How do you come to your answer?
<Vector>

<Vector> is a library we first encountered
during computer programming 1. It defines a
data structure – remember how it worked?



Add elements to the vector when there is space
through push_back.
When capacity is reached add to capacity through
reserve.
Suppose each time the capacity is full, we
double the size of the vector – how can we
estimate an amortized cost of filling the
vector?
<Vector>

Case of adding an element to a vector with
space:



Copy new values into first available cell.
O(1)
Case of adding an element to a full vector:




Copy existing values to new space
Add new value
O(size(vector)+1)
i.e. if the vector capacity and size is 4, the cost of
adding an element would be 4+1.
Amortized cost = 2
Size
Capacity
Amortized
Cost
Cost
Units Left
1
1
2
0+1
1
2
2
2
1+1
1
3
4
2
2+1
0
4
4
2
1
1
5
8
2
4+1
-2
6
8
2
1
-1
7
8
2
1
0
8
8
2
1
1
9
16
2
8+1
-6
10
16
2
1
-5
17
32
2
16+1
-14
Amortized cost = 3
Size
Capacity
Amortized
Cost
Cost
Units Left
1
1
3
0+1
2
2
2
3
1+1
3
3
4
3
2+1
3
4
4
3
1
5
5
8
3
4+1
3
6
8
3
1
5
7
8
3
1
7
8
8
3
1
9
9
16
3
8+1
3
10
16
3
1
5
17
32
3
16+1
3
Amortized Cost



From the previous 2 tables it can be
seen that an amortized cost of ‘2’ is not
enough.
With an Amortized cost of 3, there are
sufficient units left over to cover
expensive operations.
Finding an acceptable amortized cost is
however not always that easy.
Difficult Problems




It would be ideal if problems were of class constant,
linear or logarithmic.
However, many problems we will look at are polynomial
class problems (quadratic / cubic or worse) - P
Unfortunately, there are many important problems
whose best algorithms are very complex, sometimes
taking exponential time (and in fact sometimes worse!)
As well as EXPTIME problems there is another class of
problem call NP-Complete, which is bad news, ‘evidence’
that some problems just can’t be solved easily.
NP-Complete

Why worry about it?

Knowing that some problems are NPComplete saves you blindly trying to find a
solution to them.
NP-Complete

Background


P refers to the class of problems which can
be solved in polynomial time.
NP stands for “Non-deterministic
Polynomial Time”

Essentially here, we can test whether a
proposed solution is correct fairly quickly, but
finding a solution is difficult. There is no
problem with an NP problem if we could only
guess the right solution!
NP Examples

Long Simple Path


Cracking Cryptography


Finding a path through a graph from A to B traveling over
ever vertex once and only once is very difficult, but if I tell
you a solution path it is relatively simple for you to check it.
The Traveling Salesman Problem is a long ongoing problem,
with huge financial rewards for a successful solution!
It’s difficult to break encryption, but if I give you a solution,
it is easy to test it works.
Infinite Loop Checking

Ever wondered why your compiler doesn’t tell you you’ve
got an infinite loop? This problem is actually much harder
than NP – a class of complexity known as ‘undecidable’
P vs NP

Arguably one of the most famous current theoretical
science debates concerns whether P=NP, with many
theoreticians divided.



While all P problems are NP, is the reverse true?
If it is always easy to check a solution, should it also be easy
to find the solution? Can you prove either way?
This leads us to a complexity framework where we
can’t prove that a problem isn’t P, but known to be
NP, and this is where NP-Complete fits in.
NP-Complete


NP-Complete problems are the hardest
problems within NP, which are not known to
have solutions in polynomial time.
We are still left with the problem of
identifying NP-Complete problems.

How can we prove that a problem is “not known
to” have a solution in polynomial time?

(Rather than just a problem we haven’t solved?)
Reduction

We can often reduce problems;




The same theory applies to NP-Complete problems.



Problem A, can be solved by an algorithm involving a
number of calls to Algorithm B.
The number of calls could 1, a constant or polynomial.
If Algorithm B is P, then this demonstrates that Algorithm A
is also P.
If Problem is NP-Complete if it is NP, and all other NP
problems are polynomially reduced to it.
The astute will realise that to prove a problem is NPComplete it takes a problem which has already been proved
to be NP-Complete. A kind of Chicken and Egg scenario,
where fortunately Cook’s satisfiability problem came first.
We will encounter more NP-Complete problems when
dealing with graphs later in the course.