Download Lecture 2 - Rabie A. Ramadan

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Birthday problem wikipedia , lookup

Post-quantum cryptography wikipedia , lookup

Theoretical computer science wikipedia , lookup

Knapsack problem wikipedia , lookup

Travelling salesman problem wikipedia , lookup

Corecursion wikipedia , lookup

Sorting algorithm wikipedia , lookup

Fisher–Yates shuffle wikipedia , lookup

Multiplication algorithm wikipedia , lookup

Genetic algorithm wikipedia , lookup

Fast Fourier transform wikipedia , lookup

Simplex algorithm wikipedia , lookup

Simulated annealing wikipedia , lookup

Expectation–maximization algorithm wikipedia , lookup

Gene expression programming wikipedia , lookup

Quicksort wikipedia , lookup

Dijkstra's algorithm wikipedia , lookup

Drift plus penalty wikipedia , lookup

Selection algorithm wikipedia , lookup

Pattern recognition wikipedia , lookup

Computational complexity theory wikipedia , lookup

Factorization of polynomials over finite fields wikipedia , lookup

Algorithm characterizations wikipedia , lookup

Algorithm wikipedia , lookup

Planted motif search wikipedia , lookup

Time complexity wikipedia , lookup

Transcript
Introduction to Algorithms
Rabie A. Ramadan
[email protected]
http://www. rabieramadan.org
2
Some of the sides are exported from different sources to clarify the
topic
Importance of algorithms
Algorithms are used in every aspect in our
life.
Let’s take an Example ……….
Example

Suppose you are implementing a spreadsheet program, in
which you must maintain a grid of cells. Some cells of
the spreadsheet contain numbers, but other cells contain
expressions that depend on other cells for their value.
However, the expressions are not allowed to form a
cycle of dependencies: for example, if the expression in
cell E1 depends on the value of cell A5, and the
expression in cell A5 depends on the value of cell C2,
then C2 must not depend on E1.
Example

Describe an algorithm for making sure that no cycle of
dependencies exists (or finding one and complaining to
the spreadsheet user if it does exist).

If the spreadsheet changes, all its expressions may need to
be recalculated. Describe an efficient method for sorting
the expression evaluations, so that each cell is
recalculated only after the cells it depends on have been
recalculated.
Another Example
Order the following items in a food chain
tiger
human
fish
sheep
shrimp
plankton
wheat
Solving Topological Sorting Problem



Solution: Verify whether a given digraph is a dag and, if it is,
produce an ordering of vertices.
Two algorithms for solving the problem. They may give
different (alternative) solutions.
DFS-based algorithm
•
•
Perform DFS traversal and note the order in which vertices become dead
ends (that is, are popped of the traversal stack).
Reversing this order yields the desired solution, provided that no back
edge has been encountered during the traversal.
Example
Complexity: as DFS
Solving Topological Sorting Problem

Source removal algorithm
• Identify a source, which is a vertex with no
incoming edges and delete it along with all edges
outgoing from it.
• There must be at least one source to have the
problem solved.
• Repeat this process in a remaining diagraph.
• The order in which the vertices are deleted yields the
desired solution.
Example
Source removal algorithm Efficiency
Efficiency: same as efficiency of the DFS-based algorithm, but
how would you identify a source?
A big Problem
Analysis of algorithms

Issues:

Approaches:
• Correctness
• space efficiency
• time efficiency
• optimality
• theoretical analysis
• empirical analysis
Space Analysis

When considering space complexity, algorithms are divided
into those that need extra space to do their work and those that
work in place.

Space analysis would examine all of the data being stored to
see if there were more efficient ways to store it.

Example : As a developer, how do you store the real numbers ?
•
•
Suppose we are storing a real number that has only one place of precision
after the decimal point and ranges between -10 and +10.
How many bytes you need ?
Space Analysis

Example : As a developer, how do you store the real numbers ?
•
•




Suppose we are storing a real number that has only one place of precision
after the decimal point and ranges between -10 and +10.
How many bytes you need ?
Most computers will use between 4 and 8 bytes of memory.
If we first multiply the value by 10. We can then store this as an integer
between -100 and +100. This needs only 1 byte, a savings of 3 to 7 bytes.
A program that stores 1000 of these values can save 3000 to 7000 bytes.
It makes a big difference when programming mobile or PDAs or when you
have large input .
Theoretical analysis of time efficiency
Time efficiency is analyzed by determining the number of repetitions of the
basic operation as a function of input size

Basic operation: the operation that contributes the most towards the running
time of the algorithm
input size
T(n) ≈ copC(n)
running time execution time
for basic operation
or cost
Number of times
basic operation is
executed
Note: Different basic operations may cost differently!
Why Input Classes are Important?


Input determines what the path of execution through an
algorithm will be.
If we are interested in finding the largest value in a list of N
numbers, we can use the following algorithm:
Why Input Classes are Important?



If the list is in decreasing order,
•
There will only be one assignment done before the loop starts.
If the list is in increasing order,
•
There will be N assignments (one before the loop starts and N -1 inside the
loop).
Our analysis must consider more than one possible set of input, because
if we only look at one set of input, it may be the set that is solved the
fastest (or slowest).
Input size and basic operation examples
Problem
Input size measure
Basic operation
Searching for key in a list Number of list’s items, i.e.
of n items
n
Key comparison
Multiplication of two
matrices
Matrix dimensions or total
number of elements
Multiplication of two
numbers
Checking primality of a
given integer n
n’size = number of digits (in
Division
binary representation)
Typical graph problem
#vertices and/or edges
Visiting a vertex or
traversing an edge
Importance of the analysis

It gives an idea about how fast the algorithm
T(n) ≈ copC(n)
 If the number of basic operations
C(n) = ½ n (n-1) = ½ n2 – ½ n ≈ ½ n2
How much longer if the algorithm doubles its input
1
( 2n) 2
size?
C C ( 2n)
T ( 2n)
T (n)




 2
4
1 2
COPC (n)
( n)
2
OP
Increasing input size increases the complexity
We tend to omit the constants because they have no effect with large inputs
Everything is based on estimation
Empirical analysis of time efficiency

Select a specific (typical) sample of inputs

Use physical unit of time (e.g., milliseconds)
or
Count actual number of basic operation’s
executions

Analyze the empirical/experimental data
Cases to consider in Analysis
Best-case, average-case, worst-case
For some algorithms, efficiency depends on form of input:

Worst case: Cworst(n) – maximum over inputs of size n

Best case:

Average case: Cavg(n) – “average” over inputs of size n
• The toughest to do
Cbest(n) – minimum over inputs of size n
Best-case, average-case, worst-case

Average case: Cavg(n) – “average” over inputs of size n
• Determine the number of different groups into which all possible input
sets can be divided.
• Determine the probability that the input will come from each of these
groups.
• Determine how long the algorithm will run for each of these groups.
n is the size of the input,
m is the number of groups,
pi is the probability that the input will be from
group i,
ti is the time that the algorithm takes for input from
group i.
Example: Sequential search

Worst case
n key comparisons

Best case
1 comparison

Average case
(n+1)/2, assuming K is in A
Computing the Average Case for the
Sequential search



Neither the Worst nor the Best case gives the yield to the actual performance of an
algorithm with random input.
The Average Case does
Assume that:
• The probability of successful search is equal to p(0≤ p ≤1)
• The probability of the first match occurring in the ith position is the same for every i .
• The probability of a match occurs at ith position is p/n for every i
• In the case of unsuccessful search , the number of comparison is n with probability
(1-p).
C avg (n)  [1.
p
p
p
p
 2.  3.  ...  n ]  n.(1  p )
n
n
n
n
p
[1  2  3  ...n]  n.(1  p )
n
p n(n  1)
p (n  1)
 .
 n.(1  p ) 
 n(1  p )
n
2
2

Computing the Average Case for the
Sequential search
C avg (n)  [1.
p
p
p
p
 2.  3.  ...  n ]  n.(1  p )
n
n
n
n
p
[1  2  3  ...n]  n.(1  p )
n
p n(n  1)
p (n  1)
 .
 n.(1  p ) 
 n(1  p )
n
2
2
If p =1 (I found the key k)
• The average number of comparisons is (n+1)/2
If p=0
• The average number of key comparisons is n




The average Case is more difficult than the Best and Worst cases
Mathematical Background
25
Mathematical Background

Logarithms
Logarithms

Which Base ?
Loga n = Loga b Logb n
Loga n = c Logb n
 In
terms of complexity , we tend to
ignore the constant
Mathematical Background
Mathematical Background
Mathematical Background
Types of formulas for basic operation’s
count

Exact formula
e.g., C(n) = n(n-1)/2

Formula indicating order of growth with specific multiplicative
constant
e.g., C(n) ≈ 0.5 n2

Formula indicating order of growth with unknown
multiplicative constant
e.g., C(n) ≈ cn2
Order of growth
32
Order of growth

Of greater concern is the rate of increase in
operations for an algorithm to solve a problem as
the size of the problem increases.

This is referred to as the rate of growth of the
algorithm.
Order of growth

The function based on x2 increases
slowly at first, but as the problem
size gets larger, it begins to grow
at a rapid rate.

The functions that are based on x
both grow at a steady rate for the
entire length of the graph.

The function based on log x seems
to not grow at all, but this is
because it is actually growing at a
very slow rate.
Values of some important functions as
n
Order of growth
Second point to consider :



Because the faster growing functions increase at such a
significant rate, they quickly dominate the slower-growing
functions.
This means that if we determine that an algorithm’s complexity
is a combination of two of these classes, we will frequently
ignore all but the fastest growing of these terms.
Example : if the complexity is
we tend to ignore 30x term
Classification of Growth
Asymptotic order of growth
A way of comparing functions that ignores constant factors and small
input sizes.

O(g(n)): class of functions f(n) that grow no faster than g(n)
•
All functions with smaller or the same order of growth as g(n)
n  O(n 2 ),

0.5n(n  1)  O(n 2 ), n3  O(n 2 )
Ω(g(n)): class of functions f(n) that grow at least as fast as g(n)
•
All functions that are larger or have the same order of growth as g(n)
n3  (n 2 ), 0.5n(n  1)  (n 2 ), 100n  5  (n 2 ),


100n  5  O(n 2 ),
Θ(g(n)): class of functions f(n) that grow at same rate as g(n)
•
Set of functions that have the same order of growth as g(n)
an 2  bn  (n 2 )
Big-oh
•O(g(n)): class of functions t(n) that grow no faster than g(n)
• if there exist some positive constant c and some nonnegative n0 such that
t (n)  cg (n) for all n  n0
Ex : Prove that 100n  5  O(n 2 )
100n  5  100n  n (for all n  5)
 101n  101n 2
c  101 and n0  5
You may come up with different c and n0
Big-omega
Ω(g(n)): class of functions t(n) that grow at least as fast as g(n)
t (n)  cg (n) for all n  n0
prove that n3  (n 2 ) ?
n 3  n 2 for all n  0
c  1 and n0  0
Big-theta
Θ(g(n)): class of functions t(n) that grow at same rate as g(n)
c2 g (n)  t (n)  c1g (n) for all n  n0
You need to get
c1 , c2 , and n0
Summary
>=
(g(n)), functions that grow at least as fast as g(n)
=
(g(n)), functions that grow at the same rate as g(n)
g(n)
<=
O(g(n)), functions that grow no faster than g(n)
Theorem


If t1(n)  O(g1(n)) and t2(n)  O(g2(n)), then
t1(n) + t2(n)  O(max{g1(n), g2(n)}).
• The analogous assertions are true for the -notation and notation.
Implication: The algorithm’s overall efficiency will be determined
by the part with a larger order of growth, i.e., its least efficient part.
• For example, 5n2 + 3nlogn  O(n2)
Proof. There exist constants c1, c2, n1, n2 such that
t1(n)  c1*g1(n), for all n  n1
t2(n)  c2*g2(n), for all n  n2
Define c3 = c1 + c2 and n3 = max{n1,n2}. Then
t1(n) + t2(n)  c3*max{g1(n), g2(n)}, for all n  n3
Some properties of asymptotic order of
growth

f(n)  O(f(n))

f(n)  O(g(n)) iff g(n) (f(n))

If f (n)  O(g (n)) and g(n)  O(h(n)) , then f(n)  O(h(n))
If f1(n)  O(g1(n)) and f2(n)  O(g2(n)) , then
f1(n) + f2(n)  O(max{g1(n), g2(n)})
Also, 1in (f(i)) =  (1in f(i))
Orders of growth of some important
functions

All logarithmic functions loga n belong to the same class
(log n) no matter what the logarithm’s base a > 1 is
because

log a n  log b n / log b a
All polynomials of the same degree k belong to the same class:
aknk + ak-1nk-1 + … + a0  (nk)

Exponential functions an have different orders of growth for different
a’s
Basic asymptotic efficiency classes
1
constant
log n
logarithmic
n
linear
n log n
n-log-n
n2
quadratic
n3
cubic
2n
exponential
n!
factorial