Download part 2.2 - UCL Computer Science

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Addition wikipedia , lookup

Mathematics of radio engineering wikipedia , lookup

Large numbers wikipedia , lookup

Big O notation wikipedia , lookup

Elementary algebra wikipedia , lookup

Arithmetic wikipedia , lookup

System of polynomial equations wikipedia , lookup

Algorithm wikipedia , lookup

Factorization of polynomials over finite fields wikipedia , lookup

Partial differential equation wikipedia , lookup

Factorial wikipedia , lookup

Elementary mathematics wikipedia , lookup

Recurrence relation wikipedia , lookup

Transcript
3. RECURSIVE ALGORITHMS AND RECURRENCE
RELATIONS
In discussing the example of finding the determinant of a matrix an
algorithm was outlined that defined det(M) for an nxn matrix in
terms of the determinants of n matrices of size (n-1)x(n-1).
If D(n) is the work required to evaluate the determinant of an nxn
matrix using this method then
D(n)=n . D(n-1)
To solve this -- in the sense of ending up with an expression for
D(n) that does not have reference to other occurrences of the
function D() on the right hand side -- we can use progressive
substitutions as follows:
D(n)= n. D(n-1)
= n. (n-1). D(n-2)
= n. (n-1). (n-2). … x 3 x 2 x D(1)
a constant, the cost of returning a number
n!
⇒ D(n) ∈ O(n! )
COMP1004: Part 2.2
27
Now consider
ALGORITHM Factorial(n)
// Recursively calculates n! for positive integer n
if n=1
return 1
else
return n*Factorial(n-1)
The only possible choice for the elementary operation here is
multiplication, so we choose as a cost function
F(n) = number of multiplications to evaluate Factorial(n)
The algorithm as defined then implies that
F(1) = 0
F(n) = F(n-1) + 1, n>1
The general-n form can be rewritten as
F(n)-F(n-1)=1
Letting n → n-1 in the above, we also have
F(n-1)-F(n-2)=1
We can continue to do this until the row beginning with F(2), which
contains the first mention of the base case (n-1 = 2-1 = 1).
F(n) - F(n-1) = 1
+ F(n-1) – F(n-2) = 1
+ F(n-2) – F(n-3) = 1
….
….
…
+ F(2) - F(1) = 1
F(n) - F(1) = n-1
n-1 equations
('Method of differences’ or ‘ladder method'.)
COMP1004: Part 2.2
28
Adding up the rungs of the ladder there are cancellations on the
left hand side for every term except F(n) and F(1), and for the right
hand side the sum is just n-1.
→ F(n) = F(1) + n - 1
→ F(n) = n-1, since F(1)=0
→ F(n) ∈ O(n).
Aside:
Is this reasonable, is evaluating n! really only in O(n)?
The analysis ignored the fact that large values will build up quickly
and so it was not in this case ideal to count integer multiplications
as elementary. However it was a simple example that showed the
techniques that can also be used in much more complex cases.
* * *
Expressions like ‘D(n) = n D(n-1)’
‘F(n) = F(n-1) +1’
are known as recurrence relations. A recurrence relation is a
formula that allows us to compute the members of a sequence one
after the other, given one or more starting values.
These examples are first order recurrence relations because a
reference is made to only one smaller sized instance.
A first order recurrence requires only one ‘starting value’ -- in the
first of these cases D(1), in the second F(1) -- in order to obtain a
unique solution for general n.
Recurrence relations arise naturally in the analysis of recursive
algorithms, where the starting values are the work required to
compute base cases of the algorithm.
COMP1004: Part 2.2
29
Consider as another example of a first order recurrence relation
f(0) = 1 (the base case is for n=0)
f(n) = c. f(n-1) n > 0
(It doesn't matter here what hypothetical algorithm may have
generated the definition of f(n). Most of the examples in this
section will be starting directly from a recurrence relation and will
be focused primarily on developing the mathematical tools for
solving such relations.)
By inspection, f(n) = c. f(n-1)
= c2. f(n-2)
…
= cn. f(0) = cn
f(n)=cn is the solution of this recurrence relation.
It is purely a function of the input size variable n, it does not make
reference on the right hand side to the original cost function
applied to any other input sizes.
The expression
D(n)= n. D(n-1)
is clearly also a first order recurrence of the same kind, but here
the multiplying coefficient depends on n. This is the next step up in
complication, where for a general n-dependent multiplier b(n)
f(n) = b(n)xf(n-1)
(1)
(defined for n > a with base case value f(a) given)
Again by inspection
f(n) = b(n)xf(n-1)
= b(n)xb(n-1)xf(n-2)
= b(n)xb(n-1)x...xb(a+1)xf(a)
COMP1004: Part 2.2
30
(note it was not possible to go any further because there would
then be a reference to f(a-1), which is not defined)
giving as a general solution to (1)
"$ n
&$
f(n) = # ! b(i)' xf(a)
$%i=a+1 $(
in the determinant example
n
a=1, f(1)=1, b(i)=i, giving f(n) =
! b(i) = 2x3x...xn = n!
i=a+1
(1) is the most general case of a first order (it makes reference to
only one smaller sized instance), homogeneous (there is nothing
on the r.h.s. that doesn't multiply a smaller sized instance, such as
a polynomial in n) recurrence relation.
What about the factorial function recurrence
F(n) = F(n-1) + 1
?
This is a simple example of a first order, inhomogeneous
recurrence relation, for which the general form (with non-constant
coefficients b(n), c(n)) is
f(n) = b(n)xf(n-1) + c(n)
(n > a, f(a) given)
(2)
If we try to do this one by progressive substitution as for the
determinant case we quickly get into difficulties:
f(n) = b(n)xb(n-1)x f(n-2) + b(n)xc(n-1) + c(n)
= b(n)xb(n-1)xb(n-2)xf(n-3) + b(n)xb(n-1)xc(n-2) + b(n)xc(n-1)+ c(n)
= ..............
COMP1004: Part 2.2
31
In cases like this we need to define instead a new function g(n) by
f(n) = b(a+1)xb(a+2)x…xb(n)x g(n) , n>a
f(a) ≡ g(a) (the two functions agree on the base case)
Substituting in this way for f(n) in (2):
f(n) = b(n)xf(n-1) + c(n)
b(a+1)xb(a+2)x…xb(n)x g(n) = b(n)xb(a+1)xb(a+2)x… xb(n-1)x g(n-1) + c(n)
same coefficient
Then dividing by the product b(a+1)xb(a+2)x…xb(n), the common
multiplier of both instances of the function g(), gives
g(n) = g(n-1) + d(n)
where d(n) =
(3)
c(n)
.
b(a +1)xb(a + 2)x...xb(n)
(3) is now in a much simpler form (the important thing is that the
multiplier of the smaller-input instance of the function on the r.h.s.
is now 1) and can be solved by the ladder method we used in the
case of F(n) (where d(n) =1, independent of n) to give the solution
n
"$
&$
f(n) = ) b(i) # f(a) + ! d(j)'
%$
($
i=a+1
j=a+1
n
in the factorial example a=1, f(1)=0, b(i)=1,
n
d(j) = 1, giving f(n) = n-1 (as
! d(j) = !1 = n "1)
j=2
COMP1004: Part 2.2
n
j=2
32
It's possible, and not incorrect, to memorise and quote the above
formula and substitute into it in order to solve a given recurrence
relation, as the formula can be applied to inhomogeneous first
order recurrences of any degree of complexity.
However it’s more instructive, and often easier, to consider
inhomogeneous recurrences on a case-by-case basis, noting that
in the examples you would be given the base case input ‘a’ will
normally be 0 or 1, and the d-summation above -- which could in
principle be hard and require an approximation technique -- will
use just the series summation formulae you have already seen.
Example: f(0)=0
(base case a=0)
f(n)= 3f(n-1) + 1,
n>0
Change variable:
n
n
f(n) = 3 g(n)
(b(i) = 3 for all i, so
n
! b(i) = 3
)
i=1
This gives
3n g(n) = 3. 3n-1 g(n-1) + 1
3n
→
g(n) = g(n-1) +1/3n
Now use the method of differences (ladder method) to obtain
n
g(n) = ∑
i=1
COMP1004: Part 2.2
1
3i
33
In detail:
g(n)
– g(n-1) = 1/3n
g(n-1) – g(n-2) = 1/3n-1
...
...
g(1)
– g(0)
= 1/31
n
g(n)
– g(0) =
1
∑3
i=1
i
(When you are familiar with these techniques you don’t
need to show all the details.)
Multiplying both sides of the solution for g(n) by 3n gives the
solution for f(n)
1
i
i =1 3
n
f (n) = 3n ∑
We evaluate the sum using the formula for a geometric series:
1 ⎡ n i ⎤
= ∑a
∑
i
⎢⎣ i =1 ⎥⎦ a = 1
i =1 3
n
3
⎡ a(1 − an ) ⎤
= ⎢
⎥
⎣ 1 − a ⎦ a = 1
3
=
1 ⎛
1 ⎞
⎜1 − n ⎟
2 ⎝ 3 ⎠
Hence
3n "
1% 1
f(n) = $1! n ' = 3n !1
2# 3 & 2
(
COMP1004: Part 2.2
)
( O(3n )
34
SOLVING HIGHER ORDER RECURRENCES USING THE
CHARACTERISTIC EQUATION
This is a technique which can be applied to linear recurrences of
arbitrary order but is less general than some of the techniques
illustrated for first order recurrences in that we will henceforth
assume that
• anything multiplying an instance(for some input size) of the
function is a constant
• the recurrence relations are homogeneous -- they don’t
contain any term that isn’t a multiple of an instance of the
function we are trying to obtain a 'closed' solution for
Hence, we are here trying to solve kth order recurrences of the
general form
ak f(n) + ak-1 f(n-1) +… + a0 f(n-k) = 0
where the {ai} are constants.
(Putting everything on the l.h.s. and equating it to zero rather than
writing 'f(n) = ...' is a convenience, as will be seen later.)
Try the solution f(n) = α n (see Brassard and Bratley p.65 for a
justification of this choice):
ak αn + ak-1 αn-1 +… + a0 αn-k = 0
αn-k can be seen to be a common factor:
αn-k (ak α k + ak-1 α k-1 +… + a0) = 0
characteristic equation
We assume that we are not interested in the trivial solution α=0, as
this would imply f(n)=0 -- in an algorithmic context that the work
needed for an input of any size n was zero -- only in the non-trivial
solutions of the characteristic equation above.
COMP1004: Part 2.2
35
The characteristic equation is a polynomial of degree n and would
in general be expected to have k distinct roots (the case that some
of these may not be distinct will be considered later).
Assuming that the k roots ρ 1, ρ 2, ... ρ k of the characteristic
equation are all distinct, any linear combination of {ρin}
k
f (n) = ∑ c iρni
i=1
(Brassard and Bratley p.65) solves the homogeneous recurrence,
where the {ci} are constants determined by the k initial conditions
that define the base cases of the recurrence.
Example (k=2):
f(0) = 0
f(1) = 1
f(n) = 3f(n-1) – 2f(n-2),
n>1
Try a solution of the form f(n) = αn:
αn - 3αn-1 + 2αn-2 = 0
Divide by αn-2 to get the characteristic equation
→
α2 – 3α + 2 = 0
(α - 1)(α - 2) = 0
The roots ρ1 = 1, ρ2 = 2 give a general solution of the form
f(n) = c1.1n + c2 .2n
Using the initial conditions:
c1 + c2 = 0 (n=0)
c1 + 2c2 = 1 (n=1)
→ c1 = -1, c2 = 1
→ f(n) = 2n – 1
COMP1004: Part 2.2
36
Example (k=3):
f(0) = 1
f(1) = 5
f(2) = 19
f(n) = 7f(n-1) – 14f(n-2) + 8f(n-3),
n>2
The characteristic equation is
→
α3 – 7α2 + 14α - 8 = 0
(α - 1) (α - 2) (α - 4) = 0
Roots ρ1 = 1, ρ2 = 2, ρ3 = 4 give a general solution of the form
f(n) = c1 .1n + c2 .2n + c3 .4n
Using the initial conditions:
c1 + c2 + c3 = 1
(n=0)
c1 + 2c2 + 4c3 = 5
(n=1)
c1 + 4c2 + 16c3 = 19 (n=2)
→ f(n) = 4n + 2n - 1
→ c1 = -1, c2 = 1, c3 = 1
∈ O( 4n )
(Note that under exam conditions you would not be expected to
solve k > 2 examples. This is because it’s only for k=2 that there is
a simple formula for finding the roots of the (then quadratic)
characteristic equation; for k > 2 the roots might be considerably
more difficult to find.)
COMP1004: Part 2.2
37
If the characteristic equation has a root ρ with multiplicity m,
then (Brassard and Bratley p.67)
xn = ρ n, xn = n ρ n, …, xn = nm-1ρ n
are all possible solutions of the recurrence.
In this case the general solution is a linear combination (with k
constant coefficients to be determined by the initial conditions) of
these terms together with those contributed by the other roots of
the characteristic equation.
Example (k=2):
f(0) = 1
f(1) = 5
f(n) = 2f(n-1) – f(n-2),
n>1
The characteristic equation is
→
α2 – 2α + 1 = 0
(α - 1)2 = 0
The repeated root ρ = 1 gives a general solution of the form
f(n) = c1 .1n + c2 .n.1n
Using the initial conditions:
c1 = 1
(n=0)
c1 + c2 = 5
(n=1)
→ c1 = 1, c2 = 4
→ f(n) = 4n + 1
COMP1004: Part 2.2
38
Example (k=3):
f(0) = 2
f(1) = 9
f(2) = 29
f(n) = 5f(n-1) – 8f(n-2) + 4f(n-3),
n>2
The characteristic equation is
→
α3 – 5α2 + 8α - 4 = 0
(α - 1) (α - 2)2 = 0
Roots ρ1 = 1 and ρ2 = 2 (m=2) give a general solution of the form
f(n) = c1 .1n + c2 .2n + c3 .n.2n
Using the initial conditions:
c1 + c2 = 2
(n=0)
c1 + 2c2 + 2c3 = 9 (n=1)
c1 + 4c2 + 8c3 = 29 (n=2)
→ f(n) = 3n2n + 2n + 1
COMP1004: Part 2.2
→ c1 = 1, c2 = 1, c3 = 3
∈ O( n2n )
39
Example:
Fibonacci numbers
The Fibonacci numbers are a
famous sequence
0, 1, 1, 2, 3, 5, 8, 13, 21, ...
defined by the recurrence
1
Fib(n) = Fib(n-1) + Fib(n-2)
with
Fib(0) = 0
Fib(1) = 1
Patterns incorporating this
sequence appear to be
widespread in nature, for
example in the florets in the
head of a sunflower or in the
growth of pine cones.
The sequence was first
introduced to Europe in a
treatise on the growth of rabbit
populations by Leonardo
Fibonacci in 1202 (though it
may have been known in India
for much longer).
2
The size of the Fibonacci
numbers grows exponentially
-- the 49th number is
7,778,742,049, more than the
total number of people in the
world.
1
Helianthus flower, L Shyamal 2006
Young cones of a Colorado spruce,
http://www.noodlesnacks.com 2008
2
COMP1004: Part 2.2
40
We can show the exponential growth of the Fibonacci numbers by
solving the recurrence relation for the series values Fib(n) using
the characteristic equation method.
The general-n case is
Fib(n) – Fib(n-1) – Fib(n-2) = 0
with associated characteristic equation
α2 – α – 1 = 0
Unfortunately this doesn't factorise neatly with integer solutions
like previous examples and it's necessary to use the formula for
the roots of a quadratic equation to give the solutions
!1,2 =
1± 5
2
and hence a general solution for the Fibonacci series values of the
form
Fib(n) = c1ρ1n + c2ρ2n
Using the base case values Fib(0)=0, Fib(1)=1:
c1 + c2 = 0
(n=0)
c1 ρ1 + c2 ρ2 = 1
(n=1)
1
→ c1 =
5
, c2 = !
1
5
and hence
n
n
1 !# 1+ 5 $&
1 !# 1' 5 $&
1 n !n
Fib(n) =
'
=
( '(
# 2 &
# 2 &
5"
5"
5
%
%
(
)
in which
! = (1+ 5) / 2 " 1.61803 ,
COMP1004: Part 2.2
!
! = "1/ ! # "0.61803
41
It can easily be seen that
Fib(n) ! O("n )
!
since as | ! | < 1 the contribution from the oscillating second part
of the solution falls to zero very rapidly as n increases.
Rabbits really do breed like rabbits!
(Though Fibonacci assumed, among other things, that they were
also immortal and hence his model of rabbit population growth
would not nowadays be thought very realistic.)
But does this necessarily mean that computing the Fibonacci
numbers will take an exponentially growing amount of time?
Consider first a naïve procedure based directly on the definition:
ALGORITHM Fib1(n)
// Calculates the nth Fibonacci number for integer n ≥ 0
if n ≤ 1
return n
else
return Fib1(n-1)+Fib1(n-2)
Taking addition here as the elementary operation the cost F(n) in
terms of the number of additions can be seen to be
F(0) = 0
F(1) = 0
F(n) = F(n-1) + F(n-2) + 1
Unfortunately this is an inhomogeneous second order recurrence,
and the characteristic equation method presented here doesn't
extend to these cases.
(In fact it does fall within the slightly expanded range of functional
forms considered in Brassard and Bratley (p.68) but we will not
cover these more general methods of solution.)
COMP1004: Part 2.2
42
But there's a trick: notice that if we define G(n) = F(n)+1 then the
recurrence relation for the number of additions becomes
G(0) = 1
G(1) = 1
G(n) = G(n-1) + G(n-2)
which is homogeneous and so soluble using our usual methods.
Even better, it's not actually necessary to solve this recurrence as
it can also be observed that G(n) = Fib(n+1) and hence that
1 n+1 ! n+1
F(n) = G(n) !1 = Fib(n +1) !1 =
" ! " !1
5
(
)
So it is the case that F(n) ! O("n ) -- the amount of work required
by the definition-derived algorithm grows exponentially in the same
way as the sequence values Fib(n) themselves.
However it's possible to do better.
It might have been predicted that Fib1 was not an efficient
algorithm from inspection of its structure. Working through some
examples for small n (such as the n=5 case on p.82 of Levitin) will
convince you there is massive inefficiency in the recomputation of
already-known values -- for example in the case of n=5 Fib1(4) is
computed four times over.
There is a simple iterative alternative that takes only linear time as
measured by the number of integer additions, storing alreadyknown values in an array A[0..n]:
ALGORITHM Fib2(n)
// Calculates the nth Fibonacci number for integer n ≥ 0
A[0] <− 0
A[1] <− 1
for i <− 2 to n do
A[i] <− A[i-1]+A[i-2]
return A[n]
COMP1004: Part 2.2
43
This clearly does only n-1 additions, so is in O(n).
BUT -- be careful -- though the second algorithm will certainly be
preferable, considering additions in either case as having unit cost
may not be realistic because the the size of the numbers the
Fibonacci sequence generates.
It's the same situation as for the factorial function recurrence,
found to be O(n) in terms of integer multiplications but where the
integers in question would quickly become very large.
Realistically one would need to think about single bit operations,
the size of an instance n then being given as 1+ !"log2 n#$ .
CHANGE OF VARIABLE
Consider the problem of raising a number, a, to some power n.
The naïve algorithm for doing this is
ALGORITHM Exp1(a,n)
// Computes an for positive integer n
if n=1
return a
else
return a*Exp1(a,n-1)
Exp1 has the same general structure as the factorial function we
looked at earlier and by a similar argument can be demonstrated
also to be O(n) (counting multiplications at unit cost).
Can we improve on this simple exponentiation algorithm?
Consider the following sequence for computing a32 :
a → a2 → a4 → a8 → a16 → a32
Each term is obtained by squaring the previous one, so only 5
multiplications are required, not 31.
COMP1004: Part 2.2
44
It's possible to base an alternate algorithm on this successive
squaring idea:
ALGORITHM Exp2(a,n)
// Computes an for positive integer n
if n=1
return a
else
if even(n)
return [Exp2(a,n/2)]2
else
return a*[Exp2(a,(n-1)/2)]2
(divisions in each case give
integer result)
Exp2(a,n) gets broken into smaller (Exp2) instances and then
squared to give the required result.
Let E(n) be the work required to compute Exp2(a,n) and use
multiplication as the unit-cost elementary operation:
E(1) = 0
E(n) =
squaring of smaller-input instance
E(n/2) + 1
if n even
E((n-1)/2) + 2 if n odd
squaring of smaller-input instance and
multiplication of result by a
Suppose for the moment that n is not only even but is a power
of 2. Make the change of variable
n → 2k
COMP1004: Part 2.2
45
Then the even-n recurrence takes the form
E(2k) = E(2k/2) + 1
= E(2k-1) + 1
or more compactly, writing E(2k) ≡ Ek
Ek – Ek-1 = 1
This can now easily be solved by the ladder method:
Ek – Ek-1 = 1
+ Ek-1 – Ek-2 = 1
+…
… …
+ E1 - E0 = 1
Ek - E0 = k
k equations
So
Ek = E0 + k
where E0 ≡ E(20) = E(1) = 0
→ E(2k) = k
Since
n = 2k ⇔ k = log2 n
E(n) = log2 n
Thus E(n) ∈ O(log n | n is a power of 2) (remember that bases
of logs can be dropped
under ‘O’ )
conditional asymptotic
notation
COMP1004: Part 2.2
46
It can be shown (Brassard and Bratley, p.46) that
f(n) ∈ O( g(n) | n is a power of 2 ) ⇒ f(n) ∈ O(g(n))
if
(a) g is eventually non-decreasing
(b) g(2n) ∈ O( g(n) ) -- 'smoothness property'
log n is an increasing function for all n, and since
log(2n) = log 2 + log n ∈ O(log n)
we can say finally that for all n
E(n) ∈ O(log n)
Exp2 is thus significantly more efficient, as n grows, than the naïve
algorithm Exp1 -- though it is always advisable to check that the
computational overheads in implementing a more sophisticated
algorithm don’t outweigh the benefits for the size of input instances
you are likely to encounter.
Example :
f(1) = 1
f(n) = 4 f(n/2) + n2,
n>1
Change variables n → 2k to obtain the recurrence
fk = 4fk-1 + 4k
Make the further substitution (because of the multiplier ‘4’ in front
of fk-1):
fk = 4 k g k , f0 = g 0
→ 4k gk = 4. 4k-1 gk-1 + 4k
COMP1004: Part 2.2
47
Dividing by 4k and constructing the ladder
gk - gk-1
+ gk-1 – gk-2
+ … …
+ g1 - g0
gk - g0
=1
=1
…
=1
=k
→ gk = g0 + k
(where k and n are related by n = 2k, k = log2 n)
Multiplying the solution for g by 4k:
fk = 4 k f0 + 4 k . k
→ f(n) = n2 f(1) + n2 log2n
(f0 = f(20) = f(1); 4k = (2k)2 = n2)
f(n) ∈ O( n2 log n | n is a power of 2)
Since n2 log n is a non-decreasing function for all n > 1 (it has a
turning point between 0 and 1), and
(2n)2 log(2n) = 4 n2(log2) + 4n2(log n)
∈ O(n2log n)
it can be concluded that for all n
f(n) ∈ O(n2log n)
COMP1004: Part 2.2
48
Example:
f(1)=1
f(2)=2
f(n)=4f(n/2) – 4f(n/4), n>2
Change variables n → 2k (noting that as n/2 → 2k-1, n/4 → 2k-2 )
to obtain the recurrence for the new variable k
fk = 4 fk-1 – 4 fk-2
This is a homogeneous 2nd order recurrence, and should be
solved using the characteristic equation method.
The characteristic equation for the above recurrence (variable k) is
α2 – 4α + 4 = 0
→ (α - 2)2 = 0
Repeated root ρ=2, so the general solution for k is
fk = c 1 2 k + c 2 k 2 k
log2n
In terms of the original variable n, this general solution is
f(n) = c1 n + c2 n log2n
Is this O(n log n)? It depends on the base cases, which determine
c1 and c2.
f(1) = c1 + c2 log21 = c1 = 1
0
f(2) = 2c1 + 2 c2 log2 2 = 2(c1 + c2) = 2
1
→c1 = 1, c2 = 0
→f(n)=n
COMP1004: Part 2.2
O(n) not O(n log n)
49
Strassen’s algorithm for multiplication
We now look for ways to improve the efficiency of multiplication of
two n-bit numbers -- the methods we have seen so far are O(n2).
We can write the two n-bit numbers which are to be multiplied like
this:
a
n/2-bits
b
c
n/2-bits
( a . 2n/2 + b )
×
×
n/2-bits
d
n/2-bits
( c. 2n/2 + d )
= a. c. 2n + (a. d + b. c). 2n/2 + b. d
this is just
putting n zeros
after the value
a.c, so is O(n)
this is putting
n/2 zeros after
(a.d + b.c), so is O(n)
Base 10 (decimal) example for n=6 digits:
12345610 = 12300010 + 45610
= 12310×103 + 45610
a
b
Let M(n) be the cost of multiplying two n-bit numbers using
Strassen’s algorithm.
M(1) = a0
(the cost of multiplying 2 bits)
M(n) = 4 M(n/2) + a1n + a2
all other arithmetic operations apart from
the 4 n/2-bit multiplications -- note that
addition of two n-bit numbers is O(n)
4 pairs of n/2-bit multiplications
( a.c, b.d, a.d and b.c )
COMP1004: Part 2.2
50
This recurrence is similar to the one on p.47 and we proceed
similarly:
First, set n = 2k, M(2k) = Mk to get
Mk = 4 Mk-1 + a1 2k + a2
Now set Mk = 4kNk, M0 = N0 (this is just doing both necessary
substitutions in one step, a shortened form of the method in the
previous example):
4kNk = 4. 4k-1Nk-1 + a1 2k + a2
Divide by 4k and construct ladder:
Nk - Nk-1 = a1/2k + a2/4k
+ Nk-1 - Nk-2 = a1/2k-1 + a2/4k-1
+…
+ N1 - N0
Nk - N0
…
= a1/21
=
k
∑ (a
i=1
1
+ a2/41
/ 2i + a 2 / 4 i )
Multiplying by 4k (with M0 = M(1) = a0 ):
i
i
k
k
⎧
⎛ 1 ⎞
⎛ 1 ⎞ ⎫
Mk = 4 ⎨a0 + a1 ∑ ⎜ ⎟ + a 2 ∑ ⎜ ⎟ ⎬
i =1 ⎝ 2 ⎠
i =1 ⎝ 4 ⎠
⎩
⎭
k
COMP1004: Part 2.2
51
From the earlier lecture on geometric series,
(
i
k
)
k
)
⎛ 1 ⎞ 1/ 2 1 − (1/ 2)
⎜ ⎟ =
∑
1 − 1/ 2
i =1 ⎝ 2 ⎠
k
= 1−
1
2k
<1
(
i
⎛ 1 ⎞ 1/ 4 1 − (1/ 4)
⎜ ⎟ =
∑
1 − 1/ 4
i =1 ⎝ 4 ⎠
k
=
1 ⎛
1 ⎞
⎜1 − k ⎟ < 1/3
3 ⎝ 4 ⎠
Hence
M(2k) < 4k (a0 + a1 + a2/3)
→ M(n) < n2 (a0 + a1 + a2/3)
Since n2 is non-decreasing, and (2n)2 ∈ O(n2), we can say that
M(n) ∈ O(n2).
But this is the same time-complexity as shift-and-add multiplication
and the à la Russe method), so at first sight Strassen’s algorithm
looks no better than our previous ones.
COMP1004: Part 2.2
52
However if we use the following observation, due to Strassen
(a - b). (d - c) = (a. d + b. c) – a. c –b. d
we can compute a.c, b.d and (a.d + b.c) in just three
multiplications:
a.c, b.d and (a-b).(d-c)
(we assume subtractions are of negligible (O(n)) cost compared to
multiplications).
Repeating the above analysis with
M(n) = 3M(n/2) + a1n + a2
we end up with
i
i
k
k
⎧
⎛ 2 ⎞
⎛ 1 ⎞ ⎫
M(2 ) = 3 ⎨a0 + a1 ∑ ⎜ ⎟ + a 2 ∑ ⎜ ⎟ ⎬
i =1 ⎝ 3 ⎠
i =1 ⎝ 3 ⎠
⎩
⎭
k
where
k
i
⎛ ⎛ 2 ⎞k ⎞
⎛ 2 ⎞
⎜ ⎟ = 2⎜⎜1 − ⎜ ⎟ ⎟⎟ < 2
∑
i=1 ⎝ 3 ⎠
⎝ ⎝ 3 ⎠ ⎠
i
k
k
⎛ 1 ⎞ 1 ⎛⎜ ⎛ 1 ⎞ ⎞⎟
⎜ ⎟ = ⎜1 − ⎜ ⎟ ⎟ < ½
∑
2 ⎝ ⎝ 3 ⎠ ⎠
i=1 ⎝ 3 ⎠
k
and hence
a ⎞
⎛
M(2k ) < 3 k ⎜ a 0 + 2a1 + 2 ⎟
2 ⎠
⎝
a ⎞
log 3 ⎛
M(n) < n 2 ⎜ a0 + 2a1 + 2 ⎟
2 ⎠
⎝
using
3log2 n ≡ (2log2 3 )log2 n
= (2log2 n )log2 3
= nlog2 3
COMP1004: Part 2.2
53
Therefore M(n) ∈ O( nlog2 3 | n is a power of 2 )
So, since nlog2 3 is non-decreasing, and (2n)log2 3 ∈ O( nlog2 3 ) , this
result can be generalised to all n, and hence
M(n) ∈ O( nlog2 3 ) = O(n1.59…)
This is a time-complexity intermediate between the O(n2) of the
elementary algorithms and the O(n) of the lower bound on the
single-bit-operation cost of multiplying two n-bit numbers.
How much further could a Strassen-like method be pushed
toward an O(n) performance?
Dividing into 3 parts of n/3 bits, and using a similar -- but more
complicated -- trick to reduce the number of multiplications in this
case to get just 5 (as opposed to the 9 that would be näively
required) leads by similar techniques to
M(n) ∈ O( nlog3 5 )=O(n1.46…)
But relatively speaking this is a much smaller improvement than
was gained by going from the elementary algorithms to breaking
the numbers into two n/2-bit parts, and the associated O(n) costs
(from the 'trick') are greater.
If we try to break the numbers into n/4-bit, n/5-bit...parts and use a
Strassen-like method then the relative improvements each time get
lessened still further, and the O(n) overheads become still greater,
so that such elaborate algorithms would be found better in practice
only for such large-n cases that they would not really be of use
even to cryptographers. The moral is that the singleminded pursuit
of an algorithm with the best possible asymptotic (n→∞) behaviour
may not always in practice be a sensible thing.
COMP1004: Part 2.2
54