Download Random walks, diffusion and movement

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Functional decomposition wikipedia , lookup

List of important publications in mathematics wikipedia , lookup

Halting problem wikipedia , lookup

Proofs of Fermat's little theorem wikipedia , lookup

Function (mathematics) wikipedia , lookup

Collatz conjecture wikipedia , lookup

History of the function concept wikipedia , lookup

Dirac delta function wikipedia , lookup

Non-standard calculus wikipedia , lookup

Mathematics of radio engineering wikipedia , lookup

Transcript
SQUARE 2 MAGAZINE
Number 17. December 2008.
Welcome to our latest edition. We remind you that news about what you are doing should
be sent to the Editor, David Penman, email address [email protected] Note that the
web address has changed slightly in the not-too distant past: it is now
http://www.essex.ac.uk/maths/dept/square2/index.shtm.
You will be able to access all copies of the magazine there.
Departmental News.
Several postgraduate students in the Department have recently been approved for their
Ph.Ds. (In some cases, the award may yet be subject to minor corrections to the thesis).
Rong Gao, who was an undergraduate student in the Department, was recently awarded
her Ph.D for a thesis entitled “Some colouring problems for pseudo-random graphs”. Dr.
Penman was her supervisor.
Also recently awarded a Ph.D. (to go with an earlier one in geophysics) is Dan Brawn
for a thesis on fitting gamma distributions to observed drop size distributions. Prof.
Upton was his supervisor.
Tim Earl was jointly supervised with Biology, and his thesis was on levels of nutrients in
water. He was supervised by Prof. Upton and Prof. Nedwell (Biology).
Caroline Johnston has also recently received her Ph.D for a thesis about microarray
analysis. Dr. Harrison was her supervisor.
Wavelets and their applications.
The aim of this article is to introduce, in a rather informal way, the theory of wavelets
which has been of some importance recently. We will also talk about some of the areas in
which these ideas are applied. The main sources culled for this article are the websites
http://www.amara.com/IEEEwave/IEEEwavelet.html and
http://www.pacm.princeton.edu/~ingrid/parlez-vous%20wavelets.pdf for basics, with
some additional corroborative detail taken mainly from the textbook by Daubechies.
These websites suggest further reading too.
Once upon a time there was Fourier analysis. Remember the idea: Suppose you are given
a function (in engineer-speak, a signal) f (t ) , which has period (let us say) 2π - the
period doesn’t really matter, it’s just a matter of scaling. What functions can you think of
with period 2π ? If you don’t think of cos(nt ) and sin(nt ) for integers n, please change to
Media Studies: if you don’t then think of linear combinations of the functions, please
take a course in linear algebra. At this point, it becomes rather more justifiable to run out
of ideas: the rough idea is that all “sensible” periodic functions can be obtained as infinite
sums of these functions with appropriate weights. If we are going to write
∞
f (t ) = a 0 + ∑ (a n cos(nt ) + bn sin(nt )) (*)
n =1
then (if one is suitably cavalier about exchanging infinite sums and integrals, and uses
standard trigonometric identities), it is not hard to show that we “must” have (for n>0)
an =
1
π∫
2π
0
f ( x) cos(nx)dx, bn =
1
π∫
2π
0
f ( x) sin( nx)dx, a 0 =
1
2π
∫
2π
0
f ( x)dx
One can then show various rigorous results on when this series on the right-hand side of
(*) converges, and whether it converges to f (t ) : we will mostly avoid concerns of rigour
in this article, but cannot resist mentioning a key (and hard) result of Carleson, proved as
recently as 1966, that for “square-integrable” functions on [0,1] (a class which
(comfortably) includes continuous functions) the partial sums converge almost
everywhere (i.e. except on a negligible (measure zero) set) to the original function. Thus
the rough idea is largely true, and in some sense reduces the study of the (possibly very
complex) function to understanding the simpler set of coefficients a n . (One can do better
for some special functions, e.g. for a function which is twice differentiable with
continuous second derivative, the Fourier series converges (uniformly) for all values of t,
and we will work through this later).
The next thing one normally looks at is the Fourier transform for a general (not
necessarily periodic) signal f (t ). This is (ignoring detailed convergence questions: up to
small tweakings of sign conventions and constant factors: of course i = − 1) )
1
(ℑf )(ω ) =
∫
∞
f (t )e −iωt dt .
2π
This is the same idea: it is changing round to representing the frequency content of the
function f. Note that if f is only non-zero on [0,2π ] then there is a very simple
relationship between the numbers ℑf (n) for integers n and the a n and bn in the Fourier
series before.
−∞
A feature in Fourier analysis is that the Fourier transform of a highly peaked function
− ax 2
tends to be flat and vice versa. For example, if f ( x) = e
is (a constant times) the
probability density function of a normal with mean 0 and variance 2/a, a function which
gets more and more sharply peaked as a → ∞, then we can easily check that
1
2π
(ℑf )(ω ) =
=
1
2π
∫
∞
∫
∞
−∞
e −at e −iωt dt =
2
e −a (t +iω /( 2 a )) e −ω
2
−∞
2
/ 4a2
dt =
1
2π
∫
∞
e −a (t +iω /( 2 a )) e a( iω / 2 a ) dt
2
2
−∞
1 −ω 2 / 4 a 2
e
2a
where you are urged to check the details of the argument, familiarity with simple
properties of normals should see you through. This of course will be very flat as a → ∞.
This feature can be problematic: it is not easy to read off information about “blips” in a
graph from the Fourier coefficients. (A blip is obvious in the graph of the original
function/signal, but if you are looking at the coefficients it will, as the transform of the
blip will be nearly flat, it will be hard to detect).
Partly, this is dealt with by “time-localisation”; “windowing” the function by instead
choosing a “window function” g and calculating the windowed Fourier transform
∞
(ℑ win f )(ϖ , t ) = ∫ f ( s ) g ( s − t )e −iωs ds .
−∞
The naïve guess for the window function g might well be a function which is (say) one on
some interval [a,b] and zero elsewhere, but this upsets the smoothness of the functions
and so smoother window functions are desirable. Certainly it is desirable that both g and
ĝ should be concentrated near zero, as then one can (informally) say that
(ℑ win f )(ϖ , t ) will provide a description of the function near time t and frequency
We now turn (at last…) to the wavelet transform. The absolutely basic summary of a
wavelet transform is that it allows one to cut up data into different frequency
components, and then analyse each component according to its scale. In mathematical
terms, we will have
(ℑ wav f )(ϖ , t ) =
1
ϖ
∫
∞
−∞
f ( s )ψ (
s −t
ϖ
)ds
for a well-chosen function ψ (t ) with the property that
∫
∞
−∞
ψ (t )dt = 0. We deliberately
obfuscate for now the choice of ψ , which rejoices in the (sexist) title of the mother
wavelet, but the flexibility will be part of the point. The functions
s−b
1
ψ a ,b ( s ) =
ψ(
) are the eponymous wavelets. The rough idea is that the time
a
a
analysis is carried out with a contracted, high-frequency version of the wavelet and
frequency analysis with a dilated, low-frequency version of the wavelet. Crudely, one can
see both the wood and the trees.
There are both similarities and differences between the wavelet transform and the Fourier
transform. Both involve simplifying (in some sense) the description of a function by
turning it into a simpler (in some ways) object. Both are, for those of a puremathematical bent, inner products of the function f being analysed with a family of
functions with two labels. Both have well-defined, and well-understood, inversion
formulae. Very importantly from the point of view of applications, both have discretized
versions (the “discrete Fourier transform” and “discrete wavelet transform”) to use in
practice, and even more importantly, these can be computed quickly and efficiently. (In
the case of the Fourier transform, this is the Fast Fourier transform, which could easily
fill an edition of Square2 all by itself: for now, we merely note that the key idea in
speeding up the process, due to Cooley and Tukey in the 1960s, of expressing the discrete
Fourier transform of “size” N = N 1 N 2 (we omit the detailed definition of size) in terms of
discrete Fourier transforms of “size” N 1 and N 2 , was little more really than a trick used
by Gauss in 1805 to interpolate the orbits of the asteroids Pallas and Juno. In fact the idea
had also been discovered before Cooley and Tukey by Good and Yates in the context of
experimental design and by Danielson and Lanczos in the context of X-ray scattering: see
http://www.wisdom.weizmann.ac.il/~naor/COURSE/fft-lecture.pdf.
We are digressing…)
There are, however, also key differences. A principal one is that wavelet functions are
localized in space, whereas the sine and cosine functions used in Fourier analysis are not.
This means that often, when we take the wavelet transform, we end up with functions
which are, in some sense, “sparse” – this in turn leads to the applications we shall discuss
below.
Historically, the first occurence of a wavelet was by Haar in (an appendix to) his thesis in
1909. (Yes, the same Haar, for those in the know, as the invariant measure on locally
compact topological groups: see
http://www-groups.dcs.st-and.ac.uk/~history/Biographies/Haar.html).
By about the 1980s, mathematician Yves Meyer was interested in the problem, as were
geophysicist Jean Morlet (who was trying to model seismic phenomena – an archetypal
home of “sudden shock” behaviour). Indeed Meyer constructed a wavelet with two good
properties that the mother function is “smooth” (basically differentiable infinitely often)
and orthogonality (i.e. that the actual wavelets are orthogonal functions). Eventually
Belgian mathematician Ingrid Daubechies extended the ideas of Meyer to get a family of
wavelets which were not only smooth and orthogonal, but also had compact support.
These points are very helpful in getting the theory to work in a slick way.
What does one do with these things? One absolutely basic idea is to use them to “get rid
of noise”. This follows on from the point above about being able to look at things on the
scale appropriate to them. Take a wavelet transform of your original signal (function).
Take the attitude that wavelet coefficients (the discretization referred to above is tacitly
being used here) less than a certain magic number – the “threshold” – are just “noise” and
should be set equal to zero. Take the inverse wavelet transform – hopefully you should
have a “cleaned-up” version of the signal/function which you are better placed to look at.
This applies in particular in music: for example, researchers at Yale University took a
very old recording (made by the inventor Thomas Edison) of Brahms playing, in 1889,
one of his Hungarian Dances 1 . It was recorded on a wax cylinder which partially melted:
this, plus the low-tech available at the time, made it extremely hard to hear Brahms. With
the wavelet techniques, it became much clearer. Similarly, one can remove noise from
visual images - the FBI in the United States is using a wavelet-based standard for
computerising its fingerprint files (a difficult process, as obviously the detail in the
images cannot be lost but at the same time the image has to be compressed if it is to be
sent by computer).
An interesting recent idea is an attempt to distinguish the paintings of Van Gogh from
those of other famous painters by examining the wavelet transforms of the paintings. (A
1
In general, it is striking how many of the great composers of the late 19th century and very early 20th
century did leave some, albeit usually rather rudimentary, recordings in which they play, or record, their
own works. . The list includes, in addition to Brahms, (in no particular order) Mahler, Fauré, Verdi…
Slightly later, there are various recordings by Debussy and Richard Strauss: Elgar conducted some of his
pieces more than once (with notably different interpretations on the two occasions….). Some of the
famous performers from early recording technology include violinist Joseph Joachim (for whom Brahms
wrote his Violin Concerto). You can apparently hear the Brahms recording we talk about in the main article
at http://www.youtube.com/watch?v=BZXL3I7GPCY Note that crackly old recordings also extend to
(e.g.) Browning and Tennyson reading their own poetry: see
http://www.poetryarchive.org/poetryarchive/historicRecordings.do We are digressing…..
painting is of course a 2-dimensional signal, if one measures each colour by (say) its
frequency). It has been a cliché amongst many people interested in art history for some
time that Van Gogh’s paintings “feel different” from those of most other painters. See
http://www.pacm.princeton.edu/~ingrid/VG_swirling_movie/ for this: there does seem to
be some evidence that Van Gogh’s paintings are revealed as “different” from those of a
number of other painters. Quite how far this idea will run is not yet clear, but again the
basic idea of removing unimportant noise to find “hard-core structure” seems, at least in
this case, to yield some suggestive results.
Another key area is in quantum mechanics – indeed, one of the original impetuses for the
development of the theory was from the effort to understand coherent states. For a
comparatively straightforward example in this direction, look at
http://arxiv.org/abs/cond-mat?papernum=9511063
Applications of any particular piece of theory, of course, include applications in other
parts of mathematics. We indicate only two areas where wavelet ideas have been used.
One of these is in so-called discrepancy theory. Suppose we have a collection of n points
{u i }1n in the unit square in 2 dimensions, so that u i = (u i (1) , u i ( 2 ) ) with u i ( j ) ∈ [0,1]. How
well spread out can these points be: or, essentially equivalently, how far can the number
in a given rectangle vary from the number of them that one would expect to find in that
rectangle? For α = (α 1 , α 2 ) ∈ [0,1] × [0,1] define
D(α ) =| {u i }1n | ∩[0, α 1 ] × [0, α 2 ] | −nα 1α 2 ,
i.e. the actual number in the bottom left-hand corner rectangle minus the number one
would expect to be in that rectangle if they were evenly spread out. The basic lower
bound estimate on this problem was proved by K. F. Roth: he showed that
log(n) = O( ∫ ∫ D(α ) 2 dα 1 dα 2 )
Montgomery (http://www.nato-us.org/analysis2000/papers/montgomery.pdf) states that
the proof is “ a construction reminiscent of wavelets”, and also refers to work on related
problems by Pollington which does use wavelets.
A second area of application to other parts of mathematics is in numerical analysis,
especially solving (partial) differential equations. To some extent the idea of looking on
things at several scales (“multiscale methods”) had already taken hold in the numerical
analysis community, and perhaps the progress here has been less striking than in the
fields of image processing discussed earlier, but the fundamental idea of approximating a
function by a small number of coefficients is of course a staple in approximation theory
and numerical analysis, and there do seem to be some cases where the wavelet mentality
helps.
Problems Corner
Recall our problems from last time:
Problem 1. (This arises from the article on Borsuk’s problem earlier in Edition 16).
Show carefully (using sups etc. correctly) that a 1-dimensional set of diameter 1 can be
written as the union of 2 sets of diameter strictly less than 1. Not hard, but try to make
sure you get all the details in.
Solution. Since the diameter is finite, the set (S, say) must in particular be non-empty.
(Otherwise the supremum in the definition of diameter would be over an empty set and
by convention the supremum of an empty set is minus infinity).
Let s ∈ S : as the diameter is 1, we have d ( z , s ) ≤ sup d ( x, y ) = 1 for all z ∈ S . (Here
x , y∈S
of course d is just the normal distance between two points on the line). In particular
S ⊆ [ s − 1, s + 1] so is bounded. Thus (as S is non-empty and bounded) it has a (finite)
infimum m and a (finite) supremum M. The idea is now basically that we will partition S
into the two sets S ∩ [m, (m + M ) / 2) and S ∩ [(m + M ) / 2, M ] (it is obvious that this is
indeed a partition of S) and show that both these sets are indeed of smaller diameter.
Note that m < M as otherwise S would consist only of the common value m = M so
would not have diameter 1. Next note that, given ε > 0, there is x ∈ S such that
x ≤ m + ε and y ≥ M − ε . Since m < M , we can by taking ε small enough, ensure that
x < y. Thus x − ε ≤ m and y + ε ≥ M . Thus
d (m, M ) ≤ d ( x − ε , y + ε ) = ( y + ε ) − ( x − ε ) = y − x + 2ε = d ( x, y ) + 2ε ≤ 1 + 2ε
where in the last line we used the fact that S has diameter 1. This holds for all ε > 0 so
m+M
M −m 1
we conclude that d (m, M ) ≤ 1. Since m < M , we have d (m,
)=
≤ and
2
2
2
m+M
M −m 1
d(
,M) =
≤ . All claims have now been proved.
2
2
2
Problem 2. (Again, arising from the article). Prove that, for non-negative integers k ≤ n
⎛ n ⎞ ne
we have ⎜⎜ ⎟⎟ ≤ ( ) k .
k
⎝k ⎠
⎛ n⎞ nk
[Hint: First prove (easily) ⎜⎜ ⎟⎟ ≤ ( ) . Then think about the series for exponential and
k!
⎝k ⎠
how it might be relevant].
Solution. For the first step, note that, just crudely bounding each n − j above by n, we
have
⎛n⎞
n!
n(n − 1)(n − 2)....(n − k + 1)(n − k )! nnn...n(n − k )! n k
⎜⎜ ⎟⎟ =
=
≤
≤
.
k!(n − k )!
k!(n − k )!
k!
⎝ k ⎠ k!(n − k )!
Thus it remains to prove that
ne
nk
kk
≤ ( ) k , or equivalently (noting that n and k are non-negative) that
≤ ek .
k!
k
k!
∞
Now recall that e x = ∑
n =0
xn
. Thus, taking x = k , one of the terms in the series for e k is
n!
k
kk
. Since all the terms in the series are non-negative, we deduce
≤ e k as required.
k!
k!
k
Problem 3. (Unsolved problem). Is, to the best of my knowledge, still unsolved.
New Problems
As usual, we do not want to know about your solutions to Problems 1 and 2 which are
standard (in a sense). Your valid solutions to Problem 3 are very welcome, but the Editor
will not be holding his breath
Problem 1. (This arises from the article about Fourier analysis/wavelets earlier). Show
that a twice continuously differentiable function (on [−π , π ] , say) 2 has a Fourier series
which is convergent everywhere. Somewhat more precisely, show that if f if a function
on [−π , π ] and
ak =
1
π
π
∫ f (t ) cos(kt )dt
−π
for k ≥ 0 , b k =
then we have that, for any x ∈ [−π , π ]
1
π
π
∫ f (t ) sin(kt )dt for k ≥ 0 ,
−π
a0
+ a1 cos( x) + a 2 cos(2 x) + ... + b1 sin( x) + b2 sin( 2 x) + ....
2
2
When one has the result for this interval, it is obviously very easy to move it to any other interval of your
choice.
is a convergent function and that it converges to the original function f. You should adapt
the level of rigour of your proof to the level you can cope with, and if you know about
uniform convergence you should show that the convergence is also uniform.
Hint: Use integration by parts (those acquainted with rigour: why is this justified?) to
show that having a continuous derivative implies
−1 π '
f (t ) sin(kt )dt
k ∫−π
and that having a continuous second derivative implies that
πa k =
π
∫π f
'
(t ) sin( kt )dt =
−
1 π "
f (t ) cos(kt )dt
k ∫−π
Now let M = max | f " (t ) | . (Why does this maximum exist?). Deduce an upper bound on
[ −π ,π ]
π
2M
. A similar bound holds for | bk |: now how do
k2
we deduce that the series is convergent? (Rigourists: why is it uniformly convergent?).
∫π f
−
"
(t ) cos(kt )dt and so that | a k |≤
You should note that we have not yet proven that it is convergent to the original function
– for this, the usual approach is to say that we then take the Fourier series of the function
g ( x) =
a0
+ a1 cos( x) + a 2 cos(2 x) + ... + b1 sin( x) + b2 sin( 2 x) + ....
2
we have just shown is convergent and show that its Fourier coefficients are the same as
those of the original f. Now think why this should (with the above restrictions on the
functions) lead to the two functions being equal. Those doing it rigorously will have a
number of things to justify en route….
Problem 2. This idea is due to the Hungarian mathematician György Pólya (usually
anglicised as George Pólya:
http://www-gap.dcs.st-and.ac.uk/~history/Biographies/Polya.html).
We are going to give an alternative proof of the fact that there are infinitely many primes,
using Fermat numbers.
Recall that the Fermat numbers are defined by
Fn = 22 + 1.
n
Readers of Square2 should not need reminding that this is, so to speak,
Fn = 2( 2 ) + 1
n
rather than the much smaller number
(22 ) n + 1.
For example, F0 = 3, F1 = 5, F2 = 17, F3 = 257, F4 = 65337 . These numbers are (with
varying degrees of ease) all checked to be prime numbers, and Fermat seems to have
conjectured that all these numbers are prime. However Euler observed that
F5 = 2 32 + 1 = 4294967297 = 641 × 6700417
and it turns out that the next few Fermat numbers known are not prime. It remains an
open problem whether there are infinitely many Fermat numbers which are prime, or
indeed whether there are infinitely many Fermat numbers which are composite, and this
seems very hard.
However, our problem for today is much simpler, namely use the Fermat numbers to
show that there are infinitely many primes. Try doing this one without the hints first.
Hints: explain first why it is enough to show that the numbers Fn and Fm for m ≠ n, are
coprime. Without loss of generality, n > m. Consider Fn − 2 . Factorise (hard). What
small number would any common divisor of Fn and Fm have to divide? Finish off.
For bonus marks, those who know a little more number theory might like to upgrade their
proof to a proof that there are infinitely many prime numbers congruent to 1 mod 4. You
will need to know about the quadratic character of -1 (including what this means…).
(There are many proofs that there are infinitely many primes – see
http://primes.utm.edu/notes/proofs/infinite/ for a list of some of them, and the book by
Ribenboim for more).
Problem 3 (unsolved problem). If you, like many of us, are frequently tempted to believe
that everything about finite-dimensional linear algebra is already known, consider the
following simple-to-state conjecture due to G. C. Rota.
Problem. Suppose V is an n-dimensional vector space. Then suppose B1 , B2 ,....Bn are n
disjoint bases of V. Then there are n disjoint bases C1 , C 2 ,...C n of V such that
| C i ∩ B j |= 1 for all i and j.
An equivalent formulation is: Suppose given any n 2 vectors in V , and that one can
arrange them as an n × n matrix in such a way that each column is a basis. Then the
entries inside the columns can be permuted in such a way that each row is also a basis.
There are slightly more general forms of the conjecture too.
Those who have the word “transversal” in their vocabulary can reformulate the
conjecture in that language. Those who have the word “matroid” in their vocabulary can
ask the more general question in that language. However the problem can be savoured
and thought about even without these refinements.
The conjecture is true for n = 1 and even you lot should be able to do that one. The case
n = 2 is also quite easy: Let the first basis be {e1 , e2 } and the second { f1 , f 2 }. The basic
observation to bear in mind is that in a 2-dimensional vector space, given any vector v,
any vector which is not a scalar multiple of v can be used to extend v to a basis.
Now if f 1 is a scalar multiple of e1 then f 2 cannot be a scalar multiple of e1
(otherwise the span of { f1 , f 2 } would just be 1-dimensional). Thus {e1 , f 2 } is a basis. It
now remains (in this case) to show that {e2 , f 1 } is a basis: for this, if it was not, then we
have both f1 is a scalar multiple of e1 and f 2 is a scalar multiple of e2 , so not of e1 and
then we could just take {e1 , f 2 } and {e2 , f 1 } as the bases.
The other case is when f1 is not a scalar multiple of e1 : in that case {e1 , f 1 } is a basis
and the only way in which {e2 , f 2 } could fail to be a basis is if e2 is a scalar multiple of
f 2 . However in that case we can just take {e2 , f 1 } and {e1 , f 2 } to be the bases as before.
However even the case n = 3 appears to be non-trivial: see
http://www-math.mit.edu/~tchow/dinitz.pdf
In general, it is known that one can (thinking in terms of permuting the vectors in the
columns to make as many rows as possible bases too) that one can obtain n of the rows
being bases.
The case where n is even and the field over which the vector space is defined has
characteristic zero, i.e. no finite subfield, is a consequence of the following conjecture:
For a positive even integer n, we consider Latin squares of order n, i.e. n × n matrices all
of whose entries are from {1,2,...n} with, for each 1 ≤ i ≤ n , precisely one occurrence of i
in each row and precisely one occurrence of i in each column. A Latin square is either
even or odd. To make clear which is which, consider each row of the Latin square as a
permutation (we are using here notation and ideas discussed in Square 2 Edition 11): it is
then a product of disjoint cycles, and the sign of a permutation is the product of the signs
of its disjoint cycles, and cutting a long story short, the sign of a cycle (a1 ,...a r ) is − 1 if r
is even and 1 if r is odd). If we then take the product of the signs of all the rows of our
Latin square, we will clearly get 1 or − 1 . If we get 1, the Latin square is even: if − 1 , the
Latin square is odd. We then have the following
Conjecture For even positive n, the number of even Latin squares of order n and the
number of odd Latin squares of order n are not equal.
As just noted, this conjecture would imply that the case n even and characteristic zero of
Rota’s conjecture is true.
The problem turns out to be closely linked with certain combinatorial problems related to
the so-called Combinatorial Nullstellensatz developed by Noga Alon and co-workers
(another topic which could easily fill an edition of Square 2…), and this link allows one
to prove the conjecture whenever n = 2 r p or 2 r ( p + 1) for an odd prime number p.
See http://garden.irmacs.sfu.ca/?q=op/rotas_basis_conjecture by Matt de Vos for more
information on this.