Download The notes for the course

Document related concepts

History of the function concept wikipedia , lookup

Dirac delta function wikipedia , lookup

Law of large numbers wikipedia , lookup

Elementary mathematics wikipedia , lookup

Partial differential equation wikipedia , lookup

Mathematics of radio engineering wikipedia , lookup

Transcript
1
Contents
0.1
0.2
0.3
0.4
0.5
General textbooks
Statistics . . . . .
Fourier analysis . .
Matrices . . . . . .
Other . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1 Fourier Series in Complex Form
1.1 Functions . . . . . . . . . . . . . .
1.1.1 Periodic functions . . . . .
1.1.2 Even functions . . . . . . .
1.1.3 Odd functions . . . . . . . .
1.1.4 Sine and cosine of period T0
1.2 The Σ-notation . . . . . . . . . . .
1.3 Specifying periodic functions . . .
1.4 Complex Fourier series . . . . . . .
1.5 An application to filters . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5
5
6
6
6
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7
7
7
7
7
8
8
9
9
12
2 Fourier transforms
2.1 The Fourier Transform . . . . . . . . .
2.2 Fourier transform pairs . . . . . . . . .
2.3 Discrete and continuous spectra . . . .
2.4 A special case — f (t) is real and even
2.5 The Dirac δ function . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
16
16
17
18
19
20
3 Fourier transform properties
3.1 Linearity (also known as superposition) . . . . . . . . . . . .
3.2 Time scaling . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3 Time shifting . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4 Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . .
3.5 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.6 Frequency shifting . . . . . . . . . . . . . . . . . . . . . . . .
3.7 Convolution . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.7.1 Definition of convolution . . . . . . . . . . . . . . . . .
3.7.2 Fourier transform of the convolution of two functions .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
24
24
25
26
27
28
29
29
29
29
.
.
.
.
.
.
.
.
.
4 Fourier transforms without integration
33
4.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2 Recipe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1
5 Cross correlation and autocorrelation
5.1 Reminder: complex conjugate . . . . . . . . . . . . . .
5.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . .
5.3 Correlation properties . . . . . . . . . . . . . . . . . .
5.3.1 The autocorrelation function is always even . .
5.3.2 Calculating cross correlation either way round .
5.3.3 The maximum of the autocorrelation function .
5.3.4 Autocorrelation of a periodic function . . . . .
5.4 Worked examples . . . . . . . . . . . . . . . . . . . . .
5.5 Power and energy signals . . . . . . . . . . . . . . . .
5.6 Correlation demonstrations . . . . . . . . . . . . . . .
5.6.1 Cross correlation . . . . . . . . . . . . . . . . .
5.6.2 Autocorrelation . . . . . . . . . . . . . . . . . .
5.6.3 Practical applications . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
38
38
38
40
40
41
41
41
42
45
46
46
47
48
6 Introductory probability
6.1 Definition of probability . . . . . . . . . . . . . . .
6.2 Addition of probabilities — mutually exclusive case
6.3 Addition of probabilities — general case . . . . . .
6.4 Multiplication of probabilities . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
51
51
53
54
55
.
.
.
.
.
.
.
.
7 Discrete variables: the p.d.f., mean and variance
7.1 Random variables . . . . . . . . . . . . . . . . . . . . .
7.2 Definitions: a set of N discrete values . . . . . . . . .
7.2.1 Mean of x, x . . . . . . . . . . . . . . . . . . .
7.2.2 Standard deviation of x, σx . . . . . . . . . . .
7.3 The probability density function . . . . . . . . . . . .
7.3.1 Normalisation . . . . . . . . . . . . . . . . . . .
7.3.2 Other names for p.d.f. . . . . . . . . . . . . . .
7.4 What does the p.d.f. mean? . . . . . . . . . . . . . . .
7.5 The cumulative distribution function . . . . . . . . . .
7.6 Mean & standard deviation: when the p.d.f. is known
7.6.1 Mean, x . . . . . . . . . . . . . . . . . . . . . .
7.6.2 Standard deviation, σx . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
58
58
59
59
59
60
61
63
63
66
66
67
67
8 Continuous distributions
8.1 Continuous random variables . . . . .
8.2 Those definitions again . . . . . . . . .
8.2.1 Mean of x, x . . . . . . . . . .
8.2.2 Standard deviation of x, σx . .
8.3 Application to signal power . . . . . .
8.4 The p.d.f. for continuous variables . .
8.5 The c.d.f., F (x) . . . . . . . . . . . . .
8.6 Definitions: when the p.d.f. is known .
8.6.1 Mean of x, x . . . . . . . . . .
8.6.2 Standard deviation of x, σx . .
8.6.3 The mean of any function of x
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
69
69
70
70
70
71
72
73
75
75
75
75
.
.
.
.
.
.
.
.
.
.
.
2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
9 Theoretical distributions
9.1 The Gaussian distribution . . . . . . .
9.2 The Gaussian probability distribution
9.3 The Poisson distribution . . . . . . . .
9.4 The binomial distribution . . . . . . .
10 The
10.1
10.2
10.3
10.4
10.5
method of least squares
Gauss . . . . . . . . . . . .
A data fitting problem . . .
The method of least squares
Calculating m and c . . . .
Fitting to a parabola . . . .
.
.
.
.
.
.
.
.
.
.
11 Complex frequency
11.1 Complex frequency . . . . . . .
11.1.1 σ < 0 . . . . . . . . . .
11.1.2 σ = 0 . . . . . . . . . .
11.1.3 σ > 0 . . . . . . . . . .
11.2 Linear homogeneous differential
11.2.1 a2 > 1 . . . . . . . . . .
11.2.2 a2 = 1 . . . . . . . . . .
11.2.3 a2 < 1 . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . . . . .
. . . . . .
. . . . . .
. . . . . .
equations
. . . . . .
. . . . . .
. . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
78
78
80
82
85
.
.
.
.
.
88
88
88
89
90
93
.
.
.
.
.
.
.
.
96
96
97
97
98
98
100
101
101
12 The Laplace Transform
103
12.1 The Laplace transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
12.2 The Laplace transform of a derivative . . . . . . . . . . . . . . . . . . . . . . . . . . 105
13 Differential Equations and the Laplace Transform
13.1 Inhomogeneous differential equations . . . . . . . . .
13.2 Solving a d.e. by Laplace transform — overview . .
13.3 The Laplace transform of a differential equation . . .
13.4 Inverse Laplace transform using tables . . . . . . . .
13.5 Inverse Laplace transform by partial fractions . . . .
.
.
.
.
.
108
. 108
. 109
. 110
. 112
. 114
.
.
.
.
.
.
.
.
.
.
116
. 116
. 116
. 117
. 119
. 120
. 122
. 122
. 123
. 123
. 123
15 The Z transform: properties, inversion
15.1 z-transform properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15.1.1 Linearity/superposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15.1.2 Time delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
125
. 125
. 125
. 126
14 The Z transform: definition, examples
14.1 Introduction and Definitions . . . . . . . . . . . . .
14.1.1 Sampling . . . . . . . . . . . . . . . . . . .
14.1.2 The connection with Laplace transforms . .
14.1.3 The two ways of writing down z-transforms
14.2 z-transform examples . . . . . . . . . . . . . . . . .
14.2.1 f (n) = an . . . . . . . . . . . . . . . . . . .
14.2.2 f (n) = δ(n), the unit impulse . . . . . . . .
14.2.3 f (n) = u(n), the unit step function . . . . .
14.2.4 f (n) = an . . . . . . . . . . . . . . . . . . .
14.2.5 f (n) = cos an . . . . . . . . . . . . . . . . .
3
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
15.1.3 Time advance . . . . . . . . . . . . . . . .
15.1.4 Multiplication by an exponential sequence
15.1.5 Differentiation property . . . . . . . . . .
15.1.6 Initial Value Theorem . . . . . . . . . . .
15.1.7 Final Value Theorem . . . . . . . . . . . .
15.2 Inversion of the z-transform . . . . . . . . . . . .
15.2.1 Finite sequences . . . . . . . . . . . . . .
16 The
16.1
16.2
16.3
16.4
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
126
127
127
128
129
130
132
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
137
137
137
138
142
17 Matrices I
17.1 The basics . . . . . . . . . . . . . . .
17.2 Matrix equality . . . . . . . . . . . .
17.3 Matrix addition . . . . . . . . . . . .
17.4 Matrix multiplication . . . . . . . . .
17.4.1 Scalar × matrix = matrix . .
17.4.2 Matrix × vector = vector . .
17.4.3 Matrix × matrix = matrix .
17.5 Determinants . . . . . . . . . . . . .
17.6 Solving two linear equations . . . . .
17.6.1 Properties of the unit matrix
17.7 Application — Z and Y parameters
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
147
147
148
148
148
148
149
149
149
150
152
152
18 Matrices II
18.1 Matrix inversion: Pi to T conversion . .
18.2 Solving n linear equations . . . . . . . .
18.3 Inverting an n × n matrix . . . . . . . .
18.4 The equation matrix × vector = 0 . . .
18.5 Application of matrix × vector = 0 . . .
18.6 Eigenvalues and eigenvectors . . . . . .
18.7 Applications of eigenvalues/eigenvectors
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
155
155
157
157
159
160
161
162
wave equation
Partial differential equations . . . . . . . . . .
Derivation of the wave equation . . . . . . . .
The d’Alembert solution of the wave equation
Boundary conditions . . . . . . . . . . . . . .
What does it all mean? . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
166
166
167
170
171
174
19 The
19.1
19.2
19.3
19.4
19.5
Z transform: applications
Introduction . . . . . . . . . .
Difference equations . . . . .
Solving difference equations .
A FIR filter . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
4
Recommended books (for reference)
0.1
General textbooks
• Mary L. Boas, Mathematical Methods in the Physical Sciences
(2nd ed) ISBN 0-471-09960-0 Wiley
(The library has a couple of copies.)
Contains most of what you need (a little thin on Fourier transforms; useful reference for differential equations, matrix algebra
etc.)
• Erwin Kreyszig Advanced Engineering Maths
(Getting very fat and middle-aged. 1271 pages. I pity students
who have to cycle to the university carrying this one.)
• Weltner, Grosjean, Schuster and Weber, Mathematics for Engineers and Scientists.
(Newly arrived on the scene. It’s thin, light, reasonably priced
and seems to cover a lot of good stuff. No Fourier transforms.)
• K.A. Stroud, Further Engineering Mathematics
(Concentrates on ‘programmed learning’ and if you like this approach, this book may be the one for you.)
0.2
Statistics
There are many possible additional books for statistics. I suggest
you borrow from the library G.M. Clarke and D. Cooke ‘A Basic
Course in Statistics’ if you want some readable background information. Kreyszig is also strong on statistics, Weltner et al is not bad;
Boas less so.
5
0.3
Fourier analysis
G. Stephenson’s ‘Mathematical methods for Science Students’ is good
for Fourier series, and worth having a look at. (The library has many
copies.) It doesn’t even mention Fourier transforms; for these, I can
recommend wholeheartedly S. Haykin’s ‘Communication Systems’,
chapter 2. The library has about a dozen copies.
0.4
Matrices
These are well covered by all the general textbooks, and also by
Stephenson.
0.5
Other
C.R. Wylie, Differential Equations has a very good section on the
wave equation.
Note on the problems
Problems whose numbers have an asterisk are a bit harder than
those without. Look on them as a challenge.
Web page
Materials that support the course available on the WWW and the
address for this is
http://personal.maths.surrey.ac.uk/st/J.Deane/Teach/em2
6
Chapter 1
Fourier Series in Complex Form
1.1
1.1.1
Functions
Periodic functions
A function f (t) is said to be periodic with period T0 if T0 is the
smallest positive number for which
f (t + T0) = f (t)
holds for all t. Examples include:
(1) f (t) = sin t, which has period 2π because sin(t + 2π) = sin t
(2) f (t) = cos 4t, which has period π/2, since cos 4(t+π/2) = cos 4t.
Note: although sin(t + 4π) = sin t, the period is not 4π, because
there is a smaller positive number that has this property (i.e. 2π).
1.1.2
Even functions
A function f (t) is said to be even if
f (t) = f (−t)
for all t. Such functions are symmetrical about t = 0.
Examples: 1 + cos t, t2 − 3t4, |t|.
1.1.3
Odd functions
A function f (t) is said to be odd if
7
f (t) = −f (−t)
for all t. Such functions are anti-symmetric about t = 0.
Examples: sin t, t + 3/t3, t(1 + cos 3t).
1.1.4
Sine and cosine of period T0
In the rest of this chapter, you will see many references to
sin
2nπt
T0
and
cos
2nπt
T0
where n is an integer (a whole number). Using the fact that the
periods of sin t and cos t are both 2π, you should be able to see that
the period of sin(2πt/T0) is T0, and so the period of sin(2nπt/T0) is
T0/n — and that the same is true for cos(2πt/T0).
1.2
The Σ-notation
As a reminder, it is useful at this point to give some examples of the
Σ-notation, which is a neat shorthand way of representing the sum
of a set of numbers. Some examples are:
t1 + t2 + . . . + tN =
N
X
ti
i=1
a0 + a1 x + a2 x2 + . . . =
∞
X
am xm
m=0
∞
X (−1)k sin(2k + 1)t
1
1
1
sin t − sin 3t + sin 5t − sin 7t + . . . =
3
5
7
2k + 1
k=0
Study the examples and make sure you understand why the righthand side represents the left-hand side.
8
1.3
Specifying periodic functions
You will often meet a specification of a periodic function like







with
−π < t ≤ 0
0 < t ≤ π/2
π/2 < t ≤ π
0
f (t) = 
−1




 1
f (t + 2π) = f (t).
It is important that you know how to turn this into a picture of the
function. Check that you can do this by sketching the function on
the axes below, being careful to label the relevant t and f (t) values.
f (t)
✻
✲
t
1.4
Complex Fourier series
There is a neat form in which to write the Fourier coefficients, which
involves complex numbers. We derive this in two steps.
Step 1 — rewrite the Fourier series in terms of complex numbers.
Let us start the definition of a Fourier series in trigonometric form:
9
∞
∞
X
2nπt X
2nπt
1
+
an cos
bn sin
f (t) = a0 +
2
T0
T0
n=1
n=1
where
2
an =
T0
Z
T0
2
2nπt
2
f (t) cos
dt, bn =
T
T0
T0
− 20
Z
T0
2
T
− 20
f (t) sin
2nπt
dt.
T0
(1.1)
Now write sin and cos in terms of complex exponentials, so that
f (t) =
∞
∞
X e2jnπt/T0 − e−2jnπt/T0
a0 X e2jnπt/T0 + e−2jnπt/T0
+
an −
jbn
2
2
2
n=1
n=1
which can be rewritten as
∞
∞
X
an − jbn 2jnπt/T0 X an + jbn −2jnπt/T0
1
e
+
e
.
f (t) = a0 +
2
2
2
n=1
n=1
Define the complex numbers
an − jbn
α0 = a0 αn =
2
α−n =
an + jbn
.
2
Step 2 — rewrite the definitions of an and bn
Expanding cos 2nπt/T0 and sin 2nπt/T0, we can rewrite the definitions of an and bn (equations 1.1) as
Z T0
2
e2jnπt/T0 + e−2jnπt/T0
2
dt
f (t)
an =
T0 − T20
2
2
jbn =
T0
Z
T0
2
e2jnπt/T0 − e−2jnπt/T0
f (t)
dt
T
2
− 20
10
Now, adding these two equations gives
Z T0
2
2
f (t)e2jnπt/T0 dt
an + jbn =
T0 − T20
and subtracting them gives
2
an − jbn =
T0
Z
T0
2
T
− 20
f (t)e−2jnπt/T0 dt.
Using our definition of αn , we have, finally,
f (t) =
∞
X
αn e2jnπt/T0 with
n=−∞
αn =
1
T0
Z
T0
2
T
− 20
f (t)e−2jnπt/T0 dt
where the definition of αn is true for all n.
Example 1.1 Find the Fourier series in the complex form for the square
wave



−1
−1 ≤ t < −1/2


−1/2 ≤ t < 1/2
s(t) =  1


 −1
1/2 ≤ t < 1
with s(t + 2) = s(t).
We are being asked to calculate αn for all n. Hence, we need to calculate
1
αn =
T0
Z
T0
2
f (t)e−2jnπt/T0 dt,
−
T0
2
which in this case is
" Z
#
Z 1/2
Z 1
−1/2 −jnπt
1
−jnπt
−jnπt
−
e
dt +
e
dt −
e
dt
−1
−1/2
1/2
2
=
=
−1 −jnπt −1/2 −jnπt 1/2
−jnπt 1
−e
+e
−e
1/2
−1/2
−1
2jnπ
i
−1 h jnπ/2
−e
+ ejnπ + e−jnπ/2 − ejnπ/2 − e−jnπ + e−jnπ/2
2jnπ
11
i
−1 h jnπ
e − e−jnπ + 2e−jnπ/2 − 2ejnπ/2
2jnπ
−1
2 sin nπ/2
=
[sin nπ − 2 sin nπ/2] =
.
nπ
nπ
This expression is valid for all n except n = 0. We have α0 = the area
under s(t) over one period, divided by the period, and this equals (−1/2 + 1 −
1/2)/2 = 0. Hence, α0 = 0.
=
1.5
An application to filters
From problem 6 you will see that the Fourier series in the complex
form for a square wave voltage v(t), period T0, defined by



is
v(t) = 

1
0
−T0/2 < t < 0
0 < t < T0 /2
with v(t + T0) = v(t)
∞
1
1 j X
ej(2n+1)2πt/T0 .
v(t) = +
2 π n=−∞ 2n + 1
Example 1.2 What is the output vo (t) if this square wave voltage is applied
to the low pass filter in figure 1.1?
R
v(t)
C
vo(t)
Figure 1.1: An RC filter fed by a square wave.
From circuit theory, we know that the output of this filter when the input is
vinejωt is
1
vinejωt
vout =
1 + jωτ
12
with τ = RC. This is true for any frequency ω. Since the filter is a linear
system, its output, when the input is a square wave, is the sum of {the individual sine waves in the input × the transfer function, (1 + jωτ )−1}: we
deduce this from a property known as superposition. The frequencies in the
input are (2n + 1)2π/T0, so
∞
1
ej(2n+1)2πt/T0
1 j X
.
vo(t) = +
2 π n=−∞ 1 + j(2n + 1)2πτ /T0
2n + 1
(1.2)
Figure 1.2 shows the input and output waveforms for T0 = 1 and τ = 0.2.
1.0
vin
0.8
0.6
0.4
0.2
0.0
-0.2
1.0
vout
0.8
0.6
0.4
0.2
0.0
-0.2
0.0
0.5
1.0
1.5
t
Figure 1.2: The input and output waveforms for the RC-filter example.
13
2.0
Problems, chapter 1
1. (a) Sketch the even and odd example functions in sections 1.1.2 and 1.1.3.
(b) Let E1(t), E2(t) be two even functions of t and O1 (t), O2(t) be two
odd functions of t. Are the following functions even or odd?
(i) E1(t) × E2(t) (ii) E1(t) × O1(t) (iii) O1 (t) × O2(t) (iv) E1(t) + E2(t)
(v) O1 (t) + O2 (t) (vi) O1 (t) + O1 (−t)
[Even: (i), (iii), (iv) and (vi). The rest are odd.]
2. Here are some useful formulae for simplifying Fourier series results. In
all cases, n is an integer. Prove them.
(i) ejnπ = cos nπ = (−1)n.
(ii)
jnπ/2
e
=
(
(−1)n/2
n even
j(−1)(n−1)/2 n odd
(iii) For any set of numbers a0 , a1 , a2, . . .,
∞
X
[1 + (−1)n] an = 2
n=0
(iv) Show that
Z π
−π
a2n
n=0
n=0
∞
X
∞
X
[1 − (−1)n] an = 2



0
 2π
ejnt dt = 
∞
X
a2n+1
n=0
n 6= 0
n=0
3. (i) Sketch the following function over at least two periods:


f (t) = 
0
sin t
−π < t < 0
0<t<π
f (t + 2π) = f (t)
(ii) Find its Fourier series in the complex form.
[(ii) α±1 = ±1/(4j), αn = [(−1)n+1 − 1]/[2π(n2 − 1)]]
14
4. Find the Fourier series in the complex form for the function
f (t) = 1 + t
−1<t≤1
with
f (t + 2) = f (t).
[αn = j(−1)n/(nπ) if n 6= 0; α0 = 1.]
5. Find the complex Fourier series for the following waveform:
v(t) = ekt ,
−T0/2 < t ≤ T0/2
where
v(t + T0) = v(t).
[αn = (−1)n(ekT0/2 − e−kT0 /2)/(kT0 − 2jnπ)]
6. Find the complex Fourier series for


v(t) = 
1
0
−T0/2 < t < 0
0 < t < T0 /2
with v(t + T0) = v(t).
[Answer on page 12]
7. From the answer to problem 3(ii) above, deduce (i.e. do not re-do the
integrals to find the coefficients) the complex Fourier coefficients for the
full-wave rectified sine wave
g(t) =
(
− sin t −π < t ≤ 0
sin t
0<t≤π
Simplify your answer as far as possible.
Hint: if f (t) is a half-wave rectified sine wave, then first show that
g(t) = f (t) + f (−t).
[g(t) =
15
2
π
−
4
π
cos 2t
22 −1
+
cos 4t
42 −1
+
cos 6t
62 −1
+ ... ]
Chapter 2
Fourier transforms
2.1
The Fourier Transform
In the previous chapter, we discussed periodic functions which satisfied some conditions, known as the Dirichlet conditions,1 which allow
them to be expanded in a Fourier series. In this chapter, we deal with
non-periodic functions that satisfy the Dirichlet conditions. For such
functions, the Fourier transform can be calculated, which enables us
to express a function of time f (t) as a function of frequency, F (ω),
instead.
Recall from the last chapter that
Z T0
∞
X
2
1
2jnπt/T0
αn e
and αn =
f (t) =
f (t)e−2jnπt/T0 dt
T0 − T20
−∞
where f (t) is a function of time t with period T0, i.e. f (t+T0 ) = f (t).
The definition of αn in words is
αn is the mean value, over the range − T20 ≤ t ≤
period), of [f (t) × e−2jnπt/T0 ]
T0
2
(one
If the function is not periodic, then the period T0 → ∞, which leads
us to consider the integral
RT
1
/2
Specifically, if the periodic function is f (t) and has period T0 , then the conditions are (i) −T0 /2 f (t)dt is finite;
0
(ii) f (t) must have a finite number of turning points and finite discontinuities in a period; and (iii) f (t) itself must
be finite for all t.
16
F (ω) =
Z
∞
f (t)e−jωt dt
(2.1)
−∞
and this is the definition of the Fourier transform of f (t). Given
F (ω) we can recover the original f (t). By analogy with the Fourier
series expression for f (t), when the sum is replaced by an integral,
Z ∞
1
F (ω)ejωt dω.
(2.2)
f (t) =
2π −∞
The two boxed equations show us how to calculate the Fourier transform/inverse Fourier transform for a given function. All the material
in this chapter is based on just these two equations.
(See problem 1)
We occasionally need to use the following notation for the Fourier
transform of a function f (t):
F (ω) = F [f (t)] and
2.2
f (t) = F −1 [F (ω)].
Fourier transform pairs
We will always stick to the convention that a lower case letter stands
for the function of time t and the corresponding upper case letter for
the function of angular frequency, ω.
The two functions f (t) and F (ω) constitute a Fourier transform
pair, which we write as
↽ F (ω).
f (t) ⇀
This means that
Z ∞
f (t)e−jωt dt
F (ω) =
and
−∞
or, in words,
17
1
f (t) =
2π
Z
∞
−∞
F (ω)ejωt dω
F (ω) is the Fourier transform of f (t) and f (t) is the inverse
Fourier transform of F (ω).
2.3
Discrete and continuous spectra
We have used Fourier series to express periodic functions, which
have a discrete spectrum, i.e. one in which only certain frequencies are present. These frequencies were generally of the form ωn =
n(2π/T0), with n = 0, 1, 2 . . .. Similarly, in order to describe nonperiodic functions, it is necessary to use a continuous spectrum, i.e.
one in which all frequencies are present. Equation 2.1 tells us how
to calculate this spectrum.
Example 2.1 Let us first define the function rect(t), the rectangular pulse,
as

 1
− 12 < t < 21
rect(t) = 
0
otherwise.
We can now find the Fourier transform of f (t) = rect(t/2T ), which is
t
rect
2T
(see sketch below).
!





=



0
1
0
t < −T
−T < t < T
t>T
✻
f (t)
1
✲
−T
0
18
T
t
Answer By definition, the Fourier transform F (ω) is given by
F (ω) =
Z
∞
−jωt
f (t)e
−∞
dt =
Z
T
1 −jωt T
dt =
e
−T
−jω
−jωt
−T
1×e
2  ejωT − e−jωT  2
e−jωT − ejωT
= sin ωT.
=
=
−jω
ω
2j
ω


Without doing the integral, we know straight away that the inverse Fourier
transform of (2/ω) sin ωT will give us the original rectangular pulse, that is
t
rect
2T
!
⇀
↽
2
sin ωT
ω
are a Fourier transform pair.
(See problem 2)
2Τ
F(ω)
Τ
0
−Τ
ω
−π/Τ
0
π/Τ
2π/Τ
Figure 2.1: The Fourier transform of a rectangular pulse of width 2T .
2.4
A special case — f (t) is real and even
In most cases that you will come across, f (t) will be a real function
of time. It can be shown that this implies that
Re F (ω) is an even, and Im F (ω) an odd function of ω
19
that is, the real part of F (ω) is an even function of ω and the imaginary part of F (ω) is an odd function of ω.
If f (t) is also an even function of t, that is
f (t) = f (−t)
then the Fourier transform of f (t) is
Z
Z ∞
e−jωtf (t) dt +
F (ω) =
0
0
e−jωt f (t) dt
−∞
Substituting −t for t in the right-hand half gives
Z ∞
Z ∞
ejωtf (−t) dt
e−jωtf (t) dt +
F (ω) =
0
0
Using the fact that f (t) is even, this becomes
Z ∞
2 cos ωtf (t) dt
F (ω) =
0
which is real. Hence
The Fourier transform of an even, real function of time is a
real, even function of ω.
(See problem 3)
2.5
The Dirac δ function
The function δ(t) is an infinitely narrow spike, with unit area, located
at t = 0. Since the area under it is 1, its height must be infinite since
its width is zero. The fact that the area under it is one tells us that
Z ∞
δ(t) dt = 1.
−∞
It is helpful to visualise δ(t) as the limit of a rectangular pulse as its
width tends to zero, with a height such that its area = 1.
20
What is the Fourier transform of δ(t)? In the light of the above, it
is given by
t
1

rect 
F
F [δ(t)] = Tlim
→0
2T
2T


The factor 1/2T multiplying rect(t/2T ) makes the area equal to
unity. We already know the Fourier transform of rect(t/2T ): it is
2
sin ωT . Hence (by l’Hospital’s rule)
ω
F [δ(t)] = Tlim
→0
2
sin ωT = 1
2ωT
Hence
↽1
δ(t) ⇀
(See problem 6)
Note that δ(t − t0) is an infinitely narrow, infinitely high spike with
unit area, occurring at t = t0. From this we can deduce that for any
function of time f (t)
Z ∞
f (t)δ(t − t0) dt = f (t0).
(2.3)
−∞
In other words, the delta function can be used to sample a function
of time, f (t), at a particular time t0.
Incidentally, the sampling property allows us to derive the Fourier
transform of δ(t) in one line: by putting f (t) = ejωt and t0 = 0 in
equation (2.3). Since e0 = 1, the Fourier transform of δ(t) must also
be 1.
21
Problems, chapter 2
1. Write the following in terms of the Fourier transforms of the given functions:
(i)
Z
∞
10e−jωth(t) dt
−∞
(ii)
Z
∞
−jΩx
βe
−∞
f (x) dx −
(β is a constant.)
(iii)
(iv)
Z
−∞
−jky
e
a(y) dy +
∞
Z
∞
−∞
Z
∞
∞
Z
βe−jΩx g(x) dx
−∞
Z
∞
e−jkz a(z) dz
−∞
C(α)ejαv dα
−∞
e−jωv dv
[(i) 10H(ω), (ii) β[F (Ω) − G(Ω)] (iii) 0 (iv) 2πC(ω) ]
2. Sketch the following functions and find their Fourier transforms:
(i)
t
rect
4T
!
(ii)


(a is a constant)
f (t) = 
at
0
0<t<T
otherwise
(iii)


(iv)
f (t) = 
cos πt
0
−1 < t < 1
otherwise
t
rect 1 +
T
!
[(i) (2/ω) sin 2ωT (ii) a[e−jωT (1 + jωT ) − 1]/ω 2 (iii) 2ω sin ω/(π 2 − ω 2)
(iv) jejωT /2[1 − ejωT ]/ω ]
22
3. Prove that the Fourier transform of an odd, real function f (t) is imaginary.
4. Prove that, for any real f (t), Re F (ω) is an even function of ω, and
Im(F (ω) is an odd function of ω.
5. Using the sampling property of the Dirac delta function, equation 2.3,
find
(i)
R∞
−∞ δ(t
− π/2) sin t dt
R∞
−∞ δ(t
jωt
(ii) The constant t0 such that
(iii)
R∞
−∞ [δ(t
+ a) + δ(t − a)] e
dt.
− t0 )ekt dt = e2
[(i) 1, (ii) 2/k, (iii) 2 cos ωa]
6.∗ The Fourier transform of the delta function can be expressed as the
Fourier transform of the limit of any function of t whose width tends to
zero and whose height tends to infinity at t = 0, in such a way that the
area is unity.
Using the result of question 2(ii), show that this is true for the triangular
pulse defined there. (Hint: You will need to define the constant a such
that the area under the triangular pulse is 1 regardless of the value of
T . To take the limit as T tends to zero, you will need to use the Taylor
series for e−jωT up to and including the term in ω 2 T 2.)
[Well done if you get this right.]
23
Chapter 3
Fourier transform properties
Introduction
Many of the useful applications of the Fourier transform come about
because it has the properties which are discussed in this chapter. In
reading this chapter, you must remember the meaning of the symbol
⇀
↽, which was defined in the previous chapter in section 2.2.
3.1
Linearity (also known as superposition)
Let
↽ F1(ω)
f1(t) ⇀
and
↽ F2(ω)
f2(t) ⇀
be two Fourier transform pairs. Then, for constants c1 and c2 ,
Z ∞
[c1 f1(t) + c2f2 (t)] e−jωt dt
= c1
so
Z
∞
−∞
−∞
f1(t)e
−jωt
dt + c2
Z
∞
f2(t)e−jωt dt = c1F1(ω) + c2F2(ω),
−∞
↽ c1 F1(ω) + c2F2(ω).
c1 f1(t) + c2 f2(t) ⇀
This property allows us to find the Fourier transform of two functions
added together, if we know the Fourier transform of each of the
functions individually.
24
3.2
Time scaling
↽ F (ω). Then
Let f (t) ⇀
ω!
1
⇀ F
f (at) ↽
|a|
a
where a is a constant.
We prove this, assuming a > 0, by writing the Fourier transform of
f (at) as
Z ∞
f (at)e−jωt dt,
F [f (at)] =
−∞
and substituting u = at, with a > 0. Then t = u/a, and as
t → +∞, u → +∞, since a > 0, so the limits on the integral stay
the same. Hence,
Z ∞
Z ∞
ω
1
f (u)e−ju a du
f (u)e−jωu/a du/a =
F.T. =
a −∞
−∞
1
ω!
= F
.
a
a
(For the case a < 0 see problem 1.)
Example 3.1 The two properties of superposition and time scaling can be
used to calculate the Fourier transform of f (t) shown in the figure below:
✻f (t)
2
1
✲
−2T
−T
T
25
2T
t
The key to this problem is to realise that f (t) is the sum of two rectangular
pulses, rect(t/2T ), of width 2T , and rect(t/4T ), of width 4T . In other words,
f (t) = rect(t/2T ) + rect(t/4T ).
Now, we know from example 2.1 that the transform of rect(t/2T ) is (2/ω) sin ωT .
But, rect(t/2T ) and rect(t/4T ) are related by
rect(t/4T ) = rect(at/2T )
with
a=
1
2
so, using the time scaling property, we can immediately say that
F (ω) = (2/ω) sin ωT +(1/2)−1(2/2ω) sin 2ωT = (2/ω) sin ωT +(2/ω) sin 2ωT.
3.3
Time shifting
↽ F (ω) then
If f (t) ⇀
↽ e−jωt0 F (ω).
f (t − t0) ⇀
The Fourier transform of f (t − t0) is
Z ∞
f (t − t0)e−jωt dt.
−∞
Substituting u = t − t0, we have t = u + t0 and dt = du, so
Z ∞
Z ∞
f (u)ejωu du
f (u)e−jω(u+t0) du = e−jωt0
−∞
−∞
from which the time shifting property follows.
Example 3.2 Find the Fourier transform of f (t) defined in the figure below.
26
✻f (t)
1
✲
−3T
−T
T
3T
t
Answer The function f (t) is the sum of two time-shifted rectangular pulses.
We know that the Fourier transform of rect(t/2T ) is (2/ω) sin ωT . The leftand right-hand pulses are given by
t + 2T
rect
2T
!
and
t − 2T
rect
2T
!
respectively — be sure you understand why. Hence, the Fourier transform of
the left-hand pulse is
e2jωT × (2/ω) sin ωT.
Similarly, the Fourier transform of the right-hand pulse is e−2jωT ×(2/ω) sin ωT
and so, using superposition, the Fourier transform for the pair of pulses is
(2/ω)(e2jωT + e−2jωT ) sin ωT = (4/ω) cos 2ωT sin ωT.
3.4
Differentiation
↽ F (ω) then
If f (t) ⇀
df (t)
⇀
↽ jωF (ω).
dt
We prove this by writing down the inverse Fourier transform of F (ω),
which is, by definition,
27
1
f (t) =
2π
Z
∞
F (ω)ejωtdω.
−∞
Differentiating with respect to t
Z ∞
df (t)
1
[jωF (ω)] ejωtdω
=
dt
2π −∞
where the right hand side is the inverse Fourier transform of jωF (ω).
Finding the Fourier transform of both sides now proves the result.
3.5
Integration
↽ F (ω) then
If f (t) ⇀
Z
t
↽
f (u)du ⇀
−∞
1
F (ω)
jω
provided that F (0) = 0, which implies that
R∞
−∞ f (t)
dt = 0.
The Fourier transform of the integral of f (t) is
Z t
Z ∞
f (u)du dt.
e−jωt
−∞
−∞
Integrating by parts gives
∞
Z t
Z ∞
1 −jωt
1
1
f (u)du −
e
f (t)e−jωt dt =
F (ω)
−jω
−jω
jω
−∞
−∞
−∞
providedRthat the first term on the right hand side
R ∞ is zero. This will
∞
be so if −∞ f (t) dt = 0 — why? In the case −∞ f (t) dt 6= 0, see
Haykin, Chapter 2.
In the next chapter we use this formula a great deal, along with the
differentiation and time shifting formulae.
28
3.6
Frequency shifting
↽ F (ω) then
If f (t) ⇀
↽ F (ω − ω0).
ejω0tf (t) ⇀
This is proved in a similar way to the time shifting property.
Example 3.3 Given that f (t) ⇀
↽ F (ω), what is the Fourier transform of
f (t) cos ω0 t?
Answer Using the fact that
cos ω0t =
ejω0 t + e−jω0 t
2
and using the frequency shifting property above, we have
1
f (t) cos ω0 t ⇀
↽ [F (ω − ω0 ) + F (ω + ω0 )].
2
This example relates to amplitude modulation.
3.7
3.7.1
Convolution
Definition of convolution
Given two functions of time, f1(t) and f2 (t), their convolution, written f1 ⋆ f2(τ ), is defined as
Z ∞
f1 (t)f2(τ − t) dt.
f1 ⋆ f2(τ ) =
−∞
Notice that this is a function of τ only. The importance of convolution
becomes clear when we find the Fourier transform of f1 ⋆ f2(τ ).
3.7.2
Fourier transform of the convolution of two functions
Let us find the Fourier transform of the convolution of two functions
of t. This is given by
29
Z
∞
e−jωτ f1 ⋆ f2(τ ) dτ =
−∞
Z
∞
e−jωτ
−∞
Z
∞
−∞
f1(t)f2(τ − t) dt
dτ.
Call this expression FTC (Fourier Transform of the Convolution).
Swap the order of integration (w.r.t. τ first, then w.r.t. t):
Z ∞Z ∞
f1(t)e−jωτ f2(τ − t) dτ dt.
FTC =
−∞
−∞
Since f1(t) depends only on t, we can write this as
Z ∞
Z ∞
f1(t)
e−jωτ f2(τ − t) dτ dt.
FTC =
−∞
−∞
Now, applying the time shift property to the τ integral, we have
Z ∞
Z ∞
f1 (t)e−jωt dt,
f1(t)e−jωtF2(ω) dt = F2(ω)
FTC =
−∞
−∞
and so
FTC = F1(ω)F2(ω).
Hence,
F1(ω)F2(ω) = F [f1 ⋆ f2 ]
or, in words,
The Fourier transform of the convolution of two functions
f1(t) and f2 (t) is the product of the Fourier transforms of
the individual functions.
This amazing result is known as the Convolution Theorem.
30
Problems, chapter 3
1. Show that the time scaling property is also true for a < 0
2. Prove the result of example 3.1 by transforming the function directly.
3. Prove the frequency shifting property.
4. If f (t) ⇀
↽ F (ω), show that F (0) = 0 implies that
5. Find the Fourier transform of





f (t) = 



(T + t)/T
(T − t)/T
0
R∞
−∞
f (t) dt = 0.
−T < t < 0
0<t<T
otherwise.
(a) by direct calculation, and
(b) by finding the Fourier transform of the derivative of f (t) and then
using the integration property to find the Fourier transform of f (t).
[Both give 2(1 − cos ωT )/ω 2T ]
6. Find the Fourier transform of


f (t) = 
1
0
−T0/2 < t < T0/2
otherwise.
Hence, using the frequency shift property (and doing no integration),
show that the Fourier transform of


is
g(t) = 
G(ω) =
sin ω0 t
0
−T0/2 < t < T0 /2
otherwise
sin[(ω − ω0 )T0/2] sin[(ω + ω0)T0/2]
−
.
j(ω − ω0)
j(ω + ω0)
[F (ω) = (2/ω) sin(ωT0/2)]
7. (i) Use the time scaling property to show that if f (t) ⇀
↽ F (ω), then
f (−t) ⇀
↽ F (−ω). (ii) Hence, using the fact the the Fourier transform
of f (t) = at, 0 ≤ t ≤ T , is a[e−jωT (1 + jωT ) − 1]/ω 2, find the Fourier
transform of
(
−at −T ≤ t < 0
g(t) =
at
0≤t<T
[G(ω) = 2a(cos ωT + ωT sin ωT − 1)/ω 2]
31
8. Find f ⋆ g(τ ) if
(i)


and g(t) = sin ω0t.
1
0
f (t) = 
−T < t < T
otherwise
(ii)


and g(t) = cos ω0t.
f (t) = 
1−t
0
0<t<1
otherwise
[(i) [cos ω0 (τ − T ) − cos ω0 (τ + T )]/ω0, (ii)
[ω0 sin ω0τ − cos ω0 (τ − 1) + cos ω0τ ]/ω02]
9.∗ A demonstration of the Convolution Theorem.
(a) Show graphically that the convolution of two rectangular pulses of
unit height, stretching between t = −T and t = +T , is given by





Convolution = 



τ + 2T
−τ + 2T
0
−2T < τ < 0
0 < τ < 2T
otherwise.
(b) Find the Fourier transform of the convolution in (a).
(c) Hence demonstrate that the convolution theorem is true in this case,
i.e., that
The Fourier transform of the convolution of the two rectangular pulses
= the product of the Fourier transforms of the two rectangular pulses.
[(b) 2(1 − cos 2ωT )/ω 2 (c) (2 sin ωT /ω)2, which is the same, since
1 − cos 2x = 2 sin2 x. Congratulations if you got there.]
32
Chapter 4
Fourier transforms without
integration
Aims
This chapter concentrates on a technique for calculating Fourier
transforms of piecewise polynomial functions which are zero as
t → ±∞, without using integration. There are distinct advantages
to doing it this way, once you have mastered the technique.
I give an outline of the technique here, and some problems for you
to practise on. A detailed explanation and some further worked
examples will be given in lectures.
4.1
Definition
A piecewise polynomial function f (t) is one which can be
expressed as a set of polynomials in t, each applying over a different
range of t. Examples include



f (t) = 




f (t) = 

T +t
T
0
1
0
−τ < t < τ
otherwise
−T < t < 0
otherwise
33
(rectangular pulse)
(half a triangular pulse)
at2 + bt + c
−c
f (t) = 


t − t2





 0










−t1 < t ≤ 0
0 < t ≤ t2
t2 < t < 5t2
otherwise
(a nasty mess).
The important thing about functions of this type is that by differentiating with respect to t sufficiently many times, nothing remains
except a set of δ-functions and their derivatives, at various times.
Loosely speaking, such functions can be ‘differentiated away’ into
nothing but a set of (derivatives of) δ-functions.
Why do this? The answer is that it is very easy to find the Fourier
transform of a set of δ-functions, and from this the Fourier transform of the original function can be deduced by using the integration
property derived in the previous chapter.
Believe me, this method can often be a lot less trouble than the
alternative — for example, integrating by parts.
4.2
Recipe
Taking as an example the half triangular pulse defined above, the
following steps allow us to find its Fourier transform without integrating anything. In order to follow the argument it will help you
greatly if you sketch f (t) and its derivatives.
1. Differentiate f (t) w.r.t. t, which gives
1
df (t)
T
= −δ(t) + 
 0
dt



−T < t < 0
otherwise.
The δ-function arises because f (t) goes instantaneously from 1
to 0 at t = 0.
34
We haven’t differentiated enough yet — the result is not zero
everywhere — so. . .
2. . . . differentiate again to get
d2f (t)
dδ(t) 1
1
δ(t
+
T
)
−
− δ(t).
=
dt2
T
dt
T
3. Now we only have δ-functions and their derivatives, so we are
2
ready to find the Fourier transform of d dtf 2(t) . Using the differentiation property, the Fourier transform of dδ(t)
dt = jω. (See
problem 1.) Using the time shifting property, the Fourier transform of δ(t + T ) is ejωT . Hence,
2
1
ejωT − jωT − 1
1 jωT
 d f (t) 
 =
e
− jω − =
.
F
2
dt
T
T
T


(Check that the dimensions are consistent.)
4. Finally, to find the Fourier transform of f (t), use the integration
property twice, i.e. divide by (jω)2 , which gives
1 + jωT − ejωT
.
F (ω) = F [f (t)] =
ω2T
35
Problems, chapter 4
1.∗ Show that the Fourier transform of the derivative of the δ-function is jω
(i) by using the differentiation property (easy), and
(ii) by finding the Fourier transform of the derivative of f (t), where
f (t) =

 1
T
 0
− T2 < t <
otherwise
T
2
and then finding the limit of this as T → 0 (harder). Note that f (t) as
defined here has unit area, and therefore, as T → 0, tends to δ(t).
2. Using the method outlined in this chapter, find the Fourier transform of


f (t) = 
h
0
−T < t < T
otherwise
[F (ω) = (2h/ω) sin ωT ]
3. Find the Fourier transform of the function

T +t



 T
−T < t < 0
0<t<T
otherwise
f (t) =  1


 0
[F (ω) = (1 + jωT e−jωT − ejωT )/ω 2T ]
4. Find the Fourier transform of


f (t) = 
1−
0
2
t
T
−T < t < T
otherwise
[F (ω) = 4(sin ωT − ωT cos ωT )/ω 3T 2]
5.∗ Find the Fourier transform of the function
f (t) =
  t 3
T
 0
0≤t≤T
otherwise
[F (ω) = [6 − e−jωT (6 + 6jωT − 3ω 2T 2 − jω 3 T 3)]/(ω 4T 3 )]
36
6.∗ Have fun finding the Fourier transform of f (t) as drawn below
f (t) ✻
2a
a
0
✁❆
✁ ❆
✁
❆
✁
❆
✁
❆
✁
❆
✁
❆
✁
❆
✁
❆
✁
❆
✁
❆
✁
❆
2T
✁
✁
✁
✁
✁
✁
✁
✁
✁
✁
4T 5T
✁
✁
❆
❆
❆
❆ ✁
❆❆ ✁
✁
✁
✁✁
❆
❆
❆
❆
7T 8T 9T 10T 11T
❆
❆
❆
❆
❆
❆
❆
❆
13T
[F (ω) = (−a/ω 2T ) 1 − 2X 2 + X 4 + X 5 − X 7
n
−X 8 + 2X 9 − X 10 − X 11 + X 13 where X = e−jωT ]
37
o
✲
t
Chapter 5
Cross correlation and autocorrelation
Cross correlation and autocorrelation are related to convolution, and
in this chapter, we define what they are, explain some of their properties, and give some practical examples of their applications.
5.1
Reminder: complex conjugate
Before we define and discuss correlation, it will be useful to remember
the definition of the complex conjugate of a complex number, z. If
z = x + jy, then the complex conjugate of z, written z ∗ , is defined
as z ∗ = x − jy. If z happens to be written in polar form, that is,
z = rejθ , then z ∗ = re−jθ . You might notice that the same rule
works in both (in fact, in all) cases: to find the complex conjugate,
change the sign of j.
You also need to remember that zz ∗ = |z|2 = (x + jy)(x − jy) =
x2 + y 2 is the squared modulus of the complex number z, and is ≥ 0
and always real.
5.2
Definitions
Given two real functions of time, f (t) and g(t), say, their cross correlation function, written Corr(f, g)(τ ), is defined as
Z ∞
f (t)g(t + τ ) dt.
(5.1)
Corr(f, g)(τ ) =
−∞
38
This is very similar to the convolution of f and g, but it is not quite
the same — the argument of the second function is t + τ and not
τ − t.
The autocorrelation function is simply the cross correlation of a real
function with itself, in other words
Z ∞
f (t)f (t + τ ) dt.
Autocorrelation of f = Corr(f, f )(τ ) =
−∞
You might expect, because of the similarity of correlation to convolution, that there should be a correlation theorem which is like the
Convolution Theorem (see page 30), and indeed this is the case. Here
it is.
Let us try to find FC, the Fourier transform of the cross correlation
of two functions, f (t) and g(t). By definition, this is
Z ∞
Z ∞
f (t)g(t + τ ) dt dτ.
e−jωτ
FC =
−∞
−∞
Changing the order of integration, we have
Z ∞Z ∞
e−jωτ f (t)g(t + τ ) dτ dt.
FC =
−∞
−∞
Substituting u = t + τ , so dτ = du, we find
Z ∞Z ∞
e−jω(u−t)f (t)g(u) du dt
FC =
=
Z
∞
−∞
−∞
e
−jωu
−∞
g(u) du ×
Z
∞
−∞
ejωtf (t) dt.
The first part is clearly the Fourier transform of g(t), G(ω). The
second is not quite F (ω) because f (t) is multiplied by e+jωt and not
e−jωt. So what is it? Provided f (t) is real, and bearing in mind the
“change the sign of j rule”, you should be able to see that it is the
complex conjugate of F (ω), i.e. F ∗(ω). Hence
39
↽ F ∗(ω)G(ω)
Corr(f, g)(τ ) ⇀
are a Fourier transform pair: this is the Correlation Theorem.
Setting f (t) = g(t), we have
↽ F ∗(ω)F (ω) = |F (ω)|2,
Corr(f, f )(τ ) ⇀
where the function |F (ω)|2 is always real, greater than or equal to
zero, and is known as the power spectral density. It is a measure of
how the power in a signal is distributed over different frequencies.
5.3
Correlation properties
We give four properties here, and prove three of them.
5.3.1
The autocorrelation function is always even
It is easy to show that the autocorrelation function is always an even
function of τ . By definition,
Z ∞
f (t)f (t + τ ) dt
Corr(f, f )(τ ) =
−∞
and so
Corr(f, f )(−τ ) =
Z
∞
−∞
f (t)f (t − τ ) dt.
Substituting u = t − τ in the above gives
Z ∞
f (u + τ )f (u)dt = Corr(f, f )(τ ).
Corr(f, f )(−τ ) =
−∞
Hence, we have Corr(f, f )(−τ ) = Corr(f, f )(τ ) and so Corr(f, f )(τ )
is an even function of τ .
40
5.3.2
Calculating cross correlation either way round
We show that
Corr(f, g)(τ ) = Corr(g, f )(−τ ),
which is a useful property to bear in mind since sometimes it is easier
to calculate the correlation one way round than the other.
Starting from the definition, we have
Z ∞
f (t)g(t + τ )dt
Corr(f, g)(τ ) =
−∞
and substituting u = t + τ , we have
Z ∞
f (u − τ )g(u)du.
Corr(f, g)(τ ) =
−∞
By the definition of correlation, this is just Corr(g, f )(−τ ): the −τ
term comes from the fact that the argument of f is u − τ and not
u + τ.
It is now easy to see (again) that the autocorrelation function is an
even function of τ — substituting g = f in the above, we have
Corr(f, f )(τ ) = Corr(f, f )(−τ ).
5.3.3
The maximum of the autocorrelation function
We do not prove this property, but state it thus: for any τ ,
Corr(f, f )(0) ≥ Corr(f, f )(τ ),
i.e. the autocorrelation function has a maximum at τ = 0. There
may be other maxima for τ > 0, but they will never be greater than
the one at τ = 0.
5.3.4
Autocorrelation of a periodic function
Let f (t) be a periodic function with period T0, so f (t + T0 ) = f (t).
Then the autocorrelation function of f (t) is also periodic, with the
same period.
41
This is easily proved as follows. From the definition
Z ∞
f (t)f (t + τ ) dt
Corr(f, f )(τ ) =
−∞
we have
Corr(f, f )(τ + T0) =
Z
∞
f (t)f (t + τ + T0) dt
−∞
which, since f (t) has period T0, is equal to
Z ∞
f (t)f (t + τ ) dt = Corr(f, f )(τ ).
−∞
5.4
Worked examples
Example 5.1 Define


r(t) = 
1
0
0≤t≤T
otherwise
and let f (t) = sin ωt. Then
Corr(r, f )(τ ) =
Z
∞
r(t)f (t + τ ) dt =
−∞
Z
T
0
1 × sin ω(t + τ ) dt
T
1
cos ωτ − cos ω(T + τ )
= − cos ω(t + τ ) =
.
0
ω
ω
Example 5.2 Now for a slightly harder example. Find Corr(f, f )(τ ) where


f (t) = 
e−kt
0
t≥0
otherwise
Plots of f (t), f (t+ τ ), τ < 0 and f (t+ τ ), τ > 0 are shown in the figure below.
Note carefully that, when τ < 0, f (t) is shifted to the right, and when τ > 0,
it is shifted to the left.
Case 1: τ < 0.
We first work out Corr(f, f )(τ ) when τ < 0. Notice from figure 5.1 (top)
that f (t) = 0 for t < 0, and so f (t + τ ) = 0 for t < −τ . Why −τ ? Because
τ is negative: but from the figure, you can see that f (t + τ ) = 0 for t less
42
1
f(t)
0
1
f(t+τ), τ < 0
-τ
0
1
f(t+τ), τ > 0
-τ
0
Figure 5.1: The function f (t) used in example 5.2.
than some positive number. It would be easy to get this wrong, and sketching
a figure helps to avoid falling into a trap.
Hence, for τ < 0, we have
Corr(f, f )(τ ) =
Z
∞
f (t)f (t+τ ) dt =
−∞
=
Z
∞
−τ
∞
e−kτ −2kt
−
e
2k
−τ
e−kt e−k(t+τ ) dt =
Z
∞
e−2kte−kτ dt
−τ
ekτ
=
.
2k
Case 2: τ > 0.
We now work out Corr(f, f )(τ ) when τ > 0. It is still true that f (t) = 0 for
t < 0, but now we have f (t + τ ) = 0 for t < −τ , the logic being the same as
before — see the bottom panel of figure 5.1.
Hence, for τ > 0, we have
Corr(f, f )(τ ) =
Z
∞
−∞
f (t)f (t+τ ) dt =
Z
0
43
∞
−kt −k(t+τ )
e
e
dt =
Z
∞
0
e−2kte−kτ dt
∞
e−kτ
e−kτ −2kt
=
e
.
=−
2k
2k
0
Note that the lower limit on the integral was 0 in this case.
Summarising,


ekτ



τ ≤0

2k
Corr(f, f )(τ ) =  −kτ

e



τ > 0.
2k
Corr(f,f)(t)
0.5
-2
-1
0.0
0
τ
1
2
Figure 5.2: The autocorrelation function Corr(f, f )(τ ), k = 1, from example 5.2. Note that
Corr(f, f )(τ ) is an even function of τ .
Notice, too, from figure 5.2, that Corr(f, f )(τ ) is an even function of
τ , and its maximum is at τ = 0, as discussed in section 5.3.
Example 5.3 Show that the Autocorrelation Theorem is true for the function
f (t) defined in example 5.2.
We’ve done the hard work, which was to compute the autocorrelation function. The Autocorrelation Theorem tells us that the Fourier transform of
Corr(f, f )(τ ) is the same as the modulus of the Fourier transform of f (t),
squared.
To check this, first find the Fourier transform of Corr(f, f )(τ ), which is
Z
∞
1
e−jωτ Corr(f, f )(τ )dτ =
2k
−∞
Z
0
1
e−jωτ ekτ dτ +
2k
−∞
44
Z
∞
0
e−jωτ e−kτ dτ
=
1
2k
Z
0
e(k−jω)τ dτ +
−∞
=
1
2k
Z
∞
0
0
∞
−(k+jω)τ e
− e
e−(k+jω)τ dτ =
2k(k − jω) 2k(k + jω) −∞
0
(k−jω)τ
2k
1
1
1
1
+
=
=
.
2k(k − jω) 2k(k + jω) 2k k 2 + ω 2
k2 + ω2
Let us now find the Fourier transform of f (t), which is
F (ω) =
Z
∞
−∞
e−jωt f (t) dt =
Z
∞
e−jωt e−kt
0
Now,
∞
e−kt dt = −
k + jω =
0
1
.
k + jω
1
1
1
×
= 2
k + jω k − jω
k + ω2
which is indeed equal to the Fourier transform of Corr(f, f )(τ ).
|F (ω)|2 = F (ω)F ∗(ω) =
5.5
Power and energy signals
At this point, it will be useful to mention the difference between
power signals and energy signals.
An energy signal, v(t) say, contains aR finite amount of energy, E.
∞
Mathematically, this means that E = −∞ |v(t)|2dt is finite. Hence,
all signals that last a finite time, for example, rectangular or triangular pulses, are energy signals; but so, too, are signals like f (t)
in example 5.2, which has infinite duration, but decreases rapidly
enough as t → ∞ that its integral is finite.
By contrast, a power signal, v(t), ‘goes on for ever’ and thus contains
an infinite amount of energy, although the power is finite. Examples
include sin t, cos t or any periodic function that has a Fourier series.
It makes sense here to define power by
Z T
1
|v(t)|2dt.
P = lim
T →∞ 2T −T
The definition for cross correlation given earlier in equation (5.1) is
correct for energy signals but not for power signals. For power
45
signals f (t) and g(t), say, we instead calculate the correlation by
Z T
1
f (t)g(t + τ ) dt.
(5.2)
Corr(f, g)(τ ) = lim
T →∞ 2T −T
5.6
Correlation demonstrations
We now look at two demonstrations of correlation, their purpose
being to give a feel for how correlation is useful in practical situations.
5.6.1
Cross correlation
We compute the cross correlation of two signals g(t) and h(t), each of
which consists of a sum of four sine waves. Three of the frequencies
are different in each case, but g(t) and h(t) also contain one common
frequency. Cross correlation picks out the period of this common
frequency.
g(t)
3
2
1
0
-1
-2
-3
0
3
2
1
0
-1
-2
-3
0
0.02
1000
1500
2000
500
1000
1500
2000
Corr (g, h)(τ)
h(t)
500
0.01
0
-0.01
-0.02
84
204
324
444
Figure 5.3: Cross correlation being used to pick out an underlying common periodic signal in the
presence of other periodic signals.
Specifically, in figure 5.3,
46
2πt
2πt
2πt
+ 0.5 sin 2πt
g(t) = 0.1 sin 120
17 + 0.7 sin 59 + 1.0 sin 173
and
2πt
2πt
2πt
+ 0.4 sin 2πt
h(t) = 0.05 sin 120
31 + 0.8 sin 131 + 1.2 sin 203
although these exact details are not important — just that fact that
the signals contain one common frequency and the other three are
unrelated. The common frequency, 2π/120, corresponds to a period
of 120, and this component also happens to have a rather smaller
amplitude than the other terms. It would be difficult to pick out
what this period actually is by eye — see figure 5.3, top (g(t)) and
middle (h(t)) panels.
The bottom of figure 5.3 shows the cross correlation Corr(g, h)(τ ),
calculated using equation (5.2), since these are power signals. As can
easily be read from the figure, the common period of the two signals,
120, is also the period of Corr(g, h)(τ ).
5.6.2
Autocorrelation
Autocorrelation can be used to pick out the period (and hence the
frequency) of a periodic signal buried in noise, and figure 5.4 illustrates this. The signal f (t) = 1.4 r(t) + 0.05 sin 2πt/T0, where r(t)
consists of normally distributed random numbers with standard deviation approximately 0.3 – see the Theoretical Distributions chapter
for what this means. Note that the periodic signal has much smaller
amplitude than the noise. The signal is plotted in the upper half of
figure 5.4. You are unlikely to be able to pick out by eye the sine
wave in the presence of this much noise (whose amplitude is about 28
times bigger than the periodic signal). However, the autocorrelation
function, shown in the lower half, reveals that there is an underlying
periodicity, and furthermore, that the period is about 120 units.
47
2.0
f(t)
1.0
0.0
-1.0
-2.0
0
500
1000
1500
2000
240
360
480
0.01
Corr(f, f)(τ)
0.005
0
0
120
Figure 5.4: The autocorrelation function being used to pick out an underlying periodic signal in
the presence of noise.
5.6.3
Practical applications
More details will be given in a lecture, but some examples of practical
applications of autocorrelation and cross correlation are:
• Loudspeaker evaluation. White noise is fed into a loudspeaker
and a microphone is placed to pick up the sound from the speaker.
The autocorrelation function of the output of the microphone
shows the resonant frequencies of the speaker and its housing.
• Leak location. Two sound sensors are attached to a buried water
pipe, one on the upstream side of a leak and one on the downstream side. Assume that the velocity of sound along the pipe
is known. The cross correlation of the two signals is calculated,
the peak of which then gives the value of x − y, where x and y
are the distances between the sensors and the leak. The distance
x + y can be measured directly — it is the distance between the
48
sensors. From this, x and y can be found and hence the leak can
be located.
• Also used in cross correlation flow meter, GPS, Multipath interference measurements.
Problems, chapter 5
1. Given two functions of time, f (t) and g(t), write down the definition of
their convolution, f ⋆ g(τ ) and their correlation, Corr(f, g)(τ ). Directly
from these definitions, show that, if g(t) is an even function of time,
then
f ⋆ g(τ ) = Corr(f, g)(−τ ).
2. Define the following functions:


1
0
0≤t≤T
otherwise


t
0
0≤t≤T
otherwise
r(t) = 
q(t) = 
c(t) = cos ωt, s(t) = sin ωt
where T and ω are positive constants. Calculate
(i) Corr(r, c)(τ )
(ii) Corr(q, s)(τ )
(iii) Corr(q, q)(τ ) (Hint: follow example 5.2.)
[(i) [sin ω(T +τ )−sin ωτ ]/ω (ii) [sin ω(T +τ )−sin ωτ −ωT cos ω(T +τ )]/ω 2
(iii) (2T 3 + 3τ T 2 − τ 3 )/6 for −T ≤ τ < 0; (2T 3 − 3τ T 2 + τ 3 )/6 for
0 ≤ τ < T ; 0 otherwise]
3.∗ (i) For r(t) as defined in the previous question, show that





Corr(r, r)(τ ) = 



T +τ
T −τ
0
−T ≤ τ ≤ 0
0<τ ≤T
otherwise
(ii) Check that the Autocorrelation Theorem is true in this case by finding the Fourier transform of Corr(r, r)(τ ), and also of r(t), squaring the
latter, and comparing.
49
[R(ω) = (1 − e−jωT )/(jω); |R(ω)|2 = 2(1 − cos ωT )/ω 2, which is also the
f.t. of Corr(r, r)(τ )]
4. For power signals, for instance, sin t and cos t, the autocorrelation is
defined as
1
Corr(f, f )(τ ) = lim
T →∞ 2T
Z
T
f (t)f (t + τ ) dt.
−T
(If we did not divide by 2T , the answer would usually be infinite for
power signals.)
Use this definition to calculate Corr(f, f )(τ ) when f (t) = cos t.
Hint: cos x cos y = 12 cos(x + y) + 21 cos(x − y)
50
[(1/2) cos τ ]
Chapter 6
Introductory probability
6.1
Definition of probability
In everyday English, we have many different ways of saying how likely
an event is, e.g.
will certainly
will probably
It may/might
rain today.
is unlikely to
will not
These are five ways of saying roughly how likely it is to rain. They are
not quantitative though — no numbers are put on the likelihood of
rain. The part of mathematics that deals with how likely something
is, is called probability.
Example 6.1 In tossing a fair coin, there are two possible outcomes: heads
or tails. The probability of a head is
P (head) =
No. of outcomes that result in a head 1
= .
total number of possible outcomes
2
Example 6.2 Walkers Crisps claim to have put a cheque for £10,000 in
‘selected packets’. Suppose there are 10 cheques in 8,000,000 packets. The
probability of buying a winning packet is
P (win) =
1
No. of outcomes that result in a win
=
total number of possible outcomes
800, 000
51
. . . so you can be fairly certain you won’t be lucky.
Both these probabilities are obtained ‘in the limit’ as the coin is
tossed more and more times, or more and more crisps are bought. For
instance, if you were to toss the coin 1,000,000 times, you would be
fairly unlikely to get exactly 500,000 heads (about one time in 1,253
— we shall see how to calculate this from the Binomial distribution)
— but you would expect to obtain around 500,000 almost all the
time. To estimate experimentally the probability of a head, you
would need to find
No. of heads
P (head) = lim
.
N→∞ Number of times coin has been tossed, N
In calculating probabilities, you need to be careful to evaluate all
possibilities, as illustrated below.
Example 6.3 A coin is tossed three times. What is the probability that
(a) heads are obtained twice and tails, once? (b) the result is heads, heads,
tails, in that order? (c) If at least two are heads, what is the probability that
all are heads?
Answer. First write down all possible outcomes, which are hhh, hht, hth,
htt, thh, tht, tth, ttt (8 in all). (a) The outcomes that consist of two heads
and one tail are hht, hth, thh so
P (2 heads, 1 tail) =
3
8
(b) There is only one possibility in this case: hht. Hence,
P (hht in that order) =
1
8
(c) Outcomes in which there are at least two heads are hhh, hht, hth, thh (4
in all). Of these, only 1 is all heads, so
P (hhh given at least two heads) =
1
4
We can summarise the above examples in the following definition:
52
If all outcomes of an experiment are (a) equally likely and (b)
mutually exclusive, the probability of an event E is
P (E) =
number of outcomes favourable to E
.
total number of outcomes
(6.1)
In these examples, we have calculated the probability of an event E,
which we shall write P (E). By convention
• P (E) = 1 means E is certain to happen
• P (E) = 0 means it is certain not to happen.
Hence, all probabilities must lie between 0 and 1. Furthermore, let
us write the probability of an event E not happening as P (not E).
Then
P (E) + P (not E) = 1.
This equation is saying that the probability of an event happening,
plus the probability of it not happening, is one. It is certain that it
either happens or it doesn’t.
Example 6.4 A fair die is thrown. The probability of throwing a 4, P (4), =
1/6. The probability of not throwing a 4, P (not 4) = P (1) + P (2) + P (3) +
P (5) + P (6) = 5/6. So
P (4) + P (not 4) = 1/6 + 5/6 = 1.
6.2
Addition of probabilities — mutually exclusive case
We can take this further. To do so, we need to know that, if A and
B are mutually exclusive events — ones that cannot both happen
— then the probability of A or B happening, written P (A or B),
= P (A) + P (B). Examples of mutually exclusive events are: tossing
a coin — the outcome can only be heads or tails; rolling a die — the
53
outcome can be precisely one of the integers 1 . . . 6, so an outcome
of, say, 4 precludes any other outcome.
For general n, rather than just 2 events, the addition formula becomes
P (E1 or E2 . . . or En) = P (E1) + P (E2) + . . . + P (En) (6.2)
or, in words,
The probability of event E1 or E2 or . . . En happening, where
E1 . . . En are mutually exclusive events, is the sum of the
individual probabilities P (E1), P (E2). . . P (En).
Example 6.5 Two dice are rolled. What is the probability that they both
show the same number?
Answer.
Let E1 be the event that both dice show 1; E2, that they both
show 2, etc. Then P (E1) = 1/36 (one outcome favourable to E1 out of
62 = 36 possible outcomes. Similarly, P (E2) = . . . = P (E6) = 1/36. Now
E1 . . . E6 are mutually exclusive events — both dice showing 3, say, excludes
any other outcome — so the probability of obtaining the same number is
1/36 + 1/36 + 1/36 + 1/36 + 1/36 + 1/36 = 6/36 = 1/6.
6.3
Addition of probabilities — general case
If E1 and E2 are now any two (i.e. not necessarily mutually exclusive)
events then
P (E1 or E2) = P (E1) + P (E2) − P (E1 and E2).
(6.3)
The proof of this is easily seen by using a Venn diagram — figure 6.1.
The diagram is drawn inside a rectangle whose area is 1. Then P (E1)
is the area of the circle labelled E1 and P (E2) is the area of the circle
labelled E2. You should be able to see that the union of P (E1) and
54
E1
E1 and E2
E2
Figure 6.1: The proof of formula (6.3).
P (E2) includes the area common to E1 and E2 — the intersection
in Figure 6.1 — twice. Therefore the total area enclosed by E1 and
E2 is given by P (E1) + P (E2) − P (E1 and E2).
Example 6.6 A card is drawn at random from a standard pack of cards.
What is the probability that the card drawn will be a diamond or an ace?
Answer:
P (diamond) = 13/52 = 1/4, P (ace) = 4/52 = 1/13 and
P (diamond and ace) = P (ace of diamonds) = 1/52.
Hence P (diamond or ace) = 1/4 + 1/13 − 1/52 = 4/13.
6.4
Multiplication of probabilities
If two events A and B are independent, then one event has no effect
on the other. We now calculate the probability that two independent
events both happen.
Example 6.7 Suppose we toss a coin and roll a die. What is the probability
of the coin showing heads and the die showing 3?
Answer We assume that for the coin, P (heads) = 1/2 and for the die,
P (3) = 1/6. We also know that there are 6 × 2 = 12 possible outcomes,
only one of which is the desired one of heads and 3. Hence, we would expect
P (heads and 3) = 1/12. But this result is also given by P (heads and 3) =
P (heads) × P (3) = 1/12.
55
This is a general rule for ‘independent AND events’:
For two independent events A and B, with probabilities P (A)
and P (B) respectively, the probability
P (A and B) = P (A)P (B).
(6.4)
Problems, chapter 6
1. One card is taken randomly from a shuffled pack of 52. What is the
probability that it is (a) an ace? (b) a spade? (c) a red queen?
[(a) 1/13 (b) 1/4 (c) 1/26]
2. A cow has two calves. If male and female calves are equally likely, what is
the probability that (a) both calves are female? (b) there is at least one
male? (c) Given that there is at least one male, what is the probability
that both are male?
[(a) 1/4 (b) 3/4 (c) 1/3]
3. Two dice are rolled. What is the probability of obtaining (a) two sixes?
(b) two even numbers? (c) both numbers greater than or equal to 5?
[(a) 1/36 (b) 1/4 (c) 1/9]
4. In a large batch of resistors, it is found that 1 in 50 is out of tolerance;
for capacitors, it is found that 1 in 21 is out of tolerance. If I build an
RC filter using one resistor and one capacitor, what are the probabilities
that (a) the resistor, (b) the capacitor and (c) both components I select
will be in tolerance?
[(a) 49/50 (b) 20/21 (c) 14/15]
5. By drawing the analogous diagram to figure 6.1, show that for THREE
(not necessarily mutually exclusive) events,
P (E1 or E2 or E3 ) = P (E1) + P (E2) + P (E3 ) − P (E1 and E2)
−P (E1 and E3 ) − P (E2 and E3) + P (E1 and E2 and E3).
6. In the Italian Superenalotto, in order to win the jackpot, you have to
match 6 different numbers drawn from the range 1–90 inclusive. What
are the odds of winning the jackpot?
[Rather low at 1/622,614,630]
56
7. A die is rolled three times. What is the probability that the sum of the
numbers obtained is (i) 3? (ii) 4? (iii) 5?
[(i) 1/216, (ii) 3/216 = 1/72, (iii) 6/216 = 1/36]
8. From a standard pack of cards, two are drawn at random without replacement. What is the probability that both cards are face cards (jack,
queen or king of any of the four suits)?
[11/221]
9. A biscuit tin contains 100×2p coins, 50×5p coins and 30×10p coins.
Two coins are drawn from the tin at random and not replaced. What is
the probability that their combined value is greater than 10p?
[329/1074]
10. Two cards are taken successively, without replacement, from a standard
pack of 52. What is the probability that (a) both cards are greater than
3 and less than 9? (b) The first card is an ace and the second is a face
card? (c) The cards drawn are an ace and a face card (in either order)?
[(a) 95/663, (b) 4/221, (c) 8/221]
57
Chapter 7
Discrete variables: the p.d.f., mean
and variance
Aims
By the end of this chapter, you should understand the terms
• random variable
• normalisation
• probability density function (p.d.f.) f (x)
• cumulative distribution function (c.d.f.) F (x)
• mean and standard deviation
and be able to calculate the last three.
7.1
Random variables
We have already come across the idea of a random variable in the
previous chapter. For instance, the number shown when a die is
rolled is a random variable, because we have no way of predicting it
in advance.
The actual number a fair die will show (let us call it x) is unknown
in advance, but this does not mean to say we cannot say something
about it: for instance,
58
• x lies between 1 and 6 (1 ≤ x ≤ 6)
• x is equally likely to be 1, 2, 3, 4, 5 or 6
• if we roll the die 1,000 times and calculate the average of the
numbers obtained, it is likely to be around 3.5
These items are what we could call statistical properties of the random variable x, the number shown by a die. We could do experiments
to verify that x actually has these statistical properties. (What sort
of experiments might we do?)
7.2
Definitions: a set of N discrete values
We define below the terms mean, standard deviation and variance
as applied to a set of N discrete values, x1, x2, . . . xN .
7.2.1
Mean of x, x
To calculate the mean of a set of N discrete values, add them up and
divide by N :
N
1 X
xi .
x=
N i=1
7.2.2
(7.1)
Standard deviation of x, σx
To calculate the standard deviation of a set of N discrete values of
x, first find the mean, x. Then the standard deviation is given by
v
u
u
u
t
N
1 X
(xi − x)2.
σx =
N − 1 i=1
59
(7.2)
The reason we divide by N − 1 and not N is subtle and has to
do with the fact that the formula, as given, generally gives a better
approximation to the standard deviation of the whole population,
even though you’re only looking at a sample of size N : this will be
further explained in a lecture.
If all the numbers xi were very close to the mean, then the standard
deviation would be small, so σx can be seen as a measure of how
widely scattered around the mean the values are. It is in fact the
root mean square (r.m.s.) deviation from the mean.
People sometimes also talk about the variance; this is defined as σx2 .
Example 7.1 Two dice were rolled 12 times and the sums of the two numbers
were: 12, 8, 5, 10, 8, 8, 9, 9, 5, 3, 4, 10. What are the mean and standard
deviation of these results?
Answers The mean is the sum of the numbers divided by 12, which is 91/12
(= 7.58). The standard deviation is the square root of
o
1 n
(12 − 91/12)2 + (8 − 91/12)2 + . . . + (10 − 91/12)2
11
which is 2.75.
7.3
The probability density function
The random variable obtained by rolling a fair die is rather special, in
that it is equally likely to be any of the integers (whole numbers) 1 –
6. What about random variables that do not have this ‘equally likely’
property? For example, suppose the random variable x is obtained
by rolling a die twice and adding up the two numbers shown. What
can we say about this random variable?
We can easily see that the random variable obtained by adding the
numbers in this experiment does not have the ‘equally likely’ property. For instance, there is only one way that x = 2: when the first
throw gives 1 and the second throw also gives one. There are, however three ways that the result x = 10 can be obtained: 4,6 5,5 and
60
6,4. We would therefore expect to observe x = 10 more often than
x = 2 in a large number of trials. It is an important general principle
in probability that the more ways there are of obtaining a result, the
more likely that result is.
The information about how likely different results are is best displayed as a bar chart1 . Along the x-axis we plot the independent
variable, x, the sum of the two numbers in this example, and along
the y-axis we plot f (x), the probability of result x. So, for instance,
there are 3 ways to obtain x = 10. There are 6 × 6 = 36 different
possible outcomes from rolling a die twice, so f (10) = 3/36.
The function f (x) as we have defined it, is known as the probability
density function, abbreviated to p.d.f. Figure 7.1 explains how to
plot a bar chart of the p.d.f. for the two dice experiment. The steps
are
1. List all possible outcomes.
2. Calculate x for each one.
3. Calculate the relative frequency of each outcome — that is,
count how many times each different outcome is obtained and
divide by the number of possible outcomes (36 in this case).
4. Plot this number against x.
7.3.1
Normalisation
You will notice in figure 7.1 that the relative frequency is plotted
on the vertical axis, i.e. the frequency divided by the number of
possible outcomes. Suppose we add up all the values of f (x), that
is, we calculate f (2) + f (3) + . . . + f (12): what do we obtain?
1
A bar chart is a graph in which the variable plotted along the x-axis is discrete and so the y variable is plotted
as a series of vertical bars of the appropriate height.
61
List possible outcomes:
1,6 2,6 3,6 4,6 5,6 6,6
1,5 2,5 3,5 4,5 5,5 6,5
f(x)
1,4 2,4 3,4 4,4 5,4 6,4
6/36
1,3 2,3 3,3 4,3 5,3 6,3
5/36
1,2 2,2 3,2 4,2 5,2 6,2
4/36
1,1 2,1 3,1 4,1 5,1 6,1
3/36
Add
2/36
7
8
9
10 11
12
6
7
8
9
10
11
5
6
7
8
9
10
4
5
6
7
8
9
3
4
5
6
7
8
2
3
4
5
6
7
1/36
Plot
frequency
0/36
0 1 2 3 4 5 6 7 8 9 10 11 12
sum, x
Figure 7.1: How to calculate the probability density function for the sum of numbers shown by
rolling a die twice.
1+2+3+4+5+6+5+4+3+2+1
=1
36
Is it a coincidence that these numbers add up to one? No! — to
see that it isn’t, recall the addition of probabilities formula for the
mutually exclusive case (6.2) in the previous chapter. That formula
applies here, since in the ‘roll a die twice’ experiment, the events ‘sum
of the numbers = x’ and ‘sum of the numbers = y’ are mutually
exclusive if x 6= y. Now, the outcome of the experiment must be
precisely one of the numbers 2 . . . 12, and so the sum of the individual
probabilities of these numbers must be one.
A p.d.f. will always be normalised if we plot it in the way described
above, so the general rule is that all the probabilities added together
equal one, i.e.
X
f (xi ) = 1
i
62
where
X
i
means ‘sum over all relevant values of i’.
Example 7.2 On the axes below, plot the p.d.f. for x = the number of heads
obtained when 4 coins are tossed. Check that the probabilities add up to one.
f (x)
✻
✲
x
7.3.2
Other names for p.d.f.
There are various different names for the p.d.f., including frequency
function, probability density and probability function, so be aware
of this when reading textbooks.
7.4
What does the p.d.f. mean?
We have looked at the p.d.f. for two examples, both of which are
discrete. That is, the variable we have called x only takes on integer
values. (You can never roll two dice and add the numbers up to get
3.4, neither can you obtain 1.5 heads in a coin tossing experiment.)
We will look at the p.d.f. in continuous cases in the next chapter.
Look back at figure 7.1. We have calculated the p.d.f. for the sum of
the numbers shown by two dice. Two questions we might ask are:
63
1. What does the bar chart mean?
2. How would we plot the bar chart experimentally for the two dice?
The answer to the first question is that f (x) is the relative frequency
of the value x — i.e. the number of times x occurs, divided by the
total number of observations. So for instance, looking at figure 7.1, we
see that a sum of 5 is twice as likely as a sum of 3, since f (5) = 4/36
and f (3) = 2/36. The p.d.f. cannot tell you what the next outcome
will be (because it is random) but it can tell you the probability of a
particular outcome. The p.d.f. also enables us to calculate numbers
like the mean and standard deviation, as we shall see in section 7.6.
Strictly speaking, the answer to the second question is to take two
dice and roll them N times, adding up the numbers shown and
recording them. We would then plot a bar chart of the number of
times we had obtained the result 2, 3, . . . 12, normalised by dividing
by N . The question remains though, how large does N have to be to
obtain an accurate result? I am not going to do this experiment in
the lecture — unless N only needs to be some small number like 20
say, you would get bored and so would I — but I can do a computer
simulation that amounts to the same thing. Using the C random
number generator, drand48(), I can simulate a die being thrown and
so produce the data for N = 200, 000 in about 90 milliseconds (on
my computer).
The table below shows the results. The p.d.f. has been normalised, by
dividing by N in each case, and then the result has been multiplied
by 36 so that the numbers agree with those in figure 7.1.
64
x = sum
of numbers
2
3
4
5
6
7
8
9
10
11
12
p.d.f. for N =
20
200
200,000
0.0/36 0.54/36 1.0237/36
1.8/36 2.16/36 1.9973/36
3.6/36 3.42/36 2.9916/36
5.4/36 2.52/36 4.0104/36
5.4/36 4.14/36 5.0197/36
9.0/36 5.58/36 5.9967/36
3.6/36 7.02/36 4.9948/36
1.8/36 3.60/36 3.9470/36
3.6/36 4.14/36 2.9921/36
3.6/36 2.16/36 2.0284/36
0.0/36 0.90/36 0.9985/36
P.D.F. , f(x), (36ths)
20 throws
200,000 throws
200 throws
10
10
10
8
8
8
6
6
6
4
4
4
2
2
2
0
0
0
2 4 6 8 10 12
2 4 6 8 10 12
x = Sum of numbers
2 4 6 8 10 12
Figure 7.2: Finding the p.d.f. for the two dice problem by computer simulation.
65
7.5
The cumulative distribution function
As discussed in the previous section, the p.d.f., f (xi ) gives us the
probability that x = xi exactly. In many instances we want to know
something different, but related: what is the probability that x is less
than or equal to a given value? For instance, tubes of Smarties might
nominally contain 40, but in fact can contain anything between 37
and 44. We might want to know the probability that a tube contains
fewer than 39. As another example, we know the probability of
rolling a die and obtaining a given number — it is 1/6 (if the die is
fair) — but what about the probability that the result is less than,
say, 4?
Both these questions can be answered if we know the cumulative
distribution function, c.d.f., F (x), and F (x) can be easily worked
out if we know the p.d.f. As an example, let us calculate F (x) = the
probability that the number shown by a fair die is less than or equal
to x. We know the p.d.f. for this problem: it is f (x) = 1/6 for 1 ≤
x ≤ 6 and f (x) = 0 otherwise. Hence, F (1), the probability that
x ≤ 1 is 1/6; F (2), the probability that x ≤ 2 is 1/6 + 1/6 = 1/3;
F (3) = 1/6 + 1/6 + 1/6 = 1/2 and so on.
From this, you should be able to see that given f (x), we can calculate
F (x) by
F (xi) = probability that outcome x ≤ xi =
7.6
X
f (xj ).
xj ≤xi
Mean & standard deviation: when the p.d.f. is known
Suppose now, instead of giving you a list of numbers, I give you a plot
or a table of the p.d.f., f (x). It is possible to calculate directly from
66
this what the mean and standard deviation for a very large number
of observations would be.
7.6.1
Mean, x
Example 7.3 Let x be the number of heads obtained when three coins are
tossed. What is the mean value of x?
Answer First work out the p.d.f., f (x). You should be able to show that
f (0) = 1/8, f (1) = f (2) = 3/8 and f (3) = 1/8. (For other values of x,
f (x) = 0.) These figures could also be calculated using the binomial distribution — see the Theoretical Distributions chapter.
How many times, on average, will we obtain 2 heads? We know that f (2) =
3/8, so if we toss the three coins 8, 000 times, say, we would expect about
3/8 × 8, 000 = 3, 000 of these to result in 2 heads. Similarly, we would expect
to get 3 or 0 heads about 1,000 times each, and 1 head about 3,000 times.
The average number of heads per toss will therefore be
0 × 1, 000 + 1 × 3, 000 + 2 × 3, 000 + 3 × 1, 000 3
= heads per toss
8, 000
2
From the above example, we deduce that the general formula for the
mean when the p.d.f. is known is
X
x=
xif (xi )
(7.3)
i
7.6.2
Standard deviation, σx
The calculation of the standard deviation is done in the same way:
σx =
sX
i
(xi − x)2f (xi)
67
(7.4)
Problems, chapter 7
1. Ten resistors, nominally 1kΩ, are measured and their values are found to
be 996, 1001, 1023, 997, 1004, 1010, 1008, 996, 990, 1007 Ω. Calculate
the mean and standard deviation of these values.
[R = 1003.2Ω, σR = 9.4Ω]
2. In the two dice experiment, calculate the mean and standard deviation
of x, where x is the sum of the two numbers. (Hint: figure 7.1 shows the
p.d.f.)
[x = 7, σx = 2.42]
3. Sketch the p.d.f. for
x = the number of heads − the number of tails
when four coins are tossed.
[f (4) = f (−4) = 1/16, f (2) = f (−2) = 4/16, f (0) = 6/16]
4. My research on Smarties indicates that the p.d.f., f (x) = the probability
that a tube contains x Smarties is as follows:
x
36
37
38
39
40
41
42
43
f (x) 1/12 1/12 2/12 2/12 3/12 1/12 1/12 1/12
For other values of x, f (x) = 0. Plot a bar chart of the c.d.f.for this
problem. What is the probability that a tube contains (a) 39 or fewer
(b) 41 or more Smarties?
[(a) 1/2, (b) 1/4]
5. A money box contains 80 × 10p and 120 × 20p coins. Two coins are taken
out at random without replacement. Calculate the p.d.f. f (x), where x
is the total monetary value of the coins taken out. Sketch this p.d.f. in
the form of a bar chart. If this experiment is repeated many times, what
is the mean value of the money withdrawn per experiment?
[f (20) = 0.1588, f (30) = 0.4824, f (40) = 0.3588; x̄ = 32p]
68
Chapter 8
Continuous distributions
Aims
By the end of this chapter, you should know about
• the p.d.f., f (x), and cumulative distribution function, c.d.f., F (x),
for continuous variables
• how to calculate the mean and standard deviation for continuous
variables
• applications to noise.
8.1
Continuous random variables
In the previous chapters we have looked at random variables x which
take on a discrete set of values, such as the number obtained by rolling
dice and so on. In this chapter we turn our attention to continuous
variables, i.e. ones which can take a continuous set of values — all
values in a range. Examples include
• the values of resistors whose nominal value is, say, 1MΩ — the
actual value might lie anywhere between about 0.9 and 1.1MΩ
(assuming 10% tolerance).
• the voltage produced by a noise source, sampled at discrete time
intervals.
69
8.2
Those definitions again
Before we discuss the p.d.f., we will define the mean and standard
deviation for continuous variables.
8.2.1
Mean of x, x
Suppose x(t) is a variable (e.g. voltage or current) that depends on
time. Then, by analogy with equation 7.1, the mean of x(t) in the
range 0 ≤ t ≤ T is
1
x=
T
Z
T
x(t)dt.
(8.1)
0
In practice, T will often be set by the response time of the measuring
instrument.
To see why this definition is reasonable, remember that the integral
of a function between limits 0 and T is the area between a graph of
the function and the horizontal axis, with areas below the axis being
negative, and with t ranging from 0 to T . Suppose we were to squash
the graph of the function into a rectangular shape, but with the same
area and width (T ) as before. Then the height of this rectangle is
just x — which is an intuitively reasonable way to define the mean.
8.2.2
Standard deviation of x, σx
Similarly, by analogy with equation 7.2, we have
1
σx2 =
T
Z
0
T
(x(t) − x)2dt.
Note that if x = 0 then
70
(8.2)
1
σx2 =
T
8.3
Z
T
0
x(t)2dt = mean value of x(t)2 .
Application to signal power
You are probably familiar with the fact that the power delivered by
a voltage of the form V1 sin ωt to a load R is V12/2R. This is because
in general,
mean square voltage, v 2
.
signal power, P =
load resistance, R
In the case of a sine wave, therefore, equation 8.1 can be used to find
the mean square voltage v 2. In this case, it is sensible to integrate
over one complete cycle, although the integral over any number of
complete cycles would give the same answer (why?). This, when
divided by R, gives the power:
Z T
Z T
2
V
1
1
V12 sin2 2πt/T dt = 1
(1 − cos 4πt/T )dt
v2 =
T 0
2 T 0
V12
V12
V12 1
[T ] =
, so signal power =
.
=
2 T
2
2R
Now, notice that if the mean of a signal v = 0 then the variance,
σv2 is the same as the mean square (see equation 8.2). Hence, the
total signal power delivered to a load R by a signal with zero mean
and variance σv2 is given by σv2/R. This is sometimes useful for noise
power calculations.
71
8.4
The p.d.f. for continuous variables
Look back at figure 7.1. The bar chart of f (x) gives the probability
of obtaining the sum x when two dice are rolled, so that, for instance,
the probability that x = 1 is f (1) = 0, and the probability that x = 4
is f (4) = 3/36. In fact, the probability of obtaining x is equal to the
height of the strip, provided that the bar chart has been normalised.
The p.d.f. in the continuous case is the limit of the bar chart as the
width of the strips tends to zero and the number of measurements
tends to infinity. Unless we are prepared to do an infinite number
of measurements, therefore, we can only ever find (discrete) approximations to the p.d.f. by means of an experiment, and we discuss the
experimental techniques involved in the next chapter.
Example 8.1 What is the p.d.f. for the random numbers generated by the C
random number generator drand48()?
Answer The C random number generator generates pseudo-random numbers,
x, in the range 0–1. We would expect them to be uniformly distributed over
that range, so that there would be, in the long run, roughly the same number
lying in the range 0 – 0.1 as in the range 0.47 – 0.57 say. In other words,
we might expect the p.d.f., f (x), to be something like


Note that
f (x) = 
1
0
0≤x<1
otherwise.
• this is a continuous distribution (x can have any value between 0 and 1);
• it is normalised, that is
Z
∞
f (x)dx = 1.
−∞
You can easily write a program to produce data for a bar chart, with a given
number of strips, by generating a given number of random numbers.
Some sample results are shown in figure 8.1.
72
1.5
Infinite limit
1.0
0.5
0.0
−0.1
1.5
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1.1
50 strips, 100,000 numbers
f(x)
1.0
0.5
0.0
−0.1
1.5
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1.1
1.0
1.1
10 strips, 2,000 numbers
1.0
0.5
0.0
−0.1
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
x
Figure 8.1: Visualising the p.d.f. for the C random number generator, using different numbers of
strips and random numbers.
8.5
The c.d.f., F (x)
We defined the c.d.f., F (x), in the previous chapter. The definition
there was given as a sum for a discrete distribution, so you should
not be surprised that it is an integral for continuous distributions:
F (x0) = probability that x ≤ x0 =
Z
x0
f (x)dx.
(8.3)
−∞
Look at figure 8.2. This illustrates how we would calculate the probability that a measurement of the random variable x lies between x0
and x1:
Z x1
f (x)dx
P (x0 ≤ x ≤ x1) =
x0
provided that f (x) has been normalised. So, from the definition of
73
F (x), we have
— see figure 8.2.
P (x0 ≤ x ≤ x1) = F (x1) − F (x0)
f(x)
x0
x1
x
Figure 8.2: The probability that x lies between x0 and x1 is the area under the p.d.f. between x0
and x1 .
Example 8.2 Find the probability that none of the three light bulbs in a
spotlight array will have to be replaced during the first 1,200 hours of use if
the lifetime of a light bulb can be modelled as a random variable with p.d.f.
given by


f (x) = 
6(−x2 + 3x − 2) for 1 ≤ x ≤ 2
0
otherwise,
where x is measured in units of 1,000 hours.
Answer We are dealing with independent ‘and’ events here (one light bulb
does not affect another), so we need to find 1 − F (1.2), the probability that a
bulb is still working after 1.2 thousand hours. (It’s 1 − F (1.2) because F (1.2)
is the probability that a bulb has stopped working by 1,200 hours.)
The probability that all three are still working is then (1 − F (1.2))3.
We know the p.d.f., so bearing in mind equation 8.3,
F (1.2) =
Z
1.2
−∞
f (x)dx =
Z
1.2
1
6(−x2 + 3x − 2)dx = 13/125
which gives a probability of (1 − F (1.2))3 = 0.72, or 72% that all three are
still working after 1,200 hours.
74
8.6
Definitions: when the p.d.f. is known
If you look back at section 7.4, you will remember that we said that
the p.d.f. in the discrete case tells us a lot about the outcome of an
experiment. This is equally true in the continuous case, so that the
mean and standard deviation can easily be calculated, just as in the
discrete case.
8.6.1
Mean of x, x
x=
8.6.2
∞
xf (x)dx
(8.4)
(x − x)2f (x)dx
(8.5)
−∞
Standard deviation of x, σx
σx2
8.6.3
Z
=
Z
∞
−∞
The mean of any function of x
From the previous two subsections, it should come as no surprise to
you that you can calculate the mean of any function, G(x) say, from
G(x) =
Z
∞
G(x)f (x)dx.
(8.6)
−∞
For instance, if you wanted to know the mean value of x3, this can
be computed from
Z ∞
x3f (x)dx.
x3 =
−∞
75
Example 8.3 What are the mean and standard deviation of the numbers
produced by the C random number generator?
Answer Assume that the p.d.f. is


f (x) = 
Then
x=
and
∞
∞
−∞
(x − x)2f (x)dx =
so the standard deviation, σx , is
q
−∞
0≤x≤1
otherwise
xf (x)dx =
Z
σx2 =
Z
Z
1
0
Z
1
0
1
0
1
1
x x × 1dx = =
2
2
2
0
1
1
1
x
x
x
(x − )2 dx =
−
+ =
2
3
2
4
12
0
3
2
1/12.
Problems, chapter 8
1. A voltage v(t) is given by
v(t) = V0 + V1 sin ωt
(a) What is the mean, v?
(b) What is the standard deviation, σv ?
(c) What is the mean power delivered by this signal to a load of resistance
R?
√
[(a) V0 , (b) V1/ 2, (c) (V02 + V12/2)/R]
2. In example 8.2 show that the p.d.f. given for light bulb failures is normalised.
3. A particular noise voltage v(t) has p.d.f.


f (v) = 
1
v ln 2
0
for 1 ≤ v ≤ 2
otherwise
Calculate (a) the mean, (b) the standard deviation and (c) the average
power delivered to a 50Ω load.
[(a) 1.44V, (b) 0.29V, (c) 43mW]
76
4. The shelf-life, x, in months, of a batteries can be modelled as a random
variable with p.d.f.

a



for x ≥ 0
f (x) =  (x + 5)3

 0
otherwise
(a) Find the value of a.
(b) Find the probability that a single battery will have a shelf-life of (i)
at least 20 months and (ii) anywhere between 10 and 40 months.
[(a) 50, (b)(i) 1/25 or 4%, (ii) 8/81 or 9.9%]
5. The waiting time in a Post Office queue, in minutes, x, is modelled as a
continuous random variable with cumulative distribution function


1 − e−x/4 for x ≥ 0
0
otherwise
F (x) = 
(a) Calculate the probability of waiting (i) less than 12 minutes, (ii) more
than 5 minutes, and (iii) between 2 and 4 minutes.
(b) Derive the p.d.f., f (x), and hence calculate the mean waiting time,
x̄.
[(a)(i) 0.95, (ii) 0.29, (iii) 0.24, (b) x̄ = 4 minutes]
6.∗ The lifetime of a light bulb is a random variable with p.d.f. given by





f (x) = 



x − 1 for 1 ≤ x ≤ 2
3 − x for 2 ≤ x ≤ 3
0
otherwise
(x is measured in 1000 hours.)
(a) Sketch the p.d.f.
(b) Sketch the probability that a bulb has stopped working after a time
t, with t in the range 0–4000 hours.
(c) Calculate the probability that a bulb has stopped working after 2200
hours.
(d) A circuit consists of two such bulbs in (i) series and (ii) in parallel.
What is the probability that these arrangements are open circuit after
2,200 hours of operation?
[(c) 68% (d) (i) 90% (ii) 46%]
77
Chapter 9
Theoretical distributions
Aims
By the end of this chapter, you should know about
• the Gaussian (also known as normal) distribution and its properties
• the Poisson distribution and its properties
• the Binomial distribution and its properties.
9.1
The Gaussian distribution
Suppose that we were to measure the actual resistance of a large
number (say 1,000) resistors whose value was supposed to be 1kΩ.
We should not expect to obtain 1,000 values of exactly 1kΩ, because
the manufacturing process for resistors isn’t perfect. In other words,
we would expect to obtain some resistances greater than 1kΩ, some
less.
I have actually done these measurements, for 75 rather than 1,000
resistors, and a bar chart of the results is shown in figure 9.1. The
shape is more-or-less what you might have expected: a large number
around the middle and fewer further away. The mode (most popular value) however, is not 1kΩ, perhaps unexpectedly: it is 987.6Ω,
which provides evidence that the bridge I used needs re-calibrating.
78
R
980.6
982.0
983.4
984.8
986.2
987.6
989.0
990.4
991.8
993.2
994.6
996.0
997.4
Count
1 *
1 *
3 ***
4 ****
8 ********
17 *****************
15 ***************
16 ****************
3 ***
1 *
4 ****
0
1 *
Mean = 988.6, Std. dev. = 2.93
Figure 9.1: Non-normalised bar chart for the resistance of 74 nominally 1kΩ resistors, measured on
an RLC bridge.
I actually measured 76 resistors, one of which was 99.6Ω (I imagine
that it had escaped from the 100Ω drawer) and I also discounted a
single 1009Ω resistor from the calculations, on the grounds that it is
an exceptionally high value, known as an ‘outlier’.
The bar chart in figure 9.1 is an approximation to the p.d.f. f (R) for
the various values of resistance R, except that as shown it has not
been normalised: to normalise we would need to divide each column
height by 74, the number of resistors, ×1.4Ω, the width of each class.
That would ensure that the area under the bar chart is one.
It is an experimental fact that the p.d.f. of a wide range of measurements of a variable subject to random errors is found to be well
approximated by a particular curve. The p.d.f. in question has a
particular mathematical form (see below) and data that follows this
description is said to be Gaussian or normally distributed.
The continuous distribution with p.d.f.
(x−x)2
1
−
f (x) = √ e 2σ2
σ 2π
79
(9.1)
is known as the Gaussian or normal distribution. It crops up all over
the place. Some points to note are:
• The mean of the random variable x is just x
• The standard deviation is σ
• The distribution is normalised, that is,
Z ∞
f (x)dx = 1
−∞
• The curve is symmetrical about x = x and bell-shaped
It is useful to know how different values of the parameters x and
σ affect the shape of the Gaussian curve, and this is illustrated in
figure 9.2. Notice that the smaller σ is, the narrower and higher the
curve is; if it is narrower, it must also be higher because the total
area has to be 1 (normalisation). Note also that the peak of the curve
occurs at x = x — that is, the most likely value (the mode) is also
the mean: not all distributions have this property.
9.2
The Gaussian probability distribution
Look back at equation 8.3. That tells us that if a random variable x
is normally distributed, the probability that its value is less than a
number x1, P (x ≤ x1), is given by
Z x1
(x−x)2
1
−
e 2σ2 dx
F (x1) = P (x ≤ x1) = √
(9.2)
σ 2π −∞
This defines the cumulative distribution function, F (x), for a Gaussian p.d.f. We can calculate the probability that x lies between x1
and x2 — it is given by F (x2) − F (x1).
80
2.0
1.5
f(x)
σ = 0.25
1.0
σ = 0.5
0.5
σ=
0.0
–2.0
–1.0
0.0
x
1.0
1
2.0
Figure 9.2: The Gaussian p.d.f. for x = 0 and different values of σ.
Unfortunately, the integral in (9.2) can’t be expressed in terms of
known functions like sine, exp, log etc. and so we generally have to
find its value from tables. If you look at the integral in 9.2 you will see
that its value depends on three parameters: x, σ and x1. Obviously
it would be impractical to compile tables for all possible values of
these parameters, and this is not necessary. Instead we can use a
single table, which is on page 177 and which gives the area under the
Gaussian curve between 0 and z, where
|x − x|
z=
σ
is the normalised variable. From this you can calculate F (z) in all
cases, as set out below.
Example 9.1 The amplitude of a noise voltage v is normally distributed with
mean 0.8V and variance 0.25V2. What is the probability that the voltage
sampled at a particular instant (a) lies between 1 and 2V? (b) is less than
1.5V?
81
Answer (a) The probability required is P (1 ≤ v ≤ 2). We need to translate
these√values of v into the normalised variable z. First, the standard deviation
σ = 0.25 = 0.5V. Given the mean and standard deviation, we can see that
v = 1V corresponds to z = 0.4 and v = 2V corresponds to z = 2.4. From
the table on page 177, F (0.4) = 0.5 + 0.1554, F (2.4) = 0.5 + 0.4918. Hence,
P (1 ≤ v ≤ 2) = 0.9918 − 0.6554 = 0.3364, which is the required answer. The
two numbers are subtracted because the two voltages, 1 and 2V, are on the
same side of the mean.
(b) Here we need to find P (−∞ ≤ v ≤ 1.5). The two values of v correspond
to z = −∞ and z = 1.4 respectively. We need to find the area under the
Gaussian curve between z = −∞ and z = 1.4. Since F (z) is symmetrical
about z = 0, and F (∞) = 1 (normalisation), we know that F (0) = 0.5.
Hence, the area between z = −∞ and z = 1.4 = 0.5 + area between z = 0
and z = 1.4, so P (−∞ ≤ v ≤ 1.5) = 0.5+F (1.4) = 0.9192. The two numbers
are added because the two voltages −∞ and 1.5V are on opposite sides of
the mean.
9.3
The Poisson distribution
This is a discrete distribution and has many applications, e.g. in
computer networks and queueing theory.
The distribution arises in situations where a series of independent
random events occurs, and the probability of a single such event
occurring within a small time interval is proportional to the length
of that interval. In fact, it applies not just to events that happen
in time, but events distributed within any region. For example, the
Poisson distribution allows us to answer questions such as
• If an office receives on average 100 telephone calls per hour, what
is the probability exactly 210 calls being received in a given two
hour period? [2.2%]
• If a cyclist gets a flat tyre once every 5,000 miles on average,
what is the probability of having no flat tyres in 10,000 miles?
[13.5%]
82
• A typist makes an average of one mistake per page; what is
the likelihood of picking a page at random that contains three
mistakes? [6.1%]
We now derive the Poisson p.d.f.
Suppose that at the book issue desk of a library, the probability that
one person arrives during a small time interval from 0 to δt is λδt,
with λ a constant equal to the average number of arrivals per unit
time. We want to calculate Pn(t), which is the probability of exactly
n arrivals during a time interval t.
We can calculate this probability by considering Pn(t +δt), the probability of exactly n arrivals during the interval t + δt. For n > 0 this
is the sum of the probabilities of two mutually exclusive events, i.e.
[n arrivals in t and none in δt] and [n − 1 arrivals in t and 1 in δt].
That is
Pn(t + δt) = Pn(t) × P0(δt) + Pn−1(t) × P1(δt)
where we have assumed that δt is so small that P2(δt) ≈ 0.
Now, the probability of one arrival in δt is P1(δt) = λδt by definition,
so the probability of no arrivals in δt, P0(δt) = 1 − λδt. Using these
values in the above equation gives
Hence
Pn(t + δt) = Pn(t)(1 − λδt) + Pn−1(t)λδt
Pn(t + δt) − Pn(t) dPn(t)
=
= λ[Pn−1(t) − Pn(t)]
(9.3)
δt
dt
where we have taken the limit as δt → 0. This is actually a differentialdifference equation for Pn(t), which we can solve.
First let us consider n = 0. In 9.3, P−1(t) = 0 — we cannot have
−1 arrivals during any time interval, so 9.3 becomes
83
dP0(t)
= −λP0(t)
dt
which has the solution P0(t) = P0(0)e −λt . Since P0(0), the probability of no arrivals in zero time, is unity, this simplifies to
P0(t) = e −λt .
Knowing P0(t), we can use this in 9.3 with n = 1 to find P1(t), which
turns out to be
P1(t) = λte −λt
(see problems in the chapter entitled ‘Differential Equations and the
Laplace Transform’). Solving 9.3 for successive values of n gives
(λt)n −λt
e
Pn(t) =
n!
which is the probability of exactly n arrivals in a time t when the
mean arrival rate is λ. This is the Poisson distribution.
Example 9.2 During the 8 hour period that a library is open, a total of 960
people join the queue at the book issues desk. (a) What is the average rate
at which people arrive in the queue, in people/minute? (b) What are the
probabilities of exactly 0, 1, 2, and 3 people arriving in the queue during any
given minute?
Answer (a) 8 hours = 480 minutes; hence λ = 960/480 = 2 people/minute
is the average arrival rate.
(b) ‘Per unit time’ in the context of this problem means ‘per minute’. The
Poisson distribution tells us that
0
−λ×1
= 1 × e −2 = 13.5%
P0 = (λ×1)
0! e
1
−λ×1
= 2 × e −2 = 27.1%
P1 = (λ×1)
1! e
2
P2 = (λ×1)
e −λ×1 = 2 × e −2 = 27.1%
2!
3
−λ×1
= 1.33 × e −2 = 18.0%
P3 = (λ×1)
3! e
The Poisson distribution has mean λt, standard deviation
problems) and is of course normalised so that
84
√
λt (see
∞
X
Pn(t) = 1.
n=0
9.4
The binomial distribution
The binomial distribution is another discrete distribution. It applies
to situations with a discrete number of possible outcomes. It is defined as follows:
If the probability of an event occurring is p, and of it
not occurring is q (so q = 1 − p), then the probability
that the event will happen k times out of n is given
by the (k + 1)-th term in the binomial expansion of
(q + p)n.
Recall that
(q + p)n =
q n + nq n−1 p +
n(n − 1) n−2 2
n!
q p +...+
q n−k pk + . . . + pn.
2!
(n − k)!k!
(9.4)
Example 9.3 A die is rolled 4 times. What is the probability of obtaining
(a) 2 (b) 4 sixes?
Answer This is a classic example of the sort of problem to which the binomial
distribution applies. In this case, let p be the probability of obtaining a six with
one throw of the die, so p = 1/6, and so q, the probability of not obtaining a
six, is 5/6. Using the binomial expansion, we obtain
5 1
+
6 6
!4
5
=
6
!4
5
+4
6
!3
1
5
+6
6
6
!
!2
1
6
!2
5
+4
6
!
1
6
!3
1
+
6
!4
so the probability of 2 sixes is 6 × (5/6)2 × (1/6)2 ≈11.6%. The probability of
4 sixes is (1/6)4 (as you’d expect) ≈ 0.08%.
85
Problems, chapter 9
For areas under the Gaussian curve, see the table on page 177.
1. Resistor values are found to be normally distributed with mean R and
standard deviation σ. In a large batch of resistors of the same nominal
value, what percentage would be expected to lie within (a) ±σ, (b) ±2σ
of R? (c) Exactly half of all resistors lie within ± how many σ of R?
[(a) 68%, (b) 95%, (c) 0.67σ]
2. Nominally 1µF capacitors are found to have values that are normally
distributed. They are marked as being of ±5% tolerance, but 20% are
found to be outside this range. What is the standard deviation of the
production spread?
[0.039µF]
3. A d.c. signal of 100mV has added noise whose amplitude p.d.f. is Gaussian with zero mean and variance 10−6V2. This signal is fed into a D.V.M.
with resolution 1mV, which rounds to the nearest 0.5mV. Calculate the
probability of the meter reading 101mV.
[0.2417]
4. In a binary signal, levels of 0 and 100mV correspond to logic 0 and 1
respectively. Suppose this signal has Gaussian noise with zero mean and
standard deviation of 50mV added to it. (a) What is the probability
of a bit error being produced, assuming that the threshold for deciding
between 0 and 1 is 50mV? (b) What is the probability of a 4 bit word
being correct?
[(a) 15.9%, (b) 50.1%]
5. Show that the Poisson distribution really does obey equation (9.3).
6. Show that the Poisson distribution is normalised.
7.∗ Show that the Poisson distribution has mean λt.
8. Crashes of the student file server are Poisson distributed, occurring at a
mean rate of 2 per day when there’s a deadline to be met. What is the
probability of (a) 0, (b) 2, (c) 4 crashes in one day?
[(a) 13.5%, (b) 27%, (c) 9%]
9. The average number of faults on a new car is 5. What is the probability
of (a) buying a new car with 0 faults and (b) buying two new cars with
86
a total of 4 faults between them? (c) What is the most likely number of
faults in one car?
[(a) 0.67%, (b) 1.9%, (c) 4 and 5 equally likely]
10. Calculate the probabilities for the three examples of the Poisson distribution on page 82. N.B. 210! ≈ 1.06 × 10398.
11. The premium bond problem. For every pound you invest in premium
bonds, you used to have a 1/15,000 chance of winning a prize each month.
Suppose you have invested £20,000. What is the probability of winning,
in a given month, (a) exactly one prize? (b) exactly two prizes? (c) at
least one prize?
(Hints. The binomial distribution applies. For part (c) probability of at
least one prize = 1− probability of no prizes.)
[(a) 0.3515 (b) 0.2343 (c) 0.7364]
12. A hundred samples of 5 resistors were taken from a large batch. 59
samples had no defective resistors, 33 had 1, 7 had 2, 1 had 3 and no
samples had 4 or 5 defective resistors. Show that the distribution is
approximately binomial and estimate the overall percentage of defective
components.
[about 10%]
87
Chapter 10
The method of least squares
Aims
By the end of this chapter, you should understand
• how the method of least squares works
• how to fit a straight line to given data
• how to fit simple curves to given data.
You will need to recall some facts about partial differentiation from
first year maths.
10.1
Gauss
Carl Friedrich Gauss was an important mathematician in his time. In
the Theoretical Distributions chapter we learnt about a probability
density function that was named after him; in this chapter, we discuss
the method of least squares, which was discovered by him.
10.2
A data fitting problem
Suppose we measure the current in through a resistor R, for N different values of the voltage vn across it (so n = 1, 2, 3, . . . , N ). If
Ohm’s law holds, the graph of in against vn should be a straight
line with slope 1/R. Experiments being what they are, there will
88
be some errors in the measurements and the resulting graph will not
be exactly a straight line. How do we then calculate the slope and
intercept of the ‘best’ straight line through the data, and what do
we mean by ‘best’ anyway?
The general ‘straight line fit’ problem is illustrated in figure 10.1.
y
*
( xi , y i )
*
*
di
*
*
m
*
}c
x
*
*
*
*
*
*
*
Figure 10.1: A problem to be solved by the method of least squares: fit a straight line of the form
y = mx + c to the given data set. The vertical distance from the i-th point (xi , yi ) to the line is di .
10.3
The method of least squares
Almost any straight line1 can be represented in the form y = mx + c,
where m is the gradient and c is the y-intercept. We want to find the
two numbers m and c that best represent a given set of data, with
the assumption that
The errors in x are much smaller than the errors in y.
Under this assumption, a way to do this is to
1
The exception is any vertical straight line.
89
Calculate m and c such that the sum of the squared
vertical distances of each of the points from the
straight line is a minimum.
This is known as the method of least squares, since we are trying to
minimise a sum of squares. We minimise the sum of squared vertical
distances because the errors in x are assumed to be much less than
the errors in y. We have also found one plausible answer to the
question “What do we mean by ‘best’ straight line?” — one that
minimises the sum of the squared vertical distances. Why squared
distances? There are two points to note here:
1. If just the distance were to be used, some cancelling out could
happen, since some of the distances could be positive and some
negative. In fact we could make the sum of distances equal to 0,
with a straight line that was a very poor fit to the data.
2. The calculation of c and m is straightforward for squared distances, as we shall see.
There are other ways this fit could be done, e.g. by minimising the
sum of the absolute values, or the fourth power of the distances,
neither of which can be carried out as easily.
If the errors in y are much smaller than the errors in x, then you
should swap the x and y values in what follows. The theory then
remains the same.
10.4
Calculating m and c
Let us write the i-th data point, with i going from 1 to N , as (xi , yi).
We then need to define S, the sum of the vertical squared distances
of all the points from the straight line y = mx + c. This is given by
90
N
X
(yi − mxi − c)2.
S=
i=1
It is surprisingly easy to find the values of c and m that minimise S.
We partially differentiate S with respect to c and to m, and set the
derivatives equal to zero. This assumes that S as a function of c and
m has exactly one turning point, which is a minimum. This can be
proved by calculating second derivatives — see problems. The values
of c and m that satisfy the resulting pair of equations are the values
that minimise S. Hence
N
and
∂S X
−2(yi − mxi − c) = 0
=
∂c
i=1
N
∂S X
−2xi(yi − mxi − c) = 0
=
∂m
i=1
P
Using the fact that N
i=1 c = N c, the first equation gives
N
X
i=1
yi − m
N
X
i=1
xi − N c = 0
(10.1)
and the second
N
X
i=1
xi yi − m
N
X
x2i
i=1
−c
N
X
xi = 0.
(10.2)
i=1
We now have two simultaneous linear equations and two unknowns,
c and m, so we can solve for c and m.
91
Example 10.1 Use the method of least squares to fit a straight line to the
five points
(−1, 3.2), (0, 1.4), (1, −0.8), (2, −2.9), (3, −3.8)
assuming that the x-values are accurate.
Answer We will use equations 10.1 and 10.2, so we first calculate
N
X
xi = 5,
N
X
i=1
i=1
yi = −2.9,
x2i = 15,
and
N
X
i=1
i=1
Equations 10.1 and 10.2 now become
−2.9 − 5m − 5c = 0
N
X
xiyi = −21.2
− 21.2 − 15m − 5c = 0
which we can solve to obtain
m = −1.83
and
c = 1.25.
The data and the least squares straight line fit to the data are shown in figure 10.2.
4.0
2.0
0.0
-2.0
-4.0
-1.0
0.0
1.0
2.0
3.0
Figure 10.2: Five data points and a straight line fit to them, as calculated by the method of least
squares.
92
10.5
Fitting to a parabola
In a similar way, we can calculate a least squares fit to a parabola of
the form y = a + bx + cx2 . In this case there are three unknowns, a,
b and c. We still define S as the sum of the squared vertical distances
from the parabola to the points, and hence
N
X
(yi − a − bxi − cx2i )2
S=
i=1
The three equations now are
N
∂S X
−2(yi − a − bxi − cx2i ) = 0
=
∂a
i=1
N
and
∂S X
−2xi(yi − a − bxi − cx2i ) = 0
=
∂b
i=1
N
∂S X
−2x2i (yi − a − bxi − cx2i ) = 0
=
∂c
i=1
As before, these equations can be solved for a, b and c.
Similar calculations can be used to fit any functions to a data set,
provided the functions are linear in the unknown parameters. For
example,
y = a sin x
y = ax + be x
y = ax3 + b ln x + c
are all linear in the parameters a, b and c, and the method of least
squares can be used to find the parameters for a given set of data.
By contrast, the following
y = e (x−b)
2 /c2
y = cos(a/x + bx + c)
y = ln(a + bx)
are not linear in a, b and c, and least squares cannot be used, at least
not directly.
93
Problems, chapter 10
1. By considering second derivatives, show that the values of c and m obtained by solving equations 10.1 and 10.2 are such that
S=
N
X
i=1
(yi − mxi − c)2
is a minimum (as opposed to a maximum).
2. Fit a straight line of the form y = ax + b to the following set of points,
by using the method of least squares:
(0, 12.3), (5, 14.5), (8, 15.0), (11, 17.6)
Assume that the x values are correct.
[a = 0.4500, b = 12.15]
3. A battery of nominal voltage V0 and internal resistance r is connected to
a variable resistance. Various values of the current i through and voltage
v across this load are given below
2
4
6
8 Amps
i 0
v 6.1 4.9 3.0 1.6 0.2 Volts
Assuming that the errors in the current readings are much smaller than
those in voltage, calculate V0 and r.
[V0 = 6.18V, r = 0.755Ω]
4. (i) Show that the value of a that gives the least squares fit of the function
y = ax2 to a data set (x1, y1 ), . . . (xN , yN ), is given by
a=
PN
2
i=1 xi yi
PN
4
i=1 xi
(ii) Some power, P , versus voltage, V , measurements for a resistor R are
given below.
V 1 1.5 2 2.5 3 Volts, ±0.3%
P 0.2 0.6 0.9 1.6 2.4 Watts, ±2%
94
Calculate the least squares value of R.
[R = 3.87Ω]
5.∗ (i) Show, by taking logs, that least squares fitting a function of the form
y = ae bx
can be reduced to fitting a straight line.
(ii) A capacitor C is initially charged to 10V and then connected across
a resistor R. The current through R measured at 1 millisecond intervals
is
1
2
3
4 ms, ±0.5%
t 0
i 4.5 2.8 1.5 1.0 0.6 mA, ±2.5%
Using the method of least squares, find R and C.
[R = 2.2kΩ, C = 0.89µF]
6.∗ The average mass, y, of nails of length x obeys the law y = axb where a
and b are constants.
(i) Show that the problem of finding a and b from N data points can
be reduced to a least squares straight line fitting problem in which the
equations to be solved are
X
X
ln yi − b ln xi − N ln a = 0
i
i
and
X
i
ln xi ln yi − b
(ln xi )2 − ln a
X
i
X
ln xi = 0
i
(ii) Given the following data:
x 1 2 4 6 inch
y 5 12 30 60 g
and assuming that the nail lengths are more accurately known than the
masses, estimate a and b.
[a = 4.81, b = 1.37]
95
Chapter 11
Complex frequency
11.1
Complex frequency
You should be familiar now with the idea of a transform since we
have looked at the Fourier transform in some detail. The purpose of
the Fourier transform is to represent a function of time, f (t), as a
function of angular frequency, F (ω). Both f (t) and F (ω) represent
the same function, but in terms of a different variable.
Similar to, but not the same as the Fourier transform is the Laplace
transform, which transforms a function of time, f (t), into a function
of the variable s, known as complex frequency. We define the Laplace
transform in the next chapter, but in this chapter we look at what s
means.
In general, s has both a real and imaginary part, and it is written
conventionally as
s = σ + jω
so that
e st = e (σ+jω)t = e σte jωt.
We always assume that σ and ω are real.
We already know that e jωt = cos ωt + j sin ωt is periodic with period
2π/ω. What is the meaning of σ, the real part of s?
There are three cases to consider: (1) σ < 0, (2) σ = 0 and (3)
σ > 0. We consider these in turn.
96
11.1.1
σ<0
Here, eσt is an exponentially decreasing function of time. If this
is then multiplied by e jωt, the real part of the result is a damped
oscillation:
e
σt
σ<0
t
Re e
st
σ<0
t
11.1.2
σ=0
Here, eσt = 1 is a constant function of time. If this is then multiplied
by e jωt, the real part of the result oscillates with constant amplitude:
σ=0
e
σt
1
t
Re e
st
σ=0
t
97
11.1.3
σ>0
Here, eσt is an exponentially increasing function of time. If this is
then multiplied by e jωt, the real part of the result is an exponentially
growing oscillation:
e
σt
σ>0
t
Re e
st
σ>0
t
11.2
Linear homogeneous differential equations
Recall that a linear second order differential equation is an equation
of the form
dv
d2v
2
+
ω
v = f (t)
+
2aω
0
0
dt2
dt
This equation is
• linear (only first powers of the unknown function, v, and its
derivatives appear)
• second order (the highest derivative that appears is the second)
It is assumed that the real constants a and ω0 are known, and also
the function (the ‘drive’) f (t) on the right hand side is given. The
98
problem then is to find the unknown function v(t) that satisfies the
differential equation for all times t and all initial conditions.
If f (t) = 0 then the equation becomes
dv
d2v
2
+
ω
v=0
(11.1)
+
2aω
0
0
dt2
dt
and is described as a homogeneous linear differential equation.1 We
consider this case now. In order to solve 11.1 we assume that the
solution will be of the form
v(t) = V0e st
where V0 is a constant.
We then need to find the possible values of the complex frequency s.
Substituting our assumed solution into 11.1 and using the fact that
dv
= V0se st
dt
d2v
= V0s2e st
2
dt
and
gives
2
s + 2aω0s +
ω02
V0e st = 0.
This has to be 0 for all times, t. Hence either V0 = 0 (trivial
solution, since this leads to v(t) = 0 for all t) or
s2 + 2aω0s + ω02 = 0.
By assuming the general form of the solution, we have managed to
transform the original differential equation into a quadratic in s —
which of course we know how to solve:
√
s± = −ω0a ± ω0 a2 − 1.
This is just a shorthand way of writing the two values of s
√
√
and
s− = −ω0(a − a2 − 1).
s+ = −ω0(a + a2 − 1)
1
If f (t) 6= 0, then it is an inhomogeneous differential equation.
99
The most general solution to 11.1 will therefore be
v(t) = e
−ω0 at
"
Ae
√
(ω0 a2 −1)t
+ Be
#
√
−(ω0 a2 −1)t
where A and B are arbitrary constants whose values can be found
from initial conditions.
We can now use the results of section 11.1 to describe the behaviour
of v(t) as defined by the differential equation (11.1). We can always
assume that ω0 > 0 (why?). The behaviour of v(t) then depends on
the value of a. There are three cases:
1. a2 > 1
2. a2 = 1
3. a2 < 1
We consider these in turn.
11.2.1
a2 > 1
√
In this case,√a − 1 > 0 and so a2 − 1 is real. Furthermore, if
a > 0, a − a2 − 1 > 0. Hence, the two numbers
√
s± = −ω0(a ± a2 − 1)
2
are both negative if a > 0 so the general solution is the sum of two
damped exponentials. √
Similarly, if a < 0, a − a2 − 1 < 0. Thus, the two numbers
√
s± = −ω0(a ± a2 − 1)
are both positive if a < 0 and in this case, the general solution
consists of the sum of two growing exponentials.
100
11.2.2
a2 = 1
When a2 = 1,
s± = −ω0a
so the solution is exponentially decaying if a > 0 and exponentially
growing if a < 0. (In fact, the situation is a bit more complicated
than this, and the Laplace transform enables us to sort out the difficult cases easily.)
11.2.3
a2 < 1
√
In this case, a − 1 < 0 and so a2 − 1 is imaginary. Hence
√
s± = −ω0(a ± j 1 − a2)
2
are both complex. Therefore, if the real part of s± , −ω0a, is negative
— that is, a > 0 — the solution is damped oscillatory.
On the other hand, if a < 0, the solution is exponentially growing
and oscillatory.
All the above are summarised in the following diagram.
Steady state oscillation
Growing exponential
Growing
oscillatory
−1
Damped
oscillatory
0
Damped exponential
1
Figure 11.1: All possible types of behaviour of solutions of equation (11.1).
101
a
Problems, chapter 11
L
R
S
C
1. In the figure, L = 4H, C = 1F. Capacitor C is initially charged. Switch
S is then closed. For what value/range of values of R is the subsequent
behaviour
(i) damped oscillatory
(ii) damped exponential
(iii) oscillatory with constant amplitude
(iv) growing oscillatory?
What is the frequency in the case of an oscillatory solution with constant
amplitude?
Which of these would be physically realisable with passive components?
[(i) 0 < R < 4Ω (ii) R ≥ 4Ω (iii) R = 0 (iv) −4 < R < 0; 1/4π Hz; (i)
and (ii) are realisable]
102
Chapter 12
The Laplace Transform
12.1
The Laplace transform
You should already be familiar with the idea of a transform, as we
have discussed the Fourier transform in previous lectures. Just as the
Fourier transform allows us to express a function of time as a function
of angular frequency ω, the Laplace transform allows us to express
a function of time in terms of complex frequency, s. (Some books
use p.) If the function of time is f (t), then its Laplace transform is
written F (s), or occasionally L[f (t)], and is defined by
F (s) = L[f (t)] =
Z
∞
e −st f (t) dt.
(12.1)
0
Two important differences between the Laplace transform and the
Fourier transform are
1. In the Laplace transform, the function f (t) is assumed to start
from t = 0, whereas in the Fourier transform, it is assumed to
start from t = −∞.
2. In the Laplace transform, the new variable s has both real and
imaginary parts, whereas in the Fourier transform, jω is purely
imaginary.
Let us start by calculating some Laplace transforms.
103
Example 12.1 If f (t) = e −kt, then
F (s) = L e
So
−kt
=
∞
Z
e
−st −kt
e
dt =
Z
∞
e −(k+s)t dt
0
0
∞
e −(k+s)t 1
.
=−
=
k + s 0
k+s
−kt
=
L e
i
h
1
.
k+s
(12.2)
Example 12.2 If f (t) = sin ωt then, using the fact that
sin ωt =
e jωt − e −jωt
2j
we see that
L[sin ωt] =
Z
∞
0
h
e −st (e jωt − e −jωt )
dt.
2j
Using the previous result, that L e kt = 1/(k + s), we get
i
1
1
1
.
−
L[sin ωt] =
2j −jω + s jω + s
"
#
Hence, simplifying,
L[sin ωt] =
ω
.
ω 2 + s2
Example 12.3 What is the Laplace transform of cos ωt =
before,
1
1
1
+
L[cos ωt] =
2 −jω + s jω + s
"
=
s
.
ω 2 + s2
104
#
e jωt +e −jωt
?
2
As
12.2
The Laplace transform of a derivative
The importance of the Laplace transform in solving differential equations becomes clear when we try to find the transform of the derivative of a function f (t) w.r.t. time:
Z ∞
df
df
e −st
dt
=
L
dt
dt
0
Integrating by parts gives
∞ Z ∞
df
−st
−se −st f (t) dt
= e f (t) −
L
dt
0
0
= −f (0) + s L[f ] = −f (0) + sF (s).
We have assumed that f (t) is such that limt→∞ e −st f (t) = 0. (If f (t)
didn’t have this property, it would not have a Laplace transform.)
What about second derivatives? Using the fact that
d2f
d df
=
dt2
dt dt
we can use the result for the first derivative:




2
d
f
df
df (0)


+ sL 
L 2 = −
dt
dt
dt
Replacing L
df
dt
with [−f (0) + sF (s)] gives

L

2

d f
df (0)
 = −sf (0) −
+ s2F (s)
2
dt
dt
The two important results we have deduced are
2
df 
df (0) 2
d f 



=
−sf
(0)−
+s F (s).
=
−f
(0)+sF
(s),
L
L
dt
dt2
dt
(12.3)



105

Note that f (0) is the value of f (t) at t = 0 and
the derivative of f at t = 0.
df (0)
dt
is the value of
Example 12.4 Is this consistent with our previous examples? We’ve already
found L[sin ωt] and L[cos ωt]. They are given by
ω
s
and
[cos
ωt]
=
L[sin ωt] = 2
L
ω + s2
ω 2 + s2
But,
sin ωt = −
1 d
cos ωt
ω dt
so the Laplace transform of sin ωt should equal
1
1
s2
ω
−1 d
cos ωt = − (− cos 0 + s L[cos ωt]) = − (−1 + 2
)= 2
L
2
ω dt
ω
ω
ω +s
ω + s2
"
#
which is indeed the Laplace transform of sin ωt.
106
Problems, chapter 12
1. Show that the Laplace transform has the superposition property, i.e. if
L[f (t)] = F (s) and L[g(t)] = G(s), then
L[af (t) + bg(t)] = aF (s) + bG(s)
where a and b are constants.
2. Find the Laplace transform of
(i) f (t) = a, a is a constant.
(ii) f (t) = t
(iii) f (t) = a + bt, a and b constants. (Use superposition).
[(i) a/s, (ii) 1/s2, (iii) (as + b)/s2]
3. Expand sin(ωt + φ) and hence show that
L[sin(ωt + φ)] =
What is L[cos(ωt + φ)]?
ω cos φ + s sin φ
s2 + ω 2
[(s cos φ − ω sin φ)/(s2 + ω 2 )]
4. Find the Laplace transform of e −kt cos(ωt+φ) and e −kt sin(ωt+φ) without
integration, by
(i) deducing the Laplace transform of e −kt e j(ωt+φ) (use equation 12.2)
then
(ii) finding the real and imaginary parts of this expression.
cos φ−ω sin φ
[ (s+k)
,
(s+k)2 +ω 2
107
(s+k) sin φ+ω cos φ
]
(s+k)2 +ω 2
Chapter 13
Differential Equations and the
Laplace Transform
13.1
Inhomogeneous differential equations
In the last but one chapter, we saw how to solve the following differential equation
d2v
dv
2
+
ω
v = f (t)
(13.1)
+
2aω
0
0
dt2
dt
with f (t) = 0. We could (a) describe the solutions qualitatively (e.g.
damped oscillatory, growing exponential etc.) and (b) write down a
general, exact solution.
In this chapter, we discuss how to do (b) but now when f (t) 6= 0, or,
in technical terms, when the differential equation is inhomogeneous.
Where does such an equation arise in practice?
L
R
S
C
Figure 13.1: A circuit described by a homogeneous differential equation
108
L
R
f(t)
C
Figure 13.2: A circuit described by an inhomogeneous differential equation
In figure 13.1 the capacitor is initially charged and switch S is open.
At t = 0, S is closed. The behaviour of the circuit is described by
the homogeneous differential equation
d2v
dv
LC 2 + RC + v = 0
dt
dt
where v = v(t) is the voltage across C.
In figure 13.2 the circuit is driven by an applied voltage f (t) — this
might, for instance, be a sine wave from a signal generator. The
circuit is now described by the inhomogeneous differential equation
dv
d2v
(13.2)
LC 2 + RC + v = f (t).
dt
dt
We are going to solve differential equations like this one by using the
Laplace transform technique.
13.2
Solving a d.e. by Laplace transform — overview
Recall that
to solve a differential equation in a function of time, v(t),
means
to find a function, v(t), that satisfies the differential equation for
all time, t.
109
You should also remember that the general solution of a second order
differential equation will have two arbitrary constants whose values
are determined from initial conditions.
A good way to solve equation 13.1 when f (t) 6= 0 uses the Laplace
transform. This method requires us to
1. Find the Laplace transform of the differential equation;
2. solve the resulting (algebraic) equation for V (s), the Laplace
transform of v(t); then
3. find the inverse Laplace transform of V (s), which gives us v(t).
We look at each of these items in turn.
13.3
The Laplace transform of a differential equation
Using the rules for finding the Laplace transform of first and second
derivatives, we can immediately find the Laplace transform of the
differential equation for a driven RLC circuit, equation 13.2. It is
dv(0)
LC[−sv(0) −
+ s2 V (s)] + RC[−v(0) + sV (s)] + V (s) = F (s)
dt
This looks rather a mess! It is, however, just a linear equation for
V (s), the Laplace transform of the (as yet unknown) function v(t).
Solving for V (s) gives
dv(0)
2
+ RCv(0)
V (s) LCs + RCs + 1 = F (s) + LCsv(0) + LC
dt
so
F (s) + LCsv(0) + LC dv(0)
+ RCv(0)
dt
V (s) =
(13.3)
LCs2 + RCs + 1
We haven’t found v(t) yet, but we’ve found something closely related
to it: the Laplace transform of v(t), V (s). And, what is more, this
is a completely general expression for V (s), valid for any
110
• drive function, f (t) (provided its Laplace transform exists)
dv(0)
dt
• initial conditions v(0) and
• values of R, L and C.
Notice that the initial conditions (v(t) and its derivative at t = 0)
are automatically built into the expression for V (s).
Example 13.1 Let R = 3/2Ω, C = 2F and L = 1H. Let f (t) = H(t) be the
Heaviside functions, also knowns as the unit step function, which is defined
by


H(t) = 
0 t<0
1 t>0
(H(t) is undefined at t = 0.)
If the initial capacitor voltage v(0) = 0V with initial rate of change dv(0)
dt =
0V/s, find the Laplace transform of v(t), V (s).
Answer We have been asked to calculate the response of a circuit to an input
consisting of a unit step function.
According to equation 13.3, in order to find V (s), we need to find the Laplace
transform of the drive function f (t). This is H(t), whose Laplace transform
is given by
L[H(t)] =
Z
∞
e
−st
0
1 −st t=∞ 1
× 1 dt = − e =
s
s
t=0
Substituting the given values in 13.3, we get
V (s) =
1
s(s + 1)(2s + 1)
In words,
the Laplace transform of the unit step response of the circuit is
1/[s(s + 1)(2s + 1)].
We are now left with the problem of finding v(t) from V (s), i.e. the
problem of inverting the Laplace transform.
111
13.4
Inverse Laplace transform using tables
There is an analytical way of inverting the Laplace transform, called
the Bromwich integral, which involves contour integration. It is dealt
with in, for instance, Boas (chapter 15). However, we are going to
adopt the simpler and more usual approach of using tables of Laplace
transforms. In what follows, I shall always refer to the tables on
pages 8–11 in the E.E. Department’s ‘Tables of constants, formulae
and transforms’. These tables are also included at the end of this
book, starting on page 178; we refer to them as the L.T. Tables.
Sometimes it is easy to find the Laplace transform that we need in
the L.T. Tables, as the following example shows:
Example 13.2 Let us finish off the previous example by finding the inverse
Laplace transform of V (s). In the L.T. Tables you will find that the Laplace
transform of
b −at
a −bt
1
1−
e +
e
ab
b−a
b−a
"
is
#
(13.4)
1
.
s(s + a)(s + b)
But V (s) is of this form. If we divide the numerator and denominator of
V (s) by 2, we get
V (s) =
1/2
s(s + 1)(s + 1/2)
so, to make this look like the result in the L.T. Tables, put
1
b= .
2
a = 1,
Using these in 13.4 gives
v(t) = 1 + e −t − 2e −t/2.
You should check that this (a) satisfies the differential equation 13.2, and (b)
has the properties v(0) = 0 and dv(0)
dt = 0.
In this case, therefore, we have solved the differential equation 13.2.
112
Sometimes there is a little more effort involved, as in the following
example:
Example 13.3 Suppose that we have instead
V (s) =
s+2
.
2s2 + s + 1
This looks quite like two entries in the L.T. Tables, at the top of the third
page:
−at
L e cos ωt =
h
i
s+a
2
s + 2as + b
with ω =
√
b − a2
(13.5)
and
1
1 −at
.
(13.6)
L e sin ωt = 2
ω
s + 2as + b
Notice that both the denominators are of the form s2 + 2as + b, so let’s bring
out a factor of 1/2 from V (s):
#
"
V (s) =
s+2
1
.
2 s2 + 21 s + 12
Now we can find the values of a and b such that
1
1
s2 + s + = s2 + 2as + b.
2
2
Obviously,
1
a=
4
and
1
b = , from which
2
ω=
√
b − a2 =
√
7
.
4
That’s the denominator sorted out. What about the numerator? The numerator in equation (13.5) is s + a = s + 1/4, but we have a numerator of s + 2.
How do we get this? The answer is that we rewrite V (s) as
s+ 1
1
V (s) =  2 1 4
2 s + 2s +

1
2
+
s2 +
7
4
1
2s
+
1
2

.
Now we can use 13.5 and 13.6 to obtain
√
√
√
7t
7 −t/4
7t
1 −t/4
cos
+
e
sin
.
v(t) = e
2
4
2
4
113
13.5
Inverse Laplace transform by partial fractions
In some cases the function we want to invert isn’t in the L.T. Tables, in which case it may be necessary to use the method of partial
fractions. An example of this type is now given.
Example 13.4 Suppose that
V (s) =
1
;
s2 (s + 1)
what is v(t)?
The idea of using partial fractions is to re-write V (s) in the form
C
As + B
1
+
=
s2 (s + 1)
s2
s+1
with A, B and C constants that we have to find. If we can write V (s) in
this form, we can invert each of the fractions individually. The general rule
for partial fractions is that the degree of the numerator must be one less than
that of the denominator, hence the As + B term in the numerator of the first
fraction above.
We now need to find A, B and C. We do this by adding up the partial
fractions:
(As + B)(s + 1) + Cs2
1
=
s2 (s + 1)
s2(s + 1)
The denominators are equal, so comparing the numerators,
1 ≡ (As + B)(s + 1) + Cs2 = (A + C)s2 + (A + B)s + B
which has to be true for all s; hence, comparing coefficients of powers of s,
B = 1,
A + B = 0,
A+C =0
so A = −1, B = 1 and C = 1. Therefore,
−s + 1
1
1
1
1
V (s) =
+
=
−
+
+
.
s2
s+1
s s2 s + 1
We can invert each part of this using the L.T. Tables. The answer is
v(t) = −H(t) + t + e −t
where H(t) is the unit step function.
114
Problems, chapter 13
1. Solve the following differential equation by the Laplace transform method:
dy
− y = 2e −t
dt
with
y(0) = 3
[y(t) = 4et − e−t = 3 cosh t + 5 sinh t = 3et + 2 sinh t]
2. Solve
dx
d2 x
−
4
+ 4x = 4
dt2
dt
with
x(0) = 0,
dx(0)
= −2
dt
[x(t) = 1 − e 2t]
3. Solve
d2 v
+ 16v = 8 cos 4t
dt2
with
v(0) = 0,
dv(0)
=8
dt
[v(t) = (2 + t) sin 4t]
4. In the circuit of figure 13.2, L = 1H, C = 1/5F, R = 2Ω and f (t) =
2 sin t. Write down the differential equation that describes v(t), the voltage across the capacitor, and solve it with the initial conditions v(0) = 0
and dv(0)
dt = 3.
[v(t) = − cos t + 2 sin t + e −t(cos 2t + sin 2t)]
5. Show that the solution to equation 9.3 in chapter 9, with n = 1, P0 (t) =
e−λt and P1 (0) = 0, is as given.
115
Chapter 14
The Z transform: definition, examples
14.1
Introduction and Definitions
14.1.1
Sampling
f(t)
Σ δ(t - nT)
The z-transform is to sampled signals as the Laplace transform is
to continuous time signals. It is widely used in control theory and
digital signal processing. Throughout this and the next two chapters,
the sampling interval will be a fixed, positive time T .
We first show how the Laplace and z-transforms are connected.
t
0T 1T 2T 3T 4T 5T 6T 7T
t
Figure 14.1: Left: a continuous function of time, f (t). Right: a ‘comb’ of equally-spaced Dirac
P
delta functions, C(t) = ∞
n=0 δ(t − nT ).
Figure 14.1, left, shows a continuous function of time, f (t). FigP
ure 14.1, right, shows the function C(t) = ∞
n=0 δ(t − nT ), a set of
equally-spaced Dirac delta functions, occurring at t = 0, T, 2T, . . ..
Figure 14.2 tries to show the product, f (t) × C(t), which we will
call fs (t). This picture should be interpreted as follows: since each
116
f(0T)
fs(t) = f(t) x C(t)
f(T)
0T
f(t)
f(2T)
1T
2T
3T
t
4T
5T
6T
7T
Figure 14.2: The sampled version of the function f (t), which is f (t) × C(t). The heights of the
arrows are proportional to their areas — hence, the labels f (0), f (T ) and so on refer to the areas
under the Dirac delta functions at t = 0, T, . . . respectively.
Dirac δ(t − nT ) has unit area, but is infinite at t = nT and zero
everywhere else, fs(t) is also infinite at t = nT and zero everywhere
else. Figure 14.2 is therefore showing, by the height of the arrows, the
area under the Dirac delta functions at t = 0, T, 2T, . . ., these areas
being f (0), f (T ), f (2T ), . . . respectively. Hence, multiplying f (t)
by C(t) can be seen as a way of sampling f (t) at the equally-spaced
intervals t = 0, T, 2T, . . ., and so
fs (t) =
∞
X
n=0
14.1.2
f (nT )δ(t − nT ).
The connection with Laplace transforms
You now need to remember the important sampling property of the
Dirac delta function, equation (2.3), which is repeated here:
117
Z
∞
−∞
f (t)δ(t − t0) dt = f (t0),
(14.1)
true for any continuous function f (t). Using this result, you should
immediately be able to see that the Laplace transform of fs(t) is
!
Z ∞
∞
X
e −st
f (nT )δ(t − nT ) dt
L[fs(t)] =
0
=
∞ Z
X
n=0
0
∞
n=0
e −st f (nT ) δ(t − nT )dt =
∞
X
f (nT )e −nsT .
n=0
Defining z = e sT , we have the so-called z-transform of f (t), which is
Z [f (nT )] = F (z) =
∞
X
f (nT )z −n .
(14.2)
n=0
In practice, you will often see the sampled version of f (t) written
as f (n), with the sampling interval T “built in” to f (n). See the
following section for a further explanation. Using this convention,
the z-transform is defined as follows:
Z [f (n)] = F (z) =
∞
X
f (n)z −n .
(14.3)
n=0
In words:
The z-transform of a function of time, f (t), is the Laplace
transform of the sampled version of f (t), written fs(t). The
function fs(t) is obtained from f (t) by multiplying it by the
P∞
sum of Dirac delta functions n=0 δ(t − nT ).
118
Points to note about the z-transform
• We will always assume that f (t) = 0 for t < 0, so f (n) = 0 for
n < 0.
• The z-transform transforms a function of n, n = 0, 1, 2, . . . into
a function of z, where z = e sT and s is the complex frequency.
14.1.3
The two ways of writing down z-transforms
There are two slightly different ways of writing down z-transforms:
• CE, the way that is preferred by control engineers, is shown in
equation (14.2);
• DSP, the boxed definition, (14.3), which is generally used by
Digital Signal Processing people.
They are equivalent to each other, and, although we concentrate on
the DSP way in these notes, you should be familiar with both. Note
also that the Departmental Tables, the relevant section of which is
quoted on page 131, use both ways.
Time and again, in dealing with z-transforms you will find you need
to use properties of power functions, which you have certainly seen
before, but which are repeated here — you need to be able to apply
these almost without thinking about it. In the expressions, x is a
positive real number and a, b are any real numbers. We have
xa xb = xa+b
1/xb = x−b
xa /xb = xa−b
(xa)b = xa×b .
Of course, since e is a positive real number, all these also apply
to the exponential function; so, for instance, e a e b = e a+b . Here
is a good moment to note also that (−1)n = 1, −1, 1, −1, . . . for
n = 0, 1, 2, 3 . . .; and so (−1)n+1 = (−1)n−1 = −1, 1, −1, 1, . . .,
again for n = 0, 1, 2, 3 . . ..
119
Let us now look at four examples of z-transforms, taken from the
Departmental Tables on page 131. In all cases, the sampling interval
is T and n = 0, 1, 2, . . .. Each example shows that CE and DSP
are equivalent, provided that we make the right choice of parameters
(see the last column).
CE way
f (t)
f (nT )
sin ωt
sin ωnT
t
nT
−at
e
e −anT
e −at cos ωt e −anT cos ωnT
DSP way
f (n)
with. . .
sin an
a = ωT
an
a=T
n
b
b = e −aT
bn cos cn b = e −aT , c = ωT
For instance, take the first row in the table above. This is saying
that if f (t) = sin ωt, then f (nT ) = sin ωnT and we can then write
f (n) = sin an, by choosing the right definition for a, which in this
case is a = ωT .
14.2
z-transform examples
Before we compute the z-transforms of some well-known functions,
we will derive the formula for the sum of a geometric series — you
will have seen this before. You will see these formulae many times
when discussing z-transforms and it is well worth your while to learn
them, particularly the boxed ones.
P
Let Sk = kn=0 xn = 1 + x + x2 + . . . + xk . Then xSk = x + x2 +
x3 + . . . + xk+1. Hence Sk − xSk = Sk (1 − x) =
1 + x + x2 + . . . + xk − (x + x2 + x3 + . . . + xk+1) = 1 − xk+1.
Therefore
120
k
X
1 − xk+1
.
x =
Sk = 1 + x + x + . . . + x =
1
−
x
n=0
2
k
n
(14.4)
Now let k tend to infinity. Provided that |x| < 1, the numerator
tends to one, so we have
2
1 + x + x + ... =
∞
X
xn =
n=0
1
1−x
provided that
|x| < 1.
(14.5)
Replace x with −x in the above to get
∞
X
1−x+x2−x3+. . . =
(−1)nxn =
n=0
1
provided
1+x
|x| < 1.
(14.6)
By differentiating the above expression, we obtain
1
d d 1
2
3
=
1 + x + x + x + ...
=
dx 1 − x (1 − x)2 dx
= 1 + 2x + 3x2 + . . . =
∞
X
nxn−1
n=0
and multiplying both sides by x, we have
x
= x + 2x2 + 3x3 + . . .
2
(1 − x)
and hence
x + 2x2 + 3x3 . . . =
∞
X
nxn =
n=1
∞
X
n=0
121
nxn =
x
(1 − x)2
(14.7)
where it does not matter whether or not we include the n = 0 term
— it is zero anyway.
Replacing x with −x in the above gives
2
3
4
x−2x +3x −4x +. . . =
∞
X
(−1)n+1nxn =
n=1
x
(14.8)
(1 + x)2.
We now compute the z-transforms of some well-known functions. In
all the following, a, b are constants and n is an integer.
14.2.1
f (n) = an
This corresponds to a sampled version of f (t) = bt — then f (nT ) =
(bT )n = an with a = bT . From the definition,
Z [an] =
∞
X
an z −n = a
n=0
∞
X
nz −n ,
n=1
where the second sum starts from 1 rather than 0 because the n = 0
term is zero. Now we can use equation (14.7) to obtain
Z [an] = a
14.2.2
az
1/z
=
.
(1 − 1/z)2 (z − 1)2
f (n) = δ(n), the unit impulse
We need to be careful with definitions here: in the discrete case, the
function δ(n), which we call the unit impulse to avoid confusion
with the Dirac delta function, is defined as
δ(n) =
(
1 n=0
0 otherwise.
122
Note that this is different from the Dirac delta function. With this
definition in mind, it is easy to see that
∞
X
Z [δ(n)] =
δ(n) z −n = 1 · z 0 = 1.
n=0
14.2.3
f (n) = u(n), the unit step function
The unit step function is defined as
(
1 n≥0
u(n) =
0 otherwise
We then have
∞
∞
X
X
z
1
−n
Z [u(n)] =
u(n) z =
=
.
z −n =
1
−
1/z
z
−
1
n=0
n=0
Note that, since all our functions start at n = 0, a constant c is
written as c u(n) and so the z-transform of c is cz/(z − 1).
14.2.4
f (n) = an
This corresponds
to a sampled version of f (t) = e −bt : then f (nT ) =
n
e −bnT = e −bT , and putting a = e −bT , we have f (n) = an. Directly
from the definition, equation (14.3), we have
∞
∞
X
X
z
1
Z [an] =
=
anz −n =
(a/z)n =
1 − a/z z − a
n=0
n=0
where we have used equation (14.5) to calculate the infinite sum.
14.2.5
f (n) = cos an
jx
−jx
/2. Hence, the z-transform of
Remember that cos x = e + e
cos an, by definition, is
∞
∞
X
1 X jan
−n
e + e −jan z −n .
Z [cos an] =
cos an z =
2 n=0
n=0
123
We can therefore use the previous result:
z
z
2z 2 − ze −ja − ze ja
2Z [cos an] =
.
+
=
z − e ja z − e −ja z 2 − ze −ja − ze ja + 1
Hence,
z(z − cos a)
Z [cos an] = 2
.
z − 2z cos a + 1
The z-transform of sin an can be found in an analogous way.
Problems, chapter 14
1. Find the z-transform of the sequences (i) a = [1, −1, 3, 0, 2] (read this as
a(0) = 1, a(1) = −1 etc. ); and (ii) b = [0, 0, 1, 0, 2].
[(i) A(z) = 1 − z −1 + 3z −2 + 2z −4, (ii) B(z) = z −2 + 2z −4]
2. Find the z-transform of sin an.
3. Find the z-transform of bn cos an.
Hint: easiest is to look at the real part of
[z sin a/(z 2 − 2z cos a + 1)]
P∞
ja
n
n=0 (be /z) .
2
[z(z − b cos a)/(z − 2bz cos a + b2)]
4. Sketch the functions u(n), u(n − 2) and u(n) − u(n − 2). Hence deduce
Z [u(n) − u(n − 2)].
[1 + z −1]
(You’ll see another way to solve this problem in the next chapter.)
5. Define the finite sequence f (0) = 1, f (1) = 2, f (2) = 3, f (i) = 0, i > 2.
Write this sequence (i) as a sum of delta functions (unit impulses) and
(ii) as a sum of unit step functions. A sketch may help.
Using the answer to part (i) and the definitions of the z-transform and
the unit impulse, find the z-transform of the sequence f (n).
[(i) δ(n) + 2δ(n − 1) + 3δ(n − 2), (ii)
u(n) + u(n − 1) + u(n − 2) − 3u(n − 3). F (z) = 1 + 2z −1 + 3z −2]
6. Find the z-transform of the infinite sequence f (n) = 1/n!, n ≥ 0. (Hint:
ex = 1/0! + x/1! + x2/2! + . . .)
[F (z) = e(1/z)]
7. Find the z-transform of the infinite sequence f (n) = 1/(n + 1), n ≥ 0.
(Hint: ln(1 + x) = x − x2/2 + x3/3 . . .]
124
[F (z) = −z ln(1 − 1/z)]
Chapter 15
The Z transform: properties,
inversion
15.1
z-transform properties
Like the Fourier and Laplace transforms, the z-transform has many
useful properties, some of which we derive in this chapter.
15.1.1
Linearity/superposition
If c1 and c2 are constants and f1(n) and f2(n) are given functions,
then the linearity property states that
Z [c1 f1(n) + c2 f2(n)] = c1 F1(z) + c2F2(z)
(15.1)
where F1(z) = Z [f1(n)] and F2(z) = Z [f2(n)]. This is proved
directly from the definition the the z-transform.
Example 15.1 If g(t) = 1 − e −at , find G(z).
First of all, note that g(nT ) = 1 − e −anT so letting b = e −aT we have g(n) =
1 − bn . Remembering that our functions start at t = 0, we have that g(n) is
the sum of the two functions u(n) and −(bn). The z-transforms of these are
z/(z − 1) and −z/(z − b) respectively, so, using superposition,
z 1 − e −aT
z
z
z(1 − b)
G(z) =
−
=
=
.
z − 1 z − b (z − 1)(z − b) (z − 1) (z − e −aT )
125
15.1.2
Time delay
This property is analogous to the Fourier transform time shift property. It is stated as follows:
If Z [f (n)] = F (z) then Z [f (n − m)] = z −m F (z)
where m ≥ 0 is an integer. It can be proved as follows. By definition,
∞
X
Z [f (n − m)] =
f (n − m)z −n
n=0
= z 0f (−m) + z −1 f (1 − m) + . . . + z −m f (0) + z −m−1 f (1) + . . .
and, since f (n) = 0 for n < 0, we have
0
Z [f (n − m)] = z × 0 + z
−1
× 0 + ... + z
−m
= z −m F (z).
15.1.3
∞
X
f (n)z −n
n=0
Time advance
This property is also analogous to the Fourier time shift property.
In what follows, we only need time advances of 1 × T and 2 × T , in
which case the time advance property is: If Z [f (n)] = F (z), then
Z [f (n + 1)] = zF (z) − zf (0)
and
Z [f (n + 2)] = z 2F (z) − z 2 f (0) − zf (1).
The proof follows directly from the definition. We have Z [f (n + 1)] =
∞
X
f (n + 1) z −n = f (1) + f (2)z −1 + f (3)z −2 + . . .
n=0
126
= zF (z) − zf (0), using equation (14.3). The proof for Z [f (n + 2)]
works in the same way.
15.1.4
Multiplication by an exponential sequence
This property is analogous to the Fourier time scaling property, and
is stated as follows:
If Z [f (n)] = F (z) then Z [an f (n)] = F (z/a)
Again, the proof follows directly from the definition. We have that
Z [anf (n)] =
∞
X
n
a f (n) z
−n
=
n=0
∞
X
f (n) (z/a)−n = F (z/a).
n=0
Example 15.2 Given that Z [cos an] = (z 2 − z cos a)/(z 2 − 2z cos a + 1), we
can deduce that
Z [bn cos an] =
15.1.5
z 2 − zb cos a
(z/b)2 − (z/b) cos a
=
.
(z/b)2 − 2(z/b) cos a + 1 z 2 − 2zb cos a + b2
Differentiation property
Like Fourier and Laplace transforms, the z-transform has a differentiation property, which is stated as follows:
If Z [f (n)] = F (z) then Z [n f (n)] = −z
dF (z)
.
dz
The proof goes as follows. By definition,
F (z) =
∞
X
n=0
f (n) z
−n
∞
X
dF
=−
n f (n) z −n−1 .
so
dz
n=0
127
Multiplying by −z gives
∞
dF (z) X
=
{nf (n)} z −n ,
−z
dz
n=0
which is clearly the z-transform of nf (n).
Example 15.3 Using Z [an ] = z/(z − a), find Z [nan ] and Z n2an .
We have that Z [nan ] = −z d/dz {z/(z − a)}. Using the derivative of a quotient rule, this gives
za
Z [nan ] =
.
(z − a)2
Differentiating again and multiplying the result by −z, we find
h
Z n2an =
h
15.1.6
i
i
za(z + a)
.
(z − a)3
Initial Value Theorem
If you know the z-transform of a function f (t), you can compute the
value of the function at t = 0, using the Initial Value Theorem. This
is
If Z [f (n)] = F (z) then f (0) = z→∞
lim F (z).
(15.2)
The proof of this is straightforward. From the definition,
F (z) = f (0) + z −1 f (1) + z −2 f (2) + . . .
and if z → ∞, we are left with f (0).
Example 15.4 If f (t) = cos at so f (n) = cos an, then f (0) = cos 0 = 1.
The Initial Value Theorem gives the same result:
F (z) =
and so
z(z − cos a)
1 − (1/z) cos a
=
z 2 − 2z cos a + 1 1 − (2/z) cos a + 1/z 2
lim F (z) = 1
z→∞
as expected.
128
15.1.7
Final Value Theorem
The Final Value Theorem is a similar type of result to the Initial
Value Theorem, but takes longer to prove. The theorem is
If Z [f (n)] = F (z) then n→∞
lim f (n) = lim (z − 1)F (z). (15.3)
z→1
The proof requires the Time Advance Theorem, which we have already seen.
Let f (t) be a function of time whose z-transform exists, so that the
P
−n
series ∞
converges. Now consider
n=0 f (n) z
Z [f (n + 1) − f (n)] = lim
n→∞
n
X
i=0
z −i f (i + 1) − z −i f (i)
= n→∞
lim −f (0) + {f (1) − z −1 f (1)} + {z −1 f (2) − z −2 f (2)}
−2
2
−3
+{z f (3)/z − z f (3)} + . . . + z
−n
f (n + 1)
= n→∞
lim −f (0) + z −n f (n + 1) + (1 − 1/z)f (1)
n
Therefore,
−1
−2
+z (1 − 1/z)f (2) + z (1 − 1/z)f (3) + . . .
lim Z [f (n + 1) − f (n)] = n→∞
lim {−f (0) + f (n + 1)} ,
z→1
(15.4)
since, if z → 1, then (1 − 1/z) → 0 and z i → 1 for any i.
Also, from the time advance property, we have the additional fact
that
lim Z [f (n + 1) − f (n)] = lim {zF (z) − zf (0) − F (z)}
z→1
z→1
= −f (0) + lim (z − 1)F (z),
z→1
129
and this is the same thing as equation (15.4). As f (0) is a constant,
we have
lim {−f (0) + f (n + 1)} = −f (0) + n→∞
lim f (n + 1)
n→∞
= −f (0) + lim (z − 1)F (z)
z→1
and, noting that limn→∞ f (n + 1) is the same thing as limn→∞ f (n),
the Final Value Theorem follows.
Example 15.5 We have seen in example 15.1 that
Z [1 − an ] =
z(1 − a)
.
(z − 1)(z − a)
Now, limn→∞ 1 − an = 1 if |a| < 1. The Final Value Theorem confirms this:
lim(z − 1) ×
z→1
15.2
z(1 − a)
z(1 − a)
= lim
= 1.
(z − 1)(z − a) z→1 (z − a)
Inversion of the z-transform
Much like the approach we used for inverting Laplace transforms, we
can often use tables for inverting the z-transform. We can also use
the properties described above, as well as partial fractions and power
series — sometimes a combination of all of these is necessary. Finding
inverse transforms can be anything from easy to quite complicated.
Easy examples include the case where the transformed function is a
polynomial in z −1 . Examples of all kinds are given in what follows.
A small table of z-transforms, taken from the “Tables of Constants,
formulae and transforms” used in exams, is included below.
130
f (t)
Laplace Transform
F (s)
Z Transform
F (z)
δ(t)
1
1
Unit impulse
H(t)
Heaviside function or unit step
1
s
z
z−1
t
1
s2
Tz
(z − 1)2
t2
2
s3
tn
T 2 z(z + 1)
(z − 1)3
n
lim (−1)n ∂ n
∂a
a→0
n!
sn+1
z
z − e −aT
e −at
1
s+a
z
z − e −aT
te −at
1
(s + a)2
T z e −aT
(z − e −aT )2
sin ωt
ω
s2 + ω 2
z sin ωT
z 2 − 2z cos ωT + 1
cos ωt
s
s2 + ω 2
z(z − cos ωT )
z 2 − 2z cos ωT + 1
e −at sin ωt
ω
(s + a)2 + ω 2
z e −aT sin ωT
z − 2z e −aT cos ωT + e −2aT
e −at cos ωt
(s + a)
(s + a)2 + ω 2
z 2 − z e −aT cos ωT
z 2 − 2z e −aT cos ωT + e −2aT
1 − e −at
a
s(s + a)
z(1 − e −aT )
(z − 1)(z − e −aT )
2
You should note carefully how the information is presented in this
version of the table: in particular, f (t) is given in the left-hand
column, not f (n). Look back at section 14.2 to see why, for instance,
f (t) = e −bt , corresponds exactly to f (n) = an, by choosing a = e −bT .
131
f (n)
Z Transform
F (z)
δ(n) [Unit impulse]
1
H(n) [Unit step]
z
z−1
n
z
(z − 1)2
n2
z(z + 1)
(z − 1)3
nk
k
lima→0 (−1)k ∂ k
∂a
bn
z
z−b
nbn
zb
(z − b)2
sin an
z sin a
z 2 − 2z cos a + 1
cos an
z(z − cos a)
z 2 − 2z cos a + 1
bn sin an
zb sin a
z 2 − 2zb cos a + b2
bn cos an
z 2 − zb cos a
z − 2zb cos a + b2
1 − bn
z(1 − b)
(z − 1)(z − b)
z
z − ea
2
By contrast, in this table, f (n) is given and not f (t) — the DSP
way, as opposed to the CE way.
15.2.1
Finite sequences
Example 15.6 Find the inverse z-transform (i.z.t.) of F (z) = 1 + 2z −1 −
7z −3.
• This is a finite degree polynomial in z −1. By the definition of the ztransform, it should be clear that

1




2
n=0
n=1
f (n) = 

−7 n = 3



0
otherwise.
132
We now look at some examples in which the function to be inverted,
F (z), is close to a form that is in the tables. Always bear in mind also
the z-transform properties derived in the first part of this chapter.
Example 15.7 Find the i.z.t. of F (z) = z/(z + b).
• Note that Z [an ] = z/(z − a).
• Now substitute b = −a to obtain Z [(−b)n] = z/(z + b).
Hence, the i.z.t. of F (z) = z/(z + b) is f (n) = (−b)n.
Example 15.8 Find the i.z.t. of F (z) = z/(z − b)2.
• Use the fact that Z [nbn] = zb/(z − b)2 (see Example 15.3).
• Thus, using the linearity property, we can divide both sides by b to get
Z [n(bn)/b] = z/(z − b)2.
Hence, the i.z.t. of F (z) = z/(z − b)2 is f (n) = n bn−1.
Example 15.9 Find the i.z.t. of F (z) = z/(z 2 + b2 ).
• Use the fact that Z [bn sin an] = zb sin a/(z 2 −2zb cos a+b2 ) (from tables).
• To get rid of the 2zb cos a term in the denominator, set a = π/2, since
cos π/2 = 0.
• Hence, Z [bn sin(nπ/2)] = zb/(z 2 + b2), since sin π/2 = 1.
• Divide both sides by b to get the final result.
Hence, the i.z.t. of F (z) = z/(z 2 + b2 ) is f (n) = bn−1 sin(nπ/2).
Here is an example where we first use partial fractions, then use the
tables.
Example 15.10 Find the i.z.t. of F (z) = 2z/((z − 1)(z − 3)).
• First convert this to partial fractions. Note that you want, if possible, a
form that is in the tables, so look for partial fractions in the form1
1
2z
Az
Bz
=
+
.
(z − 1)(z − 3) z − 1 z − 3
F (z) is also equal to 1/(z − 1) + 3/(z − 3), but the form 1/(z − b) isn’t in the tables.
133
• This gives A = −1, B = 1 (check this), so
z
−z
+
.
F (z) =
z−1 z−3
• Each of these parts is in the tables: −z/(z − 1) corresponds to −u(n)
and z/(z − 3), to 3n .
Hence, the i.z.t. of F (z) = 2z/((z − 1)(z − 3)) is f (n) = 3n − u(n). (If n ≥ 0,
this is the same as 3n − 1.)
Here are a couple of examples where we use power series. This is
a good method to use, not too difficult, and often easier than the
alternatives. Always bear in mind equations (14.5) and (14.7) for
finding power series.
Example 15.11 Find the i.z.t. of F (z) = 1/(z + b)2.
• From equation (14.8) we have that
(1 + x)−2 = x−1
∞
X
(−1)n+1nxn =
(−1)n+1nxn−1.
n=1
n=1
• Hence (z + b)−2 = z −2 (1 + b/z)−2 =
z −2
∞
X
∞
X
n=1
(−1)n+1n(b/z)n−1 = z −2 − 2bz −3 + 3b2z −4 − 4b3z −5 . . .
• Remember the definition of the z-transform. The previous equation is
clearly the transform of a function of n which has the values f (0) =
f (1) = 0 and
f (2) = 1, f (3) = −2b, f (4) = 3b2 . . . , f (n) = (n − 1)(−b)n−2,
provided that n ≥ 1 (remember that b0 = 1.)
• Therefore, we are nearly right if we say that f (n) = (n − 1)(−b)n−2, but
this gives the wrong value for n = 0. We want f (0) = 0, but substituting
n = 0 in (n − 1)(−b)n−2 gives −b−2.
Hence the i.z.t. of F (z) = 1/(z + b)2 is f (n) = (n − 1)(−b)n−2 + b−2δ(n).
Make sure you clearly understand how adding the term b−2δ(n) makes things
work out right.
134
Example 15.12 Find the i.z.t. of F (z) = 1/(z 3 − 1).
• By substituting x = z −3 in equation (14.5), we have
3
(z − 1)
−1
−3
−3 −1
= z (1 − z )
=z
−3
∞
X
z −3n = z −3 + z −6 + z −9 + . . .
n=0
• Hence f (n) = 1 when n = 3, 6, 9, . . . and is zero otherwise.
The answer in the form given above is perfectly adequate. However, if you
want to be fancy, you could also write this as f (n) = (1 + 2 cos(2nπ/3))/3 −
δ(n) — check that this gives you the right sequence — but this is not necessary.
Example 15.13 Find the i.z.t. of F (z) = z/(z − b)2. (We have already done
this using tables — see Example 15.8.)
• By substituting x = bz −1 in equation (14.7), we have
z(z − b)−2 = z −1 (1 − bz −1 )−2 = z −1 1 + 2bz −1 + 3b2z −2 + 4b3z −3 . . . ,
= 0 + z −1 + 2bz −2 + 3b2z −3 + 4b3z −4 . . .
• Therefore, f (n) = 0, 1, 2b, 3b2, 4b3 . . . for n = 0, 1, 2, 3, 4 . . ..
Hence the i.z.t. of F (z) = z/(z − b)2 is f (n) = nbn−1, as before.
Enough examples: time for you to have a go.
Problems, chapter 15
1. Find the z-transform of f (n) = n u(n) in two ways: (i) directly from the
definition; and (ii) by using the differentiation property.
[Both give F (z) = z/(z − 1)2.]
2. Find the inverse z-transform of F (z) = z −1 − z −2 + 2z −4 .
[f (n) = 0, 1, −1, 0, 2 for n = 0, 1, 2, 3, 4 and f (n) = 0 for n > 4.]
3. By using the “multiplication by an exponential sequence” property, deduce the z-transform of f (n) = bn sin an directly from the z-transform
of sin an.
[F (z) = bz sin a/(z 2 − 2bz cos a + b2)]
135
4. By using the linearity property, deduce the z-transform of f (n) = sin(an+
φ), φ constant, directly from the z-transforms of sin an, cos an.
[F (z) = z(z sin φ + sin(a − φ))/(z 2 − 2z cos a + 1)]
5. By using the linearity property, deduce the z-transform of f (n) = cosh an
directly from the z-transforms of e an , e −an.
[F (z) = z(z − cosh a)/(z 2 − 2z cosh a + 1)]
6. Find the inverse z-transform of F (z) = z 2 /(z +b)2. Use the time advance
property and the z-transform of nbn from the tables.
[f (n) = (n + 1)(−b)n]
7. Use partial fractions, then tables, to find the inverse z-transform of
F (z) = 3z 2/((z − 1)(z + 2)).
[Hint: try partial fractions in the form F (z) = Az/(z − 1) + Bz/(z + 2).]
[f (n) = u(n) + 2(−2)n]
8. Use partial fractions, followed by tables, to find the inverse z-transform
of F (z) = z 2 /(z 2 − 4).
[f (n) = (2n + (−2)n)/2]
9. Find the inverse z-transform of F (z) = (1 − z −1 )(1 − 2z −2).
[f (n) = 1, −1, −2, 2 for n = 0, 1, 2, 3 and is zero otherwise]
10. Use power series to find the inverse z-transform of F (z) = 1/[z(z − 1)].
[f (0) = f (1) = 0, f (n) = 1, n ≥ 2 or, equivalently, f (n) = u(n − 2)]
11. Use power series to find the inverse z-transform of F (z) = 2z/(2z − 1).
[f (n) = 2−n]
12. Use power series to find the inverse z-transform of F (z) = 1/(z + b).
[f (n) = (−b)n−1 + δ(n)/b]
13. Using power series, or otherwise, find the inverse z-transform of F (z) =
(z + 2)/(z + 1).
[f (n) = 1, 1, −1, 1, −1, . . ., or, equivalently, f (n) = 2δ(n) + (−1)n+1]
14. Use power series to find the inverse z-transform of F (z) = 1/(z 2 − 1).
[f (n) = 0, 0, 1, 0, 1, 0 . . . or f (n) = (1 + (−1)n)/2 − δ(n)]
136
Chapter 16
The Z transform: applications
16.1
Introduction
Having introduced a lot of new material about z-transforms in the
previous two chapters, it is now time to see why they are important,
by looking at what they can enable us to do. In Electronic Engineering, you are most likely to encounter z-transforms in Control Theory
and Digital Signal Processing applications. We will therefore look at
some very basic filtering problems in this chapter, but before that, we
will discuss the use of the z-transform to solve difference equations.
16.2
Difference equations
Before we see how to solve them, here a a few examples of difference
equations.
1. The present value of an annuity after n periods, x(n), obeys the
difference equation x(n + 1) = (x(n) + P )/(1 + r), where r is
the interest rate and P is the amount of each payment.
2. The repeated drug dose model, in which the amount of the drug
still in the body at the n-th period, x(n), obeys the difference
equation x(n + 1) = ax(n) + b. Here, a is the fraction of the
drug which is degraded by the body during one period, and b is
the dose given per period.
137
3. The cumulative average of a sampled signal. Let the sampled signal be x(n), with n = 0, 1, 2, . . ., so that the cumulative average,
y(n), is defined as
n
1 X
y(n) =
x(n).
n + 1 i=0
We divide by n + 1 because there are n + 1 values on the right
hand side. Suppose we want to compute y(n) for all n: this
formula seems to be telling us that we need to store all n + 1
values x(0) . . . x(n) in order to do this. Eventually we will run
out of memory. We can get around this problem by being clever
Pn
and noting that (n + 1)y(n) = i=0 x(n), and so
(n + 2)y(n + 1) = x(n + 1) +
n
X
x(n) = (n + 1)y(n) + x(n + 1).
i=0
Hence,
(n + 1) y(n) + x(n + 2)
.
n+2
This is a more complicated difference equation than the previous
two, and its solution depends on the entire sequence x(n).
y(n + 1) =
16.3
Solving difference equations
Difference equations are in several ways like differential equations, the
main difference being that, in a differential equation, the unknown
function, x(t), say, is a function of a continuous variable, t. The
continuous variable t can take on any real value. By contrast, in a
difference equation, the unknown function, x(n), say, is a function of
a discrete variable n, which it is assumed will only take on the values
0, 1, 2, . . ., the non-negative integers (although the solution may in
fact be meaningful for all integers).
138
Solving a difference equation poses a similar sort of problem to solving
a differential equation. For example, suppose that the difference
equation is x(n + 1) = 2x(n). A solution, if we can find one, will be
a function x(n) that satisfies this for all integers n ≥ 0. It should be
clear that, for a given value of x(0), we have
x(1) = 2x(0), x(2) = 2x(1) = 22x(0), x(3) = 2x(2) = 23x(0), . . .
and from this you should be able to spot the general pattern, which
is that x(n) = 2nx(0). Note that
• This is a first order difference equation: x(n + 1) is a function of
x(n) only.
• Once we have specified a value of x(0), the solution is determined
for all integers n ≥ 0 — this is just like a first order differential
equation, where we need one initial condition to specify a particular solution.
Thus, suppose that we have an initial condition, x(0) = 5 say. Then
the difference equation x(n + 1) = 2x(n) with x(0) = 5 has the
solution1 x(n) = 5 × 2n.
You might think that was a rather easy problem with an obvious
solution, so consider instead the first order difference equation
Example 16.1
x(n + 1) = 2x(n) + 3n .
This is harder, because we have an additional function of n on the right hand
side. We use the z-transform in a way that should remind you of the use
of Laplace transforms to solve differential equations, by going through the
following steps:
1. Find the z transform of the difference equation. In this case, we have
zX(z) − zx(0) = 2X(z) + z/(z − 3).
The general solution is x(n) = x(0)2n ; the particular solution, when x(0) = 5, is x(n) = 5 × 2n . Even the
terminology is the same as for differential equations.
1
139
Note that we have used the Time Advance Property (see page 126) to find
the z-transform of x(n + 1).
2. Solve this equation for X(z):
X(z) =
z
z
x(0) +
.
z−2
(z − 3)(z − 2)
3. If necessary, manipulate this expression so that the inverse z-transform
can easily be found. In this case, it is best is to use partial fractions for
the second term; we then have
z
z
z
=
−
(z − 3)(z − 2) z − 3 z − 2
(check this) and so
F (z) =
z
z
z
x(0) +
−
.
z−2
z−3 z−2
4. Now find the inverse z-transform, which will give us x(n). In this case,
x(n) = 2nx(0) − 2n + 3n = 2n (x(0) − 1) + 3n.
(Look back at example 15.7 in the previous chapter if you need to remind
yourself of the i.z.t. of z/(z − a).)
You can easily check this: if it’s true, then, for any x(0),
x(n + 1) − 2x(n) = 2n+1(x(0) − 1) + 3n+1 − 2n+1(x(0) − 1) − 2 × 3n
= 3 × 3n − 2 × 3n = 3n
which is what it should be, according to the difference equation.
As a second example, let’s look at the repeated drug dose model.
Example 16.2 Find the general solution to the difference equation x(n+1) =
ax(n) + b. Under what conditions does x(n) tend to a finite limit as n → ∞,
and what is this limit?
Bear in mind that the constant b on the right hand side, as far as the ztransform is concerned, is u(n)b. Then the z-transform of this equation is
zX(z) − zx(0) = aX(z) +
140
bz
z−1
(again, using the time advance property) so
z
bz
z
b
z
z
x(0) +
=
x(0) +
−
X(z) =
z−a
(z − a)(z − 1) z − a
1−a z−1 z−a
!
where we have used partial fractions to get the last form. Hence,
b (u(n) − an )
x(n) = a x(0) +
.
1−a
It is clear from this expression that x(n) tends to a finite limit as n → ∞
only if |a| < 1 (so an → 0), and then the limit is b/(1 − a) — remember that
u(n) = 1 for n ≥ 0.
You could also obtain this last result by applying the Final Value Theorem to
X(z): try it and see.
n
As a third example, let’s try a second order difference equation.
You’ll need to remember the Time Advance Property with an advance of 2 steps as well as 1 step.
Example 16.3 Find the general solution to the difference equation x(n+2) =
2x(n + 1) + 3x(n), with general initial conditions (that is, x(0), x(1) can be
anything). Under what conditions does this solution not blow up as n → ∞?
Follow through the steps in the usual way. The z-transform of the equation
is
z 2 X(z) − z 2 x(0) − zx(1) = 2X(z) − 2zx(0) + 3X(z).
Hence,
and so
z 2 − 2z − 3 X(z) = (z + 1)(z − 3)X(z) = z(z − 2)x(0) + zx(1)
z(z − 2)
z
+ x(1)
.
(z + 1)(z − 3)
(z + 1)(z − 3)
As usual, we now need partial fractions. Again, we seek fractions of the form
z/(z ± a), because we know that this form is in the tables — it has an inverse
(∓a)n. The partial fraction form is
X(z) = x(0)
x(1)
z
z
x(0)
3z
z
+
X(z) =
+
−
4
z−3 z+1
4
z−3 z+1
from which it is easy to see that
!
x(n) = (−1)n
x(0) + x(1)
3x(0) − x(1)
+ 3n
.
4
4
141
!
This will blow up (i.e. x(n) will tend to ∞ as n increases), because of the
3n term, unless x(1) = −x(0). In that case only, x(n) = (−1)nx(0), which
remains finite for all n.
Finally, another second order difference equation.
Example 16.4 Find the general solution to the difference equation
x(n + 2) − 2x(n + 1) cos a + x(n) = 0,
with general initial conditions. Here, a is a constant.
Start in the usual way. The z-transform of the difference equation this time
is
z 2 X(z) − z 2 x(0) − zx(1) − 2z cos aX(z) − 2z cos ax(0) + X(z) = 0
so
x(0)z 2 + z x(1) − 2z x(0) cos a
.
z 2 − 2z cos a + 1
Looking in the tables, you will recognise the denominator from the z-transforms
of both sin an and cos an. Let us guess that the inverse z-transform of X(z)
is of the form x(n) = c1 sin an + c2 cos an, where c1 , c2 are constants to be
determined. Now, from the tables, Z [c1 sin an + c2 cos an] =
X(z) =
c1 z sin a + c2 z(z − cos a) c2 z 2 + c1 z sin a − c2 z cos a
=
,
z 2 − 2z cos a + 1
z 2 − 2z cos a + 1
and matching the coefficients of z in the numerator, to those in the expression
for X(z) (since the denominators are the same), we have
c2 = x(0)
and
c1 sin a − c2 cos a = x(1) − 2x(0) cos a
which we can solve for c1 , c2 . This gives, finally,
x(n) =
16.4
x(1) − x(0) cos a
sin an + x(0) cos an.
sin a
A FIR filter
We now look very briefly at a simple digital signal processing (DSP)
application of the z-transform: a Finite Impulse Response (FIR)
142
filter. The purpose of this section is to give you just a taste of why
the z-transform is important in DSP applications — you will learn
much more about this if you choose the relevant Year 3/MSc options.
The filter we will discuss is a band-stop filter, which is one that attenuates all frequencies within a range, while letting all other frequencies through. We have in fact already seen, in the last example of the previous section, the basis on which a band-stop filter works. The basis is this: imagine that a signal of the form
x(n) = c1 sin an + c2 cos an, for a given a, is the input to a system which computes y(n) = x(n) − 2x(n − 1) cos a + x(n − 2). Then
we know from example 16.4 that the output sequence y(n) will be
zero. In other words, this system filters out the particular signal x(n)
defined above (and signals which are close by).
Input
1
0
-1
0
1.5
200
300
400
500
300
400
500
0.28
1.0
Output
100
0.26
0.5
0.24
0.0
-0.5
0
100
200
n
Figure 16.1: The filtering example. Top: input signal x(n) = x
√1 (n)+x2 (n) = 0.5 sin 0.5n+cos 0.05n.
Bottom: the output signal y(n) = y1 (n) + y2 (n) = x(n) − 3x(n − 1) + x(n − 2), showing that
x1 (n) has been almost filtered out.
To see how this works, let us set a = π/6, so that 2 cos a =
143
√
3.
Then the output y(n), for n ≥ 2, will be given by
√
y(n) = x(n) − 3x(n − 1) + x(n − 2).
We know, from example 16.4, that if the input is x(n) = c1 sin nπ/6+
c2 cos nπ/6, for any constants c1 and c2, then the output will be zero.
This is easily checked. Let’s
√ set c1 = 1, c2 = 0 to simplify things.
Then y(n) = sin nπ/6 − 3 sin(n − 1)π/6 + sin(n − 2)π/6, and
expanding the terms, we have
√
√
y(n) = sin nπ/6 − 3 sin nπ/6 cos π/6 − 3 cos nπ/6 sin π/6+
sin nπ/6 cos π/3 − cos nπ/6 sin π/3
√
√ √
√ = 1 − 3 3/2 + 1/2 sin nπ/6 + − 3/2 + 3/2 cos nπ/6 = 0
as it should. This would work for any values of c1, c2 .
What happens if the input consists of two signals, one close to
sin nπ/6 and one far away? Let’s take, as an example, x(n) =
x1(n) + x2(n), where x1(n) = 0.5 sin 0.5n and x2(n) = cos 0.05n
— see figure 16.1, top. Then x1(n) is close to sin nπ/6 (because 0.5
is close to π/6 ≈ 0.524). On the other hand, the x2(n) is far away
from cos nπ/6.
In this case, the output y(n) consists of a phase- and amplitudemodified version of x2(n), with almost no trace of x1(n): x1(n) has
effectively been filtered out — see figure 16.1, bottom. We can calculate the amplitudes of both components as follows. Since this is
a linear system, superposition applies and we can write the output
y(n) = y1(n)+y2 (n), where y1(n)√is the response to x1(n) and y2 (n),
to x2(n). Then y1 (n) = x1(n) − 3x1(n − 1) + x1(n − 2)
√
= 0.5 sin 0.5n − 0.5 3 sin 0.5(n − 1) + 0.5 sin 0.5(n − 2)
√
= 0.5(1 − 3 cos 0.5 + cos 1) sin 0.5n +
√
0.5( 3 sin 0.5 − sin 1) cos 0.5n
= 0.0102 sin 0.5n − 0.0056 cos 0.5n.
144
√
The amplitude of y1(n) is therefore 0.01022 + 0.00562 = 0.0116.
You should also be able to estimate this amplitude from the magnified
portion in figure 16.1.
Carrying out the same calculation for x2(n) shows that the amplitude
of y2 (n) is 0.26, about 20 times bigger than the amplitude of y1 (n),
which again agrees with figure 16.1.
Problems, chapter 16
1. Use z-transforms to solve the following difference equations:
(i) x(n + 1) = 3x(n) with x(0) = 5
(ii) x(n + 1) = −2x(n) + 3u(n) with x(0) = 0
(iii) x(n + 2) = 5x(n + 1) − 6x(n) with x(0) = u(n), x(1) = −1.
[(i) x(n) = 5 · 3n , (ii) x(n) = 1 − (−2)n, (iii) x(n) = 4 · 2n − 3 · 3n]
2. Solve x(n + 2) = 3x(n + 1) − 2x(n) by the z-transform method, with
general initial conditions. What relation must there be among the initial
conditions in order for the solution to be constant for n ≥ 0?
[x(n) = (2x(0) − x(1))u(n) + (x(1) − x(0))2n; constant if x(1) = x(0).]
3. Use the z-transform to solve the present value of an annuity difference
equation x(n), which is x(n + 1) = (x(n) + P )/(1 + r).
[x(n) = (1 + r)−n (x(0) − P/r) + P u(n)/r]
4. Find the difference equation whose solution is x(n) = 5n − 3n, given that
it is of the form x(n + 2) + Bx(n + 1) + Cx(n) = 0. Find the initial
conditions that give rise to this solution.
[x(n + 2) − 8x(n + 1) + 15x(n) = 0, x(0) = 0, x(1) = 2]
5. Solve the difference equation x(n + 1) + 3x(n) = (−1)n.
[x(n) = 21 (−1)n + (−3)n(x(0) − 21 )]
6. A digital filter computes its output, y(n), from its input, x(n), according
to the formula y(n) = x(n) − x(n − 1) · 2 cos a + x(n − 2).
(i) Find a such that this system filters out a signal of the form x(n) =
c1 sin nπ/3.
145
(ii) Let the input signal be x1(n) + x2(n) = sin n + 8 sin(n/12). The
output is of the form y(n) = y1(n) + y2(n) where y1(n) = A1 sin(n + φ1)
and y2(n) = A2 sin(n/12 + φ2 ). Find the constants A1, A2, φ1 and φ2.
Sketch the input and output waveforms.
[(i) a = π/3, (ii) A1 = 0.0806, φ1 = −1.00, A2 = 7.94, φ2 = −0.0833]
146
Chapter 17
Matrices I
17.1
The basics
A matrix is an
n × m array of numbers; n rows, m columns.
Examples:
1.
1 0

0 1

2.
3.
4.

. . . 2 × 2 unit matrix
v
 1


v 
 2


v3


. . . 3 × 1 column vector
1.4
2
4

5 1 − 3j 2


a
a12 
 11
a21 a22
. . . 2 × 3 matrix


147
. . . general 2 × 2 matrix
We use the convention upper case A, B, C etc. for matrices, underlined letters a, b, c etc. for vectors and ordinary letters, a, b, c etc.
for scalars.
We now go through some of the rules of matrix algebra.
17.2
Matrix equality
Two matrices A and B with the same number of rows and columns
are said to be equal to each other if and only if all their corresponding
elements are equal. For instance, if A and B are 2 × 2 matrices, then
they are equal only if
a11 = b11, a12 = b12, a21 = b21, and a22 = b22.
17.3
Matrix addition
If two matrices A and B have the same number of rows and columns,
they can be added by adding together corresponding elements. For
example, if
a
a
A =  11 12 
a21 a22


then

a + b11 a12 + b12 
A + B =  11
.
a21 + b21 a22 + b22


17.4
Matrix multiplication
17.4.1
Scalar × matrix = matrix
Given
and
b
b
B =  11 12 
b21 b22

a
a
A =  11 12 
a21 a22


and
148
c = a scalar
then
ca
ca12 
cA =  11
ca21 ca22
i.e. the result is another matrix. Just multiply each element by c.
17.4.2
Given
then


Matrix × vector = vector
a
a
A =  11 12 
a21 a22


and
v
v =  1
v2


a v + a12v2 
Av =  11 1
a21v1 + a22v2
i.e. the result is a column vector.
N.B. Number of columns in A must equal number of rows in v.
17.4.3
Given
then


Matrix × matrix = matrix
a
a
A =  11 12 
a21 a22


and
b
b
B =  11 12 
b21 b22


a b + a12b21 a11b12 + a12b22 
AB =  11 11
a21b11 + a22b21 a21b12 + a22b22
i.e. the result is a 2 × 2 matrix.
N.B. Number of columns in A must equal number of rows in B.
Note also that AB does not equal BA in general — ‘matrices do not
commute’.
(See problem 1)
17.5


Determinants
You may have met these before. To recap, the determinant of a 2 × 2
matrix A, written as det A or |A|, is
det A = a11a22 − a12a21
149
i.e. the determinant of a matrix is a number.
What about a 3 × 3 matrix? This can be calculated as three 2 × 2
determinants as follows. Given


a
a
a
12
13 
 11



a
a
A=
 a21
22
23 


a31 a32 a33
then






a
a
a
a
a
a
22
23
21
23
21
22
−a

+a


det A = a11 det 
12 det
13 det
a32 a33
a31 a33
a31 a32
This is known as the Laplace development of a determinant.
Remember it as
1. Pick a row or column (used the first row in the above).
2. Taking each element in this row or column in turn, delete the
row and column in which it occurs, and find the determinant of
the remaining 2 × 2 matrix.
3. Multiply this determinant by the element in (2), with signs


+
−
+
.
.
.






+
−
.
.
.
−





+

−
+
.
.
.


 .

.
.
.. .. ..
and add up the three resulting numbers to obtain the determinant.
This works for n × n matrices, but involves a lot of work for n > 3.
(See problems 2 and 3)
17.6
Solving two linear equations
Matrix algebra provides a systematic way of solving a set of simultaneous linear equations. For example, given two linear equations
a11x1 + a12x2 = w1
150
(17.1)
and
a21x1 + a22x2 = w2
put these into matrix notation by defining






a
a
x
w
A =  11 12 
x =  1
w =  1
a21 a22
x2
w2
so that equations 17.1 and 17.2 together become





x
w
a
a
12   1 
 11
=  1
x2
w2
a21 a22
or, in matrix notation
Ax = w.
(17.2)
(17.3)
(See problem 4)
Now, suppose that a11 . . . a22 and w1 , w2 are given, with w1 , w2 not
both 0. To solve for x2,
1. Multiply 17.1 by a21 and 17.2 by a11 to get
a11a21x1 + a12a21x2 = a21w1
a11a21x1 + a11a22x2 = a11w2.
2. Subtract these to get
x2(a11a22 − a12a21) = a11w2 − a21w1 .
3. Note that a11a22 − a12a21 = det A so
a11w2 − a21w1
.
(17.4)
x2 =
det A
Similarly for x1 :
a22w1 − a12w2
.
(17.5)
x1 =
det A
Look at 17.4 and 17.5: they are same form as 17.1 and 17.2. We can
write 17.4 and 17.5 together in matrix notation:





1  a22 −a12   w1   x1 
=
.
w2
x2
det A −a21 a11
151
This equation looks just like 17.3, but with x and w swapped and
matrix A replaced by
1  a22 −a12 
.
det A −a21 a11


This new matrix is known as the inverse of A, written A−1 or inv
A. It has the property that
A−1 A = AA−1
17.6.1
1 0
= I, the unit matrix, 
.
0 1


Properties of the unit matrix
1. The n × n unit matrix has 1s down the leading diagonal and 0s
everywhere else.
2. If A is any n × n matrix and I is the n × n unit matrix, then
AI = IA = A (just like multiplying numbers by 1).
3. For any column vector v with n rows, Iv = v.
(See problem 5)
17.7
Application — Z and Y parameters
The 2-port Z parameters, z11 . . . z22, are defined with reference to
the figure below.
i1
v1
i2
Two port network
v2
The Z parameters are impedances z11 . . . z22 such that
v1 = z11i1 + z12i2
152
and
v2 = z21i1 + z22i2
or, in matrix/vector notation
v = Zi
(17.6)
where
v
v =  1,
v2

i
i =  1,
i2



z
z
Z =  11 12  .
z21 z22


The Y parameters, y11 . . . y22, are admittances (reciprocal impedances)
and are defined by:
(17.7)
i = Y v.
Now, pre-multiplying 17.6 by Z −1 gives
Z −1 v = Z −1 Zi = Ii = i
and, comparing with 17.7
Y = Z −1
Hence,
the Y (admittance) matrix is the inverse of the Z (impedance)
matrix (and vice versa).
(See problem 6)
We shall have more to say about Z and Y parameters in the next
chapter.
153
Problems, chapter 17
1. If
b
a
a12
and B = 11
A = 11
b21
a21 a22
show that AB does not equal BA in general.
!
b12
b22
!
2. Evaluate the determinant of the following matrices:
(a)
2 4
3 5
!
jωL
R
1/R −jωC
(b)

4.3 −2.2 1.2


1.4
5.9 
(e) 

 2.1
7.3 4.9 −4.1
a 4a 2b


d e
(d) 

 c
2a 8a 4b


1 2 3



2
3
1
(c) 


3 1 2

!


[(a) −2, (b) ω 2 LC − 1, (c) −18 (d) 0 (e) −262.607]
3. Evaluate the determinant of
5 3
4


 1 −3
0 


7 2 −8


by expanding (a) along the second row, (b) down the first column.
[(a) and (b) 236]
4. If
calculate A−1
a11 a12
A=
a21 a22
and show that AA−1 = A−1A = I, the 2 × 2 unit matrix.
!
5. Put the following three linear equations into matrix form Ax = w
2x1 − x2 + 5 =
3x3
−x3 + 8
= x1 + x2
x2 + 3x3
= 2 − x1
6. The admittance matrix for a 2-port network is
−3j
0.5
Y =
2
−0.5j
!
What is its impedance matrix?
[z11 = 0.2j, z12 = 0.2, z21 = 0.8, z22 = 1.2j]
154
Chapter 18
Matrices II
18.1
Matrix inversion: Pi to T conversion
Given Ya, Yb and Yc in the following Pi configuration
Yb
i1
v1
Ya
i2
Yc
v2
the problem is to find Za , Zb and Zc such that the following T circuit
is equivalent to the Pi.
Za
Zc
Zb
Matrix manipulation provides a systematic solution to this problem,
which known as the Pi–T or Delta–star transformation.
The first thing to remember is that it is easy to write down
Y -parameters for a Pi circuit,
Z-parameters for a T circuit.
155
So, what are the Y -parameters for the Pi circuit? The definition we
need is i = Y v, or, in full,
i1 = y11 v1 + y12 v2
i2 = y21v1 + y22v2.
So, for instance
i1
v1
y11 =
when
v2 = 0
i.e.
y11 = Ya + Yb .
Similarly,
y12 =
i1
v2
when
v1 = 0
i.e.
y12 = −Yb.
Repeating for y21 and y22 gives
Y + Yb
−Yb 
Y = a
.
−Yb
Yb + Yc


Now, what are the Z-parameters for T circuit? By the same method,
we find that


Z
+
Z
Z
a
b
b
.
Z=
Zb
Zb + Zc
Now, we know that Z = Y −1 so
1
Z + Zb
Zb 
Z= a
=
Zb
Zb + Zc
det Y


Y + Yc
Yb 
 b
Yb
Ya + Yb


where det Y = (Ya + Yb )(Yb + Yc) − Yb2 = YaYb + YbYc + YcYa.
Two matrices are equal only when all their elements are equal, so in
order for the Pi and T circuits to be equivalent, Y −1 for the Pi must
156
be equal to Z for the T, and so, considering each element in turn,
the following must hold
Zb = Yb/ det Y
Za + Zb = (Yb + Yc)/ det Y,
giving
Za = Yc/ det Y
Zb + Zc = (Ya + Yb)/ det Y,
giving
Zc = Ya/ det Y.
which is the required answer.
18.2
Solving n linear equations
Under certain conditions, for known matrix A and known vector w,
the set of n linear equations
Ax = w
(18.1)
can be solved for unknown vector x:
x = A−1 w.
The condition is that the inverse of A exists. Looking back at the last
section of the previous chapter, we see that calculating the inverse of
A requires us to divide by det A. The condition for the inverse of A
to exist is therefore that
det A is not equal to 0.
Provided that this condition is met, we can find the inverse of A (in
principle) and hence solve the n equations 18.1, as long as w 6= 0.
Note that this is true regardless of how many equations there are.
18.3
Inverting an n × n matrix
We have seen how to invert a 2 × 2 matrix. How is this generalised
to larger matrices?
There are several ways of doing this, one of which is known as the
adjoint method, which is best shown by example.
157
Example 18.1 Use the adjoint method to invert a 3 × 3 matrix
a11


A =  a21
a31
The inverse of A is given by

a12
a22
a32
a13

a23 
.
a33

AdjointA
det A
where the adjoint of A is calculated in two steps:
A−1 =
1. Replace each element of A with its cofactor. To do this, for each element
of A, cross out the row and column in which it appears, and find the determinant of the remaining matrix. Multiply this by +1 or −1, according
to its position.
e.g. The cofactor of a11 is a22 a33 − a23 a32 .
e.g. The cofactor of a23 is −(a11a32 − a12 a31).
The signs we need to multiply by are
+ −
− +
+ −
+
−
+
for a 3 × 3 matrix.
The matrix of cofactors of A is therefore
a22 a33 − a23a32

 −(a a − a a )
12 33
13 32

a12 a23 − a13a22

−(a21a33 − a23 a31)
a11 a33 − a13a31
−(a11a23 − a13 a21)
a21a32 − a22 a31

−(a11a32 − a12 a31 ) 

a11a22 − a12 a21

2. Transpose the matrix of cofactors — that is, reflect it about the leading
diagonal, to give
a22 a33 − a23 a32

AdjA = 
 −(a21 a33 − a23 a31 )
a21 a32 − a22 a31

−(a12a33 − a13 a32)
a11 a33 − a13a31
−(a11a32 − a12 a31)
a12 a23 − a13a22

−(a11 a23 − a13 a21 ) 

a11 a22 − a12a21

Dividing this by det A gives the inverse of A, provided that det A 6= 0.
This method extends to n × n matrices, but involves a lot of work
for n > 3.
158
18.4
The equation matrix × vector = 0
We have solved Ax = w when w is not the zero vector, 0. What
about the equation Ax = 0? (N.B. By ‘0’ I mean the column vector
with zeros everywhere.) That is, for a 2 × 2 matrix,
a
a
A =  11 12  ,
a21 a22


x
x =  1
x2


and
so Ax = 0 becomes the pair of equations
0
0= 
0


a11x1 + a12x2 = 0
(18.2)
a21x1 + a22x2 = 0.
(18.3)
and
There are two possibilities. The first is
(a) x1 = x2 = 0 (obviously).
However, another solution may also exist: first find x1 from 18.2:
a12x2
(18.4)
x1 = −
a11
Substitute this in 18.3


a21a12x2
a
a
21 12
−
+ a22x2 = −
+ a22 x2 = 0
a11
a11
so we can see that x2 is forced to be 0 unless
a21a12
+ a22 = 0
−
a11
or, in other words, if
a11a22 − a12a21 = 0.
Recognise this? It’s det A, so the second possibility is that
(b) det A = 0, in which case x1 and x2 are not forced to be 0.
This is a general condition and applies to n linear equations, not just
two.
159
In case (b), det A = 0, only the ratio x1/x2 is defined by 18.2
and 18.3. From 18.2 this ratio is
x1
a12
=− .
x2
a11
18.5
Application of matrix × vector = 0
L
L
i1
C
i2
C
C
Figure 18.1: An application of Ax = 0.
From Kirchhoff’s voltage law, we know that the sum of the voltages
around closed loops is zero, so for the circuit in figure 18.1
Loop 1:
i1 − i2
1
+ jωL +
=0
i1 
jωC
jωC
(18.5)
Loop 2:
1
i2 − i1
+ jωL +
= 0.
i2 
jωC
jωC
(18.6)
and
In matrix form,




Zi = 0
so
1  2 − ω 2 LC
−1
i
0
 1 =  .
i2
−1
2 − ω 2LC
0
jωC
The solution is either
(a) i1 = i2 = 0 (true, but trivial)
or
(b) det Z = 0, which gives



(2 − ω 2 LC)2 − 1 = 0
160


(18.7)
i.e.
2 − ω 2 LC = ±1
so
1
3
,
.
LC LC
This condition gives the two resonant frequencies of the circuit. As
stated above, i1 and i2 aren’t fixed, but their ratio is: from either 18.5
or 18.6,
i2
= 2 − ω 2LC = ±1.
i1
The interpretation of this is that current i1 can have any magnitude;
then i2 is of the same magnitude, but with the same or opposite sign
(i.e. circulates in the same or the opposite direction).
ω2 =
18.6
Eigenvalues and eigenvectors
The equation
(A − λI)x = 0
(18.8)
where A is an n × n matrix, I is the n × n unit matrix, λ is a
number and x is a vector, arises in problems in circuit theory and
other branches of electronic engineering.
Note the following about the nontrivial solutions λ and x to this
equation:
• There are n values of λ, which are known as the eigenvalues of
A.
• The eigenvalues can be real or complex, depending on A.
• To each eigenvalue there corresponds a vector x, known as an
eigenvector of A.
• If x1 is an eigenvector, then ax1, where a is any constant, is also
an eigenvector.
161
Example 18.2 Find the eigenvalues and corresponding eigenvectors of the
matrix
!
2 1
A=
.
−2 5
Answer As we saw in the previous section, the equation Ax = 0 only has nontrivial solutions if det A = 0. Hence, nontrivial solutions to equation (18.8)
can only be found if
det
"
2 1
1 0
−λ
−2 5
0 1
!
!#
= 0.
We can solve this for λ, the eigenvalues:
det
"
2 1
1 0
−λ
−2 5
0 1
!
!#
2−λ
= det
−2
1
5−λ
!
= (2 − λ)(5 − λ) + 2 = λ2 − 7λ + 12 = 0
which has solutions
λ = 3, 4.
To each of these values of λ there corresponds an eigenvector x, which is
defined such that
(A − λI)x = 0.
Taking the eigenvalue λ = 3 gives
[A − 3I]x =
"
2 1
3 0
−
−2 5
0 3
!
!#
Multiplying this out gives
−x1 + x2 = 0
and
x1
x2
!
−1 1
=
−2 2
!
x1
x2
!
= 0.
− 2x1 + 2x2 = 0
both of which tell !us that x1 = x2 . Hence, the eigenvector corresponding to
1
λ = 3 is x = a
for an arbitrary constant a.
1
The !eigenvector corresponding to λ = 4 is calculated in the same way. It is
1
b
with b another arbitrary constant.
2
18.7
Applications of eigenvalues/eigenvectors
The resonance problem considered in section 18.5 can also be treated
as an eigenvalue problem. Multiplying equation 18.7 by jωC gives
162
2
2
−
ω
LC
−1
i
0

 1 =  
i2
−1
2 − ω 2 LC
0

so


[

2 −1 

− ω 2LC
−1 2
↑
↑
A
−
λ


1 0

0 1
↑
I





]


i
 1
=
i2
↑
x
=


0
 
0
↑
0.


You should recognise this as the eigenvalue equation again. The
eigenvalues of the matrix A are LC× (the resonant frequencies of
the circuit)2. The eigenvectors of A are the currents i1 and i2. As
pointed out before, only the ratio i1/i2 is fixed, not their actual values
— which is also true for eigenvectors.
For another application, see problems.
163
Problems, chapter 18
1.∗ The T-Pi transformation. Use matrix algebra to find the values of Ya , Yb
and Yc , in terms of Za , Zb and Zc , that make the following two circuits
equivalent:
Za
Yb
Zc
Ya
Zb
Yc
[Ya = Zc /D, Yb = Zb /D, Yc = Za /D, where D = Za Zb + Zb Zc + Zc Za ]
2. (a) Put the following equations into the form Az = b:
2z1 − 3z2 = 4
9z2 − 6z1 = −12
Find det A. Can you solve the equations? Why?
(b) Put b = 0. Now what can you say about z1 and z2 ?
[(a) det A = 0. No. The 2nd eqn. is just −3 × the first, so it gives us no
new information; det A = 0 is telling us this. (b) z1 /z2 = 3/2.]
3. Consider the following circuit:
2C
2L
i1
C
L
i2
L
Find the resonant frequencies and the corresponding values of i1/i2 .
√
√
√
[ω 2 = (4 ± 6)/(10LC), i1 /i2 = (2 ∓ 2 6)/(4 ± 6) = −0.449, 4.449]
4. Find the eigenvalues and corresponding eigenvectors for the following
matrices:
(a)
4 3
2 5
164
!
(b)
2 3
−1 2
!
(c)
4 0 1


 0
4 7


−5 1 3


√
√
[(a) 2, (−3/2, 1) and 7, (1, 1), (b) 2 ± j 3, (1, ±j/ 3)
(c) 4, (1, 5, 0); 2, (1, 7, −2) and 5, (1, 7, 1)]
5.∗ (a) The two port network below has transmission parameter matrix T ,
which is defined so that
v2
i2
!
t
= 11
t21
t12
t22
i1
v1
v1
.
i1
!
!
i2
v2
Two port network
Z0
An external impedance Z0 is connected to port 2. If Z0 is chosen so that
the input impedance (v1/i1) is also Z0 , show that the eigenvalues of T
are the ratios of i2 /i1.
What do the eigenvectors of T correspond to?
(b) The network above has parameters
8 3
T =
.
1 8
!
Find the values of Z0 such that input impedance is Z0 and the corresponding current ratios, i2 /i1.
√
√
[Z0 = ± 3, i2/i1 = 8 ± 3]
165
Chapter 19
The wave equation
Aims
By the end of this chapter, you should understand
• what a partial differential equation is
• how to derive the wave equation for a transmission line
• how to find a general solution to the wave equation
• why signals propagate along lines.
19.1
Partial differential equations
You have met differential equations in the first year and they appear
again in this course, in the Laplace transform chapters. In this chapter, we derive and discuss a partial differential equation, known as
the wave equation, that crops up frequently. The one-dimensional
wave equation is
2
∂ 2v
2∂ v
=c
.
∂t2
∂x2
Points to note about it are:
(19.1)
• The unknown function, v = v(x, t), is a function of more than
one variable. In this case v(x, t) is a function of distance, x, and
time, t.
166
• In ordinary differential equations, the unknown function depends
on only one variable; in partial differential equations, the unknown function depends on two or more variables.
• The constant c2 has dimensions of velocity squared.
Other examples of partial differential equations include
• Laplace’s equation
∂ 2v ∂ 2v ∂ 2v
∇ v(x, y, z) = 2 + 2 + 2 = 0
∂x
∂y
∂z
2
• The three-dimensional wave equation
∂ 2v
= c2 ∇2v(x, y, z, t)
2
∂t
both of which arise in electromagnetic problems.
In this chapter, we concentrate on the one-dimensional wave equation, 19.1.
19.2
Derivation of the wave equation
Inner conductor
Earthed shield
Figure 19.1: A piece of coaxial cable.
167
In this section we derive equation 19.1 for a coaxial cable which has
an inner conducting core and an earthed outer shield, as shown in
figure 19.1. We assume that there is no leakage between the inner
conductor and the shield, and that the conductor has zero resistance.
Suppose that the inner conductor has inductance L per unit length
and capacitance C between it and the shield, also per unit length.
Then a section of cable of length δx has inductance Lδx and capacitance Cδx.
Consider first the inductive behaviour of a length δx of cable, illustrated below.
v
v + δv
✛
i
δx
✲
✲
x
x + δx
The voltage of the inner conductor is v at a distance x along the line,
and v + δv at a distance x + δx. If the current is i, then, from the
definition of inductance, we get
∂i
.
∂t
(N.B. signs). Rearranging and letting δx → 0 gives
v − (v + δv) = (Lδx)
−
∂v
∂i
=L .
∂x
∂t
168
(19.2)
Now consider the capacitive behaviour of the same piece of line.
i✲
✛
i +✲δi
δx
✲
v
x
x + δx
The current in the core is i at a distance x along the line, and i + δi
at a distance x + δx. If the voltage is v, then, from the capacitance
equation we get δQ = (Cδx) v, and using the fact that i = dQ/dt
gives
i − (i + δi) = (Cδx)
∂v
.
∂t
Rearranging and letting δx → 0 gives
∂i
∂v
=C .
(19.3)
∂x
∂t
We can now derive the wave equation from 19.2 and 19.3. Differentiating 19.2 with respect to x gives
−
∂ 2i
∂ 2v
− 2 =L
∂x
∂x∂t
and differentiating 19.3 with respect to t gives
∂ 2i
∂ 2v
−
= C 2.
∂t∂x
∂t
169
Combining these and using the fact that
∂2i
∂t∂x
=
∂2i
∂x∂t
gives
∂ 2v
1 ∂ 2v
=
.
∂t2
LC ∂x2
(19.4)
This is the wave equation. Remembering that L and C are the inductance/capacitance per unit length, you should show that 1/(LC)
has the dimensions of velocity squared. In fact it can be shown (see
electromagnetism course notes) that
LC = ǫ0ǫr µ0µr .
In an air-filled cable the relative permittivity/permeability, ǫr = µr =
1, so the velocity is
1
c=√
≈ 3.0 × 108 m/s
ǫ0 µ 0
which is the velocity of light.
19.3
The d’Alembert solution of the wave equation
Look again at the wave equation in the form 19.1 in which it was first
given. Suppose f is a function, which is arbitrary except that it can
be differentiated twice. Now consider f (x − c t). Differentiating:1
∂f (x − c t)
= f ′ (x − c t)
∂x
∂f (x − c t)
= −cf ′ (x−c t)
∂t
and
∂ 2f (x − c t)
= f ′′ (x − c t)
2
∂x
and
∂ 2f (x − c t)
= c2 f ′′ (x−c t)
2
∂t
1
I have used f ′ (x − c t) to mean “the derivative of f with respect to its argument, evaluated at x − ct”. For
instance, if f (x − ct) = (x − ct)3 , then f ′ (x − ct) = 3(x − ct)2 and f ′′ (x − ct) = 6(x − ct).
170
Looking at the second derivatives, we can see that f (x − c t) is a
solution to the wave equation. That is,
2
∂ 2f (x − c t)
2 ∂ f (x − c t)
=c
.
∂t2
∂x2
The same is true of another arbitrary, twice-differentiable function,
g(x + c t) (see problems). In fact, the most general possible solution
to 19.1 is the sum of the two:
v(x, t) = f (x − c t) + g(x + c t).
(19.5)
This is known as the d’Alembert solution of the wave equation.
Points to note:
• f (x − c t) represents a wave travelling to the right — in the
direction of increasing x — with velocity c.
• g(x + c t) represents a wave travelling to the left — decreasing
x — with velocity c.
• Both f and g are arbitrary functions — hence, there is a very
wide range of possible solutions to the wave equation.
• Boundary conditions are needed to find f and g for a given situation — see below.
• All of this theory applies equally to plane electromagnetic waves
in free space, waves on stretched strings etc. as well as waves on
coaxial cables.
19.4
Boundary conditions
As is the case with ordinary differential equations, some initial information is needed to solve the wave equation in a particular case. This
171
information is contained in the boundary conditions. Two boundary
conditions are needed:
1. the voltage on the line at t = 0 for all x, which we shall call
V0(x); and
2. the derivative of the voltage with respect to time, also at t = 0
and again for all x, which we shall call W0(x).
Given V0(x) and W0(x) we can find the solution, v(x, t) for all x and
t by finding the functions f and g appearing in 19.5. To put this in
the form of an equation
Given the general solution (f and g arbitrary)
apply boundary conditions, V0(x) and W0(x)
→ Particular solution, v(x, t).
So, how do we find the functions f and g given the functions V0(x)
and W0(x)? Let us write down the two things we know:
V0(x) = f (x − c t) + g(x + c t)|t=0 = f (x) + g(x)
(19.6)
for the voltage at t = 0, and
∂f (x − c t) ∂g(x + c t) = −cf ′ (x) + cg ′(x) (19.7)
W0(x) =
+
∂t
∂t
t=0
for the derivative of voltage w.r.t. t at t = 0. We have two equations
here, which we hope to solve for the two unknown functions f and
g. To do this, first integrate (19.7) and divide by c to get
Z
1 x
−f (x) + g(x) =
W0(s)ds
(19.8)
c x0
172
where x0 is an arbitrary constant which disappears later on. Now,
subtracting (19.6) and (19.8) gives
Z
1
1 x
f (x) = V0(x) −
W0(s)ds
2
2c x0
and adding the same pair of equations gives
Z
1
1 x
g(x) = V0(x) +
W0(s)ds.
2
2c x0
Remembering that the solution, v(x, t), is f (x − c t) + g(x + c t),
we obtain
1
[V0(x − c t) + V0(x + c t)]
2
Z
Z
1 x+ct
1 x−ct
W0(s)ds +
W0(s)ds.
−
2c x0
2c x0
v(x, t) =
However
−
Z
x−ct
W0(s)ds = +
x0
Z
x0
W0(s)ds
x−ct
(swapping the limits changes the sign), so combining the two integrals
gives, finally,
1
1
v(x, t) = [V0(x − c t) + V0(x + c t)] +
2
2c
Z
x+ct
W0(s)ds.
x−ct
(19.9)
To illustrate further, let us consider an example.
Example 19.1 A coaxial cable stretching to ±∞ has on it the initial voltage
V0 (x) =
1
1 + x2
173
and initial time derivative of voltage
W0(x) = 0
at t = 0. Find the function v(x, t), which describes the voltage as a function
of x and t, if the wave velocity for the cable is c.
Answer We are being asked to solve the wave equation with the boundary
condition that, at t = 0,
1
1 + x2
and W0(x) = 0. The required solution to the wave equation is
V0 (x) = v(x, 0) =
1
1
1
v(x, t) = V0 (x − c t) + V0 (x + c t) +
2
2
2c
from 19.9. Hence,
Z
x+ct
0ds
x−ct
1
1
+
2(1 + (x − ct)2 ) 2(1 + (x + ct)2 )
where c is the wave velocity for the cable. You can check that at t = 0,
v(x, 0) = V0 (x), as it should do. You can also check that the time derivative
is zero at t = 0, as required.
v(x, t) =
19.5
What does it all mean?
The above example shows why a signal (a time-varying voltage) fed
into one end of a coaxial cable of length l, comes out at the other
end a time l/c later.
The condition that f and g must be twice differentiable is automatically satisfied for all real voltage waveforms on transmission lines,
even supposedly rectangular pulses. This is because it is not possible
to create a perfectly sharp voltage edge, as this would imply a voltage with infinite first derivative. The reason this cannot happen is
that there is always some stray capacitance Cs around any conductor, and i = Csdv/dt would be infinite for an infinitely sharp edge.
174
No source can generate an infinite current, which would require an
infinite number of electrons to flow past a point (there aren’t enough
electrons in the Universe) or a finite number of electrons to flow with
infinite velocity (which violates relativity). By a similar argument,
v ′′ cannot be infinite because that would imply an infinite di/dt for
Cs. Such a current flowing through any stray inductance would give
rise to an infinite voltage across the inductance.
You can get a good idea what the solution to the example actually looks like by using a simple demonstration I have set up. This
uses the animation function in the algebraic manipulation program
xmaple. It animates the solution to example 19.1.
I recommend you look at this. You can also modify the program
yourself to see how the pictures change.
To see the demonstration
1. Copy
/vol/examples/teaching/engmaths2/wave eqn
to your home directory. Call it wave eqn.
2. Type xmaple.
3. When the xmaple window comes up, type read wave eqn;.
4. After a short while a plot of single-humped function will be produced. Click on the plot; a box appears around it and a second
row of buttons appears. Click on the play button ✄ to see the
two waves move off in opposite directions.
You can also use xmaple to differentiate v(x, t) with respect to t,
substitute t = 0 in the result: simplify(subs(t = 0, diff(v,
t))), and show that this is 0 as it should be.
175
Problems, chapter 19
1. Show that
v(x, t) = A sin(kx − ωt) + B cos(kx + ωt)
with A and B arbitrary constants, is a solution to the wave equation.
What is the wave velocity in this case?
[Velocity = ±ω/k]
2. Show that g(x+c t), where g is an arbitrary, twice differentiable function
g, is a solution of the wave equation.
3. A string is given an initial displacement
V0 (x) =
sin x
x
The initial velocity of the string is everywhere zero. Find v(x, t), the
function that describes the motion of the string after it is released at
t = 0. Describe in words and a sketch what the motion looks like. Show
that v(x, t) has the properties (a) v(x, 0) = V0 (x) and (b) ∂v(x,t)
= 0 at
∂t
t = 0.
To see what this solution looks like, use xmaple to animate v(x, t) for
you.
[v(x, t) = 1/2[sin(x − ct)/(x − ct) + sin(x + ct)/(x + ct)]]
4.∗ The solution to the wave equation with initial displacement V0 (x) and
initial time derivative W0(x) is
V0 (x − c t) + V0(x + c t)
1
v(x, t) =
+
2
2c
Z
x+ct
W0 (s)ds
x−ct
2
If the initial voltage on an infinite coaxial cable is e −x , what must the
initial rate of change of voltage, W0(s), be in order that v(x, t) consists
2
of only a single pulse of height 1, with shape e −x , moving to the right?
You are encouraged to use xmaple to animate this solution too.
2
[W0(s) = 2cse −s ]
176
Area under the Gaussian error curve
The table below gives the area under the Gaussian error curve between 0 and z, where
.
z = |x−x̄|
σ
Example: For z = 1.72, area = 0.4573.
z
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
3.0
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
0
.0000
.0398
.0793
.1179
.1554
.1915
.2257
.2580
.2881
.3159
.3413
.3643
.3849
.4032
.4192
.4332
.4452
.4554
.4641
.4713
.4772
.4821
.4861
.4893
.4918
.4938
.4953
.4965
.4974
.4981
.4987
.4990
.4993
.4995
.4997
.4998
.4998
.4999
.4999
.01
.0040
.0438
.0832
.1217
.1591
.1950
.2291
.2611
.2910
.3186
.3438
.3665
.3869
.4049
.4207
.4345
.4463
.4564
.4649
.4719
.4778
.4826
.4864
.4896
.4920
.4940
.4955
.4966
.4975
.4982
.4987
.4991
.4993
.4995
.4997
.4998
.4998
.4999
.4999
.02
.0080
.0478
.0871
.1255
.1628
.1985
.2324
.2642
.2939
.3212
.3461
.3686
.3888
.4066
.4222
.4357
.4474
.4573
.4656
.4726
.4783
.4830
.4868
.4898
.4922
.4941
.4956
.4967
.4976
.4982
.4987
.4991
.4994
.4995
.4997
.4998
.4999
.4999
.4999
.03
.0120
.0517
.0910
.1293
.1664
.2019
.2357
.2673
.2967
.3238
.3485
.3708
.3907
.4082
.4236
.4370
.4484
.4582
.4664
.4732
.4788
.4834
.4871
.4901
.4925
.4943
.4957
.4968
.4977
.4983
.4988
.4991
.4994
.4996
.4997
.4998
.4999
.4999
.4999
.04
.0160
.0557
.0948
.1331
.1700
.2054
.2389
.2704
.2995
.3264
.3508
.3729
.3925
.4099
.4251
.4382
.4495
.4591
.4671
.4738
.4793
.4838
.4875
.4904
.4927
.4945
.4959
.4969
.4977
.4984
.4988
.4992
.4994
.4996
.4997
.4998
.4999
.4999
.4999
177
.05
.0199
.0596
.0987
.1368
.1736
.2088
.2422
.2734
.3023
.3289
.3531
.3749
.3944
.4115
.4265
.4394
.4505
.4599
.4678
.4744
.4798
.4842
.4878
.4906
.4929
.4946
.4960
.4970
.4978
.4984
.4989
.4992
.4994
.4996
.4997
.4998
.4999
.4999
.4999
.06
.0239
.0636
.1026
.1406
.1772
.2123
.2454
.2764
.3051
.3315
.3554
.3770
.3962
.4131
.4279
.4406
.4515
.4608
.4686
.4750
.4803
.4846
.4881
.4909
.4931
.4948
.4961
.4971
.4979
.4985
.4989
.4992
.4994
.4996
.4997
.4998
.4999
.4999
.4999
.07
.0279
.0675
.1064
.1443
.1808
.2157
.2486
.2794
.3078
.3340
.3577
.3790
.3980
.4147
.4292
.4418
.4525
.4616
.4693
.4756
.4808
.4850
.4884
.4911
.4932
.4949
.4962
.4972
.4979
.4985
.4989
.4992
.4995
.4996
.4997
.4998
.4999
.4999
.5000
.08
.0319
.0714
.1103
.1480
.1844
.2190
.2517
.2823
.3106
.3365
.3599
.3810
.3997
.4162
.4306
.4429
.4535
.4625
.4699
.4761
.4812
.4854
.4887
.4913
.4934
.4951
.4963
.4973
.4980
.4986
.4990
.4993
.4995
.4996
.4997
.4998
.4999
.4999
.5000
.09
.0359
.0753
.1141
.1517
.1879
.2224
.2549
.2852
.3133
.3389
.3621
.3830
.4015
.4177
.4319
.4441
.4545
.4633
.4706
.4767
.4817
.4857
.4890
.4916
.4936
.4952
.4964
.4974
.4981
.4986
.4990
.4993
.4995
.4997
.4998
.4998
.4999
.4999
.5000
LAPLACE TRANSFORMS
F (s) =
Z∞
f (t)e −st dt
0
f (t)
af1 (t) + bf2 (t)
d
dt f (t)
d2 f (t)
dt2
dnn f (t)
dt
F (s)
aF1 (s) + bF2 (s)
sF (s) − f (0)
s2 F (s) − sf (0) − f ′ (0)
sn F (s) − sn−1 f (0) − sn−2 f ′ (0) − . . . − f n−1 (0)
F (s)
s
Rt Ru
F (s)
0 0 f (v) dv du
s2
d F (s)
tf (t)
− ds
n
tn f (t)
n>0
(−1)n d n F (s)
ds
R∞
1 f (t)
0 F (u) du
Rt t
F (s) G(s)
0 f (t − u)g(u) du = f (t) ∗ g(t)
e at f (t)
F (s − a)
f (t − a) with f (t) = 0 for t < 0 e −as F (s)
a>0
1f( t )
a>0
F (as)
a a
Re f (t)
Re F (s)
Im f (t)
Im F (s)
Rt
0
f (u) du
f (t), where f (t + a) = f (t)
f (t), where f (t + a) = −f (t)
1
1 − e −as
Z
a
1
1 + e −as
Z
a
178
f (t)e −st dt
0
0
f (t)e −st dt
Laplace Transforms of Simple Functions
H(t)
δ(t)
tn−1
(n − 1)!
e −at
f (t)
(Heaviside function or unit step)
(Dirac δ-function)
F (s)
1
s
1
1
sn
(n = 2, 3, 4 . . .)
1
(s + a)
1
(s + a)2
1
(s + a)n
1
(s + a)(s + b)
1
s2 + a2
s
s2 + a2
1
s2 − a2
s
s2 − a2
s−1/2
s−3/2
1
s(s2 + a2 )
1
s2 (s2 + a2 )
1
(s2 + a2 )2
s
2
(s + a2 )2
s2
2
(s + a2 )2
s2 − a2
(s2 + a2 )2
te −at
tn−1 e −at
(n − 1)!
e −at − e −bt
b−a
1 sin at
a
cos at
1
a sinh at
cosh at
(πt)−1/2
2( πt )1/2
1 (1 − cos at)
a2
1 (at − sin at)
a3
1 (sin at − at cos at)
2a3
t
2a sin at
1
2a (sin at + at cos at)
t cos at
179
Laplace Transforms of Simple Functions (continued)
−at
√
f (t)
e cos ωt where ω = b −
√
1 −at
2
ω e sin ωt where ω = b − a
H(t − a) Heaviside function starting at t = a
H(t) − H(t − a) rectangular pulse, equal to 1 from 0 to a
1
−at
a (1 − e )
a2
b −at + a e −bt
1
ab 1 − b − a e
b−a
b(α
−
a)
a(α − b) −bt
1 α−
−at
+ b−a e
ab
b−a e
1
−at
− be −bt )
a − b (ae
1
−at
−bt
(α
−
a)e
−
(α
−
b)e
b−a
F (s)
s+a
s2 + 2as + b
1
s2 + 2as + b
1 e −as
s
1 (1 − e −as )
s
1
s(s + a)
1
s(s + a)(s + b)
s+α
s(s + a)(s + b)
s
(s + a)(s + b)
s+α
(s + a)(s + b)
e −bt
e −ct
e −at
+
+
(b − a)(c − a) (c − b)(a − b) (a − c)(b − c)
1
(s + a)(s + b)(s + c)
(α − a)e −at
(α − b)e −bt
(α − c)e −ct
+
+
(b − a)(c − a) (c − b)(a − b) (a − c)(b − c)
s+α
(s + a)(s + b)(s + c)
α sin ωt
cos ωt + ω
s+α
s2 + ω 2
s sin φ + ω cos φ
s2 + ω 2
s+a
s(s2 + ω 2)
sin(ωt + φ)
√
a − a2 + ω 2 cos(ωt + φ)
ω2
ω2
where φ = arctan ω
a
−at
e
1
√
+
sin(ωt − φ)
a2 + ω 2 ω a2 + ω 2
where a > 0, φ = arctan ω
a
e −at cos ωt
1
(s + a)(s2 + ω 2 )
s+a
(s + a)2 + ω 2
s+b
(s + a)2 + ω 2
1 −at
ω e (ω cos ωt + (b − a) sin ωt)
180
Laplace Transforms of Simple Functions (continued)
f (t)
1
1
−at
− √
e sin(bt + φ)
a2 + b2 b a2 + b2
where a > 0, φ = arctan ab
√
1 − q1
e −ζωt sin(ωt 1 − ζ 2 + φ)
2
ω
ω2 1 − ζ 2
where φ = arccos ζ
F (s)
1
s (s + a)2 + b2
α +
e −at
[(a2 − αa + b2 ) sin bt − αb cos bt]
2
a +b
b(a2 + b2 )
s+α
s (s + a)2 + b2
1
s(s + 2ζωs + ω 2 )
2
2
−at
be −ct + [(c −
h a) sin bt − b icos bt] e
b (c − a)2 + b2
−at
sin(bt
+ φ)
e −ct
1
+ √ e
q
2
2 −
2
2
c(a + b ) c (c − a)2 + b2
b a + b (c − a)2 + b2
b
where a > c > 0, φ = arctan ab + arctan a −
c
1
(s + c) (s + a)2 + b2
1
s(s + c) (s + a)2 + b2
(c − α)e −ct α
+
c(a2 + b2 ) c (c − a)2 + b2
q
(α − a)2 + b2
q
+ √
e −at sin(bt + φ)
2
2
2
2
b a + b (c − a) + b
s + α
s(s + c) (s + a)2 + b2
1 (at − 1 + e −at )
a2
1 (1 − e −at − ate −at )
a2
1 (α − αe −at + a(a − α)te −at )
a2
1
s2 (s + a)
1
s(s + a)2
s+α
s(s + a)2
b + arctan b + arctan b
where α > a > c > 0, φ = arctan α −
a
a
a−c
181