Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
1 Contents 0.1 0.2 0.3 0.4 0.5 General textbooks Statistics . . . . . Fourier analysis . . Matrices . . . . . . Other . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Fourier Series in Complex Form 1.1 Functions . . . . . . . . . . . . . . 1.1.1 Periodic functions . . . . . 1.1.2 Even functions . . . . . . . 1.1.3 Odd functions . . . . . . . . 1.1.4 Sine and cosine of period T0 1.2 The Σ-notation . . . . . . . . . . . 1.3 Specifying periodic functions . . . 1.4 Complex Fourier series . . . . . . . 1.5 An application to filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 5 6 6 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 7 7 7 7 8 8 9 9 12 2 Fourier transforms 2.1 The Fourier Transform . . . . . . . . . 2.2 Fourier transform pairs . . . . . . . . . 2.3 Discrete and continuous spectra . . . . 2.4 A special case — f (t) is real and even 2.5 The Dirac δ function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 16 17 18 19 20 3 Fourier transform properties 3.1 Linearity (also known as superposition) . . . . . . . . . . . . 3.2 Time scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Time shifting . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 Frequency shifting . . . . . . . . . . . . . . . . . . . . . . . . 3.7 Convolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7.1 Definition of convolution . . . . . . . . . . . . . . . . . 3.7.2 Fourier transform of the convolution of two functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 24 25 26 27 28 29 29 29 29 . . . . . . . . . 4 Fourier transforms without integration 33 4.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.2 Recipe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 1 5 Cross correlation and autocorrelation 5.1 Reminder: complex conjugate . . . . . . . . . . . . . . 5.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Correlation properties . . . . . . . . . . . . . . . . . . 5.3.1 The autocorrelation function is always even . . 5.3.2 Calculating cross correlation either way round . 5.3.3 The maximum of the autocorrelation function . 5.3.4 Autocorrelation of a periodic function . . . . . 5.4 Worked examples . . . . . . . . . . . . . . . . . . . . . 5.5 Power and energy signals . . . . . . . . . . . . . . . . 5.6 Correlation demonstrations . . . . . . . . . . . . . . . 5.6.1 Cross correlation . . . . . . . . . . . . . . . . . 5.6.2 Autocorrelation . . . . . . . . . . . . . . . . . . 5.6.3 Practical applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 38 38 40 40 41 41 41 42 45 46 46 47 48 6 Introductory probability 6.1 Definition of probability . . . . . . . . . . . . . . . 6.2 Addition of probabilities — mutually exclusive case 6.3 Addition of probabilities — general case . . . . . . 6.4 Multiplication of probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 51 53 54 55 . . . . . . . . 7 Discrete variables: the p.d.f., mean and variance 7.1 Random variables . . . . . . . . . . . . . . . . . . . . . 7.2 Definitions: a set of N discrete values . . . . . . . . . 7.2.1 Mean of x, x . . . . . . . . . . . . . . . . . . . 7.2.2 Standard deviation of x, σx . . . . . . . . . . . 7.3 The probability density function . . . . . . . . . . . . 7.3.1 Normalisation . . . . . . . . . . . . . . . . . . . 7.3.2 Other names for p.d.f. . . . . . . . . . . . . . . 7.4 What does the p.d.f. mean? . . . . . . . . . . . . . . . 7.5 The cumulative distribution function . . . . . . . . . . 7.6 Mean & standard deviation: when the p.d.f. is known 7.6.1 Mean, x . . . . . . . . . . . . . . . . . . . . . . 7.6.2 Standard deviation, σx . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 58 59 59 59 60 61 63 63 66 66 67 67 8 Continuous distributions 8.1 Continuous random variables . . . . . 8.2 Those definitions again . . . . . . . . . 8.2.1 Mean of x, x . . . . . . . . . . 8.2.2 Standard deviation of x, σx . . 8.3 Application to signal power . . . . . . 8.4 The p.d.f. for continuous variables . . 8.5 The c.d.f., F (x) . . . . . . . . . . . . . 8.6 Definitions: when the p.d.f. is known . 8.6.1 Mean of x, x . . . . . . . . . . 8.6.2 Standard deviation of x, σx . . 8.6.3 The mean of any function of x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 69 70 70 70 71 72 73 75 75 75 75 . . . . . . . . . . . 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Theoretical distributions 9.1 The Gaussian distribution . . . . . . . 9.2 The Gaussian probability distribution 9.3 The Poisson distribution . . . . . . . . 9.4 The binomial distribution . . . . . . . 10 The 10.1 10.2 10.3 10.4 10.5 method of least squares Gauss . . . . . . . . . . . . A data fitting problem . . . The method of least squares Calculating m and c . . . . Fitting to a parabola . . . . . . . . . . . . . . 11 Complex frequency 11.1 Complex frequency . . . . . . . 11.1.1 σ < 0 . . . . . . . . . . 11.1.2 σ = 0 . . . . . . . . . . 11.1.3 σ > 0 . . . . . . . . . . 11.2 Linear homogeneous differential 11.2.1 a2 > 1 . . . . . . . . . . 11.2.2 a2 = 1 . . . . . . . . . . 11.2.3 a2 < 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 78 80 82 85 . . . . . 88 88 88 89 90 93 . . . . . . . . 96 96 97 97 98 98 100 101 101 12 The Laplace Transform 103 12.1 The Laplace transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 12.2 The Laplace transform of a derivative . . . . . . . . . . . . . . . . . . . . . . . . . . 105 13 Differential Equations and the Laplace Transform 13.1 Inhomogeneous differential equations . . . . . . . . . 13.2 Solving a d.e. by Laplace transform — overview . . 13.3 The Laplace transform of a differential equation . . . 13.4 Inverse Laplace transform using tables . . . . . . . . 13.5 Inverse Laplace transform by partial fractions . . . . . . . . . 108 . 108 . 109 . 110 . 112 . 114 . . . . . . . . . . 116 . 116 . 116 . 117 . 119 . 120 . 122 . 122 . 123 . 123 . 123 15 The Z transform: properties, inversion 15.1 z-transform properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.1.1 Linearity/superposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.1.2 Time delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 . 125 . 125 . 126 14 The Z transform: definition, examples 14.1 Introduction and Definitions . . . . . . . . . . . . . 14.1.1 Sampling . . . . . . . . . . . . . . . . . . . 14.1.2 The connection with Laplace transforms . . 14.1.3 The two ways of writing down z-transforms 14.2 z-transform examples . . . . . . . . . . . . . . . . . 14.2.1 f (n) = an . . . . . . . . . . . . . . . . . . . 14.2.2 f (n) = δ(n), the unit impulse . . . . . . . . 14.2.3 f (n) = u(n), the unit step function . . . . . 14.2.4 f (n) = an . . . . . . . . . . . . . . . . . . . 14.2.5 f (n) = cos an . . . . . . . . . . . . . . . . . 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.1.3 Time advance . . . . . . . . . . . . . . . . 15.1.4 Multiplication by an exponential sequence 15.1.5 Differentiation property . . . . . . . . . . 15.1.6 Initial Value Theorem . . . . . . . . . . . 15.1.7 Final Value Theorem . . . . . . . . . . . . 15.2 Inversion of the z-transform . . . . . . . . . . . . 15.2.1 Finite sequences . . . . . . . . . . . . . . 16 The 16.1 16.2 16.3 16.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 127 127 128 129 130 132 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 137 137 138 142 17 Matrices I 17.1 The basics . . . . . . . . . . . . . . . 17.2 Matrix equality . . . . . . . . . . . . 17.3 Matrix addition . . . . . . . . . . . . 17.4 Matrix multiplication . . . . . . . . . 17.4.1 Scalar × matrix = matrix . . 17.4.2 Matrix × vector = vector . . 17.4.3 Matrix × matrix = matrix . 17.5 Determinants . . . . . . . . . . . . . 17.6 Solving two linear equations . . . . . 17.6.1 Properties of the unit matrix 17.7 Application — Z and Y parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 147 148 148 148 148 149 149 149 150 152 152 18 Matrices II 18.1 Matrix inversion: Pi to T conversion . . 18.2 Solving n linear equations . . . . . . . . 18.3 Inverting an n × n matrix . . . . . . . . 18.4 The equation matrix × vector = 0 . . . 18.5 Application of matrix × vector = 0 . . . 18.6 Eigenvalues and eigenvectors . . . . . . 18.7 Applications of eigenvalues/eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 155 157 157 159 160 161 162 wave equation Partial differential equations . . . . . . . . . . Derivation of the wave equation . . . . . . . . The d’Alembert solution of the wave equation Boundary conditions . . . . . . . . . . . . . . What does it all mean? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 166 167 170 171 174 19 The 19.1 19.2 19.3 19.4 19.5 Z transform: applications Introduction . . . . . . . . . . Difference equations . . . . . Solving difference equations . A FIR filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Recommended books (for reference) 0.1 General textbooks • Mary L. Boas, Mathematical Methods in the Physical Sciences (2nd ed) ISBN 0-471-09960-0 Wiley (The library has a couple of copies.) Contains most of what you need (a little thin on Fourier transforms; useful reference for differential equations, matrix algebra etc.) • Erwin Kreyszig Advanced Engineering Maths (Getting very fat and middle-aged. 1271 pages. I pity students who have to cycle to the university carrying this one.) • Weltner, Grosjean, Schuster and Weber, Mathematics for Engineers and Scientists. (Newly arrived on the scene. It’s thin, light, reasonably priced and seems to cover a lot of good stuff. No Fourier transforms.) • K.A. Stroud, Further Engineering Mathematics (Concentrates on ‘programmed learning’ and if you like this approach, this book may be the one for you.) 0.2 Statistics There are many possible additional books for statistics. I suggest you borrow from the library G.M. Clarke and D. Cooke ‘A Basic Course in Statistics’ if you want some readable background information. Kreyszig is also strong on statistics, Weltner et al is not bad; Boas less so. 5 0.3 Fourier analysis G. Stephenson’s ‘Mathematical methods for Science Students’ is good for Fourier series, and worth having a look at. (The library has many copies.) It doesn’t even mention Fourier transforms; for these, I can recommend wholeheartedly S. Haykin’s ‘Communication Systems’, chapter 2. The library has about a dozen copies. 0.4 Matrices These are well covered by all the general textbooks, and also by Stephenson. 0.5 Other C.R. Wylie, Differential Equations has a very good section on the wave equation. Note on the problems Problems whose numbers have an asterisk are a bit harder than those without. Look on them as a challenge. Web page Materials that support the course available on the WWW and the address for this is http://personal.maths.surrey.ac.uk/st/J.Deane/Teach/em2 6 Chapter 1 Fourier Series in Complex Form 1.1 1.1.1 Functions Periodic functions A function f (t) is said to be periodic with period T0 if T0 is the smallest positive number for which f (t + T0) = f (t) holds for all t. Examples include: (1) f (t) = sin t, which has period 2π because sin(t + 2π) = sin t (2) f (t) = cos 4t, which has period π/2, since cos 4(t+π/2) = cos 4t. Note: although sin(t + 4π) = sin t, the period is not 4π, because there is a smaller positive number that has this property (i.e. 2π). 1.1.2 Even functions A function f (t) is said to be even if f (t) = f (−t) for all t. Such functions are symmetrical about t = 0. Examples: 1 + cos t, t2 − 3t4, |t|. 1.1.3 Odd functions A function f (t) is said to be odd if 7 f (t) = −f (−t) for all t. Such functions are anti-symmetric about t = 0. Examples: sin t, t + 3/t3, t(1 + cos 3t). 1.1.4 Sine and cosine of period T0 In the rest of this chapter, you will see many references to sin 2nπt T0 and cos 2nπt T0 where n is an integer (a whole number). Using the fact that the periods of sin t and cos t are both 2π, you should be able to see that the period of sin(2πt/T0) is T0, and so the period of sin(2nπt/T0) is T0/n — and that the same is true for cos(2πt/T0). 1.2 The Σ-notation As a reminder, it is useful at this point to give some examples of the Σ-notation, which is a neat shorthand way of representing the sum of a set of numbers. Some examples are: t1 + t2 + . . . + tN = N X ti i=1 a0 + a1 x + a2 x2 + . . . = ∞ X am xm m=0 ∞ X (−1)k sin(2k + 1)t 1 1 1 sin t − sin 3t + sin 5t − sin 7t + . . . = 3 5 7 2k + 1 k=0 Study the examples and make sure you understand why the righthand side represents the left-hand side. 8 1.3 Specifying periodic functions You will often meet a specification of a periodic function like with −π < t ≤ 0 0 < t ≤ π/2 π/2 < t ≤ π 0 f (t) = −1 1 f (t + 2π) = f (t). It is important that you know how to turn this into a picture of the function. Check that you can do this by sketching the function on the axes below, being careful to label the relevant t and f (t) values. f (t) ✻ ✲ t 1.4 Complex Fourier series There is a neat form in which to write the Fourier coefficients, which involves complex numbers. We derive this in two steps. Step 1 — rewrite the Fourier series in terms of complex numbers. Let us start the definition of a Fourier series in trigonometric form: 9 ∞ ∞ X 2nπt X 2nπt 1 + an cos bn sin f (t) = a0 + 2 T0 T0 n=1 n=1 where 2 an = T0 Z T0 2 2nπt 2 f (t) cos dt, bn = T T0 T0 − 20 Z T0 2 T − 20 f (t) sin 2nπt dt. T0 (1.1) Now write sin and cos in terms of complex exponentials, so that f (t) = ∞ ∞ X e2jnπt/T0 − e−2jnπt/T0 a0 X e2jnπt/T0 + e−2jnπt/T0 + an − jbn 2 2 2 n=1 n=1 which can be rewritten as ∞ ∞ X an − jbn 2jnπt/T0 X an + jbn −2jnπt/T0 1 e + e . f (t) = a0 + 2 2 2 n=1 n=1 Define the complex numbers an − jbn α0 = a0 αn = 2 α−n = an + jbn . 2 Step 2 — rewrite the definitions of an and bn Expanding cos 2nπt/T0 and sin 2nπt/T0, we can rewrite the definitions of an and bn (equations 1.1) as Z T0 2 e2jnπt/T0 + e−2jnπt/T0 2 dt f (t) an = T0 − T20 2 2 jbn = T0 Z T0 2 e2jnπt/T0 − e−2jnπt/T0 f (t) dt T 2 − 20 10 Now, adding these two equations gives Z T0 2 2 f (t)e2jnπt/T0 dt an + jbn = T0 − T20 and subtracting them gives 2 an − jbn = T0 Z T0 2 T − 20 f (t)e−2jnπt/T0 dt. Using our definition of αn , we have, finally, f (t) = ∞ X αn e2jnπt/T0 with n=−∞ αn = 1 T0 Z T0 2 T − 20 f (t)e−2jnπt/T0 dt where the definition of αn is true for all n. Example 1.1 Find the Fourier series in the complex form for the square wave −1 −1 ≤ t < −1/2 −1/2 ≤ t < 1/2 s(t) = 1 −1 1/2 ≤ t < 1 with s(t + 2) = s(t). We are being asked to calculate αn for all n. Hence, we need to calculate 1 αn = T0 Z T0 2 f (t)e−2jnπt/T0 dt, − T0 2 which in this case is " Z # Z 1/2 Z 1 −1/2 −jnπt 1 −jnπt −jnπt − e dt + e dt − e dt −1 −1/2 1/2 2 = = −1 −jnπt −1/2 −jnπt 1/2 −jnπt 1 −e +e −e 1/2 −1/2 −1 2jnπ i −1 h jnπ/2 −e + ejnπ + e−jnπ/2 − ejnπ/2 − e−jnπ + e−jnπ/2 2jnπ 11 i −1 h jnπ e − e−jnπ + 2e−jnπ/2 − 2ejnπ/2 2jnπ −1 2 sin nπ/2 = [sin nπ − 2 sin nπ/2] = . nπ nπ This expression is valid for all n except n = 0. We have α0 = the area under s(t) over one period, divided by the period, and this equals (−1/2 + 1 − 1/2)/2 = 0. Hence, α0 = 0. = 1.5 An application to filters From problem 6 you will see that the Fourier series in the complex form for a square wave voltage v(t), period T0, defined by is v(t) = 1 0 −T0/2 < t < 0 0 < t < T0 /2 with v(t + T0) = v(t) ∞ 1 1 j X ej(2n+1)2πt/T0 . v(t) = + 2 π n=−∞ 2n + 1 Example 1.2 What is the output vo (t) if this square wave voltage is applied to the low pass filter in figure 1.1? R v(t) C vo(t) Figure 1.1: An RC filter fed by a square wave. From circuit theory, we know that the output of this filter when the input is vinejωt is 1 vinejωt vout = 1 + jωτ 12 with τ = RC. This is true for any frequency ω. Since the filter is a linear system, its output, when the input is a square wave, is the sum of {the individual sine waves in the input × the transfer function, (1 + jωτ )−1}: we deduce this from a property known as superposition. The frequencies in the input are (2n + 1)2π/T0, so ∞ 1 ej(2n+1)2πt/T0 1 j X . vo(t) = + 2 π n=−∞ 1 + j(2n + 1)2πτ /T0 2n + 1 (1.2) Figure 1.2 shows the input and output waveforms for T0 = 1 and τ = 0.2. 1.0 vin 0.8 0.6 0.4 0.2 0.0 -0.2 1.0 vout 0.8 0.6 0.4 0.2 0.0 -0.2 0.0 0.5 1.0 1.5 t Figure 1.2: The input and output waveforms for the RC-filter example. 13 2.0 Problems, chapter 1 1. (a) Sketch the even and odd example functions in sections 1.1.2 and 1.1.3. (b) Let E1(t), E2(t) be two even functions of t and O1 (t), O2(t) be two odd functions of t. Are the following functions even or odd? (i) E1(t) × E2(t) (ii) E1(t) × O1(t) (iii) O1 (t) × O2(t) (iv) E1(t) + E2(t) (v) O1 (t) + O2 (t) (vi) O1 (t) + O1 (−t) [Even: (i), (iii), (iv) and (vi). The rest are odd.] 2. Here are some useful formulae for simplifying Fourier series results. In all cases, n is an integer. Prove them. (i) ejnπ = cos nπ = (−1)n. (ii) jnπ/2 e = ( (−1)n/2 n even j(−1)(n−1)/2 n odd (iii) For any set of numbers a0 , a1 , a2, . . ., ∞ X [1 + (−1)n] an = 2 n=0 (iv) Show that Z π −π a2n n=0 n=0 ∞ X ∞ X [1 − (−1)n] an = 2 0 2π ejnt dt = ∞ X a2n+1 n=0 n 6= 0 n=0 3. (i) Sketch the following function over at least two periods: f (t) = 0 sin t −π < t < 0 0<t<π f (t + 2π) = f (t) (ii) Find its Fourier series in the complex form. [(ii) α±1 = ±1/(4j), αn = [(−1)n+1 − 1]/[2π(n2 − 1)]] 14 4. Find the Fourier series in the complex form for the function f (t) = 1 + t −1<t≤1 with f (t + 2) = f (t). [αn = j(−1)n/(nπ) if n 6= 0; α0 = 1.] 5. Find the complex Fourier series for the following waveform: v(t) = ekt , −T0/2 < t ≤ T0/2 where v(t + T0) = v(t). [αn = (−1)n(ekT0/2 − e−kT0 /2)/(kT0 − 2jnπ)] 6. Find the complex Fourier series for v(t) = 1 0 −T0/2 < t < 0 0 < t < T0 /2 with v(t + T0) = v(t). [Answer on page 12] 7. From the answer to problem 3(ii) above, deduce (i.e. do not re-do the integrals to find the coefficients) the complex Fourier coefficients for the full-wave rectified sine wave g(t) = ( − sin t −π < t ≤ 0 sin t 0<t≤π Simplify your answer as far as possible. Hint: if f (t) is a half-wave rectified sine wave, then first show that g(t) = f (t) + f (−t). [g(t) = 15 2 π − 4 π cos 2t 22 −1 + cos 4t 42 −1 + cos 6t 62 −1 + ... ] Chapter 2 Fourier transforms 2.1 The Fourier Transform In the previous chapter, we discussed periodic functions which satisfied some conditions, known as the Dirichlet conditions,1 which allow them to be expanded in a Fourier series. In this chapter, we deal with non-periodic functions that satisfy the Dirichlet conditions. For such functions, the Fourier transform can be calculated, which enables us to express a function of time f (t) as a function of frequency, F (ω), instead. Recall from the last chapter that Z T0 ∞ X 2 1 2jnπt/T0 αn e and αn = f (t) = f (t)e−2jnπt/T0 dt T0 − T20 −∞ where f (t) is a function of time t with period T0, i.e. f (t+T0 ) = f (t). The definition of αn in words is αn is the mean value, over the range − T20 ≤ t ≤ period), of [f (t) × e−2jnπt/T0 ] T0 2 (one If the function is not periodic, then the period T0 → ∞, which leads us to consider the integral RT 1 /2 Specifically, if the periodic function is f (t) and has period T0 , then the conditions are (i) −T0 /2 f (t)dt is finite; 0 (ii) f (t) must have a finite number of turning points and finite discontinuities in a period; and (iii) f (t) itself must be finite for all t. 16 F (ω) = Z ∞ f (t)e−jωt dt (2.1) −∞ and this is the definition of the Fourier transform of f (t). Given F (ω) we can recover the original f (t). By analogy with the Fourier series expression for f (t), when the sum is replaced by an integral, Z ∞ 1 F (ω)ejωt dω. (2.2) f (t) = 2π −∞ The two boxed equations show us how to calculate the Fourier transform/inverse Fourier transform for a given function. All the material in this chapter is based on just these two equations. (See problem 1) We occasionally need to use the following notation for the Fourier transform of a function f (t): F (ω) = F [f (t)] and 2.2 f (t) = F −1 [F (ω)]. Fourier transform pairs We will always stick to the convention that a lower case letter stands for the function of time t and the corresponding upper case letter for the function of angular frequency, ω. The two functions f (t) and F (ω) constitute a Fourier transform pair, which we write as ↽ F (ω). f (t) ⇀ This means that Z ∞ f (t)e−jωt dt F (ω) = and −∞ or, in words, 17 1 f (t) = 2π Z ∞ −∞ F (ω)ejωt dω F (ω) is the Fourier transform of f (t) and f (t) is the inverse Fourier transform of F (ω). 2.3 Discrete and continuous spectra We have used Fourier series to express periodic functions, which have a discrete spectrum, i.e. one in which only certain frequencies are present. These frequencies were generally of the form ωn = n(2π/T0), with n = 0, 1, 2 . . .. Similarly, in order to describe nonperiodic functions, it is necessary to use a continuous spectrum, i.e. one in which all frequencies are present. Equation 2.1 tells us how to calculate this spectrum. Example 2.1 Let us first define the function rect(t), the rectangular pulse, as 1 − 12 < t < 21 rect(t) = 0 otherwise. We can now find the Fourier transform of f (t) = rect(t/2T ), which is t rect 2T (see sketch below). ! = 0 1 0 t < −T −T < t < T t>T ✻ f (t) 1 ✲ −T 0 18 T t Answer By definition, the Fourier transform F (ω) is given by F (ω) = Z ∞ −jωt f (t)e −∞ dt = Z T 1 −jωt T dt = e −T −jω −jωt −T 1×e 2 ejωT − e−jωT 2 e−jωT − ejωT = sin ωT. = = −jω ω 2j ω Without doing the integral, we know straight away that the inverse Fourier transform of (2/ω) sin ωT will give us the original rectangular pulse, that is t rect 2T ! ⇀ ↽ 2 sin ωT ω are a Fourier transform pair. (See problem 2) 2Τ F(ω) Τ 0 −Τ ω −π/Τ 0 π/Τ 2π/Τ Figure 2.1: The Fourier transform of a rectangular pulse of width 2T . 2.4 A special case — f (t) is real and even In most cases that you will come across, f (t) will be a real function of time. It can be shown that this implies that Re F (ω) is an even, and Im F (ω) an odd function of ω 19 that is, the real part of F (ω) is an even function of ω and the imaginary part of F (ω) is an odd function of ω. If f (t) is also an even function of t, that is f (t) = f (−t) then the Fourier transform of f (t) is Z Z ∞ e−jωtf (t) dt + F (ω) = 0 0 e−jωt f (t) dt −∞ Substituting −t for t in the right-hand half gives Z ∞ Z ∞ ejωtf (−t) dt e−jωtf (t) dt + F (ω) = 0 0 Using the fact that f (t) is even, this becomes Z ∞ 2 cos ωtf (t) dt F (ω) = 0 which is real. Hence The Fourier transform of an even, real function of time is a real, even function of ω. (See problem 3) 2.5 The Dirac δ function The function δ(t) is an infinitely narrow spike, with unit area, located at t = 0. Since the area under it is 1, its height must be infinite since its width is zero. The fact that the area under it is one tells us that Z ∞ δ(t) dt = 1. −∞ It is helpful to visualise δ(t) as the limit of a rectangular pulse as its width tends to zero, with a height such that its area = 1. 20 What is the Fourier transform of δ(t)? In the light of the above, it is given by t 1 rect F F [δ(t)] = Tlim →0 2T 2T The factor 1/2T multiplying rect(t/2T ) makes the area equal to unity. We already know the Fourier transform of rect(t/2T ): it is 2 sin ωT . Hence (by l’Hospital’s rule) ω F [δ(t)] = Tlim →0 2 sin ωT = 1 2ωT Hence ↽1 δ(t) ⇀ (See problem 6) Note that δ(t − t0) is an infinitely narrow, infinitely high spike with unit area, occurring at t = t0. From this we can deduce that for any function of time f (t) Z ∞ f (t)δ(t − t0) dt = f (t0). (2.3) −∞ In other words, the delta function can be used to sample a function of time, f (t), at a particular time t0. Incidentally, the sampling property allows us to derive the Fourier transform of δ(t) in one line: by putting f (t) = ejωt and t0 = 0 in equation (2.3). Since e0 = 1, the Fourier transform of δ(t) must also be 1. 21 Problems, chapter 2 1. Write the following in terms of the Fourier transforms of the given functions: (i) Z ∞ 10e−jωth(t) dt −∞ (ii) Z ∞ −jΩx βe −∞ f (x) dx − (β is a constant.) (iii) (iv) Z −∞ −jky e a(y) dy + ∞ Z ∞ −∞ Z ∞ ∞ Z βe−jΩx g(x) dx −∞ Z ∞ e−jkz a(z) dz −∞ C(α)ejαv dα −∞ e−jωv dv [(i) 10H(ω), (ii) β[F (Ω) − G(Ω)] (iii) 0 (iv) 2πC(ω) ] 2. Sketch the following functions and find their Fourier transforms: (i) t rect 4T ! (ii) (a is a constant) f (t) = at 0 0<t<T otherwise (iii) (iv) f (t) = cos πt 0 −1 < t < 1 otherwise t rect 1 + T ! [(i) (2/ω) sin 2ωT (ii) a[e−jωT (1 + jωT ) − 1]/ω 2 (iii) 2ω sin ω/(π 2 − ω 2) (iv) jejωT /2[1 − ejωT ]/ω ] 22 3. Prove that the Fourier transform of an odd, real function f (t) is imaginary. 4. Prove that, for any real f (t), Re F (ω) is an even function of ω, and Im(F (ω) is an odd function of ω. 5. Using the sampling property of the Dirac delta function, equation 2.3, find (i) R∞ −∞ δ(t − π/2) sin t dt R∞ −∞ δ(t jωt (ii) The constant t0 such that (iii) R∞ −∞ [δ(t + a) + δ(t − a)] e dt. − t0 )ekt dt = e2 [(i) 1, (ii) 2/k, (iii) 2 cos ωa] 6.∗ The Fourier transform of the delta function can be expressed as the Fourier transform of the limit of any function of t whose width tends to zero and whose height tends to infinity at t = 0, in such a way that the area is unity. Using the result of question 2(ii), show that this is true for the triangular pulse defined there. (Hint: You will need to define the constant a such that the area under the triangular pulse is 1 regardless of the value of T . To take the limit as T tends to zero, you will need to use the Taylor series for e−jωT up to and including the term in ω 2 T 2.) [Well done if you get this right.] 23 Chapter 3 Fourier transform properties Introduction Many of the useful applications of the Fourier transform come about because it has the properties which are discussed in this chapter. In reading this chapter, you must remember the meaning of the symbol ⇀ ↽, which was defined in the previous chapter in section 2.2. 3.1 Linearity (also known as superposition) Let ↽ F1(ω) f1(t) ⇀ and ↽ F2(ω) f2(t) ⇀ be two Fourier transform pairs. Then, for constants c1 and c2 , Z ∞ [c1 f1(t) + c2f2 (t)] e−jωt dt = c1 so Z ∞ −∞ −∞ f1(t)e −jωt dt + c2 Z ∞ f2(t)e−jωt dt = c1F1(ω) + c2F2(ω), −∞ ↽ c1 F1(ω) + c2F2(ω). c1 f1(t) + c2 f2(t) ⇀ This property allows us to find the Fourier transform of two functions added together, if we know the Fourier transform of each of the functions individually. 24 3.2 Time scaling ↽ F (ω). Then Let f (t) ⇀ ω! 1 ⇀ F f (at) ↽ |a| a where a is a constant. We prove this, assuming a > 0, by writing the Fourier transform of f (at) as Z ∞ f (at)e−jωt dt, F [f (at)] = −∞ and substituting u = at, with a > 0. Then t = u/a, and as t → +∞, u → +∞, since a > 0, so the limits on the integral stay the same. Hence, Z ∞ Z ∞ ω 1 f (u)e−ju a du f (u)e−jωu/a du/a = F.T. = a −∞ −∞ 1 ω! = F . a a (For the case a < 0 see problem 1.) Example 3.1 The two properties of superposition and time scaling can be used to calculate the Fourier transform of f (t) shown in the figure below: ✻f (t) 2 1 ✲ −2T −T T 25 2T t The key to this problem is to realise that f (t) is the sum of two rectangular pulses, rect(t/2T ), of width 2T , and rect(t/4T ), of width 4T . In other words, f (t) = rect(t/2T ) + rect(t/4T ). Now, we know from example 2.1 that the transform of rect(t/2T ) is (2/ω) sin ωT . But, rect(t/2T ) and rect(t/4T ) are related by rect(t/4T ) = rect(at/2T ) with a= 1 2 so, using the time scaling property, we can immediately say that F (ω) = (2/ω) sin ωT +(1/2)−1(2/2ω) sin 2ωT = (2/ω) sin ωT +(2/ω) sin 2ωT. 3.3 Time shifting ↽ F (ω) then If f (t) ⇀ ↽ e−jωt0 F (ω). f (t − t0) ⇀ The Fourier transform of f (t − t0) is Z ∞ f (t − t0)e−jωt dt. −∞ Substituting u = t − t0, we have t = u + t0 and dt = du, so Z ∞ Z ∞ f (u)ejωu du f (u)e−jω(u+t0) du = e−jωt0 −∞ −∞ from which the time shifting property follows. Example 3.2 Find the Fourier transform of f (t) defined in the figure below. 26 ✻f (t) 1 ✲ −3T −T T 3T t Answer The function f (t) is the sum of two time-shifted rectangular pulses. We know that the Fourier transform of rect(t/2T ) is (2/ω) sin ωT . The leftand right-hand pulses are given by t + 2T rect 2T ! and t − 2T rect 2T ! respectively — be sure you understand why. Hence, the Fourier transform of the left-hand pulse is e2jωT × (2/ω) sin ωT. Similarly, the Fourier transform of the right-hand pulse is e−2jωT ×(2/ω) sin ωT and so, using superposition, the Fourier transform for the pair of pulses is (2/ω)(e2jωT + e−2jωT ) sin ωT = (4/ω) cos 2ωT sin ωT. 3.4 Differentiation ↽ F (ω) then If f (t) ⇀ df (t) ⇀ ↽ jωF (ω). dt We prove this by writing down the inverse Fourier transform of F (ω), which is, by definition, 27 1 f (t) = 2π Z ∞ F (ω)ejωtdω. −∞ Differentiating with respect to t Z ∞ df (t) 1 [jωF (ω)] ejωtdω = dt 2π −∞ where the right hand side is the inverse Fourier transform of jωF (ω). Finding the Fourier transform of both sides now proves the result. 3.5 Integration ↽ F (ω) then If f (t) ⇀ Z t ↽ f (u)du ⇀ −∞ 1 F (ω) jω provided that F (0) = 0, which implies that R∞ −∞ f (t) dt = 0. The Fourier transform of the integral of f (t) is Z t Z ∞ f (u)du dt. e−jωt −∞ −∞ Integrating by parts gives ∞ Z t Z ∞ 1 −jωt 1 1 f (u)du − e f (t)e−jωt dt = F (ω) −jω −jω jω −∞ −∞ −∞ providedRthat the first term on the right hand side R ∞ is zero. This will ∞ be so if −∞ f (t) dt = 0 — why? In the case −∞ f (t) dt 6= 0, see Haykin, Chapter 2. In the next chapter we use this formula a great deal, along with the differentiation and time shifting formulae. 28 3.6 Frequency shifting ↽ F (ω) then If f (t) ⇀ ↽ F (ω − ω0). ejω0tf (t) ⇀ This is proved in a similar way to the time shifting property. Example 3.3 Given that f (t) ⇀ ↽ F (ω), what is the Fourier transform of f (t) cos ω0 t? Answer Using the fact that cos ω0t = ejω0 t + e−jω0 t 2 and using the frequency shifting property above, we have 1 f (t) cos ω0 t ⇀ ↽ [F (ω − ω0 ) + F (ω + ω0 )]. 2 This example relates to amplitude modulation. 3.7 3.7.1 Convolution Definition of convolution Given two functions of time, f1(t) and f2 (t), their convolution, written f1 ⋆ f2(τ ), is defined as Z ∞ f1 (t)f2(τ − t) dt. f1 ⋆ f2(τ ) = −∞ Notice that this is a function of τ only. The importance of convolution becomes clear when we find the Fourier transform of f1 ⋆ f2(τ ). 3.7.2 Fourier transform of the convolution of two functions Let us find the Fourier transform of the convolution of two functions of t. This is given by 29 Z ∞ e−jωτ f1 ⋆ f2(τ ) dτ = −∞ Z ∞ e−jωτ −∞ Z ∞ −∞ f1(t)f2(τ − t) dt dτ. Call this expression FTC (Fourier Transform of the Convolution). Swap the order of integration (w.r.t. τ first, then w.r.t. t): Z ∞Z ∞ f1(t)e−jωτ f2(τ − t) dτ dt. FTC = −∞ −∞ Since f1(t) depends only on t, we can write this as Z ∞ Z ∞ f1(t) e−jωτ f2(τ − t) dτ dt. FTC = −∞ −∞ Now, applying the time shift property to the τ integral, we have Z ∞ Z ∞ f1 (t)e−jωt dt, f1(t)e−jωtF2(ω) dt = F2(ω) FTC = −∞ −∞ and so FTC = F1(ω)F2(ω). Hence, F1(ω)F2(ω) = F [f1 ⋆ f2 ] or, in words, The Fourier transform of the convolution of two functions f1(t) and f2 (t) is the product of the Fourier transforms of the individual functions. This amazing result is known as the Convolution Theorem. 30 Problems, chapter 3 1. Show that the time scaling property is also true for a < 0 2. Prove the result of example 3.1 by transforming the function directly. 3. Prove the frequency shifting property. 4. If f (t) ⇀ ↽ F (ω), show that F (0) = 0 implies that 5. Find the Fourier transform of f (t) = (T + t)/T (T − t)/T 0 R∞ −∞ f (t) dt = 0. −T < t < 0 0<t<T otherwise. (a) by direct calculation, and (b) by finding the Fourier transform of the derivative of f (t) and then using the integration property to find the Fourier transform of f (t). [Both give 2(1 − cos ωT )/ω 2T ] 6. Find the Fourier transform of f (t) = 1 0 −T0/2 < t < T0/2 otherwise. Hence, using the frequency shift property (and doing no integration), show that the Fourier transform of is g(t) = G(ω) = sin ω0 t 0 −T0/2 < t < T0 /2 otherwise sin[(ω − ω0 )T0/2] sin[(ω + ω0)T0/2] − . j(ω − ω0) j(ω + ω0) [F (ω) = (2/ω) sin(ωT0/2)] 7. (i) Use the time scaling property to show that if f (t) ⇀ ↽ F (ω), then f (−t) ⇀ ↽ F (−ω). (ii) Hence, using the fact the the Fourier transform of f (t) = at, 0 ≤ t ≤ T , is a[e−jωT (1 + jωT ) − 1]/ω 2, find the Fourier transform of ( −at −T ≤ t < 0 g(t) = at 0≤t<T [G(ω) = 2a(cos ωT + ωT sin ωT − 1)/ω 2] 31 8. Find f ⋆ g(τ ) if (i) and g(t) = sin ω0t. 1 0 f (t) = −T < t < T otherwise (ii) and g(t) = cos ω0t. f (t) = 1−t 0 0<t<1 otherwise [(i) [cos ω0 (τ − T ) − cos ω0 (τ + T )]/ω0, (ii) [ω0 sin ω0τ − cos ω0 (τ − 1) + cos ω0τ ]/ω02] 9.∗ A demonstration of the Convolution Theorem. (a) Show graphically that the convolution of two rectangular pulses of unit height, stretching between t = −T and t = +T , is given by Convolution = τ + 2T −τ + 2T 0 −2T < τ < 0 0 < τ < 2T otherwise. (b) Find the Fourier transform of the convolution in (a). (c) Hence demonstrate that the convolution theorem is true in this case, i.e., that The Fourier transform of the convolution of the two rectangular pulses = the product of the Fourier transforms of the two rectangular pulses. [(b) 2(1 − cos 2ωT )/ω 2 (c) (2 sin ωT /ω)2, which is the same, since 1 − cos 2x = 2 sin2 x. Congratulations if you got there.] 32 Chapter 4 Fourier transforms without integration Aims This chapter concentrates on a technique for calculating Fourier transforms of piecewise polynomial functions which are zero as t → ±∞, without using integration. There are distinct advantages to doing it this way, once you have mastered the technique. I give an outline of the technique here, and some problems for you to practise on. A detailed explanation and some further worked examples will be given in lectures. 4.1 Definition A piecewise polynomial function f (t) is one which can be expressed as a set of polynomials in t, each applying over a different range of t. Examples include f (t) = f (t) = T +t T 0 1 0 −τ < t < τ otherwise −T < t < 0 otherwise 33 (rectangular pulse) (half a triangular pulse) at2 + bt + c −c f (t) = t − t2 0 −t1 < t ≤ 0 0 < t ≤ t2 t2 < t < 5t2 otherwise (a nasty mess). The important thing about functions of this type is that by differentiating with respect to t sufficiently many times, nothing remains except a set of δ-functions and their derivatives, at various times. Loosely speaking, such functions can be ‘differentiated away’ into nothing but a set of (derivatives of) δ-functions. Why do this? The answer is that it is very easy to find the Fourier transform of a set of δ-functions, and from this the Fourier transform of the original function can be deduced by using the integration property derived in the previous chapter. Believe me, this method can often be a lot less trouble than the alternative — for example, integrating by parts. 4.2 Recipe Taking as an example the half triangular pulse defined above, the following steps allow us to find its Fourier transform without integrating anything. In order to follow the argument it will help you greatly if you sketch f (t) and its derivatives. 1. Differentiate f (t) w.r.t. t, which gives 1 df (t) T = −δ(t) + 0 dt −T < t < 0 otherwise. The δ-function arises because f (t) goes instantaneously from 1 to 0 at t = 0. 34 We haven’t differentiated enough yet — the result is not zero everywhere — so. . . 2. . . . differentiate again to get d2f (t) dδ(t) 1 1 δ(t + T ) − − δ(t). = dt2 T dt T 3. Now we only have δ-functions and their derivatives, so we are 2 ready to find the Fourier transform of d dtf 2(t) . Using the differentiation property, the Fourier transform of dδ(t) dt = jω. (See problem 1.) Using the time shifting property, the Fourier transform of δ(t + T ) is ejωT . Hence, 2 1 ejωT − jωT − 1 1 jωT d f (t) = e − jω − = . F 2 dt T T T (Check that the dimensions are consistent.) 4. Finally, to find the Fourier transform of f (t), use the integration property twice, i.e. divide by (jω)2 , which gives 1 + jωT − ejωT . F (ω) = F [f (t)] = ω2T 35 Problems, chapter 4 1.∗ Show that the Fourier transform of the derivative of the δ-function is jω (i) by using the differentiation property (easy), and (ii) by finding the Fourier transform of the derivative of f (t), where f (t) = 1 T 0 − T2 < t < otherwise T 2 and then finding the limit of this as T → 0 (harder). Note that f (t) as defined here has unit area, and therefore, as T → 0, tends to δ(t). 2. Using the method outlined in this chapter, find the Fourier transform of f (t) = h 0 −T < t < T otherwise [F (ω) = (2h/ω) sin ωT ] 3. Find the Fourier transform of the function T +t T −T < t < 0 0<t<T otherwise f (t) = 1 0 [F (ω) = (1 + jωT e−jωT − ejωT )/ω 2T ] 4. Find the Fourier transform of f (t) = 1− 0 2 t T −T < t < T otherwise [F (ω) = 4(sin ωT − ωT cos ωT )/ω 3T 2] 5.∗ Find the Fourier transform of the function f (t) = t 3 T 0 0≤t≤T otherwise [F (ω) = [6 − e−jωT (6 + 6jωT − 3ω 2T 2 − jω 3 T 3)]/(ω 4T 3 )] 36 6.∗ Have fun finding the Fourier transform of f (t) as drawn below f (t) ✻ 2a a 0 ✁❆ ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ 2T ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ 4T 5T ✁ ✁ ❆ ❆ ❆ ❆ ✁ ❆❆ ✁ ✁ ✁ ✁✁ ❆ ❆ ❆ ❆ 7T 8T 9T 10T 11T ❆ ❆ ❆ ❆ ❆ ❆ ❆ ❆ 13T [F (ω) = (−a/ω 2T ) 1 − 2X 2 + X 4 + X 5 − X 7 n −X 8 + 2X 9 − X 10 − X 11 + X 13 where X = e−jωT ] 37 o ✲ t Chapter 5 Cross correlation and autocorrelation Cross correlation and autocorrelation are related to convolution, and in this chapter, we define what they are, explain some of their properties, and give some practical examples of their applications. 5.1 Reminder: complex conjugate Before we define and discuss correlation, it will be useful to remember the definition of the complex conjugate of a complex number, z. If z = x + jy, then the complex conjugate of z, written z ∗ , is defined as z ∗ = x − jy. If z happens to be written in polar form, that is, z = rejθ , then z ∗ = re−jθ . You might notice that the same rule works in both (in fact, in all) cases: to find the complex conjugate, change the sign of j. You also need to remember that zz ∗ = |z|2 = (x + jy)(x − jy) = x2 + y 2 is the squared modulus of the complex number z, and is ≥ 0 and always real. 5.2 Definitions Given two real functions of time, f (t) and g(t), say, their cross correlation function, written Corr(f, g)(τ ), is defined as Z ∞ f (t)g(t + τ ) dt. (5.1) Corr(f, g)(τ ) = −∞ 38 This is very similar to the convolution of f and g, but it is not quite the same — the argument of the second function is t + τ and not τ − t. The autocorrelation function is simply the cross correlation of a real function with itself, in other words Z ∞ f (t)f (t + τ ) dt. Autocorrelation of f = Corr(f, f )(τ ) = −∞ You might expect, because of the similarity of correlation to convolution, that there should be a correlation theorem which is like the Convolution Theorem (see page 30), and indeed this is the case. Here it is. Let us try to find FC, the Fourier transform of the cross correlation of two functions, f (t) and g(t). By definition, this is Z ∞ Z ∞ f (t)g(t + τ ) dt dτ. e−jωτ FC = −∞ −∞ Changing the order of integration, we have Z ∞Z ∞ e−jωτ f (t)g(t + τ ) dτ dt. FC = −∞ −∞ Substituting u = t + τ , so dτ = du, we find Z ∞Z ∞ e−jω(u−t)f (t)g(u) du dt FC = = Z ∞ −∞ −∞ e −jωu −∞ g(u) du × Z ∞ −∞ ejωtf (t) dt. The first part is clearly the Fourier transform of g(t), G(ω). The second is not quite F (ω) because f (t) is multiplied by e+jωt and not e−jωt. So what is it? Provided f (t) is real, and bearing in mind the “change the sign of j rule”, you should be able to see that it is the complex conjugate of F (ω), i.e. F ∗(ω). Hence 39 ↽ F ∗(ω)G(ω) Corr(f, g)(τ ) ⇀ are a Fourier transform pair: this is the Correlation Theorem. Setting f (t) = g(t), we have ↽ F ∗(ω)F (ω) = |F (ω)|2, Corr(f, f )(τ ) ⇀ where the function |F (ω)|2 is always real, greater than or equal to zero, and is known as the power spectral density. It is a measure of how the power in a signal is distributed over different frequencies. 5.3 Correlation properties We give four properties here, and prove three of them. 5.3.1 The autocorrelation function is always even It is easy to show that the autocorrelation function is always an even function of τ . By definition, Z ∞ f (t)f (t + τ ) dt Corr(f, f )(τ ) = −∞ and so Corr(f, f )(−τ ) = Z ∞ −∞ f (t)f (t − τ ) dt. Substituting u = t − τ in the above gives Z ∞ f (u + τ )f (u)dt = Corr(f, f )(τ ). Corr(f, f )(−τ ) = −∞ Hence, we have Corr(f, f )(−τ ) = Corr(f, f )(τ ) and so Corr(f, f )(τ ) is an even function of τ . 40 5.3.2 Calculating cross correlation either way round We show that Corr(f, g)(τ ) = Corr(g, f )(−τ ), which is a useful property to bear in mind since sometimes it is easier to calculate the correlation one way round than the other. Starting from the definition, we have Z ∞ f (t)g(t + τ )dt Corr(f, g)(τ ) = −∞ and substituting u = t + τ , we have Z ∞ f (u − τ )g(u)du. Corr(f, g)(τ ) = −∞ By the definition of correlation, this is just Corr(g, f )(−τ ): the −τ term comes from the fact that the argument of f is u − τ and not u + τ. It is now easy to see (again) that the autocorrelation function is an even function of τ — substituting g = f in the above, we have Corr(f, f )(τ ) = Corr(f, f )(−τ ). 5.3.3 The maximum of the autocorrelation function We do not prove this property, but state it thus: for any τ , Corr(f, f )(0) ≥ Corr(f, f )(τ ), i.e. the autocorrelation function has a maximum at τ = 0. There may be other maxima for τ > 0, but they will never be greater than the one at τ = 0. 5.3.4 Autocorrelation of a periodic function Let f (t) be a periodic function with period T0, so f (t + T0 ) = f (t). Then the autocorrelation function of f (t) is also periodic, with the same period. 41 This is easily proved as follows. From the definition Z ∞ f (t)f (t + τ ) dt Corr(f, f )(τ ) = −∞ we have Corr(f, f )(τ + T0) = Z ∞ f (t)f (t + τ + T0) dt −∞ which, since f (t) has period T0, is equal to Z ∞ f (t)f (t + τ ) dt = Corr(f, f )(τ ). −∞ 5.4 Worked examples Example 5.1 Define r(t) = 1 0 0≤t≤T otherwise and let f (t) = sin ωt. Then Corr(r, f )(τ ) = Z ∞ r(t)f (t + τ ) dt = −∞ Z T 0 1 × sin ω(t + τ ) dt T 1 cos ωτ − cos ω(T + τ ) = − cos ω(t + τ ) = . 0 ω ω Example 5.2 Now for a slightly harder example. Find Corr(f, f )(τ ) where f (t) = e−kt 0 t≥0 otherwise Plots of f (t), f (t+ τ ), τ < 0 and f (t+ τ ), τ > 0 are shown in the figure below. Note carefully that, when τ < 0, f (t) is shifted to the right, and when τ > 0, it is shifted to the left. Case 1: τ < 0. We first work out Corr(f, f )(τ ) when τ < 0. Notice from figure 5.1 (top) that f (t) = 0 for t < 0, and so f (t + τ ) = 0 for t < −τ . Why −τ ? Because τ is negative: but from the figure, you can see that f (t + τ ) = 0 for t less 42 1 f(t) 0 1 f(t+τ), τ < 0 -τ 0 1 f(t+τ), τ > 0 -τ 0 Figure 5.1: The function f (t) used in example 5.2. than some positive number. It would be easy to get this wrong, and sketching a figure helps to avoid falling into a trap. Hence, for τ < 0, we have Corr(f, f )(τ ) = Z ∞ f (t)f (t+τ ) dt = −∞ = Z ∞ −τ ∞ e−kτ −2kt − e 2k −τ e−kt e−k(t+τ ) dt = Z ∞ e−2kte−kτ dt −τ ekτ = . 2k Case 2: τ > 0. We now work out Corr(f, f )(τ ) when τ > 0. It is still true that f (t) = 0 for t < 0, but now we have f (t + τ ) = 0 for t < −τ , the logic being the same as before — see the bottom panel of figure 5.1. Hence, for τ > 0, we have Corr(f, f )(τ ) = Z ∞ −∞ f (t)f (t+τ ) dt = Z 0 43 ∞ −kt −k(t+τ ) e e dt = Z ∞ 0 e−2kte−kτ dt ∞ e−kτ e−kτ −2kt = e . =− 2k 2k 0 Note that the lower limit on the integral was 0 in this case. Summarising, ekτ τ ≤0 2k Corr(f, f )(τ ) = −kτ e τ > 0. 2k Corr(f,f)(t) 0.5 -2 -1 0.0 0 τ 1 2 Figure 5.2: The autocorrelation function Corr(f, f )(τ ), k = 1, from example 5.2. Note that Corr(f, f )(τ ) is an even function of τ . Notice, too, from figure 5.2, that Corr(f, f )(τ ) is an even function of τ , and its maximum is at τ = 0, as discussed in section 5.3. Example 5.3 Show that the Autocorrelation Theorem is true for the function f (t) defined in example 5.2. We’ve done the hard work, which was to compute the autocorrelation function. The Autocorrelation Theorem tells us that the Fourier transform of Corr(f, f )(τ ) is the same as the modulus of the Fourier transform of f (t), squared. To check this, first find the Fourier transform of Corr(f, f )(τ ), which is Z ∞ 1 e−jωτ Corr(f, f )(τ )dτ = 2k −∞ Z 0 1 e−jωτ ekτ dτ + 2k −∞ 44 Z ∞ 0 e−jωτ e−kτ dτ = 1 2k Z 0 e(k−jω)τ dτ + −∞ = 1 2k Z ∞ 0 0 ∞ −(k+jω)τ e − e e−(k+jω)τ dτ = 2k(k − jω) 2k(k + jω) −∞ 0 (k−jω)τ 2k 1 1 1 1 + = = . 2k(k − jω) 2k(k + jω) 2k k 2 + ω 2 k2 + ω2 Let us now find the Fourier transform of f (t), which is F (ω) = Z ∞ −∞ e−jωt f (t) dt = Z ∞ e−jωt e−kt 0 Now, ∞ e−kt dt = − k + jω = 0 1 . k + jω 1 1 1 × = 2 k + jω k − jω k + ω2 which is indeed equal to the Fourier transform of Corr(f, f )(τ ). |F (ω)|2 = F (ω)F ∗(ω) = 5.5 Power and energy signals At this point, it will be useful to mention the difference between power signals and energy signals. An energy signal, v(t) say, contains aR finite amount of energy, E. ∞ Mathematically, this means that E = −∞ |v(t)|2dt is finite. Hence, all signals that last a finite time, for example, rectangular or triangular pulses, are energy signals; but so, too, are signals like f (t) in example 5.2, which has infinite duration, but decreases rapidly enough as t → ∞ that its integral is finite. By contrast, a power signal, v(t), ‘goes on for ever’ and thus contains an infinite amount of energy, although the power is finite. Examples include sin t, cos t or any periodic function that has a Fourier series. It makes sense here to define power by Z T 1 |v(t)|2dt. P = lim T →∞ 2T −T The definition for cross correlation given earlier in equation (5.1) is correct for energy signals but not for power signals. For power 45 signals f (t) and g(t), say, we instead calculate the correlation by Z T 1 f (t)g(t + τ ) dt. (5.2) Corr(f, g)(τ ) = lim T →∞ 2T −T 5.6 Correlation demonstrations We now look at two demonstrations of correlation, their purpose being to give a feel for how correlation is useful in practical situations. 5.6.1 Cross correlation We compute the cross correlation of two signals g(t) and h(t), each of which consists of a sum of four sine waves. Three of the frequencies are different in each case, but g(t) and h(t) also contain one common frequency. Cross correlation picks out the period of this common frequency. g(t) 3 2 1 0 -1 -2 -3 0 3 2 1 0 -1 -2 -3 0 0.02 1000 1500 2000 500 1000 1500 2000 Corr (g, h)(τ) h(t) 500 0.01 0 -0.01 -0.02 84 204 324 444 Figure 5.3: Cross correlation being used to pick out an underlying common periodic signal in the presence of other periodic signals. Specifically, in figure 5.3, 46 2πt 2πt 2πt + 0.5 sin 2πt g(t) = 0.1 sin 120 17 + 0.7 sin 59 + 1.0 sin 173 and 2πt 2πt 2πt + 0.4 sin 2πt h(t) = 0.05 sin 120 31 + 0.8 sin 131 + 1.2 sin 203 although these exact details are not important — just that fact that the signals contain one common frequency and the other three are unrelated. The common frequency, 2π/120, corresponds to a period of 120, and this component also happens to have a rather smaller amplitude than the other terms. It would be difficult to pick out what this period actually is by eye — see figure 5.3, top (g(t)) and middle (h(t)) panels. The bottom of figure 5.3 shows the cross correlation Corr(g, h)(τ ), calculated using equation (5.2), since these are power signals. As can easily be read from the figure, the common period of the two signals, 120, is also the period of Corr(g, h)(τ ). 5.6.2 Autocorrelation Autocorrelation can be used to pick out the period (and hence the frequency) of a periodic signal buried in noise, and figure 5.4 illustrates this. The signal f (t) = 1.4 r(t) + 0.05 sin 2πt/T0, where r(t) consists of normally distributed random numbers with standard deviation approximately 0.3 – see the Theoretical Distributions chapter for what this means. Note that the periodic signal has much smaller amplitude than the noise. The signal is plotted in the upper half of figure 5.4. You are unlikely to be able to pick out by eye the sine wave in the presence of this much noise (whose amplitude is about 28 times bigger than the periodic signal). However, the autocorrelation function, shown in the lower half, reveals that there is an underlying periodicity, and furthermore, that the period is about 120 units. 47 2.0 f(t) 1.0 0.0 -1.0 -2.0 0 500 1000 1500 2000 240 360 480 0.01 Corr(f, f)(τ) 0.005 0 0 120 Figure 5.4: The autocorrelation function being used to pick out an underlying periodic signal in the presence of noise. 5.6.3 Practical applications More details will be given in a lecture, but some examples of practical applications of autocorrelation and cross correlation are: • Loudspeaker evaluation. White noise is fed into a loudspeaker and a microphone is placed to pick up the sound from the speaker. The autocorrelation function of the output of the microphone shows the resonant frequencies of the speaker and its housing. • Leak location. Two sound sensors are attached to a buried water pipe, one on the upstream side of a leak and one on the downstream side. Assume that the velocity of sound along the pipe is known. The cross correlation of the two signals is calculated, the peak of which then gives the value of x − y, where x and y are the distances between the sensors and the leak. The distance x + y can be measured directly — it is the distance between the 48 sensors. From this, x and y can be found and hence the leak can be located. • Also used in cross correlation flow meter, GPS, Multipath interference measurements. Problems, chapter 5 1. Given two functions of time, f (t) and g(t), write down the definition of their convolution, f ⋆ g(τ ) and their correlation, Corr(f, g)(τ ). Directly from these definitions, show that, if g(t) is an even function of time, then f ⋆ g(τ ) = Corr(f, g)(−τ ). 2. Define the following functions: 1 0 0≤t≤T otherwise t 0 0≤t≤T otherwise r(t) = q(t) = c(t) = cos ωt, s(t) = sin ωt where T and ω are positive constants. Calculate (i) Corr(r, c)(τ ) (ii) Corr(q, s)(τ ) (iii) Corr(q, q)(τ ) (Hint: follow example 5.2.) [(i) [sin ω(T +τ )−sin ωτ ]/ω (ii) [sin ω(T +τ )−sin ωτ −ωT cos ω(T +τ )]/ω 2 (iii) (2T 3 + 3τ T 2 − τ 3 )/6 for −T ≤ τ < 0; (2T 3 − 3τ T 2 + τ 3 )/6 for 0 ≤ τ < T ; 0 otherwise] 3.∗ (i) For r(t) as defined in the previous question, show that Corr(r, r)(τ ) = T +τ T −τ 0 −T ≤ τ ≤ 0 0<τ ≤T otherwise (ii) Check that the Autocorrelation Theorem is true in this case by finding the Fourier transform of Corr(r, r)(τ ), and also of r(t), squaring the latter, and comparing. 49 [R(ω) = (1 − e−jωT )/(jω); |R(ω)|2 = 2(1 − cos ωT )/ω 2, which is also the f.t. of Corr(r, r)(τ )] 4. For power signals, for instance, sin t and cos t, the autocorrelation is defined as 1 Corr(f, f )(τ ) = lim T →∞ 2T Z T f (t)f (t + τ ) dt. −T (If we did not divide by 2T , the answer would usually be infinite for power signals.) Use this definition to calculate Corr(f, f )(τ ) when f (t) = cos t. Hint: cos x cos y = 12 cos(x + y) + 21 cos(x − y) 50 [(1/2) cos τ ] Chapter 6 Introductory probability 6.1 Definition of probability In everyday English, we have many different ways of saying how likely an event is, e.g. will certainly will probably It may/might rain today. is unlikely to will not These are five ways of saying roughly how likely it is to rain. They are not quantitative though — no numbers are put on the likelihood of rain. The part of mathematics that deals with how likely something is, is called probability. Example 6.1 In tossing a fair coin, there are two possible outcomes: heads or tails. The probability of a head is P (head) = No. of outcomes that result in a head 1 = . total number of possible outcomes 2 Example 6.2 Walkers Crisps claim to have put a cheque for £10,000 in ‘selected packets’. Suppose there are 10 cheques in 8,000,000 packets. The probability of buying a winning packet is P (win) = 1 No. of outcomes that result in a win = total number of possible outcomes 800, 000 51 . . . so you can be fairly certain you won’t be lucky. Both these probabilities are obtained ‘in the limit’ as the coin is tossed more and more times, or more and more crisps are bought. For instance, if you were to toss the coin 1,000,000 times, you would be fairly unlikely to get exactly 500,000 heads (about one time in 1,253 — we shall see how to calculate this from the Binomial distribution) — but you would expect to obtain around 500,000 almost all the time. To estimate experimentally the probability of a head, you would need to find No. of heads P (head) = lim . N→∞ Number of times coin has been tossed, N In calculating probabilities, you need to be careful to evaluate all possibilities, as illustrated below. Example 6.3 A coin is tossed three times. What is the probability that (a) heads are obtained twice and tails, once? (b) the result is heads, heads, tails, in that order? (c) If at least two are heads, what is the probability that all are heads? Answer. First write down all possible outcomes, which are hhh, hht, hth, htt, thh, tht, tth, ttt (8 in all). (a) The outcomes that consist of two heads and one tail are hht, hth, thh so P (2 heads, 1 tail) = 3 8 (b) There is only one possibility in this case: hht. Hence, P (hht in that order) = 1 8 (c) Outcomes in which there are at least two heads are hhh, hht, hth, thh (4 in all). Of these, only 1 is all heads, so P (hhh given at least two heads) = 1 4 We can summarise the above examples in the following definition: 52 If all outcomes of an experiment are (a) equally likely and (b) mutually exclusive, the probability of an event E is P (E) = number of outcomes favourable to E . total number of outcomes (6.1) In these examples, we have calculated the probability of an event E, which we shall write P (E). By convention • P (E) = 1 means E is certain to happen • P (E) = 0 means it is certain not to happen. Hence, all probabilities must lie between 0 and 1. Furthermore, let us write the probability of an event E not happening as P (not E). Then P (E) + P (not E) = 1. This equation is saying that the probability of an event happening, plus the probability of it not happening, is one. It is certain that it either happens or it doesn’t. Example 6.4 A fair die is thrown. The probability of throwing a 4, P (4), = 1/6. The probability of not throwing a 4, P (not 4) = P (1) + P (2) + P (3) + P (5) + P (6) = 5/6. So P (4) + P (not 4) = 1/6 + 5/6 = 1. 6.2 Addition of probabilities — mutually exclusive case We can take this further. To do so, we need to know that, if A and B are mutually exclusive events — ones that cannot both happen — then the probability of A or B happening, written P (A or B), = P (A) + P (B). Examples of mutually exclusive events are: tossing a coin — the outcome can only be heads or tails; rolling a die — the 53 outcome can be precisely one of the integers 1 . . . 6, so an outcome of, say, 4 precludes any other outcome. For general n, rather than just 2 events, the addition formula becomes P (E1 or E2 . . . or En) = P (E1) + P (E2) + . . . + P (En) (6.2) or, in words, The probability of event E1 or E2 or . . . En happening, where E1 . . . En are mutually exclusive events, is the sum of the individual probabilities P (E1), P (E2). . . P (En). Example 6.5 Two dice are rolled. What is the probability that they both show the same number? Answer. Let E1 be the event that both dice show 1; E2, that they both show 2, etc. Then P (E1) = 1/36 (one outcome favourable to E1 out of 62 = 36 possible outcomes. Similarly, P (E2) = . . . = P (E6) = 1/36. Now E1 . . . E6 are mutually exclusive events — both dice showing 3, say, excludes any other outcome — so the probability of obtaining the same number is 1/36 + 1/36 + 1/36 + 1/36 + 1/36 + 1/36 = 6/36 = 1/6. 6.3 Addition of probabilities — general case If E1 and E2 are now any two (i.e. not necessarily mutually exclusive) events then P (E1 or E2) = P (E1) + P (E2) − P (E1 and E2). (6.3) The proof of this is easily seen by using a Venn diagram — figure 6.1. The diagram is drawn inside a rectangle whose area is 1. Then P (E1) is the area of the circle labelled E1 and P (E2) is the area of the circle labelled E2. You should be able to see that the union of P (E1) and 54 E1 E1 and E2 E2 Figure 6.1: The proof of formula (6.3). P (E2) includes the area common to E1 and E2 — the intersection in Figure 6.1 — twice. Therefore the total area enclosed by E1 and E2 is given by P (E1) + P (E2) − P (E1 and E2). Example 6.6 A card is drawn at random from a standard pack of cards. What is the probability that the card drawn will be a diamond or an ace? Answer: P (diamond) = 13/52 = 1/4, P (ace) = 4/52 = 1/13 and P (diamond and ace) = P (ace of diamonds) = 1/52. Hence P (diamond or ace) = 1/4 + 1/13 − 1/52 = 4/13. 6.4 Multiplication of probabilities If two events A and B are independent, then one event has no effect on the other. We now calculate the probability that two independent events both happen. Example 6.7 Suppose we toss a coin and roll a die. What is the probability of the coin showing heads and the die showing 3? Answer We assume that for the coin, P (heads) = 1/2 and for the die, P (3) = 1/6. We also know that there are 6 × 2 = 12 possible outcomes, only one of which is the desired one of heads and 3. Hence, we would expect P (heads and 3) = 1/12. But this result is also given by P (heads and 3) = P (heads) × P (3) = 1/12. 55 This is a general rule for ‘independent AND events’: For two independent events A and B, with probabilities P (A) and P (B) respectively, the probability P (A and B) = P (A)P (B). (6.4) Problems, chapter 6 1. One card is taken randomly from a shuffled pack of 52. What is the probability that it is (a) an ace? (b) a spade? (c) a red queen? [(a) 1/13 (b) 1/4 (c) 1/26] 2. A cow has two calves. If male and female calves are equally likely, what is the probability that (a) both calves are female? (b) there is at least one male? (c) Given that there is at least one male, what is the probability that both are male? [(a) 1/4 (b) 3/4 (c) 1/3] 3. Two dice are rolled. What is the probability of obtaining (a) two sixes? (b) two even numbers? (c) both numbers greater than or equal to 5? [(a) 1/36 (b) 1/4 (c) 1/9] 4. In a large batch of resistors, it is found that 1 in 50 is out of tolerance; for capacitors, it is found that 1 in 21 is out of tolerance. If I build an RC filter using one resistor and one capacitor, what are the probabilities that (a) the resistor, (b) the capacitor and (c) both components I select will be in tolerance? [(a) 49/50 (b) 20/21 (c) 14/15] 5. By drawing the analogous diagram to figure 6.1, show that for THREE (not necessarily mutually exclusive) events, P (E1 or E2 or E3 ) = P (E1) + P (E2) + P (E3 ) − P (E1 and E2) −P (E1 and E3 ) − P (E2 and E3) + P (E1 and E2 and E3). 6. In the Italian Superenalotto, in order to win the jackpot, you have to match 6 different numbers drawn from the range 1–90 inclusive. What are the odds of winning the jackpot? [Rather low at 1/622,614,630] 56 7. A die is rolled three times. What is the probability that the sum of the numbers obtained is (i) 3? (ii) 4? (iii) 5? [(i) 1/216, (ii) 3/216 = 1/72, (iii) 6/216 = 1/36] 8. From a standard pack of cards, two are drawn at random without replacement. What is the probability that both cards are face cards (jack, queen or king of any of the four suits)? [11/221] 9. A biscuit tin contains 100×2p coins, 50×5p coins and 30×10p coins. Two coins are drawn from the tin at random and not replaced. What is the probability that their combined value is greater than 10p? [329/1074] 10. Two cards are taken successively, without replacement, from a standard pack of 52. What is the probability that (a) both cards are greater than 3 and less than 9? (b) The first card is an ace and the second is a face card? (c) The cards drawn are an ace and a face card (in either order)? [(a) 95/663, (b) 4/221, (c) 8/221] 57 Chapter 7 Discrete variables: the p.d.f., mean and variance Aims By the end of this chapter, you should understand the terms • random variable • normalisation • probability density function (p.d.f.) f (x) • cumulative distribution function (c.d.f.) F (x) • mean and standard deviation and be able to calculate the last three. 7.1 Random variables We have already come across the idea of a random variable in the previous chapter. For instance, the number shown when a die is rolled is a random variable, because we have no way of predicting it in advance. The actual number a fair die will show (let us call it x) is unknown in advance, but this does not mean to say we cannot say something about it: for instance, 58 • x lies between 1 and 6 (1 ≤ x ≤ 6) • x is equally likely to be 1, 2, 3, 4, 5 or 6 • if we roll the die 1,000 times and calculate the average of the numbers obtained, it is likely to be around 3.5 These items are what we could call statistical properties of the random variable x, the number shown by a die. We could do experiments to verify that x actually has these statistical properties. (What sort of experiments might we do?) 7.2 Definitions: a set of N discrete values We define below the terms mean, standard deviation and variance as applied to a set of N discrete values, x1, x2, . . . xN . 7.2.1 Mean of x, x To calculate the mean of a set of N discrete values, add them up and divide by N : N 1 X xi . x= N i=1 7.2.2 (7.1) Standard deviation of x, σx To calculate the standard deviation of a set of N discrete values of x, first find the mean, x. Then the standard deviation is given by v u u u t N 1 X (xi − x)2. σx = N − 1 i=1 59 (7.2) The reason we divide by N − 1 and not N is subtle and has to do with the fact that the formula, as given, generally gives a better approximation to the standard deviation of the whole population, even though you’re only looking at a sample of size N : this will be further explained in a lecture. If all the numbers xi were very close to the mean, then the standard deviation would be small, so σx can be seen as a measure of how widely scattered around the mean the values are. It is in fact the root mean square (r.m.s.) deviation from the mean. People sometimes also talk about the variance; this is defined as σx2 . Example 7.1 Two dice were rolled 12 times and the sums of the two numbers were: 12, 8, 5, 10, 8, 8, 9, 9, 5, 3, 4, 10. What are the mean and standard deviation of these results? Answers The mean is the sum of the numbers divided by 12, which is 91/12 (= 7.58). The standard deviation is the square root of o 1 n (12 − 91/12)2 + (8 − 91/12)2 + . . . + (10 − 91/12)2 11 which is 2.75. 7.3 The probability density function The random variable obtained by rolling a fair die is rather special, in that it is equally likely to be any of the integers (whole numbers) 1 – 6. What about random variables that do not have this ‘equally likely’ property? For example, suppose the random variable x is obtained by rolling a die twice and adding up the two numbers shown. What can we say about this random variable? We can easily see that the random variable obtained by adding the numbers in this experiment does not have the ‘equally likely’ property. For instance, there is only one way that x = 2: when the first throw gives 1 and the second throw also gives one. There are, however three ways that the result x = 10 can be obtained: 4,6 5,5 and 60 6,4. We would therefore expect to observe x = 10 more often than x = 2 in a large number of trials. It is an important general principle in probability that the more ways there are of obtaining a result, the more likely that result is. The information about how likely different results are is best displayed as a bar chart1 . Along the x-axis we plot the independent variable, x, the sum of the two numbers in this example, and along the y-axis we plot f (x), the probability of result x. So, for instance, there are 3 ways to obtain x = 10. There are 6 × 6 = 36 different possible outcomes from rolling a die twice, so f (10) = 3/36. The function f (x) as we have defined it, is known as the probability density function, abbreviated to p.d.f. Figure 7.1 explains how to plot a bar chart of the p.d.f. for the two dice experiment. The steps are 1. List all possible outcomes. 2. Calculate x for each one. 3. Calculate the relative frequency of each outcome — that is, count how many times each different outcome is obtained and divide by the number of possible outcomes (36 in this case). 4. Plot this number against x. 7.3.1 Normalisation You will notice in figure 7.1 that the relative frequency is plotted on the vertical axis, i.e. the frequency divided by the number of possible outcomes. Suppose we add up all the values of f (x), that is, we calculate f (2) + f (3) + . . . + f (12): what do we obtain? 1 A bar chart is a graph in which the variable plotted along the x-axis is discrete and so the y variable is plotted as a series of vertical bars of the appropriate height. 61 List possible outcomes: 1,6 2,6 3,6 4,6 5,6 6,6 1,5 2,5 3,5 4,5 5,5 6,5 f(x) 1,4 2,4 3,4 4,4 5,4 6,4 6/36 1,3 2,3 3,3 4,3 5,3 6,3 5/36 1,2 2,2 3,2 4,2 5,2 6,2 4/36 1,1 2,1 3,1 4,1 5,1 6,1 3/36 Add 2/36 7 8 9 10 11 12 6 7 8 9 10 11 5 6 7 8 9 10 4 5 6 7 8 9 3 4 5 6 7 8 2 3 4 5 6 7 1/36 Plot frequency 0/36 0 1 2 3 4 5 6 7 8 9 10 11 12 sum, x Figure 7.1: How to calculate the probability density function for the sum of numbers shown by rolling a die twice. 1+2+3+4+5+6+5+4+3+2+1 =1 36 Is it a coincidence that these numbers add up to one? No! — to see that it isn’t, recall the addition of probabilities formula for the mutually exclusive case (6.2) in the previous chapter. That formula applies here, since in the ‘roll a die twice’ experiment, the events ‘sum of the numbers = x’ and ‘sum of the numbers = y’ are mutually exclusive if x 6= y. Now, the outcome of the experiment must be precisely one of the numbers 2 . . . 12, and so the sum of the individual probabilities of these numbers must be one. A p.d.f. will always be normalised if we plot it in the way described above, so the general rule is that all the probabilities added together equal one, i.e. X f (xi ) = 1 i 62 where X i means ‘sum over all relevant values of i’. Example 7.2 On the axes below, plot the p.d.f. for x = the number of heads obtained when 4 coins are tossed. Check that the probabilities add up to one. f (x) ✻ ✲ x 7.3.2 Other names for p.d.f. There are various different names for the p.d.f., including frequency function, probability density and probability function, so be aware of this when reading textbooks. 7.4 What does the p.d.f. mean? We have looked at the p.d.f. for two examples, both of which are discrete. That is, the variable we have called x only takes on integer values. (You can never roll two dice and add the numbers up to get 3.4, neither can you obtain 1.5 heads in a coin tossing experiment.) We will look at the p.d.f. in continuous cases in the next chapter. Look back at figure 7.1. We have calculated the p.d.f. for the sum of the numbers shown by two dice. Two questions we might ask are: 63 1. What does the bar chart mean? 2. How would we plot the bar chart experimentally for the two dice? The answer to the first question is that f (x) is the relative frequency of the value x — i.e. the number of times x occurs, divided by the total number of observations. So for instance, looking at figure 7.1, we see that a sum of 5 is twice as likely as a sum of 3, since f (5) = 4/36 and f (3) = 2/36. The p.d.f. cannot tell you what the next outcome will be (because it is random) but it can tell you the probability of a particular outcome. The p.d.f. also enables us to calculate numbers like the mean and standard deviation, as we shall see in section 7.6. Strictly speaking, the answer to the second question is to take two dice and roll them N times, adding up the numbers shown and recording them. We would then plot a bar chart of the number of times we had obtained the result 2, 3, . . . 12, normalised by dividing by N . The question remains though, how large does N have to be to obtain an accurate result? I am not going to do this experiment in the lecture — unless N only needs to be some small number like 20 say, you would get bored and so would I — but I can do a computer simulation that amounts to the same thing. Using the C random number generator, drand48(), I can simulate a die being thrown and so produce the data for N = 200, 000 in about 90 milliseconds (on my computer). The table below shows the results. The p.d.f. has been normalised, by dividing by N in each case, and then the result has been multiplied by 36 so that the numbers agree with those in figure 7.1. 64 x = sum of numbers 2 3 4 5 6 7 8 9 10 11 12 p.d.f. for N = 20 200 200,000 0.0/36 0.54/36 1.0237/36 1.8/36 2.16/36 1.9973/36 3.6/36 3.42/36 2.9916/36 5.4/36 2.52/36 4.0104/36 5.4/36 4.14/36 5.0197/36 9.0/36 5.58/36 5.9967/36 3.6/36 7.02/36 4.9948/36 1.8/36 3.60/36 3.9470/36 3.6/36 4.14/36 2.9921/36 3.6/36 2.16/36 2.0284/36 0.0/36 0.90/36 0.9985/36 P.D.F. , f(x), (36ths) 20 throws 200,000 throws 200 throws 10 10 10 8 8 8 6 6 6 4 4 4 2 2 2 0 0 0 2 4 6 8 10 12 2 4 6 8 10 12 x = Sum of numbers 2 4 6 8 10 12 Figure 7.2: Finding the p.d.f. for the two dice problem by computer simulation. 65 7.5 The cumulative distribution function As discussed in the previous section, the p.d.f., f (xi ) gives us the probability that x = xi exactly. In many instances we want to know something different, but related: what is the probability that x is less than or equal to a given value? For instance, tubes of Smarties might nominally contain 40, but in fact can contain anything between 37 and 44. We might want to know the probability that a tube contains fewer than 39. As another example, we know the probability of rolling a die and obtaining a given number — it is 1/6 (if the die is fair) — but what about the probability that the result is less than, say, 4? Both these questions can be answered if we know the cumulative distribution function, c.d.f., F (x), and F (x) can be easily worked out if we know the p.d.f. As an example, let us calculate F (x) = the probability that the number shown by a fair die is less than or equal to x. We know the p.d.f. for this problem: it is f (x) = 1/6 for 1 ≤ x ≤ 6 and f (x) = 0 otherwise. Hence, F (1), the probability that x ≤ 1 is 1/6; F (2), the probability that x ≤ 2 is 1/6 + 1/6 = 1/3; F (3) = 1/6 + 1/6 + 1/6 = 1/2 and so on. From this, you should be able to see that given f (x), we can calculate F (x) by F (xi) = probability that outcome x ≤ xi = 7.6 X f (xj ). xj ≤xi Mean & standard deviation: when the p.d.f. is known Suppose now, instead of giving you a list of numbers, I give you a plot or a table of the p.d.f., f (x). It is possible to calculate directly from 66 this what the mean and standard deviation for a very large number of observations would be. 7.6.1 Mean, x Example 7.3 Let x be the number of heads obtained when three coins are tossed. What is the mean value of x? Answer First work out the p.d.f., f (x). You should be able to show that f (0) = 1/8, f (1) = f (2) = 3/8 and f (3) = 1/8. (For other values of x, f (x) = 0.) These figures could also be calculated using the binomial distribution — see the Theoretical Distributions chapter. How many times, on average, will we obtain 2 heads? We know that f (2) = 3/8, so if we toss the three coins 8, 000 times, say, we would expect about 3/8 × 8, 000 = 3, 000 of these to result in 2 heads. Similarly, we would expect to get 3 or 0 heads about 1,000 times each, and 1 head about 3,000 times. The average number of heads per toss will therefore be 0 × 1, 000 + 1 × 3, 000 + 2 × 3, 000 + 3 × 1, 000 3 = heads per toss 8, 000 2 From the above example, we deduce that the general formula for the mean when the p.d.f. is known is X x= xif (xi ) (7.3) i 7.6.2 Standard deviation, σx The calculation of the standard deviation is done in the same way: σx = sX i (xi − x)2f (xi) 67 (7.4) Problems, chapter 7 1. Ten resistors, nominally 1kΩ, are measured and their values are found to be 996, 1001, 1023, 997, 1004, 1010, 1008, 996, 990, 1007 Ω. Calculate the mean and standard deviation of these values. [R = 1003.2Ω, σR = 9.4Ω] 2. In the two dice experiment, calculate the mean and standard deviation of x, where x is the sum of the two numbers. (Hint: figure 7.1 shows the p.d.f.) [x = 7, σx = 2.42] 3. Sketch the p.d.f. for x = the number of heads − the number of tails when four coins are tossed. [f (4) = f (−4) = 1/16, f (2) = f (−2) = 4/16, f (0) = 6/16] 4. My research on Smarties indicates that the p.d.f., f (x) = the probability that a tube contains x Smarties is as follows: x 36 37 38 39 40 41 42 43 f (x) 1/12 1/12 2/12 2/12 3/12 1/12 1/12 1/12 For other values of x, f (x) = 0. Plot a bar chart of the c.d.f.for this problem. What is the probability that a tube contains (a) 39 or fewer (b) 41 or more Smarties? [(a) 1/2, (b) 1/4] 5. A money box contains 80 × 10p and 120 × 20p coins. Two coins are taken out at random without replacement. Calculate the p.d.f. f (x), where x is the total monetary value of the coins taken out. Sketch this p.d.f. in the form of a bar chart. If this experiment is repeated many times, what is the mean value of the money withdrawn per experiment? [f (20) = 0.1588, f (30) = 0.4824, f (40) = 0.3588; x̄ = 32p] 68 Chapter 8 Continuous distributions Aims By the end of this chapter, you should know about • the p.d.f., f (x), and cumulative distribution function, c.d.f., F (x), for continuous variables • how to calculate the mean and standard deviation for continuous variables • applications to noise. 8.1 Continuous random variables In the previous chapters we have looked at random variables x which take on a discrete set of values, such as the number obtained by rolling dice and so on. In this chapter we turn our attention to continuous variables, i.e. ones which can take a continuous set of values — all values in a range. Examples include • the values of resistors whose nominal value is, say, 1MΩ — the actual value might lie anywhere between about 0.9 and 1.1MΩ (assuming 10% tolerance). • the voltage produced by a noise source, sampled at discrete time intervals. 69 8.2 Those definitions again Before we discuss the p.d.f., we will define the mean and standard deviation for continuous variables. 8.2.1 Mean of x, x Suppose x(t) is a variable (e.g. voltage or current) that depends on time. Then, by analogy with equation 7.1, the mean of x(t) in the range 0 ≤ t ≤ T is 1 x= T Z T x(t)dt. (8.1) 0 In practice, T will often be set by the response time of the measuring instrument. To see why this definition is reasonable, remember that the integral of a function between limits 0 and T is the area between a graph of the function and the horizontal axis, with areas below the axis being negative, and with t ranging from 0 to T . Suppose we were to squash the graph of the function into a rectangular shape, but with the same area and width (T ) as before. Then the height of this rectangle is just x — which is an intuitively reasonable way to define the mean. 8.2.2 Standard deviation of x, σx Similarly, by analogy with equation 7.2, we have 1 σx2 = T Z 0 T (x(t) − x)2dt. Note that if x = 0 then 70 (8.2) 1 σx2 = T 8.3 Z T 0 x(t)2dt = mean value of x(t)2 . Application to signal power You are probably familiar with the fact that the power delivered by a voltage of the form V1 sin ωt to a load R is V12/2R. This is because in general, mean square voltage, v 2 . signal power, P = load resistance, R In the case of a sine wave, therefore, equation 8.1 can be used to find the mean square voltage v 2. In this case, it is sensible to integrate over one complete cycle, although the integral over any number of complete cycles would give the same answer (why?). This, when divided by R, gives the power: Z T Z T 2 V 1 1 V12 sin2 2πt/T dt = 1 (1 − cos 4πt/T )dt v2 = T 0 2 T 0 V12 V12 V12 1 [T ] = , so signal power = . = 2 T 2 2R Now, notice that if the mean of a signal v = 0 then the variance, σv2 is the same as the mean square (see equation 8.2). Hence, the total signal power delivered to a load R by a signal with zero mean and variance σv2 is given by σv2/R. This is sometimes useful for noise power calculations. 71 8.4 The p.d.f. for continuous variables Look back at figure 7.1. The bar chart of f (x) gives the probability of obtaining the sum x when two dice are rolled, so that, for instance, the probability that x = 1 is f (1) = 0, and the probability that x = 4 is f (4) = 3/36. In fact, the probability of obtaining x is equal to the height of the strip, provided that the bar chart has been normalised. The p.d.f. in the continuous case is the limit of the bar chart as the width of the strips tends to zero and the number of measurements tends to infinity. Unless we are prepared to do an infinite number of measurements, therefore, we can only ever find (discrete) approximations to the p.d.f. by means of an experiment, and we discuss the experimental techniques involved in the next chapter. Example 8.1 What is the p.d.f. for the random numbers generated by the C random number generator drand48()? Answer The C random number generator generates pseudo-random numbers, x, in the range 0–1. We would expect them to be uniformly distributed over that range, so that there would be, in the long run, roughly the same number lying in the range 0 – 0.1 as in the range 0.47 – 0.57 say. In other words, we might expect the p.d.f., f (x), to be something like Note that f (x) = 1 0 0≤x<1 otherwise. • this is a continuous distribution (x can have any value between 0 and 1); • it is normalised, that is Z ∞ f (x)dx = 1. −∞ You can easily write a program to produce data for a bar chart, with a given number of strips, by generating a given number of random numbers. Some sample results are shown in figure 8.1. 72 1.5 Infinite limit 1.0 0.5 0.0 −0.1 1.5 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 50 strips, 100,000 numbers f(x) 1.0 0.5 0.0 −0.1 1.5 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.0 1.1 10 strips, 2,000 numbers 1.0 0.5 0.0 −0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 x Figure 8.1: Visualising the p.d.f. for the C random number generator, using different numbers of strips and random numbers. 8.5 The c.d.f., F (x) We defined the c.d.f., F (x), in the previous chapter. The definition there was given as a sum for a discrete distribution, so you should not be surprised that it is an integral for continuous distributions: F (x0) = probability that x ≤ x0 = Z x0 f (x)dx. (8.3) −∞ Look at figure 8.2. This illustrates how we would calculate the probability that a measurement of the random variable x lies between x0 and x1: Z x1 f (x)dx P (x0 ≤ x ≤ x1) = x0 provided that f (x) has been normalised. So, from the definition of 73 F (x), we have — see figure 8.2. P (x0 ≤ x ≤ x1) = F (x1) − F (x0) f(x) x0 x1 x Figure 8.2: The probability that x lies between x0 and x1 is the area under the p.d.f. between x0 and x1 . Example 8.2 Find the probability that none of the three light bulbs in a spotlight array will have to be replaced during the first 1,200 hours of use if the lifetime of a light bulb can be modelled as a random variable with p.d.f. given by f (x) = 6(−x2 + 3x − 2) for 1 ≤ x ≤ 2 0 otherwise, where x is measured in units of 1,000 hours. Answer We are dealing with independent ‘and’ events here (one light bulb does not affect another), so we need to find 1 − F (1.2), the probability that a bulb is still working after 1.2 thousand hours. (It’s 1 − F (1.2) because F (1.2) is the probability that a bulb has stopped working by 1,200 hours.) The probability that all three are still working is then (1 − F (1.2))3. We know the p.d.f., so bearing in mind equation 8.3, F (1.2) = Z 1.2 −∞ f (x)dx = Z 1.2 1 6(−x2 + 3x − 2)dx = 13/125 which gives a probability of (1 − F (1.2))3 = 0.72, or 72% that all three are still working after 1,200 hours. 74 8.6 Definitions: when the p.d.f. is known If you look back at section 7.4, you will remember that we said that the p.d.f. in the discrete case tells us a lot about the outcome of an experiment. This is equally true in the continuous case, so that the mean and standard deviation can easily be calculated, just as in the discrete case. 8.6.1 Mean of x, x x= 8.6.2 ∞ xf (x)dx (8.4) (x − x)2f (x)dx (8.5) −∞ Standard deviation of x, σx σx2 8.6.3 Z = Z ∞ −∞ The mean of any function of x From the previous two subsections, it should come as no surprise to you that you can calculate the mean of any function, G(x) say, from G(x) = Z ∞ G(x)f (x)dx. (8.6) −∞ For instance, if you wanted to know the mean value of x3, this can be computed from Z ∞ x3f (x)dx. x3 = −∞ 75 Example 8.3 What are the mean and standard deviation of the numbers produced by the C random number generator? Answer Assume that the p.d.f. is f (x) = Then x= and ∞ ∞ −∞ (x − x)2f (x)dx = so the standard deviation, σx , is q −∞ 0≤x≤1 otherwise xf (x)dx = Z σx2 = Z Z 1 0 Z 1 0 1 0 1 1 x x × 1dx = = 2 2 2 0 1 1 1 x x x (x − )2 dx = − + = 2 3 2 4 12 0 3 2 1/12. Problems, chapter 8 1. A voltage v(t) is given by v(t) = V0 + V1 sin ωt (a) What is the mean, v? (b) What is the standard deviation, σv ? (c) What is the mean power delivered by this signal to a load of resistance R? √ [(a) V0 , (b) V1/ 2, (c) (V02 + V12/2)/R] 2. In example 8.2 show that the p.d.f. given for light bulb failures is normalised. 3. A particular noise voltage v(t) has p.d.f. f (v) = 1 v ln 2 0 for 1 ≤ v ≤ 2 otherwise Calculate (a) the mean, (b) the standard deviation and (c) the average power delivered to a 50Ω load. [(a) 1.44V, (b) 0.29V, (c) 43mW] 76 4. The shelf-life, x, in months, of a batteries can be modelled as a random variable with p.d.f. a for x ≥ 0 f (x) = (x + 5)3 0 otherwise (a) Find the value of a. (b) Find the probability that a single battery will have a shelf-life of (i) at least 20 months and (ii) anywhere between 10 and 40 months. [(a) 50, (b)(i) 1/25 or 4%, (ii) 8/81 or 9.9%] 5. The waiting time in a Post Office queue, in minutes, x, is modelled as a continuous random variable with cumulative distribution function 1 − e−x/4 for x ≥ 0 0 otherwise F (x) = (a) Calculate the probability of waiting (i) less than 12 minutes, (ii) more than 5 minutes, and (iii) between 2 and 4 minutes. (b) Derive the p.d.f., f (x), and hence calculate the mean waiting time, x̄. [(a)(i) 0.95, (ii) 0.29, (iii) 0.24, (b) x̄ = 4 minutes] 6.∗ The lifetime of a light bulb is a random variable with p.d.f. given by f (x) = x − 1 for 1 ≤ x ≤ 2 3 − x for 2 ≤ x ≤ 3 0 otherwise (x is measured in 1000 hours.) (a) Sketch the p.d.f. (b) Sketch the probability that a bulb has stopped working after a time t, with t in the range 0–4000 hours. (c) Calculate the probability that a bulb has stopped working after 2200 hours. (d) A circuit consists of two such bulbs in (i) series and (ii) in parallel. What is the probability that these arrangements are open circuit after 2,200 hours of operation? [(c) 68% (d) (i) 90% (ii) 46%] 77 Chapter 9 Theoretical distributions Aims By the end of this chapter, you should know about • the Gaussian (also known as normal) distribution and its properties • the Poisson distribution and its properties • the Binomial distribution and its properties. 9.1 The Gaussian distribution Suppose that we were to measure the actual resistance of a large number (say 1,000) resistors whose value was supposed to be 1kΩ. We should not expect to obtain 1,000 values of exactly 1kΩ, because the manufacturing process for resistors isn’t perfect. In other words, we would expect to obtain some resistances greater than 1kΩ, some less. I have actually done these measurements, for 75 rather than 1,000 resistors, and a bar chart of the results is shown in figure 9.1. The shape is more-or-less what you might have expected: a large number around the middle and fewer further away. The mode (most popular value) however, is not 1kΩ, perhaps unexpectedly: it is 987.6Ω, which provides evidence that the bridge I used needs re-calibrating. 78 R 980.6 982.0 983.4 984.8 986.2 987.6 989.0 990.4 991.8 993.2 994.6 996.0 997.4 Count 1 * 1 * 3 *** 4 **** 8 ******** 17 ***************** 15 *************** 16 **************** 3 *** 1 * 4 **** 0 1 * Mean = 988.6, Std. dev. = 2.93 Figure 9.1: Non-normalised bar chart for the resistance of 74 nominally 1kΩ resistors, measured on an RLC bridge. I actually measured 76 resistors, one of which was 99.6Ω (I imagine that it had escaped from the 100Ω drawer) and I also discounted a single 1009Ω resistor from the calculations, on the grounds that it is an exceptionally high value, known as an ‘outlier’. The bar chart in figure 9.1 is an approximation to the p.d.f. f (R) for the various values of resistance R, except that as shown it has not been normalised: to normalise we would need to divide each column height by 74, the number of resistors, ×1.4Ω, the width of each class. That would ensure that the area under the bar chart is one. It is an experimental fact that the p.d.f. of a wide range of measurements of a variable subject to random errors is found to be well approximated by a particular curve. The p.d.f. in question has a particular mathematical form (see below) and data that follows this description is said to be Gaussian or normally distributed. The continuous distribution with p.d.f. (x−x)2 1 − f (x) = √ e 2σ2 σ 2π 79 (9.1) is known as the Gaussian or normal distribution. It crops up all over the place. Some points to note are: • The mean of the random variable x is just x • The standard deviation is σ • The distribution is normalised, that is, Z ∞ f (x)dx = 1 −∞ • The curve is symmetrical about x = x and bell-shaped It is useful to know how different values of the parameters x and σ affect the shape of the Gaussian curve, and this is illustrated in figure 9.2. Notice that the smaller σ is, the narrower and higher the curve is; if it is narrower, it must also be higher because the total area has to be 1 (normalisation). Note also that the peak of the curve occurs at x = x — that is, the most likely value (the mode) is also the mean: not all distributions have this property. 9.2 The Gaussian probability distribution Look back at equation 8.3. That tells us that if a random variable x is normally distributed, the probability that its value is less than a number x1, P (x ≤ x1), is given by Z x1 (x−x)2 1 − e 2σ2 dx F (x1) = P (x ≤ x1) = √ (9.2) σ 2π −∞ This defines the cumulative distribution function, F (x), for a Gaussian p.d.f. We can calculate the probability that x lies between x1 and x2 — it is given by F (x2) − F (x1). 80 2.0 1.5 f(x) σ = 0.25 1.0 σ = 0.5 0.5 σ= 0.0 –2.0 –1.0 0.0 x 1.0 1 2.0 Figure 9.2: The Gaussian p.d.f. for x = 0 and different values of σ. Unfortunately, the integral in (9.2) can’t be expressed in terms of known functions like sine, exp, log etc. and so we generally have to find its value from tables. If you look at the integral in 9.2 you will see that its value depends on three parameters: x, σ and x1. Obviously it would be impractical to compile tables for all possible values of these parameters, and this is not necessary. Instead we can use a single table, which is on page 177 and which gives the area under the Gaussian curve between 0 and z, where |x − x| z= σ is the normalised variable. From this you can calculate F (z) in all cases, as set out below. Example 9.1 The amplitude of a noise voltage v is normally distributed with mean 0.8V and variance 0.25V2. What is the probability that the voltage sampled at a particular instant (a) lies between 1 and 2V? (b) is less than 1.5V? 81 Answer (a) The probability required is P (1 ≤ v ≤ 2). We need to translate these√values of v into the normalised variable z. First, the standard deviation σ = 0.25 = 0.5V. Given the mean and standard deviation, we can see that v = 1V corresponds to z = 0.4 and v = 2V corresponds to z = 2.4. From the table on page 177, F (0.4) = 0.5 + 0.1554, F (2.4) = 0.5 + 0.4918. Hence, P (1 ≤ v ≤ 2) = 0.9918 − 0.6554 = 0.3364, which is the required answer. The two numbers are subtracted because the two voltages, 1 and 2V, are on the same side of the mean. (b) Here we need to find P (−∞ ≤ v ≤ 1.5). The two values of v correspond to z = −∞ and z = 1.4 respectively. We need to find the area under the Gaussian curve between z = −∞ and z = 1.4. Since F (z) is symmetrical about z = 0, and F (∞) = 1 (normalisation), we know that F (0) = 0.5. Hence, the area between z = −∞ and z = 1.4 = 0.5 + area between z = 0 and z = 1.4, so P (−∞ ≤ v ≤ 1.5) = 0.5+F (1.4) = 0.9192. The two numbers are added because the two voltages −∞ and 1.5V are on opposite sides of the mean. 9.3 The Poisson distribution This is a discrete distribution and has many applications, e.g. in computer networks and queueing theory. The distribution arises in situations where a series of independent random events occurs, and the probability of a single such event occurring within a small time interval is proportional to the length of that interval. In fact, it applies not just to events that happen in time, but events distributed within any region. For example, the Poisson distribution allows us to answer questions such as • If an office receives on average 100 telephone calls per hour, what is the probability exactly 210 calls being received in a given two hour period? [2.2%] • If a cyclist gets a flat tyre once every 5,000 miles on average, what is the probability of having no flat tyres in 10,000 miles? [13.5%] 82 • A typist makes an average of one mistake per page; what is the likelihood of picking a page at random that contains three mistakes? [6.1%] We now derive the Poisson p.d.f. Suppose that at the book issue desk of a library, the probability that one person arrives during a small time interval from 0 to δt is λδt, with λ a constant equal to the average number of arrivals per unit time. We want to calculate Pn(t), which is the probability of exactly n arrivals during a time interval t. We can calculate this probability by considering Pn(t +δt), the probability of exactly n arrivals during the interval t + δt. For n > 0 this is the sum of the probabilities of two mutually exclusive events, i.e. [n arrivals in t and none in δt] and [n − 1 arrivals in t and 1 in δt]. That is Pn(t + δt) = Pn(t) × P0(δt) + Pn−1(t) × P1(δt) where we have assumed that δt is so small that P2(δt) ≈ 0. Now, the probability of one arrival in δt is P1(δt) = λδt by definition, so the probability of no arrivals in δt, P0(δt) = 1 − λδt. Using these values in the above equation gives Hence Pn(t + δt) = Pn(t)(1 − λδt) + Pn−1(t)λδt Pn(t + δt) − Pn(t) dPn(t) = = λ[Pn−1(t) − Pn(t)] (9.3) δt dt where we have taken the limit as δt → 0. This is actually a differentialdifference equation for Pn(t), which we can solve. First let us consider n = 0. In 9.3, P−1(t) = 0 — we cannot have −1 arrivals during any time interval, so 9.3 becomes 83 dP0(t) = −λP0(t) dt which has the solution P0(t) = P0(0)e −λt . Since P0(0), the probability of no arrivals in zero time, is unity, this simplifies to P0(t) = e −λt . Knowing P0(t), we can use this in 9.3 with n = 1 to find P1(t), which turns out to be P1(t) = λte −λt (see problems in the chapter entitled ‘Differential Equations and the Laplace Transform’). Solving 9.3 for successive values of n gives (λt)n −λt e Pn(t) = n! which is the probability of exactly n arrivals in a time t when the mean arrival rate is λ. This is the Poisson distribution. Example 9.2 During the 8 hour period that a library is open, a total of 960 people join the queue at the book issues desk. (a) What is the average rate at which people arrive in the queue, in people/minute? (b) What are the probabilities of exactly 0, 1, 2, and 3 people arriving in the queue during any given minute? Answer (a) 8 hours = 480 minutes; hence λ = 960/480 = 2 people/minute is the average arrival rate. (b) ‘Per unit time’ in the context of this problem means ‘per minute’. The Poisson distribution tells us that 0 −λ×1 = 1 × e −2 = 13.5% P0 = (λ×1) 0! e 1 −λ×1 = 2 × e −2 = 27.1% P1 = (λ×1) 1! e 2 P2 = (λ×1) e −λ×1 = 2 × e −2 = 27.1% 2! 3 −λ×1 = 1.33 × e −2 = 18.0% P3 = (λ×1) 3! e The Poisson distribution has mean λt, standard deviation problems) and is of course normalised so that 84 √ λt (see ∞ X Pn(t) = 1. n=0 9.4 The binomial distribution The binomial distribution is another discrete distribution. It applies to situations with a discrete number of possible outcomes. It is defined as follows: If the probability of an event occurring is p, and of it not occurring is q (so q = 1 − p), then the probability that the event will happen k times out of n is given by the (k + 1)-th term in the binomial expansion of (q + p)n. Recall that (q + p)n = q n + nq n−1 p + n(n − 1) n−2 2 n! q p +...+ q n−k pk + . . . + pn. 2! (n − k)!k! (9.4) Example 9.3 A die is rolled 4 times. What is the probability of obtaining (a) 2 (b) 4 sixes? Answer This is a classic example of the sort of problem to which the binomial distribution applies. In this case, let p be the probability of obtaining a six with one throw of the die, so p = 1/6, and so q, the probability of not obtaining a six, is 5/6. Using the binomial expansion, we obtain 5 1 + 6 6 !4 5 = 6 !4 5 +4 6 !3 1 5 +6 6 6 ! !2 1 6 !2 5 +4 6 ! 1 6 !3 1 + 6 !4 so the probability of 2 sixes is 6 × (5/6)2 × (1/6)2 ≈11.6%. The probability of 4 sixes is (1/6)4 (as you’d expect) ≈ 0.08%. 85 Problems, chapter 9 For areas under the Gaussian curve, see the table on page 177. 1. Resistor values are found to be normally distributed with mean R and standard deviation σ. In a large batch of resistors of the same nominal value, what percentage would be expected to lie within (a) ±σ, (b) ±2σ of R? (c) Exactly half of all resistors lie within ± how many σ of R? [(a) 68%, (b) 95%, (c) 0.67σ] 2. Nominally 1µF capacitors are found to have values that are normally distributed. They are marked as being of ±5% tolerance, but 20% are found to be outside this range. What is the standard deviation of the production spread? [0.039µF] 3. A d.c. signal of 100mV has added noise whose amplitude p.d.f. is Gaussian with zero mean and variance 10−6V2. This signal is fed into a D.V.M. with resolution 1mV, which rounds to the nearest 0.5mV. Calculate the probability of the meter reading 101mV. [0.2417] 4. In a binary signal, levels of 0 and 100mV correspond to logic 0 and 1 respectively. Suppose this signal has Gaussian noise with zero mean and standard deviation of 50mV added to it. (a) What is the probability of a bit error being produced, assuming that the threshold for deciding between 0 and 1 is 50mV? (b) What is the probability of a 4 bit word being correct? [(a) 15.9%, (b) 50.1%] 5. Show that the Poisson distribution really does obey equation (9.3). 6. Show that the Poisson distribution is normalised. 7.∗ Show that the Poisson distribution has mean λt. 8. Crashes of the student file server are Poisson distributed, occurring at a mean rate of 2 per day when there’s a deadline to be met. What is the probability of (a) 0, (b) 2, (c) 4 crashes in one day? [(a) 13.5%, (b) 27%, (c) 9%] 9. The average number of faults on a new car is 5. What is the probability of (a) buying a new car with 0 faults and (b) buying two new cars with 86 a total of 4 faults between them? (c) What is the most likely number of faults in one car? [(a) 0.67%, (b) 1.9%, (c) 4 and 5 equally likely] 10. Calculate the probabilities for the three examples of the Poisson distribution on page 82. N.B. 210! ≈ 1.06 × 10398. 11. The premium bond problem. For every pound you invest in premium bonds, you used to have a 1/15,000 chance of winning a prize each month. Suppose you have invested £20,000. What is the probability of winning, in a given month, (a) exactly one prize? (b) exactly two prizes? (c) at least one prize? (Hints. The binomial distribution applies. For part (c) probability of at least one prize = 1− probability of no prizes.) [(a) 0.3515 (b) 0.2343 (c) 0.7364] 12. A hundred samples of 5 resistors were taken from a large batch. 59 samples had no defective resistors, 33 had 1, 7 had 2, 1 had 3 and no samples had 4 or 5 defective resistors. Show that the distribution is approximately binomial and estimate the overall percentage of defective components. [about 10%] 87 Chapter 10 The method of least squares Aims By the end of this chapter, you should understand • how the method of least squares works • how to fit a straight line to given data • how to fit simple curves to given data. You will need to recall some facts about partial differentiation from first year maths. 10.1 Gauss Carl Friedrich Gauss was an important mathematician in his time. In the Theoretical Distributions chapter we learnt about a probability density function that was named after him; in this chapter, we discuss the method of least squares, which was discovered by him. 10.2 A data fitting problem Suppose we measure the current in through a resistor R, for N different values of the voltage vn across it (so n = 1, 2, 3, . . . , N ). If Ohm’s law holds, the graph of in against vn should be a straight line with slope 1/R. Experiments being what they are, there will 88 be some errors in the measurements and the resulting graph will not be exactly a straight line. How do we then calculate the slope and intercept of the ‘best’ straight line through the data, and what do we mean by ‘best’ anyway? The general ‘straight line fit’ problem is illustrated in figure 10.1. y * ( xi , y i ) * * di * * m * }c x * * * * * * * Figure 10.1: A problem to be solved by the method of least squares: fit a straight line of the form y = mx + c to the given data set. The vertical distance from the i-th point (xi , yi ) to the line is di . 10.3 The method of least squares Almost any straight line1 can be represented in the form y = mx + c, where m is the gradient and c is the y-intercept. We want to find the two numbers m and c that best represent a given set of data, with the assumption that The errors in x are much smaller than the errors in y. Under this assumption, a way to do this is to 1 The exception is any vertical straight line. 89 Calculate m and c such that the sum of the squared vertical distances of each of the points from the straight line is a minimum. This is known as the method of least squares, since we are trying to minimise a sum of squares. We minimise the sum of squared vertical distances because the errors in x are assumed to be much less than the errors in y. We have also found one plausible answer to the question “What do we mean by ‘best’ straight line?” — one that minimises the sum of the squared vertical distances. Why squared distances? There are two points to note here: 1. If just the distance were to be used, some cancelling out could happen, since some of the distances could be positive and some negative. In fact we could make the sum of distances equal to 0, with a straight line that was a very poor fit to the data. 2. The calculation of c and m is straightforward for squared distances, as we shall see. There are other ways this fit could be done, e.g. by minimising the sum of the absolute values, or the fourth power of the distances, neither of which can be carried out as easily. If the errors in y are much smaller than the errors in x, then you should swap the x and y values in what follows. The theory then remains the same. 10.4 Calculating m and c Let us write the i-th data point, with i going from 1 to N , as (xi , yi). We then need to define S, the sum of the vertical squared distances of all the points from the straight line y = mx + c. This is given by 90 N X (yi − mxi − c)2. S= i=1 It is surprisingly easy to find the values of c and m that minimise S. We partially differentiate S with respect to c and to m, and set the derivatives equal to zero. This assumes that S as a function of c and m has exactly one turning point, which is a minimum. This can be proved by calculating second derivatives — see problems. The values of c and m that satisfy the resulting pair of equations are the values that minimise S. Hence N and ∂S X −2(yi − mxi − c) = 0 = ∂c i=1 N ∂S X −2xi(yi − mxi − c) = 0 = ∂m i=1 P Using the fact that N i=1 c = N c, the first equation gives N X i=1 yi − m N X i=1 xi − N c = 0 (10.1) and the second N X i=1 xi yi − m N X x2i i=1 −c N X xi = 0. (10.2) i=1 We now have two simultaneous linear equations and two unknowns, c and m, so we can solve for c and m. 91 Example 10.1 Use the method of least squares to fit a straight line to the five points (−1, 3.2), (0, 1.4), (1, −0.8), (2, −2.9), (3, −3.8) assuming that the x-values are accurate. Answer We will use equations 10.1 and 10.2, so we first calculate N X xi = 5, N X i=1 i=1 yi = −2.9, x2i = 15, and N X i=1 i=1 Equations 10.1 and 10.2 now become −2.9 − 5m − 5c = 0 N X xiyi = −21.2 − 21.2 − 15m − 5c = 0 which we can solve to obtain m = −1.83 and c = 1.25. The data and the least squares straight line fit to the data are shown in figure 10.2. 4.0 2.0 0.0 -2.0 -4.0 -1.0 0.0 1.0 2.0 3.0 Figure 10.2: Five data points and a straight line fit to them, as calculated by the method of least squares. 92 10.5 Fitting to a parabola In a similar way, we can calculate a least squares fit to a parabola of the form y = a + bx + cx2 . In this case there are three unknowns, a, b and c. We still define S as the sum of the squared vertical distances from the parabola to the points, and hence N X (yi − a − bxi − cx2i )2 S= i=1 The three equations now are N ∂S X −2(yi − a − bxi − cx2i ) = 0 = ∂a i=1 N and ∂S X −2xi(yi − a − bxi − cx2i ) = 0 = ∂b i=1 N ∂S X −2x2i (yi − a − bxi − cx2i ) = 0 = ∂c i=1 As before, these equations can be solved for a, b and c. Similar calculations can be used to fit any functions to a data set, provided the functions are linear in the unknown parameters. For example, y = a sin x y = ax + be x y = ax3 + b ln x + c are all linear in the parameters a, b and c, and the method of least squares can be used to find the parameters for a given set of data. By contrast, the following y = e (x−b) 2 /c2 y = cos(a/x + bx + c) y = ln(a + bx) are not linear in a, b and c, and least squares cannot be used, at least not directly. 93 Problems, chapter 10 1. By considering second derivatives, show that the values of c and m obtained by solving equations 10.1 and 10.2 are such that S= N X i=1 (yi − mxi − c)2 is a minimum (as opposed to a maximum). 2. Fit a straight line of the form y = ax + b to the following set of points, by using the method of least squares: (0, 12.3), (5, 14.5), (8, 15.0), (11, 17.6) Assume that the x values are correct. [a = 0.4500, b = 12.15] 3. A battery of nominal voltage V0 and internal resistance r is connected to a variable resistance. Various values of the current i through and voltage v across this load are given below 2 4 6 8 Amps i 0 v 6.1 4.9 3.0 1.6 0.2 Volts Assuming that the errors in the current readings are much smaller than those in voltage, calculate V0 and r. [V0 = 6.18V, r = 0.755Ω] 4. (i) Show that the value of a that gives the least squares fit of the function y = ax2 to a data set (x1, y1 ), . . . (xN , yN ), is given by a= PN 2 i=1 xi yi PN 4 i=1 xi (ii) Some power, P , versus voltage, V , measurements for a resistor R are given below. V 1 1.5 2 2.5 3 Volts, ±0.3% P 0.2 0.6 0.9 1.6 2.4 Watts, ±2% 94 Calculate the least squares value of R. [R = 3.87Ω] 5.∗ (i) Show, by taking logs, that least squares fitting a function of the form y = ae bx can be reduced to fitting a straight line. (ii) A capacitor C is initially charged to 10V and then connected across a resistor R. The current through R measured at 1 millisecond intervals is 1 2 3 4 ms, ±0.5% t 0 i 4.5 2.8 1.5 1.0 0.6 mA, ±2.5% Using the method of least squares, find R and C. [R = 2.2kΩ, C = 0.89µF] 6.∗ The average mass, y, of nails of length x obeys the law y = axb where a and b are constants. (i) Show that the problem of finding a and b from N data points can be reduced to a least squares straight line fitting problem in which the equations to be solved are X X ln yi − b ln xi − N ln a = 0 i i and X i ln xi ln yi − b (ln xi )2 − ln a X i X ln xi = 0 i (ii) Given the following data: x 1 2 4 6 inch y 5 12 30 60 g and assuming that the nail lengths are more accurately known than the masses, estimate a and b. [a = 4.81, b = 1.37] 95 Chapter 11 Complex frequency 11.1 Complex frequency You should be familiar now with the idea of a transform since we have looked at the Fourier transform in some detail. The purpose of the Fourier transform is to represent a function of time, f (t), as a function of angular frequency, F (ω). Both f (t) and F (ω) represent the same function, but in terms of a different variable. Similar to, but not the same as the Fourier transform is the Laplace transform, which transforms a function of time, f (t), into a function of the variable s, known as complex frequency. We define the Laplace transform in the next chapter, but in this chapter we look at what s means. In general, s has both a real and imaginary part, and it is written conventionally as s = σ + jω so that e st = e (σ+jω)t = e σte jωt. We always assume that σ and ω are real. We already know that e jωt = cos ωt + j sin ωt is periodic with period 2π/ω. What is the meaning of σ, the real part of s? There are three cases to consider: (1) σ < 0, (2) σ = 0 and (3) σ > 0. We consider these in turn. 96 11.1.1 σ<0 Here, eσt is an exponentially decreasing function of time. If this is then multiplied by e jωt, the real part of the result is a damped oscillation: e σt σ<0 t Re e st σ<0 t 11.1.2 σ=0 Here, eσt = 1 is a constant function of time. If this is then multiplied by e jωt, the real part of the result oscillates with constant amplitude: σ=0 e σt 1 t Re e st σ=0 t 97 11.1.3 σ>0 Here, eσt is an exponentially increasing function of time. If this is then multiplied by e jωt, the real part of the result is an exponentially growing oscillation: e σt σ>0 t Re e st σ>0 t 11.2 Linear homogeneous differential equations Recall that a linear second order differential equation is an equation of the form dv d2v 2 + ω v = f (t) + 2aω 0 0 dt2 dt This equation is • linear (only first powers of the unknown function, v, and its derivatives appear) • second order (the highest derivative that appears is the second) It is assumed that the real constants a and ω0 are known, and also the function (the ‘drive’) f (t) on the right hand side is given. The 98 problem then is to find the unknown function v(t) that satisfies the differential equation for all times t and all initial conditions. If f (t) = 0 then the equation becomes dv d2v 2 + ω v=0 (11.1) + 2aω 0 0 dt2 dt and is described as a homogeneous linear differential equation.1 We consider this case now. In order to solve 11.1 we assume that the solution will be of the form v(t) = V0e st where V0 is a constant. We then need to find the possible values of the complex frequency s. Substituting our assumed solution into 11.1 and using the fact that dv = V0se st dt d2v = V0s2e st 2 dt and gives 2 s + 2aω0s + ω02 V0e st = 0. This has to be 0 for all times, t. Hence either V0 = 0 (trivial solution, since this leads to v(t) = 0 for all t) or s2 + 2aω0s + ω02 = 0. By assuming the general form of the solution, we have managed to transform the original differential equation into a quadratic in s — which of course we know how to solve: √ s± = −ω0a ± ω0 a2 − 1. This is just a shorthand way of writing the two values of s √ √ and s− = −ω0(a − a2 − 1). s+ = −ω0(a + a2 − 1) 1 If f (t) 6= 0, then it is an inhomogeneous differential equation. 99 The most general solution to 11.1 will therefore be v(t) = e −ω0 at " Ae √ (ω0 a2 −1)t + Be # √ −(ω0 a2 −1)t where A and B are arbitrary constants whose values can be found from initial conditions. We can now use the results of section 11.1 to describe the behaviour of v(t) as defined by the differential equation (11.1). We can always assume that ω0 > 0 (why?). The behaviour of v(t) then depends on the value of a. There are three cases: 1. a2 > 1 2. a2 = 1 3. a2 < 1 We consider these in turn. 11.2.1 a2 > 1 √ In this case,√a − 1 > 0 and so a2 − 1 is real. Furthermore, if a > 0, a − a2 − 1 > 0. Hence, the two numbers √ s± = −ω0(a ± a2 − 1) 2 are both negative if a > 0 so the general solution is the sum of two damped exponentials. √ Similarly, if a < 0, a − a2 − 1 < 0. Thus, the two numbers √ s± = −ω0(a ± a2 − 1) are both positive if a < 0 and in this case, the general solution consists of the sum of two growing exponentials. 100 11.2.2 a2 = 1 When a2 = 1, s± = −ω0a so the solution is exponentially decaying if a > 0 and exponentially growing if a < 0. (In fact, the situation is a bit more complicated than this, and the Laplace transform enables us to sort out the difficult cases easily.) 11.2.3 a2 < 1 √ In this case, a − 1 < 0 and so a2 − 1 is imaginary. Hence √ s± = −ω0(a ± j 1 − a2) 2 are both complex. Therefore, if the real part of s± , −ω0a, is negative — that is, a > 0 — the solution is damped oscillatory. On the other hand, if a < 0, the solution is exponentially growing and oscillatory. All the above are summarised in the following diagram. Steady state oscillation Growing exponential Growing oscillatory −1 Damped oscillatory 0 Damped exponential 1 Figure 11.1: All possible types of behaviour of solutions of equation (11.1). 101 a Problems, chapter 11 L R S C 1. In the figure, L = 4H, C = 1F. Capacitor C is initially charged. Switch S is then closed. For what value/range of values of R is the subsequent behaviour (i) damped oscillatory (ii) damped exponential (iii) oscillatory with constant amplitude (iv) growing oscillatory? What is the frequency in the case of an oscillatory solution with constant amplitude? Which of these would be physically realisable with passive components? [(i) 0 < R < 4Ω (ii) R ≥ 4Ω (iii) R = 0 (iv) −4 < R < 0; 1/4π Hz; (i) and (ii) are realisable] 102 Chapter 12 The Laplace Transform 12.1 The Laplace transform You should already be familiar with the idea of a transform, as we have discussed the Fourier transform in previous lectures. Just as the Fourier transform allows us to express a function of time as a function of angular frequency ω, the Laplace transform allows us to express a function of time in terms of complex frequency, s. (Some books use p.) If the function of time is f (t), then its Laplace transform is written F (s), or occasionally L[f (t)], and is defined by F (s) = L[f (t)] = Z ∞ e −st f (t) dt. (12.1) 0 Two important differences between the Laplace transform and the Fourier transform are 1. In the Laplace transform, the function f (t) is assumed to start from t = 0, whereas in the Fourier transform, it is assumed to start from t = −∞. 2. In the Laplace transform, the new variable s has both real and imaginary parts, whereas in the Fourier transform, jω is purely imaginary. Let us start by calculating some Laplace transforms. 103 Example 12.1 If f (t) = e −kt, then F (s) = L e So −kt = ∞ Z e −st −kt e dt = Z ∞ e −(k+s)t dt 0 0 ∞ e −(k+s)t 1 . =− = k + s 0 k+s −kt = L e i h 1 . k+s (12.2) Example 12.2 If f (t) = sin ωt then, using the fact that sin ωt = e jωt − e −jωt 2j we see that L[sin ωt] = Z ∞ 0 h e −st (e jωt − e −jωt ) dt. 2j Using the previous result, that L e kt = 1/(k + s), we get i 1 1 1 . − L[sin ωt] = 2j −jω + s jω + s " # Hence, simplifying, L[sin ωt] = ω . ω 2 + s2 Example 12.3 What is the Laplace transform of cos ωt = before, 1 1 1 + L[cos ωt] = 2 −jω + s jω + s " = s . ω 2 + s2 104 # e jωt +e −jωt ? 2 As 12.2 The Laplace transform of a derivative The importance of the Laplace transform in solving differential equations becomes clear when we try to find the transform of the derivative of a function f (t) w.r.t. time: Z ∞ df df e −st dt = L dt dt 0 Integrating by parts gives ∞ Z ∞ df −st −se −st f (t) dt = e f (t) − L dt 0 0 = −f (0) + s L[f ] = −f (0) + sF (s). We have assumed that f (t) is such that limt→∞ e −st f (t) = 0. (If f (t) didn’t have this property, it would not have a Laplace transform.) What about second derivatives? Using the fact that d2f d df = dt2 dt dt we can use the result for the first derivative: 2 d f df df (0) + sL L 2 = − dt dt dt Replacing L df dt with [−f (0) + sF (s)] gives L 2 d f df (0) = −sf (0) − + s2F (s) 2 dt dt The two important results we have deduced are 2 df df (0) 2 d f = −sf (0)− +s F (s). = −f (0)+sF (s), L L dt dt2 dt (12.3) 105 Note that f (0) is the value of f (t) at t = 0 and the derivative of f at t = 0. df (0) dt is the value of Example 12.4 Is this consistent with our previous examples? We’ve already found L[sin ωt] and L[cos ωt]. They are given by ω s and [cos ωt] = L[sin ωt] = 2 L ω + s2 ω 2 + s2 But, sin ωt = − 1 d cos ωt ω dt so the Laplace transform of sin ωt should equal 1 1 s2 ω −1 d cos ωt = − (− cos 0 + s L[cos ωt]) = − (−1 + 2 )= 2 L 2 ω dt ω ω ω +s ω + s2 " # which is indeed the Laplace transform of sin ωt. 106 Problems, chapter 12 1. Show that the Laplace transform has the superposition property, i.e. if L[f (t)] = F (s) and L[g(t)] = G(s), then L[af (t) + bg(t)] = aF (s) + bG(s) where a and b are constants. 2. Find the Laplace transform of (i) f (t) = a, a is a constant. (ii) f (t) = t (iii) f (t) = a + bt, a and b constants. (Use superposition). [(i) a/s, (ii) 1/s2, (iii) (as + b)/s2] 3. Expand sin(ωt + φ) and hence show that L[sin(ωt + φ)] = What is L[cos(ωt + φ)]? ω cos φ + s sin φ s2 + ω 2 [(s cos φ − ω sin φ)/(s2 + ω 2 )] 4. Find the Laplace transform of e −kt cos(ωt+φ) and e −kt sin(ωt+φ) without integration, by (i) deducing the Laplace transform of e −kt e j(ωt+φ) (use equation 12.2) then (ii) finding the real and imaginary parts of this expression. cos φ−ω sin φ [ (s+k) , (s+k)2 +ω 2 107 (s+k) sin φ+ω cos φ ] (s+k)2 +ω 2 Chapter 13 Differential Equations and the Laplace Transform 13.1 Inhomogeneous differential equations In the last but one chapter, we saw how to solve the following differential equation d2v dv 2 + ω v = f (t) (13.1) + 2aω 0 0 dt2 dt with f (t) = 0. We could (a) describe the solutions qualitatively (e.g. damped oscillatory, growing exponential etc.) and (b) write down a general, exact solution. In this chapter, we discuss how to do (b) but now when f (t) 6= 0, or, in technical terms, when the differential equation is inhomogeneous. Where does such an equation arise in practice? L R S C Figure 13.1: A circuit described by a homogeneous differential equation 108 L R f(t) C Figure 13.2: A circuit described by an inhomogeneous differential equation In figure 13.1 the capacitor is initially charged and switch S is open. At t = 0, S is closed. The behaviour of the circuit is described by the homogeneous differential equation d2v dv LC 2 + RC + v = 0 dt dt where v = v(t) is the voltage across C. In figure 13.2 the circuit is driven by an applied voltage f (t) — this might, for instance, be a sine wave from a signal generator. The circuit is now described by the inhomogeneous differential equation dv d2v (13.2) LC 2 + RC + v = f (t). dt dt We are going to solve differential equations like this one by using the Laplace transform technique. 13.2 Solving a d.e. by Laplace transform — overview Recall that to solve a differential equation in a function of time, v(t), means to find a function, v(t), that satisfies the differential equation for all time, t. 109 You should also remember that the general solution of a second order differential equation will have two arbitrary constants whose values are determined from initial conditions. A good way to solve equation 13.1 when f (t) 6= 0 uses the Laplace transform. This method requires us to 1. Find the Laplace transform of the differential equation; 2. solve the resulting (algebraic) equation for V (s), the Laplace transform of v(t); then 3. find the inverse Laplace transform of V (s), which gives us v(t). We look at each of these items in turn. 13.3 The Laplace transform of a differential equation Using the rules for finding the Laplace transform of first and second derivatives, we can immediately find the Laplace transform of the differential equation for a driven RLC circuit, equation 13.2. It is dv(0) LC[−sv(0) − + s2 V (s)] + RC[−v(0) + sV (s)] + V (s) = F (s) dt This looks rather a mess! It is, however, just a linear equation for V (s), the Laplace transform of the (as yet unknown) function v(t). Solving for V (s) gives dv(0) 2 + RCv(0) V (s) LCs + RCs + 1 = F (s) + LCsv(0) + LC dt so F (s) + LCsv(0) + LC dv(0) + RCv(0) dt V (s) = (13.3) LCs2 + RCs + 1 We haven’t found v(t) yet, but we’ve found something closely related to it: the Laplace transform of v(t), V (s). And, what is more, this is a completely general expression for V (s), valid for any 110 • drive function, f (t) (provided its Laplace transform exists) dv(0) dt • initial conditions v(0) and • values of R, L and C. Notice that the initial conditions (v(t) and its derivative at t = 0) are automatically built into the expression for V (s). Example 13.1 Let R = 3/2Ω, C = 2F and L = 1H. Let f (t) = H(t) be the Heaviside functions, also knowns as the unit step function, which is defined by H(t) = 0 t<0 1 t>0 (H(t) is undefined at t = 0.) If the initial capacitor voltage v(0) = 0V with initial rate of change dv(0) dt = 0V/s, find the Laplace transform of v(t), V (s). Answer We have been asked to calculate the response of a circuit to an input consisting of a unit step function. According to equation 13.3, in order to find V (s), we need to find the Laplace transform of the drive function f (t). This is H(t), whose Laplace transform is given by L[H(t)] = Z ∞ e −st 0 1 −st t=∞ 1 × 1 dt = − e = s s t=0 Substituting the given values in 13.3, we get V (s) = 1 s(s + 1)(2s + 1) In words, the Laplace transform of the unit step response of the circuit is 1/[s(s + 1)(2s + 1)]. We are now left with the problem of finding v(t) from V (s), i.e. the problem of inverting the Laplace transform. 111 13.4 Inverse Laplace transform using tables There is an analytical way of inverting the Laplace transform, called the Bromwich integral, which involves contour integration. It is dealt with in, for instance, Boas (chapter 15). However, we are going to adopt the simpler and more usual approach of using tables of Laplace transforms. In what follows, I shall always refer to the tables on pages 8–11 in the E.E. Department’s ‘Tables of constants, formulae and transforms’. These tables are also included at the end of this book, starting on page 178; we refer to them as the L.T. Tables. Sometimes it is easy to find the Laplace transform that we need in the L.T. Tables, as the following example shows: Example 13.2 Let us finish off the previous example by finding the inverse Laplace transform of V (s). In the L.T. Tables you will find that the Laplace transform of b −at a −bt 1 1− e + e ab b−a b−a " is # (13.4) 1 . s(s + a)(s + b) But V (s) is of this form. If we divide the numerator and denominator of V (s) by 2, we get V (s) = 1/2 s(s + 1)(s + 1/2) so, to make this look like the result in the L.T. Tables, put 1 b= . 2 a = 1, Using these in 13.4 gives v(t) = 1 + e −t − 2e −t/2. You should check that this (a) satisfies the differential equation 13.2, and (b) has the properties v(0) = 0 and dv(0) dt = 0. In this case, therefore, we have solved the differential equation 13.2. 112 Sometimes there is a little more effort involved, as in the following example: Example 13.3 Suppose that we have instead V (s) = s+2 . 2s2 + s + 1 This looks quite like two entries in the L.T. Tables, at the top of the third page: −at L e cos ωt = h i s+a 2 s + 2as + b with ω = √ b − a2 (13.5) and 1 1 −at . (13.6) L e sin ωt = 2 ω s + 2as + b Notice that both the denominators are of the form s2 + 2as + b, so let’s bring out a factor of 1/2 from V (s): # " V (s) = s+2 1 . 2 s2 + 21 s + 12 Now we can find the values of a and b such that 1 1 s2 + s + = s2 + 2as + b. 2 2 Obviously, 1 a= 4 and 1 b = , from which 2 ω= √ b − a2 = √ 7 . 4 That’s the denominator sorted out. What about the numerator? The numerator in equation (13.5) is s + a = s + 1/4, but we have a numerator of s + 2. How do we get this? The answer is that we rewrite V (s) as s+ 1 1 V (s) = 2 1 4 2 s + 2s + 1 2 + s2 + 7 4 1 2s + 1 2 . Now we can use 13.5 and 13.6 to obtain √ √ √ 7t 7 −t/4 7t 1 −t/4 cos + e sin . v(t) = e 2 4 2 4 113 13.5 Inverse Laplace transform by partial fractions In some cases the function we want to invert isn’t in the L.T. Tables, in which case it may be necessary to use the method of partial fractions. An example of this type is now given. Example 13.4 Suppose that V (s) = 1 ; s2 (s + 1) what is v(t)? The idea of using partial fractions is to re-write V (s) in the form C As + B 1 + = s2 (s + 1) s2 s+1 with A, B and C constants that we have to find. If we can write V (s) in this form, we can invert each of the fractions individually. The general rule for partial fractions is that the degree of the numerator must be one less than that of the denominator, hence the As + B term in the numerator of the first fraction above. We now need to find A, B and C. We do this by adding up the partial fractions: (As + B)(s + 1) + Cs2 1 = s2 (s + 1) s2(s + 1) The denominators are equal, so comparing the numerators, 1 ≡ (As + B)(s + 1) + Cs2 = (A + C)s2 + (A + B)s + B which has to be true for all s; hence, comparing coefficients of powers of s, B = 1, A + B = 0, A+C =0 so A = −1, B = 1 and C = 1. Therefore, −s + 1 1 1 1 1 V (s) = + = − + + . s2 s+1 s s2 s + 1 We can invert each part of this using the L.T. Tables. The answer is v(t) = −H(t) + t + e −t where H(t) is the unit step function. 114 Problems, chapter 13 1. Solve the following differential equation by the Laplace transform method: dy − y = 2e −t dt with y(0) = 3 [y(t) = 4et − e−t = 3 cosh t + 5 sinh t = 3et + 2 sinh t] 2. Solve dx d2 x − 4 + 4x = 4 dt2 dt with x(0) = 0, dx(0) = −2 dt [x(t) = 1 − e 2t] 3. Solve d2 v + 16v = 8 cos 4t dt2 with v(0) = 0, dv(0) =8 dt [v(t) = (2 + t) sin 4t] 4. In the circuit of figure 13.2, L = 1H, C = 1/5F, R = 2Ω and f (t) = 2 sin t. Write down the differential equation that describes v(t), the voltage across the capacitor, and solve it with the initial conditions v(0) = 0 and dv(0) dt = 3. [v(t) = − cos t + 2 sin t + e −t(cos 2t + sin 2t)] 5. Show that the solution to equation 9.3 in chapter 9, with n = 1, P0 (t) = e−λt and P1 (0) = 0, is as given. 115 Chapter 14 The Z transform: definition, examples 14.1 Introduction and Definitions 14.1.1 Sampling f(t) Σ δ(t - nT) The z-transform is to sampled signals as the Laplace transform is to continuous time signals. It is widely used in control theory and digital signal processing. Throughout this and the next two chapters, the sampling interval will be a fixed, positive time T . We first show how the Laplace and z-transforms are connected. t 0T 1T 2T 3T 4T 5T 6T 7T t Figure 14.1: Left: a continuous function of time, f (t). Right: a ‘comb’ of equally-spaced Dirac P delta functions, C(t) = ∞ n=0 δ(t − nT ). Figure 14.1, left, shows a continuous function of time, f (t). FigP ure 14.1, right, shows the function C(t) = ∞ n=0 δ(t − nT ), a set of equally-spaced Dirac delta functions, occurring at t = 0, T, 2T, . . .. Figure 14.2 tries to show the product, f (t) × C(t), which we will call fs (t). This picture should be interpreted as follows: since each 116 f(0T) fs(t) = f(t) x C(t) f(T) 0T f(t) f(2T) 1T 2T 3T t 4T 5T 6T 7T Figure 14.2: The sampled version of the function f (t), which is f (t) × C(t). The heights of the arrows are proportional to their areas — hence, the labels f (0), f (T ) and so on refer to the areas under the Dirac delta functions at t = 0, T, . . . respectively. Dirac δ(t − nT ) has unit area, but is infinite at t = nT and zero everywhere else, fs(t) is also infinite at t = nT and zero everywhere else. Figure 14.2 is therefore showing, by the height of the arrows, the area under the Dirac delta functions at t = 0, T, 2T, . . ., these areas being f (0), f (T ), f (2T ), . . . respectively. Hence, multiplying f (t) by C(t) can be seen as a way of sampling f (t) at the equally-spaced intervals t = 0, T, 2T, . . ., and so fs (t) = ∞ X n=0 14.1.2 f (nT )δ(t − nT ). The connection with Laplace transforms You now need to remember the important sampling property of the Dirac delta function, equation (2.3), which is repeated here: 117 Z ∞ −∞ f (t)δ(t − t0) dt = f (t0), (14.1) true for any continuous function f (t). Using this result, you should immediately be able to see that the Laplace transform of fs(t) is ! Z ∞ ∞ X e −st f (nT )δ(t − nT ) dt L[fs(t)] = 0 = ∞ Z X n=0 0 ∞ n=0 e −st f (nT ) δ(t − nT )dt = ∞ X f (nT )e −nsT . n=0 Defining z = e sT , we have the so-called z-transform of f (t), which is Z [f (nT )] = F (z) = ∞ X f (nT )z −n . (14.2) n=0 In practice, you will often see the sampled version of f (t) written as f (n), with the sampling interval T “built in” to f (n). See the following section for a further explanation. Using this convention, the z-transform is defined as follows: Z [f (n)] = F (z) = ∞ X f (n)z −n . (14.3) n=0 In words: The z-transform of a function of time, f (t), is the Laplace transform of the sampled version of f (t), written fs(t). The function fs(t) is obtained from f (t) by multiplying it by the P∞ sum of Dirac delta functions n=0 δ(t − nT ). 118 Points to note about the z-transform • We will always assume that f (t) = 0 for t < 0, so f (n) = 0 for n < 0. • The z-transform transforms a function of n, n = 0, 1, 2, . . . into a function of z, where z = e sT and s is the complex frequency. 14.1.3 The two ways of writing down z-transforms There are two slightly different ways of writing down z-transforms: • CE, the way that is preferred by control engineers, is shown in equation (14.2); • DSP, the boxed definition, (14.3), which is generally used by Digital Signal Processing people. They are equivalent to each other, and, although we concentrate on the DSP way in these notes, you should be familiar with both. Note also that the Departmental Tables, the relevant section of which is quoted on page 131, use both ways. Time and again, in dealing with z-transforms you will find you need to use properties of power functions, which you have certainly seen before, but which are repeated here — you need to be able to apply these almost without thinking about it. In the expressions, x is a positive real number and a, b are any real numbers. We have xa xb = xa+b 1/xb = x−b xa /xb = xa−b (xa)b = xa×b . Of course, since e is a positive real number, all these also apply to the exponential function; so, for instance, e a e b = e a+b . Here is a good moment to note also that (−1)n = 1, −1, 1, −1, . . . for n = 0, 1, 2, 3 . . .; and so (−1)n+1 = (−1)n−1 = −1, 1, −1, 1, . . ., again for n = 0, 1, 2, 3 . . .. 119 Let us now look at four examples of z-transforms, taken from the Departmental Tables on page 131. In all cases, the sampling interval is T and n = 0, 1, 2, . . .. Each example shows that CE and DSP are equivalent, provided that we make the right choice of parameters (see the last column). CE way f (t) f (nT ) sin ωt sin ωnT t nT −at e e −anT e −at cos ωt e −anT cos ωnT DSP way f (n) with. . . sin an a = ωT an a=T n b b = e −aT bn cos cn b = e −aT , c = ωT For instance, take the first row in the table above. This is saying that if f (t) = sin ωt, then f (nT ) = sin ωnT and we can then write f (n) = sin an, by choosing the right definition for a, which in this case is a = ωT . 14.2 z-transform examples Before we compute the z-transforms of some well-known functions, we will derive the formula for the sum of a geometric series — you will have seen this before. You will see these formulae many times when discussing z-transforms and it is well worth your while to learn them, particularly the boxed ones. P Let Sk = kn=0 xn = 1 + x + x2 + . . . + xk . Then xSk = x + x2 + x3 + . . . + xk+1. Hence Sk − xSk = Sk (1 − x) = 1 + x + x2 + . . . + xk − (x + x2 + x3 + . . . + xk+1) = 1 − xk+1. Therefore 120 k X 1 − xk+1 . x = Sk = 1 + x + x + . . . + x = 1 − x n=0 2 k n (14.4) Now let k tend to infinity. Provided that |x| < 1, the numerator tends to one, so we have 2 1 + x + x + ... = ∞ X xn = n=0 1 1−x provided that |x| < 1. (14.5) Replace x with −x in the above to get ∞ X 1−x+x2−x3+. . . = (−1)nxn = n=0 1 provided 1+x |x| < 1. (14.6) By differentiating the above expression, we obtain 1 d d 1 2 3 = 1 + x + x + x + ... = dx 1 − x (1 − x)2 dx = 1 + 2x + 3x2 + . . . = ∞ X nxn−1 n=0 and multiplying both sides by x, we have x = x + 2x2 + 3x3 + . . . 2 (1 − x) and hence x + 2x2 + 3x3 . . . = ∞ X nxn = n=1 ∞ X n=0 121 nxn = x (1 − x)2 (14.7) where it does not matter whether or not we include the n = 0 term — it is zero anyway. Replacing x with −x in the above gives 2 3 4 x−2x +3x −4x +. . . = ∞ X (−1)n+1nxn = n=1 x (14.8) (1 + x)2. We now compute the z-transforms of some well-known functions. In all the following, a, b are constants and n is an integer. 14.2.1 f (n) = an This corresponds to a sampled version of f (t) = bt — then f (nT ) = (bT )n = an with a = bT . From the definition, Z [an] = ∞ X an z −n = a n=0 ∞ X nz −n , n=1 where the second sum starts from 1 rather than 0 because the n = 0 term is zero. Now we can use equation (14.7) to obtain Z [an] = a 14.2.2 az 1/z = . (1 − 1/z)2 (z − 1)2 f (n) = δ(n), the unit impulse We need to be careful with definitions here: in the discrete case, the function δ(n), which we call the unit impulse to avoid confusion with the Dirac delta function, is defined as δ(n) = ( 1 n=0 0 otherwise. 122 Note that this is different from the Dirac delta function. With this definition in mind, it is easy to see that ∞ X Z [δ(n)] = δ(n) z −n = 1 · z 0 = 1. n=0 14.2.3 f (n) = u(n), the unit step function The unit step function is defined as ( 1 n≥0 u(n) = 0 otherwise We then have ∞ ∞ X X z 1 −n Z [u(n)] = u(n) z = = . z −n = 1 − 1/z z − 1 n=0 n=0 Note that, since all our functions start at n = 0, a constant c is written as c u(n) and so the z-transform of c is cz/(z − 1). 14.2.4 f (n) = an This corresponds to a sampled version of f (t) = e −bt : then f (nT ) = n e −bnT = e −bT , and putting a = e −bT , we have f (n) = an. Directly from the definition, equation (14.3), we have ∞ ∞ X X z 1 Z [an] = = anz −n = (a/z)n = 1 − a/z z − a n=0 n=0 where we have used equation (14.5) to calculate the infinite sum. 14.2.5 f (n) = cos an jx −jx /2. Hence, the z-transform of Remember that cos x = e + e cos an, by definition, is ∞ ∞ X 1 X jan −n e + e −jan z −n . Z [cos an] = cos an z = 2 n=0 n=0 123 We can therefore use the previous result: z z 2z 2 − ze −ja − ze ja 2Z [cos an] = . + = z − e ja z − e −ja z 2 − ze −ja − ze ja + 1 Hence, z(z − cos a) Z [cos an] = 2 . z − 2z cos a + 1 The z-transform of sin an can be found in an analogous way. Problems, chapter 14 1. Find the z-transform of the sequences (i) a = [1, −1, 3, 0, 2] (read this as a(0) = 1, a(1) = −1 etc. ); and (ii) b = [0, 0, 1, 0, 2]. [(i) A(z) = 1 − z −1 + 3z −2 + 2z −4, (ii) B(z) = z −2 + 2z −4] 2. Find the z-transform of sin an. 3. Find the z-transform of bn cos an. Hint: easiest is to look at the real part of [z sin a/(z 2 − 2z cos a + 1)] P∞ ja n n=0 (be /z) . 2 [z(z − b cos a)/(z − 2bz cos a + b2)] 4. Sketch the functions u(n), u(n − 2) and u(n) − u(n − 2). Hence deduce Z [u(n) − u(n − 2)]. [1 + z −1] (You’ll see another way to solve this problem in the next chapter.) 5. Define the finite sequence f (0) = 1, f (1) = 2, f (2) = 3, f (i) = 0, i > 2. Write this sequence (i) as a sum of delta functions (unit impulses) and (ii) as a sum of unit step functions. A sketch may help. Using the answer to part (i) and the definitions of the z-transform and the unit impulse, find the z-transform of the sequence f (n). [(i) δ(n) + 2δ(n − 1) + 3δ(n − 2), (ii) u(n) + u(n − 1) + u(n − 2) − 3u(n − 3). F (z) = 1 + 2z −1 + 3z −2] 6. Find the z-transform of the infinite sequence f (n) = 1/n!, n ≥ 0. (Hint: ex = 1/0! + x/1! + x2/2! + . . .) [F (z) = e(1/z)] 7. Find the z-transform of the infinite sequence f (n) = 1/(n + 1), n ≥ 0. (Hint: ln(1 + x) = x − x2/2 + x3/3 . . .] 124 [F (z) = −z ln(1 − 1/z)] Chapter 15 The Z transform: properties, inversion 15.1 z-transform properties Like the Fourier and Laplace transforms, the z-transform has many useful properties, some of which we derive in this chapter. 15.1.1 Linearity/superposition If c1 and c2 are constants and f1(n) and f2(n) are given functions, then the linearity property states that Z [c1 f1(n) + c2 f2(n)] = c1 F1(z) + c2F2(z) (15.1) where F1(z) = Z [f1(n)] and F2(z) = Z [f2(n)]. This is proved directly from the definition the the z-transform. Example 15.1 If g(t) = 1 − e −at , find G(z). First of all, note that g(nT ) = 1 − e −anT so letting b = e −aT we have g(n) = 1 − bn . Remembering that our functions start at t = 0, we have that g(n) is the sum of the two functions u(n) and −(bn). The z-transforms of these are z/(z − 1) and −z/(z − b) respectively, so, using superposition, z 1 − e −aT z z z(1 − b) G(z) = − = = . z − 1 z − b (z − 1)(z − b) (z − 1) (z − e −aT ) 125 15.1.2 Time delay This property is analogous to the Fourier transform time shift property. It is stated as follows: If Z [f (n)] = F (z) then Z [f (n − m)] = z −m F (z) where m ≥ 0 is an integer. It can be proved as follows. By definition, ∞ X Z [f (n − m)] = f (n − m)z −n n=0 = z 0f (−m) + z −1 f (1 − m) + . . . + z −m f (0) + z −m−1 f (1) + . . . and, since f (n) = 0 for n < 0, we have 0 Z [f (n − m)] = z × 0 + z −1 × 0 + ... + z −m = z −m F (z). 15.1.3 ∞ X f (n)z −n n=0 Time advance This property is also analogous to the Fourier time shift property. In what follows, we only need time advances of 1 × T and 2 × T , in which case the time advance property is: If Z [f (n)] = F (z), then Z [f (n + 1)] = zF (z) − zf (0) and Z [f (n + 2)] = z 2F (z) − z 2 f (0) − zf (1). The proof follows directly from the definition. We have Z [f (n + 1)] = ∞ X f (n + 1) z −n = f (1) + f (2)z −1 + f (3)z −2 + . . . n=0 126 = zF (z) − zf (0), using equation (14.3). The proof for Z [f (n + 2)] works in the same way. 15.1.4 Multiplication by an exponential sequence This property is analogous to the Fourier time scaling property, and is stated as follows: If Z [f (n)] = F (z) then Z [an f (n)] = F (z/a) Again, the proof follows directly from the definition. We have that Z [anf (n)] = ∞ X n a f (n) z −n = n=0 ∞ X f (n) (z/a)−n = F (z/a). n=0 Example 15.2 Given that Z [cos an] = (z 2 − z cos a)/(z 2 − 2z cos a + 1), we can deduce that Z [bn cos an] = 15.1.5 z 2 − zb cos a (z/b)2 − (z/b) cos a = . (z/b)2 − 2(z/b) cos a + 1 z 2 − 2zb cos a + b2 Differentiation property Like Fourier and Laplace transforms, the z-transform has a differentiation property, which is stated as follows: If Z [f (n)] = F (z) then Z [n f (n)] = −z dF (z) . dz The proof goes as follows. By definition, F (z) = ∞ X n=0 f (n) z −n ∞ X dF =− n f (n) z −n−1 . so dz n=0 127 Multiplying by −z gives ∞ dF (z) X = {nf (n)} z −n , −z dz n=0 which is clearly the z-transform of nf (n). Example 15.3 Using Z [an ] = z/(z − a), find Z [nan ] and Z n2an . We have that Z [nan ] = −z d/dz {z/(z − a)}. Using the derivative of a quotient rule, this gives za Z [nan ] = . (z − a)2 Differentiating again and multiplying the result by −z, we find h Z n2an = h 15.1.6 i i za(z + a) . (z − a)3 Initial Value Theorem If you know the z-transform of a function f (t), you can compute the value of the function at t = 0, using the Initial Value Theorem. This is If Z [f (n)] = F (z) then f (0) = z→∞ lim F (z). (15.2) The proof of this is straightforward. From the definition, F (z) = f (0) + z −1 f (1) + z −2 f (2) + . . . and if z → ∞, we are left with f (0). Example 15.4 If f (t) = cos at so f (n) = cos an, then f (0) = cos 0 = 1. The Initial Value Theorem gives the same result: F (z) = and so z(z − cos a) 1 − (1/z) cos a = z 2 − 2z cos a + 1 1 − (2/z) cos a + 1/z 2 lim F (z) = 1 z→∞ as expected. 128 15.1.7 Final Value Theorem The Final Value Theorem is a similar type of result to the Initial Value Theorem, but takes longer to prove. The theorem is If Z [f (n)] = F (z) then n→∞ lim f (n) = lim (z − 1)F (z). (15.3) z→1 The proof requires the Time Advance Theorem, which we have already seen. Let f (t) be a function of time whose z-transform exists, so that the P −n series ∞ converges. Now consider n=0 f (n) z Z [f (n + 1) − f (n)] = lim n→∞ n X i=0 z −i f (i + 1) − z −i f (i) = n→∞ lim −f (0) + {f (1) − z −1 f (1)} + {z −1 f (2) − z −2 f (2)} −2 2 −3 +{z f (3)/z − z f (3)} + . . . + z −n f (n + 1) = n→∞ lim −f (0) + z −n f (n + 1) + (1 − 1/z)f (1) n Therefore, −1 −2 +z (1 − 1/z)f (2) + z (1 − 1/z)f (3) + . . . lim Z [f (n + 1) − f (n)] = n→∞ lim {−f (0) + f (n + 1)} , z→1 (15.4) since, if z → 1, then (1 − 1/z) → 0 and z i → 1 for any i. Also, from the time advance property, we have the additional fact that lim Z [f (n + 1) − f (n)] = lim {zF (z) − zf (0) − F (z)} z→1 z→1 = −f (0) + lim (z − 1)F (z), z→1 129 and this is the same thing as equation (15.4). As f (0) is a constant, we have lim {−f (0) + f (n + 1)} = −f (0) + n→∞ lim f (n + 1) n→∞ = −f (0) + lim (z − 1)F (z) z→1 and, noting that limn→∞ f (n + 1) is the same thing as limn→∞ f (n), the Final Value Theorem follows. Example 15.5 We have seen in example 15.1 that Z [1 − an ] = z(1 − a) . (z − 1)(z − a) Now, limn→∞ 1 − an = 1 if |a| < 1. The Final Value Theorem confirms this: lim(z − 1) × z→1 15.2 z(1 − a) z(1 − a) = lim = 1. (z − 1)(z − a) z→1 (z − a) Inversion of the z-transform Much like the approach we used for inverting Laplace transforms, we can often use tables for inverting the z-transform. We can also use the properties described above, as well as partial fractions and power series — sometimes a combination of all of these is necessary. Finding inverse transforms can be anything from easy to quite complicated. Easy examples include the case where the transformed function is a polynomial in z −1 . Examples of all kinds are given in what follows. A small table of z-transforms, taken from the “Tables of Constants, formulae and transforms” used in exams, is included below. 130 f (t) Laplace Transform F (s) Z Transform F (z) δ(t) 1 1 Unit impulse H(t) Heaviside function or unit step 1 s z z−1 t 1 s2 Tz (z − 1)2 t2 2 s3 tn T 2 z(z + 1) (z − 1)3 n lim (−1)n ∂ n ∂a a→0 n! sn+1 z z − e −aT e −at 1 s+a z z − e −aT te −at 1 (s + a)2 T z e −aT (z − e −aT )2 sin ωt ω s2 + ω 2 z sin ωT z 2 − 2z cos ωT + 1 cos ωt s s2 + ω 2 z(z − cos ωT ) z 2 − 2z cos ωT + 1 e −at sin ωt ω (s + a)2 + ω 2 z e −aT sin ωT z − 2z e −aT cos ωT + e −2aT e −at cos ωt (s + a) (s + a)2 + ω 2 z 2 − z e −aT cos ωT z 2 − 2z e −aT cos ωT + e −2aT 1 − e −at a s(s + a) z(1 − e −aT ) (z − 1)(z − e −aT ) 2 You should note carefully how the information is presented in this version of the table: in particular, f (t) is given in the left-hand column, not f (n). Look back at section 14.2 to see why, for instance, f (t) = e −bt , corresponds exactly to f (n) = an, by choosing a = e −bT . 131 f (n) Z Transform F (z) δ(n) [Unit impulse] 1 H(n) [Unit step] z z−1 n z (z − 1)2 n2 z(z + 1) (z − 1)3 nk k lima→0 (−1)k ∂ k ∂a bn z z−b nbn zb (z − b)2 sin an z sin a z 2 − 2z cos a + 1 cos an z(z − cos a) z 2 − 2z cos a + 1 bn sin an zb sin a z 2 − 2zb cos a + b2 bn cos an z 2 − zb cos a z − 2zb cos a + b2 1 − bn z(1 − b) (z − 1)(z − b) z z − ea 2 By contrast, in this table, f (n) is given and not f (t) — the DSP way, as opposed to the CE way. 15.2.1 Finite sequences Example 15.6 Find the inverse z-transform (i.z.t.) of F (z) = 1 + 2z −1 − 7z −3. • This is a finite degree polynomial in z −1. By the definition of the ztransform, it should be clear that 1 2 n=0 n=1 f (n) = −7 n = 3 0 otherwise. 132 We now look at some examples in which the function to be inverted, F (z), is close to a form that is in the tables. Always bear in mind also the z-transform properties derived in the first part of this chapter. Example 15.7 Find the i.z.t. of F (z) = z/(z + b). • Note that Z [an ] = z/(z − a). • Now substitute b = −a to obtain Z [(−b)n] = z/(z + b). Hence, the i.z.t. of F (z) = z/(z + b) is f (n) = (−b)n. Example 15.8 Find the i.z.t. of F (z) = z/(z − b)2. • Use the fact that Z [nbn] = zb/(z − b)2 (see Example 15.3). • Thus, using the linearity property, we can divide both sides by b to get Z [n(bn)/b] = z/(z − b)2. Hence, the i.z.t. of F (z) = z/(z − b)2 is f (n) = n bn−1. Example 15.9 Find the i.z.t. of F (z) = z/(z 2 + b2 ). • Use the fact that Z [bn sin an] = zb sin a/(z 2 −2zb cos a+b2 ) (from tables). • To get rid of the 2zb cos a term in the denominator, set a = π/2, since cos π/2 = 0. • Hence, Z [bn sin(nπ/2)] = zb/(z 2 + b2), since sin π/2 = 1. • Divide both sides by b to get the final result. Hence, the i.z.t. of F (z) = z/(z 2 + b2 ) is f (n) = bn−1 sin(nπ/2). Here is an example where we first use partial fractions, then use the tables. Example 15.10 Find the i.z.t. of F (z) = 2z/((z − 1)(z − 3)). • First convert this to partial fractions. Note that you want, if possible, a form that is in the tables, so look for partial fractions in the form1 1 2z Az Bz = + . (z − 1)(z − 3) z − 1 z − 3 F (z) is also equal to 1/(z − 1) + 3/(z − 3), but the form 1/(z − b) isn’t in the tables. 133 • This gives A = −1, B = 1 (check this), so z −z + . F (z) = z−1 z−3 • Each of these parts is in the tables: −z/(z − 1) corresponds to −u(n) and z/(z − 3), to 3n . Hence, the i.z.t. of F (z) = 2z/((z − 1)(z − 3)) is f (n) = 3n − u(n). (If n ≥ 0, this is the same as 3n − 1.) Here are a couple of examples where we use power series. This is a good method to use, not too difficult, and often easier than the alternatives. Always bear in mind equations (14.5) and (14.7) for finding power series. Example 15.11 Find the i.z.t. of F (z) = 1/(z + b)2. • From equation (14.8) we have that (1 + x)−2 = x−1 ∞ X (−1)n+1nxn = (−1)n+1nxn−1. n=1 n=1 • Hence (z + b)−2 = z −2 (1 + b/z)−2 = z −2 ∞ X ∞ X n=1 (−1)n+1n(b/z)n−1 = z −2 − 2bz −3 + 3b2z −4 − 4b3z −5 . . . • Remember the definition of the z-transform. The previous equation is clearly the transform of a function of n which has the values f (0) = f (1) = 0 and f (2) = 1, f (3) = −2b, f (4) = 3b2 . . . , f (n) = (n − 1)(−b)n−2, provided that n ≥ 1 (remember that b0 = 1.) • Therefore, we are nearly right if we say that f (n) = (n − 1)(−b)n−2, but this gives the wrong value for n = 0. We want f (0) = 0, but substituting n = 0 in (n − 1)(−b)n−2 gives −b−2. Hence the i.z.t. of F (z) = 1/(z + b)2 is f (n) = (n − 1)(−b)n−2 + b−2δ(n). Make sure you clearly understand how adding the term b−2δ(n) makes things work out right. 134 Example 15.12 Find the i.z.t. of F (z) = 1/(z 3 − 1). • By substituting x = z −3 in equation (14.5), we have 3 (z − 1) −1 −3 −3 −1 = z (1 − z ) =z −3 ∞ X z −3n = z −3 + z −6 + z −9 + . . . n=0 • Hence f (n) = 1 when n = 3, 6, 9, . . . and is zero otherwise. The answer in the form given above is perfectly adequate. However, if you want to be fancy, you could also write this as f (n) = (1 + 2 cos(2nπ/3))/3 − δ(n) — check that this gives you the right sequence — but this is not necessary. Example 15.13 Find the i.z.t. of F (z) = z/(z − b)2. (We have already done this using tables — see Example 15.8.) • By substituting x = bz −1 in equation (14.7), we have z(z − b)−2 = z −1 (1 − bz −1 )−2 = z −1 1 + 2bz −1 + 3b2z −2 + 4b3z −3 . . . , = 0 + z −1 + 2bz −2 + 3b2z −3 + 4b3z −4 . . . • Therefore, f (n) = 0, 1, 2b, 3b2, 4b3 . . . for n = 0, 1, 2, 3, 4 . . .. Hence the i.z.t. of F (z) = z/(z − b)2 is f (n) = nbn−1, as before. Enough examples: time for you to have a go. Problems, chapter 15 1. Find the z-transform of f (n) = n u(n) in two ways: (i) directly from the definition; and (ii) by using the differentiation property. [Both give F (z) = z/(z − 1)2.] 2. Find the inverse z-transform of F (z) = z −1 − z −2 + 2z −4 . [f (n) = 0, 1, −1, 0, 2 for n = 0, 1, 2, 3, 4 and f (n) = 0 for n > 4.] 3. By using the “multiplication by an exponential sequence” property, deduce the z-transform of f (n) = bn sin an directly from the z-transform of sin an. [F (z) = bz sin a/(z 2 − 2bz cos a + b2)] 135 4. By using the linearity property, deduce the z-transform of f (n) = sin(an+ φ), φ constant, directly from the z-transforms of sin an, cos an. [F (z) = z(z sin φ + sin(a − φ))/(z 2 − 2z cos a + 1)] 5. By using the linearity property, deduce the z-transform of f (n) = cosh an directly from the z-transforms of e an , e −an. [F (z) = z(z − cosh a)/(z 2 − 2z cosh a + 1)] 6. Find the inverse z-transform of F (z) = z 2 /(z +b)2. Use the time advance property and the z-transform of nbn from the tables. [f (n) = (n + 1)(−b)n] 7. Use partial fractions, then tables, to find the inverse z-transform of F (z) = 3z 2/((z − 1)(z + 2)). [Hint: try partial fractions in the form F (z) = Az/(z − 1) + Bz/(z + 2).] [f (n) = u(n) + 2(−2)n] 8. Use partial fractions, followed by tables, to find the inverse z-transform of F (z) = z 2 /(z 2 − 4). [f (n) = (2n + (−2)n)/2] 9. Find the inverse z-transform of F (z) = (1 − z −1 )(1 − 2z −2). [f (n) = 1, −1, −2, 2 for n = 0, 1, 2, 3 and is zero otherwise] 10. Use power series to find the inverse z-transform of F (z) = 1/[z(z − 1)]. [f (0) = f (1) = 0, f (n) = 1, n ≥ 2 or, equivalently, f (n) = u(n − 2)] 11. Use power series to find the inverse z-transform of F (z) = 2z/(2z − 1). [f (n) = 2−n] 12. Use power series to find the inverse z-transform of F (z) = 1/(z + b). [f (n) = (−b)n−1 + δ(n)/b] 13. Using power series, or otherwise, find the inverse z-transform of F (z) = (z + 2)/(z + 1). [f (n) = 1, 1, −1, 1, −1, . . ., or, equivalently, f (n) = 2δ(n) + (−1)n+1] 14. Use power series to find the inverse z-transform of F (z) = 1/(z 2 − 1). [f (n) = 0, 0, 1, 0, 1, 0 . . . or f (n) = (1 + (−1)n)/2 − δ(n)] 136 Chapter 16 The Z transform: applications 16.1 Introduction Having introduced a lot of new material about z-transforms in the previous two chapters, it is now time to see why they are important, by looking at what they can enable us to do. In Electronic Engineering, you are most likely to encounter z-transforms in Control Theory and Digital Signal Processing applications. We will therefore look at some very basic filtering problems in this chapter, but before that, we will discuss the use of the z-transform to solve difference equations. 16.2 Difference equations Before we see how to solve them, here a a few examples of difference equations. 1. The present value of an annuity after n periods, x(n), obeys the difference equation x(n + 1) = (x(n) + P )/(1 + r), where r is the interest rate and P is the amount of each payment. 2. The repeated drug dose model, in which the amount of the drug still in the body at the n-th period, x(n), obeys the difference equation x(n + 1) = ax(n) + b. Here, a is the fraction of the drug which is degraded by the body during one period, and b is the dose given per period. 137 3. The cumulative average of a sampled signal. Let the sampled signal be x(n), with n = 0, 1, 2, . . ., so that the cumulative average, y(n), is defined as n 1 X y(n) = x(n). n + 1 i=0 We divide by n + 1 because there are n + 1 values on the right hand side. Suppose we want to compute y(n) for all n: this formula seems to be telling us that we need to store all n + 1 values x(0) . . . x(n) in order to do this. Eventually we will run out of memory. We can get around this problem by being clever Pn and noting that (n + 1)y(n) = i=0 x(n), and so (n + 2)y(n + 1) = x(n + 1) + n X x(n) = (n + 1)y(n) + x(n + 1). i=0 Hence, (n + 1) y(n) + x(n + 2) . n+2 This is a more complicated difference equation than the previous two, and its solution depends on the entire sequence x(n). y(n + 1) = 16.3 Solving difference equations Difference equations are in several ways like differential equations, the main difference being that, in a differential equation, the unknown function, x(t), say, is a function of a continuous variable, t. The continuous variable t can take on any real value. By contrast, in a difference equation, the unknown function, x(n), say, is a function of a discrete variable n, which it is assumed will only take on the values 0, 1, 2, . . ., the non-negative integers (although the solution may in fact be meaningful for all integers). 138 Solving a difference equation poses a similar sort of problem to solving a differential equation. For example, suppose that the difference equation is x(n + 1) = 2x(n). A solution, if we can find one, will be a function x(n) that satisfies this for all integers n ≥ 0. It should be clear that, for a given value of x(0), we have x(1) = 2x(0), x(2) = 2x(1) = 22x(0), x(3) = 2x(2) = 23x(0), . . . and from this you should be able to spot the general pattern, which is that x(n) = 2nx(0). Note that • This is a first order difference equation: x(n + 1) is a function of x(n) only. • Once we have specified a value of x(0), the solution is determined for all integers n ≥ 0 — this is just like a first order differential equation, where we need one initial condition to specify a particular solution. Thus, suppose that we have an initial condition, x(0) = 5 say. Then the difference equation x(n + 1) = 2x(n) with x(0) = 5 has the solution1 x(n) = 5 × 2n. You might think that was a rather easy problem with an obvious solution, so consider instead the first order difference equation Example 16.1 x(n + 1) = 2x(n) + 3n . This is harder, because we have an additional function of n on the right hand side. We use the z-transform in a way that should remind you of the use of Laplace transforms to solve differential equations, by going through the following steps: 1. Find the z transform of the difference equation. In this case, we have zX(z) − zx(0) = 2X(z) + z/(z − 3). The general solution is x(n) = x(0)2n ; the particular solution, when x(0) = 5, is x(n) = 5 × 2n . Even the terminology is the same as for differential equations. 1 139 Note that we have used the Time Advance Property (see page 126) to find the z-transform of x(n + 1). 2. Solve this equation for X(z): X(z) = z z x(0) + . z−2 (z − 3)(z − 2) 3. If necessary, manipulate this expression so that the inverse z-transform can easily be found. In this case, it is best is to use partial fractions for the second term; we then have z z z = − (z − 3)(z − 2) z − 3 z − 2 (check this) and so F (z) = z z z x(0) + − . z−2 z−3 z−2 4. Now find the inverse z-transform, which will give us x(n). In this case, x(n) = 2nx(0) − 2n + 3n = 2n (x(0) − 1) + 3n. (Look back at example 15.7 in the previous chapter if you need to remind yourself of the i.z.t. of z/(z − a).) You can easily check this: if it’s true, then, for any x(0), x(n + 1) − 2x(n) = 2n+1(x(0) − 1) + 3n+1 − 2n+1(x(0) − 1) − 2 × 3n = 3 × 3n − 2 × 3n = 3n which is what it should be, according to the difference equation. As a second example, let’s look at the repeated drug dose model. Example 16.2 Find the general solution to the difference equation x(n+1) = ax(n) + b. Under what conditions does x(n) tend to a finite limit as n → ∞, and what is this limit? Bear in mind that the constant b on the right hand side, as far as the ztransform is concerned, is u(n)b. Then the z-transform of this equation is zX(z) − zx(0) = aX(z) + 140 bz z−1 (again, using the time advance property) so z bz z b z z x(0) + = x(0) + − X(z) = z−a (z − a)(z − 1) z − a 1−a z−1 z−a ! where we have used partial fractions to get the last form. Hence, b (u(n) − an ) x(n) = a x(0) + . 1−a It is clear from this expression that x(n) tends to a finite limit as n → ∞ only if |a| < 1 (so an → 0), and then the limit is b/(1 − a) — remember that u(n) = 1 for n ≥ 0. You could also obtain this last result by applying the Final Value Theorem to X(z): try it and see. n As a third example, let’s try a second order difference equation. You’ll need to remember the Time Advance Property with an advance of 2 steps as well as 1 step. Example 16.3 Find the general solution to the difference equation x(n+2) = 2x(n + 1) + 3x(n), with general initial conditions (that is, x(0), x(1) can be anything). Under what conditions does this solution not blow up as n → ∞? Follow through the steps in the usual way. The z-transform of the equation is z 2 X(z) − z 2 x(0) − zx(1) = 2X(z) − 2zx(0) + 3X(z). Hence, and so z 2 − 2z − 3 X(z) = (z + 1)(z − 3)X(z) = z(z − 2)x(0) + zx(1) z(z − 2) z + x(1) . (z + 1)(z − 3) (z + 1)(z − 3) As usual, we now need partial fractions. Again, we seek fractions of the form z/(z ± a), because we know that this form is in the tables — it has an inverse (∓a)n. The partial fraction form is X(z) = x(0) x(1) z z x(0) 3z z + X(z) = + − 4 z−3 z+1 4 z−3 z+1 from which it is easy to see that ! x(n) = (−1)n x(0) + x(1) 3x(0) − x(1) + 3n . 4 4 141 ! This will blow up (i.e. x(n) will tend to ∞ as n increases), because of the 3n term, unless x(1) = −x(0). In that case only, x(n) = (−1)nx(0), which remains finite for all n. Finally, another second order difference equation. Example 16.4 Find the general solution to the difference equation x(n + 2) − 2x(n + 1) cos a + x(n) = 0, with general initial conditions. Here, a is a constant. Start in the usual way. The z-transform of the difference equation this time is z 2 X(z) − z 2 x(0) − zx(1) − 2z cos aX(z) − 2z cos ax(0) + X(z) = 0 so x(0)z 2 + z x(1) − 2z x(0) cos a . z 2 − 2z cos a + 1 Looking in the tables, you will recognise the denominator from the z-transforms of both sin an and cos an. Let us guess that the inverse z-transform of X(z) is of the form x(n) = c1 sin an + c2 cos an, where c1 , c2 are constants to be determined. Now, from the tables, Z [c1 sin an + c2 cos an] = X(z) = c1 z sin a + c2 z(z − cos a) c2 z 2 + c1 z sin a − c2 z cos a = , z 2 − 2z cos a + 1 z 2 − 2z cos a + 1 and matching the coefficients of z in the numerator, to those in the expression for X(z) (since the denominators are the same), we have c2 = x(0) and c1 sin a − c2 cos a = x(1) − 2x(0) cos a which we can solve for c1 , c2 . This gives, finally, x(n) = 16.4 x(1) − x(0) cos a sin an + x(0) cos an. sin a A FIR filter We now look very briefly at a simple digital signal processing (DSP) application of the z-transform: a Finite Impulse Response (FIR) 142 filter. The purpose of this section is to give you just a taste of why the z-transform is important in DSP applications — you will learn much more about this if you choose the relevant Year 3/MSc options. The filter we will discuss is a band-stop filter, which is one that attenuates all frequencies within a range, while letting all other frequencies through. We have in fact already seen, in the last example of the previous section, the basis on which a band-stop filter works. The basis is this: imagine that a signal of the form x(n) = c1 sin an + c2 cos an, for a given a, is the input to a system which computes y(n) = x(n) − 2x(n − 1) cos a + x(n − 2). Then we know from example 16.4 that the output sequence y(n) will be zero. In other words, this system filters out the particular signal x(n) defined above (and signals which are close by). Input 1 0 -1 0 1.5 200 300 400 500 300 400 500 0.28 1.0 Output 100 0.26 0.5 0.24 0.0 -0.5 0 100 200 n Figure 16.1: The filtering example. Top: input signal x(n) = x √1 (n)+x2 (n) = 0.5 sin 0.5n+cos 0.05n. Bottom: the output signal y(n) = y1 (n) + y2 (n) = x(n) − 3x(n − 1) + x(n − 2), showing that x1 (n) has been almost filtered out. To see how this works, let us set a = π/6, so that 2 cos a = 143 √ 3. Then the output y(n), for n ≥ 2, will be given by √ y(n) = x(n) − 3x(n − 1) + x(n − 2). We know, from example 16.4, that if the input is x(n) = c1 sin nπ/6+ c2 cos nπ/6, for any constants c1 and c2, then the output will be zero. This is easily checked. Let’s √ set c1 = 1, c2 = 0 to simplify things. Then y(n) = sin nπ/6 − 3 sin(n − 1)π/6 + sin(n − 2)π/6, and expanding the terms, we have √ √ y(n) = sin nπ/6 − 3 sin nπ/6 cos π/6 − 3 cos nπ/6 sin π/6+ sin nπ/6 cos π/3 − cos nπ/6 sin π/3 √ √ √ √ = 1 − 3 3/2 + 1/2 sin nπ/6 + − 3/2 + 3/2 cos nπ/6 = 0 as it should. This would work for any values of c1, c2 . What happens if the input consists of two signals, one close to sin nπ/6 and one far away? Let’s take, as an example, x(n) = x1(n) + x2(n), where x1(n) = 0.5 sin 0.5n and x2(n) = cos 0.05n — see figure 16.1, top. Then x1(n) is close to sin nπ/6 (because 0.5 is close to π/6 ≈ 0.524). On the other hand, the x2(n) is far away from cos nπ/6. In this case, the output y(n) consists of a phase- and amplitudemodified version of x2(n), with almost no trace of x1(n): x1(n) has effectively been filtered out — see figure 16.1, bottom. We can calculate the amplitudes of both components as follows. Since this is a linear system, superposition applies and we can write the output y(n) = y1(n)+y2 (n), where y1(n)√is the response to x1(n) and y2 (n), to x2(n). Then y1 (n) = x1(n) − 3x1(n − 1) + x1(n − 2) √ = 0.5 sin 0.5n − 0.5 3 sin 0.5(n − 1) + 0.5 sin 0.5(n − 2) √ = 0.5(1 − 3 cos 0.5 + cos 1) sin 0.5n + √ 0.5( 3 sin 0.5 − sin 1) cos 0.5n = 0.0102 sin 0.5n − 0.0056 cos 0.5n. 144 √ The amplitude of y1(n) is therefore 0.01022 + 0.00562 = 0.0116. You should also be able to estimate this amplitude from the magnified portion in figure 16.1. Carrying out the same calculation for x2(n) shows that the amplitude of y2 (n) is 0.26, about 20 times bigger than the amplitude of y1 (n), which again agrees with figure 16.1. Problems, chapter 16 1. Use z-transforms to solve the following difference equations: (i) x(n + 1) = 3x(n) with x(0) = 5 (ii) x(n + 1) = −2x(n) + 3u(n) with x(0) = 0 (iii) x(n + 2) = 5x(n + 1) − 6x(n) with x(0) = u(n), x(1) = −1. [(i) x(n) = 5 · 3n , (ii) x(n) = 1 − (−2)n, (iii) x(n) = 4 · 2n − 3 · 3n] 2. Solve x(n + 2) = 3x(n + 1) − 2x(n) by the z-transform method, with general initial conditions. What relation must there be among the initial conditions in order for the solution to be constant for n ≥ 0? [x(n) = (2x(0) − x(1))u(n) + (x(1) − x(0))2n; constant if x(1) = x(0).] 3. Use the z-transform to solve the present value of an annuity difference equation x(n), which is x(n + 1) = (x(n) + P )/(1 + r). [x(n) = (1 + r)−n (x(0) − P/r) + P u(n)/r] 4. Find the difference equation whose solution is x(n) = 5n − 3n, given that it is of the form x(n + 2) + Bx(n + 1) + Cx(n) = 0. Find the initial conditions that give rise to this solution. [x(n + 2) − 8x(n + 1) + 15x(n) = 0, x(0) = 0, x(1) = 2] 5. Solve the difference equation x(n + 1) + 3x(n) = (−1)n. [x(n) = 21 (−1)n + (−3)n(x(0) − 21 )] 6. A digital filter computes its output, y(n), from its input, x(n), according to the formula y(n) = x(n) − x(n − 1) · 2 cos a + x(n − 2). (i) Find a such that this system filters out a signal of the form x(n) = c1 sin nπ/3. 145 (ii) Let the input signal be x1(n) + x2(n) = sin n + 8 sin(n/12). The output is of the form y(n) = y1(n) + y2(n) where y1(n) = A1 sin(n + φ1) and y2(n) = A2 sin(n/12 + φ2 ). Find the constants A1, A2, φ1 and φ2. Sketch the input and output waveforms. [(i) a = π/3, (ii) A1 = 0.0806, φ1 = −1.00, A2 = 7.94, φ2 = −0.0833] 146 Chapter 17 Matrices I 17.1 The basics A matrix is an n × m array of numbers; n rows, m columns. Examples: 1. 1 0 0 1 2. 3. 4. . . . 2 × 2 unit matrix v 1 v 2 v3 . . . 3 × 1 column vector 1.4 2 4 5 1 − 3j 2 a a12 11 a21 a22 . . . 2 × 3 matrix 147 . . . general 2 × 2 matrix We use the convention upper case A, B, C etc. for matrices, underlined letters a, b, c etc. for vectors and ordinary letters, a, b, c etc. for scalars. We now go through some of the rules of matrix algebra. 17.2 Matrix equality Two matrices A and B with the same number of rows and columns are said to be equal to each other if and only if all their corresponding elements are equal. For instance, if A and B are 2 × 2 matrices, then they are equal only if a11 = b11, a12 = b12, a21 = b21, and a22 = b22. 17.3 Matrix addition If two matrices A and B have the same number of rows and columns, they can be added by adding together corresponding elements. For example, if a a A = 11 12 a21 a22 then a + b11 a12 + b12 A + B = 11 . a21 + b21 a22 + b22 17.4 Matrix multiplication 17.4.1 Scalar × matrix = matrix Given and b b B = 11 12 b21 b22 a a A = 11 12 a21 a22 and 148 c = a scalar then ca ca12 cA = 11 ca21 ca22 i.e. the result is another matrix. Just multiply each element by c. 17.4.2 Given then Matrix × vector = vector a a A = 11 12 a21 a22 and v v = 1 v2 a v + a12v2 Av = 11 1 a21v1 + a22v2 i.e. the result is a column vector. N.B. Number of columns in A must equal number of rows in v. 17.4.3 Given then Matrix × matrix = matrix a a A = 11 12 a21 a22 and b b B = 11 12 b21 b22 a b + a12b21 a11b12 + a12b22 AB = 11 11 a21b11 + a22b21 a21b12 + a22b22 i.e. the result is a 2 × 2 matrix. N.B. Number of columns in A must equal number of rows in B. Note also that AB does not equal BA in general — ‘matrices do not commute’. (See problem 1) 17.5 Determinants You may have met these before. To recap, the determinant of a 2 × 2 matrix A, written as det A or |A|, is det A = a11a22 − a12a21 149 i.e. the determinant of a matrix is a number. What about a 3 × 3 matrix? This can be calculated as three 2 × 2 determinants as follows. Given a a a 12 13 11 a a A= a21 22 23 a31 a32 a33 then a a a a a a 22 23 21 23 21 22 −a +a det A = a11 det 12 det 13 det a32 a33 a31 a33 a31 a32 This is known as the Laplace development of a determinant. Remember it as 1. Pick a row or column (used the first row in the above). 2. Taking each element in this row or column in turn, delete the row and column in which it occurs, and find the determinant of the remaining 2 × 2 matrix. 3. Multiply this determinant by the element in (2), with signs + − + . . . + − . . . − + − + . . . . . . .. .. .. and add up the three resulting numbers to obtain the determinant. This works for n × n matrices, but involves a lot of work for n > 3. (See problems 2 and 3) 17.6 Solving two linear equations Matrix algebra provides a systematic way of solving a set of simultaneous linear equations. For example, given two linear equations a11x1 + a12x2 = w1 150 (17.1) and a21x1 + a22x2 = w2 put these into matrix notation by defining a a x w A = 11 12 x = 1 w = 1 a21 a22 x2 w2 so that equations 17.1 and 17.2 together become x w a a 12 1 11 = 1 x2 w2 a21 a22 or, in matrix notation Ax = w. (17.2) (17.3) (See problem 4) Now, suppose that a11 . . . a22 and w1 , w2 are given, with w1 , w2 not both 0. To solve for x2, 1. Multiply 17.1 by a21 and 17.2 by a11 to get a11a21x1 + a12a21x2 = a21w1 a11a21x1 + a11a22x2 = a11w2. 2. Subtract these to get x2(a11a22 − a12a21) = a11w2 − a21w1 . 3. Note that a11a22 − a12a21 = det A so a11w2 − a21w1 . (17.4) x2 = det A Similarly for x1 : a22w1 − a12w2 . (17.5) x1 = det A Look at 17.4 and 17.5: they are same form as 17.1 and 17.2. We can write 17.4 and 17.5 together in matrix notation: 1 a22 −a12 w1 x1 = . w2 x2 det A −a21 a11 151 This equation looks just like 17.3, but with x and w swapped and matrix A replaced by 1 a22 −a12 . det A −a21 a11 This new matrix is known as the inverse of A, written A−1 or inv A. It has the property that A−1 A = AA−1 17.6.1 1 0 = I, the unit matrix, . 0 1 Properties of the unit matrix 1. The n × n unit matrix has 1s down the leading diagonal and 0s everywhere else. 2. If A is any n × n matrix and I is the n × n unit matrix, then AI = IA = A (just like multiplying numbers by 1). 3. For any column vector v with n rows, Iv = v. (See problem 5) 17.7 Application — Z and Y parameters The 2-port Z parameters, z11 . . . z22, are defined with reference to the figure below. i1 v1 i2 Two port network v2 The Z parameters are impedances z11 . . . z22 such that v1 = z11i1 + z12i2 152 and v2 = z21i1 + z22i2 or, in matrix/vector notation v = Zi (17.6) where v v = 1, v2 i i = 1, i2 z z Z = 11 12 . z21 z22 The Y parameters, y11 . . . y22, are admittances (reciprocal impedances) and are defined by: (17.7) i = Y v. Now, pre-multiplying 17.6 by Z −1 gives Z −1 v = Z −1 Zi = Ii = i and, comparing with 17.7 Y = Z −1 Hence, the Y (admittance) matrix is the inverse of the Z (impedance) matrix (and vice versa). (See problem 6) We shall have more to say about Z and Y parameters in the next chapter. 153 Problems, chapter 17 1. If b a a12 and B = 11 A = 11 b21 a21 a22 show that AB does not equal BA in general. ! b12 b22 ! 2. Evaluate the determinant of the following matrices: (a) 2 4 3 5 ! jωL R 1/R −jωC (b) 4.3 −2.2 1.2 1.4 5.9 (e) 2.1 7.3 4.9 −4.1 a 4a 2b d e (d) c 2a 8a 4b 1 2 3 2 3 1 (c) 3 1 2 ! [(a) −2, (b) ω 2 LC − 1, (c) −18 (d) 0 (e) −262.607] 3. Evaluate the determinant of 5 3 4 1 −3 0 7 2 −8 by expanding (a) along the second row, (b) down the first column. [(a) and (b) 236] 4. If calculate A−1 a11 a12 A= a21 a22 and show that AA−1 = A−1A = I, the 2 × 2 unit matrix. ! 5. Put the following three linear equations into matrix form Ax = w 2x1 − x2 + 5 = 3x3 −x3 + 8 = x1 + x2 x2 + 3x3 = 2 − x1 6. The admittance matrix for a 2-port network is −3j 0.5 Y = 2 −0.5j ! What is its impedance matrix? [z11 = 0.2j, z12 = 0.2, z21 = 0.8, z22 = 1.2j] 154 Chapter 18 Matrices II 18.1 Matrix inversion: Pi to T conversion Given Ya, Yb and Yc in the following Pi configuration Yb i1 v1 Ya i2 Yc v2 the problem is to find Za , Zb and Zc such that the following T circuit is equivalent to the Pi. Za Zc Zb Matrix manipulation provides a systematic solution to this problem, which known as the Pi–T or Delta–star transformation. The first thing to remember is that it is easy to write down Y -parameters for a Pi circuit, Z-parameters for a T circuit. 155 So, what are the Y -parameters for the Pi circuit? The definition we need is i = Y v, or, in full, i1 = y11 v1 + y12 v2 i2 = y21v1 + y22v2. So, for instance i1 v1 y11 = when v2 = 0 i.e. y11 = Ya + Yb . Similarly, y12 = i1 v2 when v1 = 0 i.e. y12 = −Yb. Repeating for y21 and y22 gives Y + Yb −Yb Y = a . −Yb Yb + Yc Now, what are the Z-parameters for T circuit? By the same method, we find that Z + Z Z a b b . Z= Zb Zb + Zc Now, we know that Z = Y −1 so 1 Z + Zb Zb Z= a = Zb Zb + Zc det Y Y + Yc Yb b Yb Ya + Yb where det Y = (Ya + Yb )(Yb + Yc) − Yb2 = YaYb + YbYc + YcYa. Two matrices are equal only when all their elements are equal, so in order for the Pi and T circuits to be equivalent, Y −1 for the Pi must 156 be equal to Z for the T, and so, considering each element in turn, the following must hold Zb = Yb/ det Y Za + Zb = (Yb + Yc)/ det Y, giving Za = Yc/ det Y Zb + Zc = (Ya + Yb)/ det Y, giving Zc = Ya/ det Y. which is the required answer. 18.2 Solving n linear equations Under certain conditions, for known matrix A and known vector w, the set of n linear equations Ax = w (18.1) can be solved for unknown vector x: x = A−1 w. The condition is that the inverse of A exists. Looking back at the last section of the previous chapter, we see that calculating the inverse of A requires us to divide by det A. The condition for the inverse of A to exist is therefore that det A is not equal to 0. Provided that this condition is met, we can find the inverse of A (in principle) and hence solve the n equations 18.1, as long as w 6= 0. Note that this is true regardless of how many equations there are. 18.3 Inverting an n × n matrix We have seen how to invert a 2 × 2 matrix. How is this generalised to larger matrices? There are several ways of doing this, one of which is known as the adjoint method, which is best shown by example. 157 Example 18.1 Use the adjoint method to invert a 3 × 3 matrix a11 A = a21 a31 The inverse of A is given by a12 a22 a32 a13 a23 . a33 AdjointA det A where the adjoint of A is calculated in two steps: A−1 = 1. Replace each element of A with its cofactor. To do this, for each element of A, cross out the row and column in which it appears, and find the determinant of the remaining matrix. Multiply this by +1 or −1, according to its position. e.g. The cofactor of a11 is a22 a33 − a23 a32 . e.g. The cofactor of a23 is −(a11a32 − a12 a31). The signs we need to multiply by are + − − + + − + − + for a 3 × 3 matrix. The matrix of cofactors of A is therefore a22 a33 − a23a32 −(a a − a a ) 12 33 13 32 a12 a23 − a13a22 −(a21a33 − a23 a31) a11 a33 − a13a31 −(a11a23 − a13 a21) a21a32 − a22 a31 −(a11a32 − a12 a31 ) a11a22 − a12 a21 2. Transpose the matrix of cofactors — that is, reflect it about the leading diagonal, to give a22 a33 − a23 a32 AdjA = −(a21 a33 − a23 a31 ) a21 a32 − a22 a31 −(a12a33 − a13 a32) a11 a33 − a13a31 −(a11a32 − a12 a31) a12 a23 − a13a22 −(a11 a23 − a13 a21 ) a11 a22 − a12a21 Dividing this by det A gives the inverse of A, provided that det A 6= 0. This method extends to n × n matrices, but involves a lot of work for n > 3. 158 18.4 The equation matrix × vector = 0 We have solved Ax = w when w is not the zero vector, 0. What about the equation Ax = 0? (N.B. By ‘0’ I mean the column vector with zeros everywhere.) That is, for a 2 × 2 matrix, a a A = 11 12 , a21 a22 x x = 1 x2 and so Ax = 0 becomes the pair of equations 0 0= 0 a11x1 + a12x2 = 0 (18.2) a21x1 + a22x2 = 0. (18.3) and There are two possibilities. The first is (a) x1 = x2 = 0 (obviously). However, another solution may also exist: first find x1 from 18.2: a12x2 (18.4) x1 = − a11 Substitute this in 18.3 a21a12x2 a a 21 12 − + a22x2 = − + a22 x2 = 0 a11 a11 so we can see that x2 is forced to be 0 unless a21a12 + a22 = 0 − a11 or, in other words, if a11a22 − a12a21 = 0. Recognise this? It’s det A, so the second possibility is that (b) det A = 0, in which case x1 and x2 are not forced to be 0. This is a general condition and applies to n linear equations, not just two. 159 In case (b), det A = 0, only the ratio x1/x2 is defined by 18.2 and 18.3. From 18.2 this ratio is x1 a12 =− . x2 a11 18.5 Application of matrix × vector = 0 L L i1 C i2 C C Figure 18.1: An application of Ax = 0. From Kirchhoff’s voltage law, we know that the sum of the voltages around closed loops is zero, so for the circuit in figure 18.1 Loop 1: i1 − i2 1 + jωL + =0 i1 jωC jωC (18.5) Loop 2: 1 i2 − i1 + jωL + = 0. i2 jωC jωC (18.6) and In matrix form, Zi = 0 so 1 2 − ω 2 LC −1 i 0 1 = . i2 −1 2 − ω 2LC 0 jωC The solution is either (a) i1 = i2 = 0 (true, but trivial) or (b) det Z = 0, which gives (2 − ω 2 LC)2 − 1 = 0 160 (18.7) i.e. 2 − ω 2 LC = ±1 so 1 3 , . LC LC This condition gives the two resonant frequencies of the circuit. As stated above, i1 and i2 aren’t fixed, but their ratio is: from either 18.5 or 18.6, i2 = 2 − ω 2LC = ±1. i1 The interpretation of this is that current i1 can have any magnitude; then i2 is of the same magnitude, but with the same or opposite sign (i.e. circulates in the same or the opposite direction). ω2 = 18.6 Eigenvalues and eigenvectors The equation (A − λI)x = 0 (18.8) where A is an n × n matrix, I is the n × n unit matrix, λ is a number and x is a vector, arises in problems in circuit theory and other branches of electronic engineering. Note the following about the nontrivial solutions λ and x to this equation: • There are n values of λ, which are known as the eigenvalues of A. • The eigenvalues can be real or complex, depending on A. • To each eigenvalue there corresponds a vector x, known as an eigenvector of A. • If x1 is an eigenvector, then ax1, where a is any constant, is also an eigenvector. 161 Example 18.2 Find the eigenvalues and corresponding eigenvectors of the matrix ! 2 1 A= . −2 5 Answer As we saw in the previous section, the equation Ax = 0 only has nontrivial solutions if det A = 0. Hence, nontrivial solutions to equation (18.8) can only be found if det " 2 1 1 0 −λ −2 5 0 1 ! !# = 0. We can solve this for λ, the eigenvalues: det " 2 1 1 0 −λ −2 5 0 1 ! !# 2−λ = det −2 1 5−λ ! = (2 − λ)(5 − λ) + 2 = λ2 − 7λ + 12 = 0 which has solutions λ = 3, 4. To each of these values of λ there corresponds an eigenvector x, which is defined such that (A − λI)x = 0. Taking the eigenvalue λ = 3 gives [A − 3I]x = " 2 1 3 0 − −2 5 0 3 ! !# Multiplying this out gives −x1 + x2 = 0 and x1 x2 ! −1 1 = −2 2 ! x1 x2 ! = 0. − 2x1 + 2x2 = 0 both of which tell !us that x1 = x2 . Hence, the eigenvector corresponding to 1 λ = 3 is x = a for an arbitrary constant a. 1 The !eigenvector corresponding to λ = 4 is calculated in the same way. It is 1 b with b another arbitrary constant. 2 18.7 Applications of eigenvalues/eigenvectors The resonance problem considered in section 18.5 can also be treated as an eigenvalue problem. Multiplying equation 18.7 by jωC gives 162 2 2 − ω LC −1 i 0 1 = i2 −1 2 − ω 2 LC 0 so [ 2 −1 − ω 2LC −1 2 ↑ ↑ A − λ 1 0 0 1 ↑ I ] i 1 = i2 ↑ x = 0 0 ↑ 0. You should recognise this as the eigenvalue equation again. The eigenvalues of the matrix A are LC× (the resonant frequencies of the circuit)2. The eigenvectors of A are the currents i1 and i2. As pointed out before, only the ratio i1/i2 is fixed, not their actual values — which is also true for eigenvectors. For another application, see problems. 163 Problems, chapter 18 1.∗ The T-Pi transformation. Use matrix algebra to find the values of Ya , Yb and Yc , in terms of Za , Zb and Zc , that make the following two circuits equivalent: Za Yb Zc Ya Zb Yc [Ya = Zc /D, Yb = Zb /D, Yc = Za /D, where D = Za Zb + Zb Zc + Zc Za ] 2. (a) Put the following equations into the form Az = b: 2z1 − 3z2 = 4 9z2 − 6z1 = −12 Find det A. Can you solve the equations? Why? (b) Put b = 0. Now what can you say about z1 and z2 ? [(a) det A = 0. No. The 2nd eqn. is just −3 × the first, so it gives us no new information; det A = 0 is telling us this. (b) z1 /z2 = 3/2.] 3. Consider the following circuit: 2C 2L i1 C L i2 L Find the resonant frequencies and the corresponding values of i1/i2 . √ √ √ [ω 2 = (4 ± 6)/(10LC), i1 /i2 = (2 ∓ 2 6)/(4 ± 6) = −0.449, 4.449] 4. Find the eigenvalues and corresponding eigenvectors for the following matrices: (a) 4 3 2 5 164 ! (b) 2 3 −1 2 ! (c) 4 0 1 0 4 7 −5 1 3 √ √ [(a) 2, (−3/2, 1) and 7, (1, 1), (b) 2 ± j 3, (1, ±j/ 3) (c) 4, (1, 5, 0); 2, (1, 7, −2) and 5, (1, 7, 1)] 5.∗ (a) The two port network below has transmission parameter matrix T , which is defined so that v2 i2 ! t = 11 t21 t12 t22 i1 v1 v1 . i1 ! ! i2 v2 Two port network Z0 An external impedance Z0 is connected to port 2. If Z0 is chosen so that the input impedance (v1/i1) is also Z0 , show that the eigenvalues of T are the ratios of i2 /i1. What do the eigenvectors of T correspond to? (b) The network above has parameters 8 3 T = . 1 8 ! Find the values of Z0 such that input impedance is Z0 and the corresponding current ratios, i2 /i1. √ √ [Z0 = ± 3, i2/i1 = 8 ± 3] 165 Chapter 19 The wave equation Aims By the end of this chapter, you should understand • what a partial differential equation is • how to derive the wave equation for a transmission line • how to find a general solution to the wave equation • why signals propagate along lines. 19.1 Partial differential equations You have met differential equations in the first year and they appear again in this course, in the Laplace transform chapters. In this chapter, we derive and discuss a partial differential equation, known as the wave equation, that crops up frequently. The one-dimensional wave equation is 2 ∂ 2v 2∂ v =c . ∂t2 ∂x2 Points to note about it are: (19.1) • The unknown function, v = v(x, t), is a function of more than one variable. In this case v(x, t) is a function of distance, x, and time, t. 166 • In ordinary differential equations, the unknown function depends on only one variable; in partial differential equations, the unknown function depends on two or more variables. • The constant c2 has dimensions of velocity squared. Other examples of partial differential equations include • Laplace’s equation ∂ 2v ∂ 2v ∂ 2v ∇ v(x, y, z) = 2 + 2 + 2 = 0 ∂x ∂y ∂z 2 • The three-dimensional wave equation ∂ 2v = c2 ∇2v(x, y, z, t) 2 ∂t both of which arise in electromagnetic problems. In this chapter, we concentrate on the one-dimensional wave equation, 19.1. 19.2 Derivation of the wave equation Inner conductor Earthed shield Figure 19.1: A piece of coaxial cable. 167 In this section we derive equation 19.1 for a coaxial cable which has an inner conducting core and an earthed outer shield, as shown in figure 19.1. We assume that there is no leakage between the inner conductor and the shield, and that the conductor has zero resistance. Suppose that the inner conductor has inductance L per unit length and capacitance C between it and the shield, also per unit length. Then a section of cable of length δx has inductance Lδx and capacitance Cδx. Consider first the inductive behaviour of a length δx of cable, illustrated below. v v + δv ✛ i δx ✲ ✲ x x + δx The voltage of the inner conductor is v at a distance x along the line, and v + δv at a distance x + δx. If the current is i, then, from the definition of inductance, we get ∂i . ∂t (N.B. signs). Rearranging and letting δx → 0 gives v − (v + δv) = (Lδx) − ∂v ∂i =L . ∂x ∂t 168 (19.2) Now consider the capacitive behaviour of the same piece of line. i✲ ✛ i +✲δi δx ✲ v x x + δx The current in the core is i at a distance x along the line, and i + δi at a distance x + δx. If the voltage is v, then, from the capacitance equation we get δQ = (Cδx) v, and using the fact that i = dQ/dt gives i − (i + δi) = (Cδx) ∂v . ∂t Rearranging and letting δx → 0 gives ∂i ∂v =C . (19.3) ∂x ∂t We can now derive the wave equation from 19.2 and 19.3. Differentiating 19.2 with respect to x gives − ∂ 2i ∂ 2v − 2 =L ∂x ∂x∂t and differentiating 19.3 with respect to t gives ∂ 2i ∂ 2v − = C 2. ∂t∂x ∂t 169 Combining these and using the fact that ∂2i ∂t∂x = ∂2i ∂x∂t gives ∂ 2v 1 ∂ 2v = . ∂t2 LC ∂x2 (19.4) This is the wave equation. Remembering that L and C are the inductance/capacitance per unit length, you should show that 1/(LC) has the dimensions of velocity squared. In fact it can be shown (see electromagnetism course notes) that LC = ǫ0ǫr µ0µr . In an air-filled cable the relative permittivity/permeability, ǫr = µr = 1, so the velocity is 1 c=√ ≈ 3.0 × 108 m/s ǫ0 µ 0 which is the velocity of light. 19.3 The d’Alembert solution of the wave equation Look again at the wave equation in the form 19.1 in which it was first given. Suppose f is a function, which is arbitrary except that it can be differentiated twice. Now consider f (x − c t). Differentiating:1 ∂f (x − c t) = f ′ (x − c t) ∂x ∂f (x − c t) = −cf ′ (x−c t) ∂t and ∂ 2f (x − c t) = f ′′ (x − c t) 2 ∂x and ∂ 2f (x − c t) = c2 f ′′ (x−c t) 2 ∂t 1 I have used f ′ (x − c t) to mean “the derivative of f with respect to its argument, evaluated at x − ct”. For instance, if f (x − ct) = (x − ct)3 , then f ′ (x − ct) = 3(x − ct)2 and f ′′ (x − ct) = 6(x − ct). 170 Looking at the second derivatives, we can see that f (x − c t) is a solution to the wave equation. That is, 2 ∂ 2f (x − c t) 2 ∂ f (x − c t) =c . ∂t2 ∂x2 The same is true of another arbitrary, twice-differentiable function, g(x + c t) (see problems). In fact, the most general possible solution to 19.1 is the sum of the two: v(x, t) = f (x − c t) + g(x + c t). (19.5) This is known as the d’Alembert solution of the wave equation. Points to note: • f (x − c t) represents a wave travelling to the right — in the direction of increasing x — with velocity c. • g(x + c t) represents a wave travelling to the left — decreasing x — with velocity c. • Both f and g are arbitrary functions — hence, there is a very wide range of possible solutions to the wave equation. • Boundary conditions are needed to find f and g for a given situation — see below. • All of this theory applies equally to plane electromagnetic waves in free space, waves on stretched strings etc. as well as waves on coaxial cables. 19.4 Boundary conditions As is the case with ordinary differential equations, some initial information is needed to solve the wave equation in a particular case. This 171 information is contained in the boundary conditions. Two boundary conditions are needed: 1. the voltage on the line at t = 0 for all x, which we shall call V0(x); and 2. the derivative of the voltage with respect to time, also at t = 0 and again for all x, which we shall call W0(x). Given V0(x) and W0(x) we can find the solution, v(x, t) for all x and t by finding the functions f and g appearing in 19.5. To put this in the form of an equation Given the general solution (f and g arbitrary) apply boundary conditions, V0(x) and W0(x) → Particular solution, v(x, t). So, how do we find the functions f and g given the functions V0(x) and W0(x)? Let us write down the two things we know: V0(x) = f (x − c t) + g(x + c t)|t=0 = f (x) + g(x) (19.6) for the voltage at t = 0, and ∂f (x − c t) ∂g(x + c t) = −cf ′ (x) + cg ′(x) (19.7) W0(x) = + ∂t ∂t t=0 for the derivative of voltage w.r.t. t at t = 0. We have two equations here, which we hope to solve for the two unknown functions f and g. To do this, first integrate (19.7) and divide by c to get Z 1 x −f (x) + g(x) = W0(s)ds (19.8) c x0 172 where x0 is an arbitrary constant which disappears later on. Now, subtracting (19.6) and (19.8) gives Z 1 1 x f (x) = V0(x) − W0(s)ds 2 2c x0 and adding the same pair of equations gives Z 1 1 x g(x) = V0(x) + W0(s)ds. 2 2c x0 Remembering that the solution, v(x, t), is f (x − c t) + g(x + c t), we obtain 1 [V0(x − c t) + V0(x + c t)] 2 Z Z 1 x+ct 1 x−ct W0(s)ds + W0(s)ds. − 2c x0 2c x0 v(x, t) = However − Z x−ct W0(s)ds = + x0 Z x0 W0(s)ds x−ct (swapping the limits changes the sign), so combining the two integrals gives, finally, 1 1 v(x, t) = [V0(x − c t) + V0(x + c t)] + 2 2c Z x+ct W0(s)ds. x−ct (19.9) To illustrate further, let us consider an example. Example 19.1 A coaxial cable stretching to ±∞ has on it the initial voltage V0 (x) = 1 1 + x2 173 and initial time derivative of voltage W0(x) = 0 at t = 0. Find the function v(x, t), which describes the voltage as a function of x and t, if the wave velocity for the cable is c. Answer We are being asked to solve the wave equation with the boundary condition that, at t = 0, 1 1 + x2 and W0(x) = 0. The required solution to the wave equation is V0 (x) = v(x, 0) = 1 1 1 v(x, t) = V0 (x − c t) + V0 (x + c t) + 2 2 2c from 19.9. Hence, Z x+ct 0ds x−ct 1 1 + 2(1 + (x − ct)2 ) 2(1 + (x + ct)2 ) where c is the wave velocity for the cable. You can check that at t = 0, v(x, 0) = V0 (x), as it should do. You can also check that the time derivative is zero at t = 0, as required. v(x, t) = 19.5 What does it all mean? The above example shows why a signal (a time-varying voltage) fed into one end of a coaxial cable of length l, comes out at the other end a time l/c later. The condition that f and g must be twice differentiable is automatically satisfied for all real voltage waveforms on transmission lines, even supposedly rectangular pulses. This is because it is not possible to create a perfectly sharp voltage edge, as this would imply a voltage with infinite first derivative. The reason this cannot happen is that there is always some stray capacitance Cs around any conductor, and i = Csdv/dt would be infinite for an infinitely sharp edge. 174 No source can generate an infinite current, which would require an infinite number of electrons to flow past a point (there aren’t enough electrons in the Universe) or a finite number of electrons to flow with infinite velocity (which violates relativity). By a similar argument, v ′′ cannot be infinite because that would imply an infinite di/dt for Cs. Such a current flowing through any stray inductance would give rise to an infinite voltage across the inductance. You can get a good idea what the solution to the example actually looks like by using a simple demonstration I have set up. This uses the animation function in the algebraic manipulation program xmaple. It animates the solution to example 19.1. I recommend you look at this. You can also modify the program yourself to see how the pictures change. To see the demonstration 1. Copy /vol/examples/teaching/engmaths2/wave eqn to your home directory. Call it wave eqn. 2. Type xmaple. 3. When the xmaple window comes up, type read wave eqn;. 4. After a short while a plot of single-humped function will be produced. Click on the plot; a box appears around it and a second row of buttons appears. Click on the play button ✄ to see the two waves move off in opposite directions. You can also use xmaple to differentiate v(x, t) with respect to t, substitute t = 0 in the result: simplify(subs(t = 0, diff(v, t))), and show that this is 0 as it should be. 175 Problems, chapter 19 1. Show that v(x, t) = A sin(kx − ωt) + B cos(kx + ωt) with A and B arbitrary constants, is a solution to the wave equation. What is the wave velocity in this case? [Velocity = ±ω/k] 2. Show that g(x+c t), where g is an arbitrary, twice differentiable function g, is a solution of the wave equation. 3. A string is given an initial displacement V0 (x) = sin x x The initial velocity of the string is everywhere zero. Find v(x, t), the function that describes the motion of the string after it is released at t = 0. Describe in words and a sketch what the motion looks like. Show that v(x, t) has the properties (a) v(x, 0) = V0 (x) and (b) ∂v(x,t) = 0 at ∂t t = 0. To see what this solution looks like, use xmaple to animate v(x, t) for you. [v(x, t) = 1/2[sin(x − ct)/(x − ct) + sin(x + ct)/(x + ct)]] 4.∗ The solution to the wave equation with initial displacement V0 (x) and initial time derivative W0(x) is V0 (x − c t) + V0(x + c t) 1 v(x, t) = + 2 2c Z x+ct W0 (s)ds x−ct 2 If the initial voltage on an infinite coaxial cable is e −x , what must the initial rate of change of voltage, W0(s), be in order that v(x, t) consists 2 of only a single pulse of height 1, with shape e −x , moving to the right? You are encouraged to use xmaple to animate this solution too. 2 [W0(s) = 2cse −s ] 176 Area under the Gaussian error curve The table below gives the area under the Gaussian error curve between 0 and z, where . z = |x−x̄| σ Example: For z = 1.72, area = 0.4573. z 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3.0 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 0 .0000 .0398 .0793 .1179 .1554 .1915 .2257 .2580 .2881 .3159 .3413 .3643 .3849 .4032 .4192 .4332 .4452 .4554 .4641 .4713 .4772 .4821 .4861 .4893 .4918 .4938 .4953 .4965 .4974 .4981 .4987 .4990 .4993 .4995 .4997 .4998 .4998 .4999 .4999 .01 .0040 .0438 .0832 .1217 .1591 .1950 .2291 .2611 .2910 .3186 .3438 .3665 .3869 .4049 .4207 .4345 .4463 .4564 .4649 .4719 .4778 .4826 .4864 .4896 .4920 .4940 .4955 .4966 .4975 .4982 .4987 .4991 .4993 .4995 .4997 .4998 .4998 .4999 .4999 .02 .0080 .0478 .0871 .1255 .1628 .1985 .2324 .2642 .2939 .3212 .3461 .3686 .3888 .4066 .4222 .4357 .4474 .4573 .4656 .4726 .4783 .4830 .4868 .4898 .4922 .4941 .4956 .4967 .4976 .4982 .4987 .4991 .4994 .4995 .4997 .4998 .4999 .4999 .4999 .03 .0120 .0517 .0910 .1293 .1664 .2019 .2357 .2673 .2967 .3238 .3485 .3708 .3907 .4082 .4236 .4370 .4484 .4582 .4664 .4732 .4788 .4834 .4871 .4901 .4925 .4943 .4957 .4968 .4977 .4983 .4988 .4991 .4994 .4996 .4997 .4998 .4999 .4999 .4999 .04 .0160 .0557 .0948 .1331 .1700 .2054 .2389 .2704 .2995 .3264 .3508 .3729 .3925 .4099 .4251 .4382 .4495 .4591 .4671 .4738 .4793 .4838 .4875 .4904 .4927 .4945 .4959 .4969 .4977 .4984 .4988 .4992 .4994 .4996 .4997 .4998 .4999 .4999 .4999 177 .05 .0199 .0596 .0987 .1368 .1736 .2088 .2422 .2734 .3023 .3289 .3531 .3749 .3944 .4115 .4265 .4394 .4505 .4599 .4678 .4744 .4798 .4842 .4878 .4906 .4929 .4946 .4960 .4970 .4978 .4984 .4989 .4992 .4994 .4996 .4997 .4998 .4999 .4999 .4999 .06 .0239 .0636 .1026 .1406 .1772 .2123 .2454 .2764 .3051 .3315 .3554 .3770 .3962 .4131 .4279 .4406 .4515 .4608 .4686 .4750 .4803 .4846 .4881 .4909 .4931 .4948 .4961 .4971 .4979 .4985 .4989 .4992 .4994 .4996 .4997 .4998 .4999 .4999 .4999 .07 .0279 .0675 .1064 .1443 .1808 .2157 .2486 .2794 .3078 .3340 .3577 .3790 .3980 .4147 .4292 .4418 .4525 .4616 .4693 .4756 .4808 .4850 .4884 .4911 .4932 .4949 .4962 .4972 .4979 .4985 .4989 .4992 .4995 .4996 .4997 .4998 .4999 .4999 .5000 .08 .0319 .0714 .1103 .1480 .1844 .2190 .2517 .2823 .3106 .3365 .3599 .3810 .3997 .4162 .4306 .4429 .4535 .4625 .4699 .4761 .4812 .4854 .4887 .4913 .4934 .4951 .4963 .4973 .4980 .4986 .4990 .4993 .4995 .4996 .4997 .4998 .4999 .4999 .5000 .09 .0359 .0753 .1141 .1517 .1879 .2224 .2549 .2852 .3133 .3389 .3621 .3830 .4015 .4177 .4319 .4441 .4545 .4633 .4706 .4767 .4817 .4857 .4890 .4916 .4936 .4952 .4964 .4974 .4981 .4986 .4990 .4993 .4995 .4997 .4998 .4998 .4999 .4999 .5000 LAPLACE TRANSFORMS F (s) = Z∞ f (t)e −st dt 0 f (t) af1 (t) + bf2 (t) d dt f (t) d2 f (t) dt2 dnn f (t) dt F (s) aF1 (s) + bF2 (s) sF (s) − f (0) s2 F (s) − sf (0) − f ′ (0) sn F (s) − sn−1 f (0) − sn−2 f ′ (0) − . . . − f n−1 (0) F (s) s Rt Ru F (s) 0 0 f (v) dv du s2 d F (s) tf (t) − ds n tn f (t) n>0 (−1)n d n F (s) ds R∞ 1 f (t) 0 F (u) du Rt t F (s) G(s) 0 f (t − u)g(u) du = f (t) ∗ g(t) e at f (t) F (s − a) f (t − a) with f (t) = 0 for t < 0 e −as F (s) a>0 1f( t ) a>0 F (as) a a Re f (t) Re F (s) Im f (t) Im F (s) Rt 0 f (u) du f (t), where f (t + a) = f (t) f (t), where f (t + a) = −f (t) 1 1 − e −as Z a 1 1 + e −as Z a 178 f (t)e −st dt 0 0 f (t)e −st dt Laplace Transforms of Simple Functions H(t) δ(t) tn−1 (n − 1)! e −at f (t) (Heaviside function or unit step) (Dirac δ-function) F (s) 1 s 1 1 sn (n = 2, 3, 4 . . .) 1 (s + a) 1 (s + a)2 1 (s + a)n 1 (s + a)(s + b) 1 s2 + a2 s s2 + a2 1 s2 − a2 s s2 − a2 s−1/2 s−3/2 1 s(s2 + a2 ) 1 s2 (s2 + a2 ) 1 (s2 + a2 )2 s 2 (s + a2 )2 s2 2 (s + a2 )2 s2 − a2 (s2 + a2 )2 te −at tn−1 e −at (n − 1)! e −at − e −bt b−a 1 sin at a cos at 1 a sinh at cosh at (πt)−1/2 2( πt )1/2 1 (1 − cos at) a2 1 (at − sin at) a3 1 (sin at − at cos at) 2a3 t 2a sin at 1 2a (sin at + at cos at) t cos at 179 Laplace Transforms of Simple Functions (continued) −at √ f (t) e cos ωt where ω = b − √ 1 −at 2 ω e sin ωt where ω = b − a H(t − a) Heaviside function starting at t = a H(t) − H(t − a) rectangular pulse, equal to 1 from 0 to a 1 −at a (1 − e ) a2 b −at + a e −bt 1 ab 1 − b − a e b−a b(α − a) a(α − b) −bt 1 α− −at + b−a e ab b−a e 1 −at − be −bt ) a − b (ae 1 −at −bt (α − a)e − (α − b)e b−a F (s) s+a s2 + 2as + b 1 s2 + 2as + b 1 e −as s 1 (1 − e −as ) s 1 s(s + a) 1 s(s + a)(s + b) s+α s(s + a)(s + b) s (s + a)(s + b) s+α (s + a)(s + b) e −bt e −ct e −at + + (b − a)(c − a) (c − b)(a − b) (a − c)(b − c) 1 (s + a)(s + b)(s + c) (α − a)e −at (α − b)e −bt (α − c)e −ct + + (b − a)(c − a) (c − b)(a − b) (a − c)(b − c) s+α (s + a)(s + b)(s + c) α sin ωt cos ωt + ω s+α s2 + ω 2 s sin φ + ω cos φ s2 + ω 2 s+a s(s2 + ω 2) sin(ωt + φ) √ a − a2 + ω 2 cos(ωt + φ) ω2 ω2 where φ = arctan ω a −at e 1 √ + sin(ωt − φ) a2 + ω 2 ω a2 + ω 2 where a > 0, φ = arctan ω a e −at cos ωt 1 (s + a)(s2 + ω 2 ) s+a (s + a)2 + ω 2 s+b (s + a)2 + ω 2 1 −at ω e (ω cos ωt + (b − a) sin ωt) 180 Laplace Transforms of Simple Functions (continued) f (t) 1 1 −at − √ e sin(bt + φ) a2 + b2 b a2 + b2 where a > 0, φ = arctan ab √ 1 − q1 e −ζωt sin(ωt 1 − ζ 2 + φ) 2 ω ω2 1 − ζ 2 where φ = arccos ζ F (s) 1 s (s + a)2 + b2 α + e −at [(a2 − αa + b2 ) sin bt − αb cos bt] 2 a +b b(a2 + b2 ) s+α s (s + a)2 + b2 1 s(s + 2ζωs + ω 2 ) 2 2 −at be −ct + [(c − h a) sin bt − b icos bt] e b (c − a)2 + b2 −at sin(bt + φ) e −ct 1 + √ e q 2 2 − 2 2 c(a + b ) c (c − a)2 + b2 b a + b (c − a)2 + b2 b where a > c > 0, φ = arctan ab + arctan a − c 1 (s + c) (s + a)2 + b2 1 s(s + c) (s + a)2 + b2 (c − α)e −ct α + c(a2 + b2 ) c (c − a)2 + b2 q (α − a)2 + b2 q + √ e −at sin(bt + φ) 2 2 2 2 b a + b (c − a) + b s + α s(s + c) (s + a)2 + b2 1 (at − 1 + e −at ) a2 1 (1 − e −at − ate −at ) a2 1 (α − αe −at + a(a − α)te −at ) a2 1 s2 (s + a) 1 s(s + a)2 s+α s(s + a)2 b + arctan b + arctan b where α > a > c > 0, φ = arctan α − a a a−c 181