Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Some System Identification Challenges and Approaches “Many basic scientific problems are now routinely solved by simulation: a fancy random walk is performed on the system of interest. Averages computed from the walk give useful answers to formerly intractable problems” Persi Diaconis, 2008 Brett Ninness School of Electrical Engineering & Computer Science The University of Newcastle, Australia 1 System Identification - a rich history • 1700‘s: Bernoulli, Euler, Lagrange - probability concepts • 1763: Bayes - conditional probability • 1795: Gauss, Legendre - least squares • 1800-1850: Gauss, Legendre, Cauchy - prob. distributions • 1879: Stokes - periodogram of time series • 1890: Galton, Pearson - regression and correlation • 1922: Fisher - Maximum Likelihood (ML) • 1921: Yule - AR and MA time series • 1933: Kolmogorov - Axiomatic probability theory • 1930‘s: Khinchin, Kolmogorov, Cramér - stationary processes 2 System Identification - a rich history • 1941-1949: Wiener, Kolmogorov - prediction theory • 1960: Kalman - Kalman Filter • 1965: Kalman & Ho - Realisation theory • 1965: Åström & Bohlin - ML methods for dynamic systems • 1970: Box & Jenkins - a unified and complete presentation • 1970’s: Experiment design, PE formulation with underpinning theory, analysis of recursive methods • 1980‘s: Bias & Variance quantification, tradeoff and design • 1990‘s: Subspace methods, control relevant identification, robust estimation methods. 3 Recent & Current Activity 4 This talk 5 Acknowledgements • • Results here rest heavily on the work of colleagues: ‣ Dr. Adrian Wills (Newcastle University) ‣ Dr. Thomas Schön (Linköping University) ‣ Dr. Stuart Gibson (Nomura Bank) ‣ Soren Henriksen (Newcastle University) and on learning from experts: ‣ Håkan Hjalmarsson, Tomas McKelvey, Fredrik Gustafsson, Michel Gevers, Graham Goodwin. 6 Challenge 1 - General Nonlinear ID • Effective solutions available for specific nonlinear structures ‣ NARX, Hammerstein-Wiener, Bilinear..... • Extension to more general forms? • Example: 7 Challenge 1 - General Nonlinear ID Obstacle 1: How do we compute a cost function? • Prediction error (PE) cost: • Maximum Likelihood (ML) cost: 8 Computing • Turn to general measurement and time update equations: Measureme nt Update Time Update • Problem - closed form solutions only for special cases: ‣ Linear, Gaussian (Kalman Filter), Discrete state HMM • More generally: ‣ Need to compute solution numerically ‣ Multi-dimensional integrals the main challenge 9 Computing 10 SEQUENTIAL IMPORTANCE RESAMPLING • SIR - More commonly known as “particle filtering” • Key idea - use the strong law of large numbers (SLLN) • ‣ Suppose a vector random number generator gives realisations from a given target density ‣ Then by the SLLN, with probability one: ‣ Suggests approximate quantification How to build the necessary random number generator? 11 Recursive solution (Particle filter) Time Update Resampling Measurement Update 12 Example . vs 13 History ‣ Handschin & Mayne, Int’l J. Control, 1969 ‣ Resampling Approach: Gordon, Salmond & Smith, IEE Proc. Radar & Signal Processing, 1993. (1136 citations) ‣ Now widely used in signal processing, target tracking, computer vision, econometrics, robotics and statistics, control.... ‣ Some applications in system identification. - Bulk of work has involved considering parameters as state variables. 14 Back to Nonlinear System Identification • General(ish) model structure • Prediction error cost: • Max. Likelihood cost: 15 Nonlinear System Identification Obstacle 2: How do we compute an estimate? • Gradient based search is standard practice: • How to compute the necessary gradients? • Strategies: ‣ Differencing to compute derivatives? ‣ Direct search methods: Nelder-Mead, simulated annealing? 16 Expectation-Maximisation (EM) ALG. 17 Expectation-Maximisation (EM) ALG. • Example - linear system: • Estimate by regression? • Need state - use estimate? E.g. Kalman smoother • Suggests iteration: ‣ Use estimates of A,B,C,D to estimate state ‣ Use estimates of state ‣ Return and do again. ; to estimate A,B,C,D; 18 Expectation-Maximisation (EM) ALG. • Key idea - “complete” and “incomplete” data ‣ Actual observations: ‣ “Wished for” (incomplete) obervations: ‣ Form estimate of “wished for” likelihood: • E Step: Calculate • M Step: Compute 19 KEY EM Algorithm Property • Bayes’ rule: • Take conditional expectation • Increasing of both sides: implies increased likelihood: 20 Expectation-Maximisation (EM) ALG. 21 Expectation-Maximisation (EM) ALG. 22 Expectation-Maximisation (EM) ALG. • ‣ ‣ History Generally attributed to Baum: Ann. Math. Stat. 1970; Generalised by Dempster et al: JRSS B, 1977 (9858 cites) ‣ Widely used in image processing, statistics, radar... 23 Nonlinear system estimation Example: N=100 data points, M=100 particles, 100 experiments 24 Evolution of • Look at b parameter only - others fixed at true values: 25 Gradient Based Search Revisited • Fisher’s Identity 26 EM vs Gradient search iterates 27 Challenge 2: Application Relevant ID • Quality of an estimate • “Traditional” practice - note asymptotic results • Assume convergence effectively occurred for finite N must be quantified for it to be useful 28 Assessment & Design • Often, a function • Again - “classical” approach - use linear approximation: • Couple with approximate Gaussianity of of the parameters is of more interest 29 One perspective Need to combine prior knowledge, assumptions and data : Measure of the evidence supporting an underlying system property - parameter value, frequency response, achieved gain/phase margin...... 30 Computing Posteriors • In principle, posterior computation straightforward: Bayes’ Rule Likelihood prior knowledge • Example: Combine: 31 Using Posteriors Now the difficulty - using the posterior • • Marginal on i’th parameter: ‣ Evaluation on -dim. grid, evaluations of ‣ Simpson’s rule - evaluation error: Other measures? model order 32 Markov Chain Monte Carlo (MCMC) 33 A randomised approach • • Use the Strong Law of Large Numbers (SLLN) again. ‣ Build a (vector) random number generator giving realisations: ‣ Then by the SLLN, with probability one: ‣ Suggests the approximation: One view - numerical integration with intelligently chosen grid points. 34 The Metropolis Algorithm The required vector random number generator: ‣ 1. Initialise: Choose and set Z.y=y; Z.u=u; M.A=4; g1=est(Z,M); theta=g1.theta; 35 The Metropolis Algorithm ‣ 2. Draw a proposal value xi = theta + 0.1*randn(size(theta)); g2 = theta2m(xi,g1); 36 The Metropolis Algorithm 3. Compute an acceptance probability: cold = validate(Z,g1); cnew = validate(Z,g2); prat = exp((-0.5/var)*(cold-cnew)*N); alpha = min(1,prat); 37 The Metropolis Algorithm 4. Set with probability if (rand <= alpha) theta=xi; end; 38 “Markov Chain Monte Carlo” History • Origins: Metropolis, Rosenbluth, Rosenbluth,Teller & Teller, Journal of Chemical Physics, 1953. (11,564 ISI citations) • Widespread use: ‣ Listed #1 in “Great Algorithms of Scientific Computing”, Dongarra & Sullivan, Comp. & Sci in Eng. 2000 ‣ “The Markov Chain Monte Carlo Revolution”, Diaconis, Bull. American Mathematical Society, 2008. “Many basic scientific problems are now routinely solved by simulation: a fancy random walk is performed on the system of interest. Averages computed from the walk give useful answers to formerly intractable problems” ‣ Widely used in chemistry, physics, statistics.... Emerging uses in biology, telecommunications. 39 Example • Simple first order situation: • N=20 data samples available: • Metropolis Algorithm: realisations 40 Marginal posteriors via MCMC 41 Posterior of functions of • Candidate closed loop controller: What are the likely achieved gain and phase margins ? Implicit functions of - direct computation unclear 42 Sample Histograms of There is strong evidence that the proposed controller will achieve a gain margin > 3.8 and phase margin > 95o 43 Conclusions • Many thanks for your attention; • Collective thanks to the SYSID2009 Organisation Team! • Deep thanks to the Uni. Newcastle Signal Processing Micro-electonics group (sigpromu.org) ‣ Steve Weller, Chris Kellett, Tharaka Dissanayake, Peter Schreier, Sarah Johnson, Geoff Knagge, Björn Rüffer, Adrian Wills, Lawrence Ong, Dale Bates, Ian Griffiths, David Hayes, Soren Henriksen, Adam Mills, Alan Murray who endured multiple road-test versions of this talk, that were even worse than this one. 44