Download Method of Least Squares

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Rotation matrix wikipedia , lookup

Four-vector wikipedia , lookup

Determinant wikipedia , lookup

Eigenvalues and eigenvectors wikipedia , lookup

Matrix (mathematics) wikipedia , lookup

Jordan normal form wikipedia , lookup

Matrix calculus wikipedia , lookup

Perron–Frobenius theorem wikipedia , lookup

Principal component analysis wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

Singular-value decomposition wikipedia , lookup

Non-negative matrix factorization wikipedia , lookup

Orthogonal matrix wikipedia , lookup

Linear least squares (mathematics) wikipedia , lookup

Gaussian elimination wikipedia , lookup

Matrix multiplication wikipedia , lookup

System of linear equations wikipedia , lookup

Ordinary least squares wikipedia , lookup

Least squares wikipedia , lookup

Transcript
Method of Least Squares
Least Squares

Method of Least Squares:
 Deterministic approach

The inputs u(1), u(2), ..., u(N) are applied to the system
The outputs y(1), y(2), ..., y(N) are observed
 Find a model which fits the input-output relation to a (linear?)
curve, f(n,u(n))
 ‘best’ fit by minimising the sum of the squres of the difference f - y

50
45
40
35
30
25
20
15
10
5
0
0
5
10
15
20
25
30
35
40
45
50
Least Squares

The curve fitting problem can be formulated as
model
variable
observations

Error:
Sum-of-error-squares:

Minimum (least-squares of error) is achieved when the gradient is zero

Problem Statement

For the inputs to the system, u(i)
The observed desired response
is, d(i)

Relation is assumed to be linear

Unobservable measurement error
 Zero mean


White
Problem Statement

Design a transversal filter which finds the least squares solution

Then, sum of error squares is
Data Windowing


We will express the input in matrix form
Depending on the limits i1 and i2 this matrix changes
Covariance Method
i1=M, i2=N
Postwindowing Method
i1=M, i2=N+M1
Autocorr. Method
i1=1, i2=N+M1
Prewindowing Method
i1=1, i2=N
Principle of Orthogonality

Error signal

Least squares (minimum of sum of squares) is achieved when

i.e., when
!Time averaging!
(For Wiener filtering)
(this was ensemble average)

The minimum-error time series emin(i) is orthogonal to the time series of
the input u(i-k) applied to tap k of a transversal filter of length M for
k=0,1,...,M-1 when the filter is operating in its least-squares condition.
Corollary of Principle of Orthogonality

LS estimate of the desired response is

Multiply principle of orthogonality by wk* and take summation over k

Then

When a transversal filter operates in its least-squares condition, the
least-squares estimate of the desired response -produced at the
output of the filter- and the minimum estimation error time series are
orthogonal to each other over time i.
Energy of Minimum Error

Due to the principle of orthogonality, second and third terms are
orthogonal, hence
where



, when eo(i)= 0 for all i, impossible
, when the problem is underdetermined fewer data points
than parameters infinitely many solutions (no unique soln.)!
Normal Equations
Principle of Orthogonality
Minimum error:
→
(t,k), 0≤(t,k) ≤M-1
time-average
autocorrelation function
of the input

z(-k), 0 ≤k ≤M-1
time-average
cross-correlation bw
the desired response
and the input
Hence,
Expanded system of the normal equations for linear least-squares filters.
Normal Equations (Matrix Formulation)

Matrix form of the normal equations for linear least-squares filters:
(if -1 exists!)


Linear least-squares counterpart of the Wiener-Hopf eqn.s.
Here  and z are time averages, whereas in Wiener-Hopf eqn.s
they were ensemble averages.
Minimum Sum of Error Squares

Energy contained in the time series

Or,

Then the minimum sum of error squares is
is
Properties of the Time-Average Correlation Matrix 

Property I: The correlation matrix  is Hermitian symmetric,

Property II: The correlation matrix  is nonnegative definite,

Property III: The correlation matrix  is nonsingular iff det() is nonzero

Property IV: The eigenvalues of the correlation matrix  are real and
non-negative.
Properties of the Time-Average Correlation Matrix 

Property V: The correlation matrix  is the product of two rectangular
Toeplitz matrices that are Hermitian transpose of each other.
Normal Equations (Reformulation)

But we know that
then
which yields
! Pseudo-inverse !

Substituting into the minimum sum of error squares expression gives
Projection

The LS estimate of d is given by

The matrix
is a projection operator
 onto the linear space spanned by the columns of data matrix A
 i.e. the space Ui.

The orthogonal complement projector is
Projection - Example

M=2 tap filter, N=4 → N-M+1=3
Let

Then

And

orthogonal
Projection - Example
Uniqueness of the LS Solution

LS always has a solution, is that solution unique?

The least-squares estimate
is unique if and only if the nullity (the
dimension of the null space) of the data matrix A equals zero.

AKxM, (K=N-M+1)

Solution is unique when A is of full column rank, K≥M
 All columns of A are linearly independent
 Overdetermined system (more eqns. than variables (taps))
 (AHA)-1 nonsingular → exists and unique


Infinitely many solutions when A has linearly dependent
columns, K<M
(AHA)-1 is singular
Properties of the LS Estimates

Property I: The least-squares estimate is unbiased, provided that
the measurement error process eo(i) has zero mean.

Property II: When the measurement error process eo(i) is white with
zero mean and variance 2, the covariance matrix of the leastsquares estimate
equals 2-1.

Property III: When the measurement error process eo(i) is white
with zero mean, the least squares estimate
is the best linear
unbiased estimate.

Property IV: When the measurement error process eo(i) is white and
Gaussian with zero mean, the least-squares estimate
achieves
the Cramer-Rao lower bound for unbiased estimates.
Computation of the LS Estimates

The rank (W) of an KxN (K≥N or K<N) matrix A gives
 The number of linearly independent columns/rows
 The number of non-zero eigenvalues/singular values

The matrix is said to be full rank (full column or row rank) if


Otherwise, it is said to be rank-deficient
Rank is an important parameter for matrix inversion
 If K=N (square matrix) and the matrix is full rank (W=K=N) (nonsingular) inverse of the matrix can be calculated, A-1=adj(A)/det(A)

If the matrix is not square (K≠N), and/or it is rank-deficient (singular),
A-1 does not exist, instead we can use the pseudo-inverse (a
projection of the inverse), A+
SVD

We can calculate the pseudo-inverse using SVD.

Any KxN matrix (K≥N or K<N) can be decomposed using the
Singular Value Decomposition (SVD) as follows:
SVD

The system of eqn.s,
 is overdetermined if K>N, more eqn.s than unknowns,



is underdetermined if K<N, more unknowns than eqn.s,


Unique solution (if A is full-rank)
Non-unique, infinitely many solutions (if A is rank-deficient)
Non-unique, infinitely many solutions
In either case the solution(s) is(are)
where
Computation of the LS Estimates

Find the solution of (A: KxM)

If K>M and rank(A)=M, (

Otherwise
, infinitely many solutions, but pseudo-inverse
gives the minimum-norm solution to the least squares problem.

) the unique solution is
Shortest length possible in the Euclidean norm sense.