Download 1 Non-linear Curve Fitting

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Regression analysis wikipedia , lookup

Expectation–maximization algorithm wikipedia , lookup

Choice modelling wikipedia , lookup

Linear regression wikipedia , lookup

Data assimilation wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
Mathematics 241
1
1.1
Nonlinear models
Non-linear Curve Fitting
Linearization
Suppose that we wish to fit a function y = f (x) to data for which a linear function is clearly not appropriate. We
generally know this because we see a definite non-linear pattern in the scatterplot (or in a residual plot) or because
the science behind the relationship tells us that a non-linear relationship might be more appropriate.
Of course we cannot simply search for an arbitrary function f . We could fit the data exactly with a polynomial of
sufficiently high degree, but such a polynomial is unlikely to be a useful model. Therefore, to fit a non-linear function
to data, we generally constrain the function we are looking for to be in some small class of functions. Usually this
class is defined by a small number of parameters. This is what we did in the linear case – we limited ourselves to a
two-parameter (β0 , β1 ) family of functions.
We use the example of the world records in track.
Example 1.1. Various biological models suggest that a possible model for the men’s world records in track might
have the form
y = b0 x b 1
where x is the distance in meters and y is the time in seconds and b0 and b1 are parameters. One way to fit these
data are to linearize the relationship. Notice that if this equation is correct,
ln y = ln b0 + b1 ln x
or
y 0 = β̂0 + β̂1 x0
Notice here that we have rewritten this relationship so that we can see that it really is linear. The variables x0 and
y 0 are just data – x0 = ln x and y 0 = ln y. The estimated parameters β̂0 and β̂1 are just ln b0 and b1 respectively. We
can compute β̂0 and β̂1 using lm.
> ltrack=lm(log(Seconds)~log(Meters),data=mentrack)
> ltrack
Call:
lm(formula = log(Seconds) ~ log(Meters), data = mentrack)
Coefficients:
(Intercept) log(Meters)
-2.921
1.124
> exp(-2.921)
[1] 0.05387978
\ = .05Meters1.12 .
Thus −2.921 is ln b0 and 1.124 is b1 . Therefore the curve that we want is Seconds
1.2
Least-squares Solution
There is another way to solve estimate a non-linear relationship of the form y = f (x) that does not involve the step
of linearizing the relationship. Since not all relationships can be linearized, this method should be in any scientists
toolbox.
We found the the least squares line in Example 1.1 by minimizing the sums of squares of the residuals in the
transformed equation. Notice that the residuals in this case were in the units of log-seconds. We next try to find b0
and b1 without transforming the equation but by minimizing the sums of squares of residuals directly. Our residuals
have the form ei = yi − b0 xbi 1 and minimizing the sum of squares of these quantities will amount to minimizing a
non-linear function.
Mathematics 241
Nonlinear models
The R function nls minimizes the sums of squares of residuals for a wide range of functions. However a non-linear
minimization routine needs a starting guess that is sufficiently close to the minimum in order to be successful. We
have a good starting guess in the solution to our linearization.
> nltrack=nls(Seconds~b0*Meters^b1, start=list(b0=.05,b1=1),data=mentrack)
> nltrack
Nonlinear regression model
model: Seconds ~ b0 * Meters^b1
data: mentrack
b0
b1
0.08395 1.06864
residual sum-of-squares: 168.1
Number of iterations to convergence: 5
Achieved convergence tolerance: 1.484e-06
\ = .08Meters1.07 . This is not the same solution as we got in Example 1.1. This
The fitted function is now Seconds
is because we are minimizing a different quantity and so we have a different criterion for the best fitting line. In
this case, our residuals are in seconds. Roughly this means that larger errors (in seconds) are penalized more in this
method of fitting than in the transformed model (since log-seconds are much smaller).
1.3
Comparison of the two Methods
There is no right method. Both of these two methods fit the function by some reasonable criterion of “best.”
Sometimes we will find that the first method suits our purposes better and in others, the second method does. Quite
often, the functions computed by each method will be close. Here we compare the two models for the men’s track
records.
dy = −2.92 + 1.12 ln x minimized the sums of squares of residuals. To see what
In the linear model, we found that ln
this means we write the equation for the residuals:
ln yi = −2.92 + 1.12 ln xi + ei
Transforming this equation back to the original units, we have
yi = eei x1.12
In other words, the residuals are multiplicative in the original model and what we are minimizing is really percentage
error rather than absolute error.
The following output compares the errors of the two models. It is easy to see the differences between the two models
by looking at the extreme points. For the 10,000 meter race, the prediction of the linearized model is more than a
minute off (104 seconds) but is only about 6% off. The maximum percentage error for the linear model is about 7%
(for the 200 meter race). On the other hand, the non-linear least squares model has a maximum error of at most 7.5
seconds but is more than 15% off in its prediction of the 100 meter race time. Clearly the question here is whether
we prefer to minimize absolute error (in seconds) or relative error (in percentages).
Historically, the first of our two methods was usually preferred. The reason for that is that the linear-least squares
solution can be computed with a minimum of technology. The second method is now widely used – for example it
is the method implemented in Logger-Pro for all non-linear problems. However the choice between the two methods
should be made based on what one considers a good model rather than on what technology is more convenient. As
we have shown above, both methods are easy to use in R.
Mathematics 241
Nonlinear models
>
>
>
>
>
>
>
>
>
options(digits=3)
lmodel = exp(predict(ltrack))
lerror=mentrack$Seconds-lmodel
lfactor=exp(residuals(ltrack))
nmodel = predict(nltrack)
nerror=residuals(nltrack)
nfactor=mentrack$Seconds/nmodel
results = data.frame(actual=mentrack$Seconds, lmodel, lerror,lfactor, nmodel, nerror,nfactor)
results
actual lmodel
lerror lfactor nmodel nerror nfactor
1
9.69
9.52
0.173
1.018
11.5 -1.83
0.841
2
19.30
20.74
-1.436
0.931
24.2 -4.85
0.799
3
43.18
45.18
-1.999
0.956
50.7 -7.48
0.852
4
101.11
98.44
2.673
1.027 106.3 -5.15
0.952
5
131.96 126.48
5.475
1.043 134.9 -2.92
0.978
6
206.00 199.47
6.527
1.033 208.0 -2.03
0.990
7
223.13 215.88
7.254
1.034 224.3 -1.14
0.995
8
284.79 275.59
9.203
1.033 282.9
1.89
1.007
9
440.67 434.61
6.056
1.014 436.3
4.34
1.010
10 757.35 771.54 -14.193
0.982 753.2
4.19
1.006
11 1577.53 1681.05 -103.516
0.938 1579.7 -2.18
0.999
Problem
Find functions g and h that transform the following non-lienar equations y = f (x) that depend on parameters b0
and b1 to linear equations g(y) = b00 + b01 h(x).
b0
b1 + x
x
2. y =
b0 + b1 x
1. y =
3. y =
1
1 + b0 e b 1 x