Download che452lect07

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Choice modelling wikipedia , lookup

Data assimilation wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
ChE 452 Lecture 07
Statistical Tests Of Rate Equations
1
Last Time Considered
Paramecium Example
Error r2
Lineweaver
Burke
9454 0.910
Eadie Hofstee
5647 0.344
Nonlinear
Least Squares
4919 0.905
k1K 2 [par]
rp 
1  K 2 [par]
 
k K [par] 
TotalError   abs rp  1 2

1

K
[
par
]
Data 


2
2
r2 does not indicate goodness of fit
2
Today: Statistical Analysis Of
Rate Data
•
•
Can we do a calculation to tell if one
model fits the data better than
another model?
Is the result statistically significant?
3
Method: Calculate A Variance
Vi 
2
 experimentalrate  calculatedrate
po int s
numberof samples   numberof independentparametersin model
(3.B.1)
substituting in equations (3.A.7) yields
total error from Equ. 3.A.7
v1 
number of samples  number of parameters
(3.B.2)
Usually model with the lowest variance works best!
4
Limitations Of Using Variance To
Assess Which Model Fits Best
•
•
•
Assumes error in data
2
• Follows a “ distribution” (i.e. error is random)
Usually good assumption in direct rate data
It is not good to assume 1/rate follows 2
distribution, so one needs to be careful about
linearizing data.
5
For Our Example
4919
Vi 
 164
32po int s  2parameters
(3.B.3)
6
For Our Example
Continued
Eadie-Hofstee:
5647
Vi 
 188
32  2
(3.B.4)
while for the Lineweaver-Burk Plot:
9454
Vi 
 315
32  2
(3.B.5)
The non-linear least squares fit the data best.
7
Subtlety

We are never sure whether the model
with the lowest variance is the best one



Instead we can only say that it fits the data
best
The model that fits the data best is
usually the best one
Still there always is the possibility that a
model fits better because the errors in the
data line up to make it seem better
8
Next: Using An F-Test To Tell If the
Difference Is Statistically Significant

We want to do a statistical test to
calculate the probability that one model
fits better than another
9
Using An F-Test To Tell If the
Difference Is Statistically Significant
Method: Compute Finverse, given by
variance in weaker mod el
Finverse 
variance in better mod el
(3.B.6)
If Finverse is large enough, the model is
statistically better.
10
Statistics: Gives A Value of Finverse
That Is “Large Enough”
Table 3.B.2 Values of Finverse as a function of nf
when both models have the same value of nf
nf =
Significance Level
90%
95%
99%
99.5%
1
39.86
161.5
4052
16212
2
9.0
19
99
199
3
5.39
9.28
29.46
47
4
4.11
6.39
15.98
23
nf=number of data points - parameters in the model
(3.B.8)
To read the table, if nf=4, you need Finverse to be at least 15.98
to be 99% sure that the better model really is better.
There will still be 1% chance that the differences caused by
random errors in the data
11
Assumptions In Using the
Values Of F In Table 3.B.2


Models are independent (non-nested)
2 distribution of errors
Not mathematically rigorous in our example since
models not independent! (Gives small error in
Finverse)
12
Fdist Gives The Probability
That A Given Model Is Better
% confidence=1-FDIST (Finverse, nf for
better model, nf for worse model)
(3.B.9)
Not mathematically rigorous, but close.
13
Example: Is The Non-Linear Least Squares
Better Than LineWeaver-Burke
Variance Lineweaver-Burke = 321
Variance non-linear = 185
nf=30
315
Finverse 
 1.92
164
I used Excel to calculate
1-FDIST (1.92, 30, 30)=0.96
96% sure non-linear least squares fits better
4% chance difference due to noise in data.
14
Another Example: Comparing
Two Models
Previously fit data to
k1K 2 [par]
rp 
1  K 2 [par]
(3.A.1)
Does the following work better?
rp 
k1K 2 [par]
1  K 2 [par]
1.5
Is the difference statistically significant?
15
The Spreadsheet Is The
Same As In Problem 3.A:
Table 3.C.1 Part of the spreadsheet used to calculate values of k- 1 and K 2 to minimize the total error
A
01
B
C
D
10
Conc
0
2
3.6
4
5.2
7.8
8
k_1= 1940 (Calculated by solver)
K_2= 0.00188
(Calculated by solver)
rate
equation 3.C.1^1.5
0
=k_1*K_2*A4/(1+K_2*A4^1.5)
10.4
=k_1*K_2*A5/(1+K_2*A5^1.5)
12.8
=k_1*K_2*A6/(1+K_2*A6^1.5)
23.2
=k_1*K_2*A7/(1+K_2*A7^1.5)
17.6
=k_1*K_2*A8/(1+K_2*A8^1.5)
46.4
=k_1*K_2*A9/(1+K_2*A9^1.5)
23.2
=k_1*K_2*A10/(1+K_2*A10^1.5)
11
8
46.4
=k_1*K_2*A11/(1+K_2*A11^1.5) =ABS(C11-$B11)^$D$1
12
11
32
=k_1*K_2*A12/(1+K_2*A12^1.5) =ABS(C12-$B12)^$D$1
13
14.4
34.4
=k_1*K_2*A13/(1+K_2*A13^1.5) =ABS(C13-$B13)^$D$1
14
15.6
44.8
=k_1*K_2*A14/(1+K_2*A14^1.5) =ABS(C14-$B14)^$D$1
15
15.6
63.2
=k_1*K_2*A15/(1+K_2*A15^1.5) =ABS(C15-$B15)^$D$1
16
16
36
=k_1*K_2*A16/(1+K_2*A16^1.5) =ABS(C16-$B16)^$D$1
17
16.6
46.4
=k_1*K_2*A17/(1+K_2*A17^1.5) =ABS(C17-$B17)^$D$1
02
03
04
05
06
07
08
09
2
0.00188
error
=ABS(C4-$B4)^$D$1
=ABS(C5-$B5)^$D$1
=ABS(C6-$B6)^$D$1
=ABS(C7-$B7)^$D$1
=ABS(C8-$B8)^$D$1
=ABS(C9-$B9)^$D$1
=ABS(C10-$B10)^$D$1
16
F Test To Determine Which
Model Is Better
V3.A.1 the variance of equation 3.A.1 is
4919
V3.A.1 
 164
32po int s  2parameters
V3.C.1 the variance of equation 3.C.1 is
4576
V3.C.1 
 152
32po int s  2parameters
The ratio of variance is
Finverse
164

 1.07
152
17
Calculate Probability Second
Model Is Better From FDIST
probability=1-FDIST (1.07,30,30)=0.58.
58% chance second model is better
42% probability first model is better
Note: Not rigorous number
18
Pitfalls Of Direct
Measurements
• It is not uncommon for more than one rate
equation may fit the measured kinetics within
the experimental uncertainties.
• Just because data fits, does not mean rate equation
is correct.
• The quality of kinetic data vary with the
equipment used and the method of temperature
measurement and control.
• Data taken on one apparatus is often not directly
comparable to data taken on different apparatus.
19
Pitfalls Continued
•
It is not uncommon to observe 10-30% variations
in rate taken in the same apparatus on different
days.
•
•
•
Usually, these variations can be traced to variations in
the temperature, pressure, or flow rate in the reactor.
The procedure used to fit the data can have a
major effect on the values of the parameters
obtained in the data analysis.
The quality of the regression coefficient (r2) does
not tell you how well a model fits your data.
20
Class Question

What did you learn new today?
21