Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
ChE 452 Lecture 07 Statistical Tests Of Rate Equations 1 Last Time Considered Paramecium Example Error r2 Lineweaver Burke 9454 0.910 Eadie Hofstee 5647 0.344 Nonlinear Least Squares 4919 0.905 k1K 2 [par] rp 1 K 2 [par] k K [par] TotalError abs rp 1 2 1 K [ par ] Data 2 2 r2 does not indicate goodness of fit 2 Today: Statistical Analysis Of Rate Data • • Can we do a calculation to tell if one model fits the data better than another model? Is the result statistically significant? 3 Method: Calculate A Variance Vi 2 experimentalrate calculatedrate po int s numberof samples numberof independentparametersin model (3.B.1) substituting in equations (3.A.7) yields total error from Equ. 3.A.7 v1 number of samples number of parameters (3.B.2) Usually model with the lowest variance works best! 4 Limitations Of Using Variance To Assess Which Model Fits Best • • • Assumes error in data 2 • Follows a “ distribution” (i.e. error is random) Usually good assumption in direct rate data It is not good to assume 1/rate follows 2 distribution, so one needs to be careful about linearizing data. 5 For Our Example 4919 Vi 164 32po int s 2parameters (3.B.3) 6 For Our Example Continued Eadie-Hofstee: 5647 Vi 188 32 2 (3.B.4) while for the Lineweaver-Burk Plot: 9454 Vi 315 32 2 (3.B.5) The non-linear least squares fit the data best. 7 Subtlety We are never sure whether the model with the lowest variance is the best one Instead we can only say that it fits the data best The model that fits the data best is usually the best one Still there always is the possibility that a model fits better because the errors in the data line up to make it seem better 8 Next: Using An F-Test To Tell If the Difference Is Statistically Significant We want to do a statistical test to calculate the probability that one model fits better than another 9 Using An F-Test To Tell If the Difference Is Statistically Significant Method: Compute Finverse, given by variance in weaker mod el Finverse variance in better mod el (3.B.6) If Finverse is large enough, the model is statistically better. 10 Statistics: Gives A Value of Finverse That Is “Large Enough” Table 3.B.2 Values of Finverse as a function of nf when both models have the same value of nf nf = Significance Level 90% 95% 99% 99.5% 1 39.86 161.5 4052 16212 2 9.0 19 99 199 3 5.39 9.28 29.46 47 4 4.11 6.39 15.98 23 nf=number of data points - parameters in the model (3.B.8) To read the table, if nf=4, you need Finverse to be at least 15.98 to be 99% sure that the better model really is better. There will still be 1% chance that the differences caused by random errors in the data 11 Assumptions In Using the Values Of F In Table 3.B.2 Models are independent (non-nested) 2 distribution of errors Not mathematically rigorous in our example since models not independent! (Gives small error in Finverse) 12 Fdist Gives The Probability That A Given Model Is Better % confidence=1-FDIST (Finverse, nf for better model, nf for worse model) (3.B.9) Not mathematically rigorous, but close. 13 Example: Is The Non-Linear Least Squares Better Than LineWeaver-Burke Variance Lineweaver-Burke = 321 Variance non-linear = 185 nf=30 315 Finverse 1.92 164 I used Excel to calculate 1-FDIST (1.92, 30, 30)=0.96 96% sure non-linear least squares fits better 4% chance difference due to noise in data. 14 Another Example: Comparing Two Models Previously fit data to k1K 2 [par] rp 1 K 2 [par] (3.A.1) Does the following work better? rp k1K 2 [par] 1 K 2 [par] 1.5 Is the difference statistically significant? 15 The Spreadsheet Is The Same As In Problem 3.A: Table 3.C.1 Part of the spreadsheet used to calculate values of k- 1 and K 2 to minimize the total error A 01 B C D 10 Conc 0 2 3.6 4 5.2 7.8 8 k_1= 1940 (Calculated by solver) K_2= 0.00188 (Calculated by solver) rate equation 3.C.1^1.5 0 =k_1*K_2*A4/(1+K_2*A4^1.5) 10.4 =k_1*K_2*A5/(1+K_2*A5^1.5) 12.8 =k_1*K_2*A6/(1+K_2*A6^1.5) 23.2 =k_1*K_2*A7/(1+K_2*A7^1.5) 17.6 =k_1*K_2*A8/(1+K_2*A8^1.5) 46.4 =k_1*K_2*A9/(1+K_2*A9^1.5) 23.2 =k_1*K_2*A10/(1+K_2*A10^1.5) 11 8 46.4 =k_1*K_2*A11/(1+K_2*A11^1.5) =ABS(C11-$B11)^$D$1 12 11 32 =k_1*K_2*A12/(1+K_2*A12^1.5) =ABS(C12-$B12)^$D$1 13 14.4 34.4 =k_1*K_2*A13/(1+K_2*A13^1.5) =ABS(C13-$B13)^$D$1 14 15.6 44.8 =k_1*K_2*A14/(1+K_2*A14^1.5) =ABS(C14-$B14)^$D$1 15 15.6 63.2 =k_1*K_2*A15/(1+K_2*A15^1.5) =ABS(C15-$B15)^$D$1 16 16 36 =k_1*K_2*A16/(1+K_2*A16^1.5) =ABS(C16-$B16)^$D$1 17 16.6 46.4 =k_1*K_2*A17/(1+K_2*A17^1.5) =ABS(C17-$B17)^$D$1 02 03 04 05 06 07 08 09 2 0.00188 error =ABS(C4-$B4)^$D$1 =ABS(C5-$B5)^$D$1 =ABS(C6-$B6)^$D$1 =ABS(C7-$B7)^$D$1 =ABS(C8-$B8)^$D$1 =ABS(C9-$B9)^$D$1 =ABS(C10-$B10)^$D$1 16 F Test To Determine Which Model Is Better V3.A.1 the variance of equation 3.A.1 is 4919 V3.A.1 164 32po int s 2parameters V3.C.1 the variance of equation 3.C.1 is 4576 V3.C.1 152 32po int s 2parameters The ratio of variance is Finverse 164 1.07 152 17 Calculate Probability Second Model Is Better From FDIST probability=1-FDIST (1.07,30,30)=0.58. 58% chance second model is better 42% probability first model is better Note: Not rigorous number 18 Pitfalls Of Direct Measurements • It is not uncommon for more than one rate equation may fit the measured kinetics within the experimental uncertainties. • Just because data fits, does not mean rate equation is correct. • The quality of kinetic data vary with the equipment used and the method of temperature measurement and control. • Data taken on one apparatus is often not directly comparable to data taken on different apparatus. 19 Pitfalls Continued • It is not uncommon to observe 10-30% variations in rate taken in the same apparatus on different days. • • • Usually, these variations can be traced to variations in the temperature, pressure, or flow rate in the reactor. The procedure used to fit the data can have a major effect on the values of the parameters obtained in the data analysis. The quality of the regression coefficient (r2) does not tell you how well a model fits your data. 20 Class Question What did you learn new today? 21