Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Quantum Topological Molecular Descriptors in QSAR Analysis of Organophosphorus Compounds Y. Paukku1, G. Hill1* 1 * Interdisciplinary Center for Nanotoxicity, Department of Chemistry, Jackson State University, 1400 J. R. Lynch Street, P. O. Box 17910, Jackson, MS 39217, USA To whom correspondence should be addressed. E-mail address: [email protected]. Phone: (601)-979-1699. Fax: (601)-979-7823. Supplementary material Figure I. Multiple leave-N-out cross-validation plot (10 times reordered data set). Table I. Results for leave-N-out N 1 2 3 4 5 6 7 8 9 10 Single Q2LNO 0.81 0.71 0.69 0.70 0.67 0.69 0.71 0.65 0.67 0.67 Av: 0.69 10-times Q2LNO 0.73 0.71 0.69 0.68 0.66 0.67 0.67 0.65 0.65 0.65 Table II. Ten y-randomizations for the real model. Randomizations 1 2 3 4 5 6 7 8 9 10 Maximum Average Real model Q2 0.393 0.164 0.267 0.122 0.500 0.297 0.389 0.187 0.612 0.309 0.612 0.324 0.737 R2 0.099 0.187 0.145 0.163 0.051 0.187 0.040 0.116 0.008 0.148 0.187 0.114 0.870 Table III. Correlation matrix of the variables in the model Pcharge ρR1 P charge 1.000 0.055 ρR1 1.000 K(r)X - Analysis of descriptor-Y relations 1) Y = -3.167*Pcharge +4.395 (n=20; R2=0.083; s=1.026; F=7.330; Q2=0.046; SPRESS=1.188) K(r)X 0.045 0.131 1.000 Analysis of Variance Source Intercept Slope Error Adj. Total Total DF 1 1 18 19 20 Sum of Squares 242.4384 7.714039 18.94283 26.65687 269.0953 Mean Square 242.4384 7.714039 1.05238 1.402993 s = Square Root(1.05238) = 1.025856 T-Value= -2.7074 2) Y = 195.241*ρR1 -32.094 (n=20; R2=0.579; s=0.790; F=24.707; Q2=0.463) F-Ratio Prob Level Power (5%) 7.3301 0.0144 0.7260 Analysis of Variance Source Intercept Slope Error Adj. Total Total DF 1 1 18 19 20 Sum of Squares 242.4384 15.4216 11.23528 26.65687 269.0953 Mean Square 242.4384 15.4216 0.6241821 1.402993 s = Square Root(0.6241821) = 0.790052 T-Value=4.9706 3) Y = -34.712*K(r)X +7.797 (n=20; R2=0.259; s=1.047; F=6.297; Q2=0.088) F-Ratio Prob Level Power (5%) 24.7069 0.0001 0.9968 Analysis of Variance Source Intercept Slope Error Adj. Total Total DF 1 1 18 19 20 Sum of Squares 242.4384 6.908604 19.74827 26.65687 269.0953 Mean Square 242.4384 6.908604 1.097126 1.402993 s = Square Root(1.097126) = 1.047438 T-Value=-2.5094 F-Ratio Prob Level Power (5%) 6.2970 0.0219 0.6606 Errors and t-test statistics for regression coefficients for model with all 20 samples: R2 Adj R2 Coefficient of Variation Mean Square Error Square Root of MSE Ave Abs Pct Error Variable Y Pcharge ρR1 K(r)X 0.8697 0.8452 0.1338 0.2171545 0.4659984 12.744 Count 20 20 20 20 Regression Coefficient b(i) -14.1527 -3.0611 122.3920 -30.4404 Independent Variable Intercept Pcharge ρR1 K(r)X Standard Deviation 0.2012068 4.614406E-03 1.737173E-02 1.18448 Mean 0.2885081 0.1822146 0.1243126 3.481655 Standard Error Sb(i) 5.3177 0.5785 26.4442 6.9844 T-Value to test H0:B(i)=0 -2.661 -5.292 4.628 -4.358 Minimum -0.080117 0.1706071 0.0959985 1.1791 Prob Level 0.0171 0.0001 0.0003 0.0005 Maximum 0.640682 0.1882551 0.1712337 5.421 Reject H0 at 5%? Yes Yes Yes Yes Power of Test at 5% 0.7052 0.9986 0.9912 0.9832 Estimated Model -14.153-3.061*Pcharge+ 122.392* ρR1-30.440* K(r)X Regression Coefficients Independent Regression Standard Variable Coefficient Error Intercept -14.1527 5.3177 Pcharge -3.0611 0.5785 ρR1 122.3920 26.4442 K(r)X -30.4404 6.9844 Note: The T-Value used to calculate these confidence limits was 2.120. Source Intercept Model Error Total(Adjusted) DF 1 3 16 19 Analysis of Variance Detail Section Model Term DF Intercept 1 Model 3 Pcharge 1 ρR1 1 K(r)X 1 Error 16 Total(Adjusted) 19 Normality Tests Section Test Name Shapiro Wilk Anderson Darling D'Agostino Skewness D'Agostino Kurtosis D'Agostino Omnibus R2 0.8697 0.1303 1.0000 R2 0.8697 0.2281 0.1745 0.1547 0.1303 1.0000 Test Value 0.9306 0.5645 0.0647 -1.8500 3.4266 Lower 95% C.L. -25.4257 -4.2874 66.3329 -45.2467 Sum of Squares 242.4384 23.1824 3.474472 26.65687 Mean Square 242.4384 7.727467 0.2171545 1.402993 Sum of Squares 242.4384 23.1824 6.080916 4.651731 4.124836 3.474472 26.65687 Mean Square 242.4384 7.727467 6.080916 4.651731 4.124836 0.2171545 1.402993 Prob Level 0.158365 0.144047 0.948414 0.064318 0.180272 Upper 95% C.L. -2.8797 -1.8348 178.4512 -15.6340 Standardized Coefficient 0.0000 -0.5200 0.4768 -0.4464 F-Ratio Prob Level Power (5%) 35.585 0.0000 1.0000 F-Ratio Prob Level Power (5%) 35.585 28.003 21.421 18.995 0.0000 0.0001 0.0003 0.0005 1.0000 0.9986 0.9912 0.9832 Reject H0 At Alpha = 20%? Yes Yes No Yes Yes Predicted Values with Confidence Limits of Means Actual Predicted Standard Error Of 95% Lower Conf. Limit 95% Upper Conf. Limit Row 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Y 3.579 4.533 4.102 2.262 1.179 1.807 3.372 4.553 4.086 3.667 4.246 2.056 5.421 2.107 2.782 2.473 4.333 3.774 5.235 4.067 Y 3.839 4.402 4.564 1.629 1.416 2.096 3.196 4.024 3.829 3.609 4.399 2.488 4.937 2.891 2.170 2.767 4.743 3.336 4.800 4.498 Actual Y 3.579 4.533 4.102 2.262 1.179 1.807 3.372 4.553 4.086 3.667 4.246 2.056 5.421 2.107 2.782 2.473 4.333 3.774 5.235 4.067 Predicted Y 3.839 4.402 4.564 1.629 1.416 2.096 3.196 4.024 3.829 3.609 4.399 2.488 4.937 2.891 2.170 2.767 4.743 3.336 4.800 4.498 Predicted 0.158 0.249 0.164 0.306 0.306 0.232 0.132 0.119 0.143 0.108 0.138 0.144 0.201 0.132 0.349 0.326 0.214 0.129 0.194 0.154 Of Mean 3.503 3.873 4.215 0.981 0.768 1.604 2.915 3.773 3.526 3.380 4.106 2.182 4.511 2.612 1.430 2.076 4.289 3.062 4.390 4.171 Of Mean 4.175 4.930 4.912 2.277 2.065 2.589 3.476 4.275 4.132 3.839 4.692 2.794 5.363 3.170 2.910 3.458 5.196 3.610 5.211 4.825 Residual -0.260 0.131 -0.462 0.633 -0.237 -0.290 0.176 0.529 0.257 0.057 -0.153 -0.432 0.484 -0.784 0.612 -0.294 -0.410 0.438 0.435 -0.431 Absolute Percent Error 7.275 2.900 11.261 27.976 20.115 16.050 5.211 11.622 6.285 1.560 3.602 21.003 8.932 37.218 22.001 11.895 9.456 11.613 8.309 10.592 Sqrt(MSE) Without This Row 0.476 0.480 0.464 0.430 0.474 0.473 0.479 0.460 0.476 0.481 0.480 0.467 0.461 0.433 0.418 0.469 0.466 0.467 0.465 0.467 Residual Report Row 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Regression Diagnostics Section Standardized Row Residual RStudent 1 -0.5940 -0.5816 2 0.3339 0.3244 3 -1.0592 -1.0635 4 1.7996 1.9510 5 -0.6747 -0.6628 6 -0.7179 -0.7066 7 0.3932 0.3826 8 1.1741 1.1892 9 0.5790 0.5666 10 0.1262 0.1222 11 -0.3436 -0.3339 12 -0.9747 -0.9730 13 1.1517 1.1645 14 -1.7541 -1.8898 15 1.9826 2.2102 16 -0.8836 -0.8773 17 -0.9898 -0.9891 Hat Diagonal 0.1155 0.2860 0.1243 0.4304 0.4310 0.2489 0.0807 0.0647 0.0942 0.0538 0.0878 0.0961 0.1861 0.0799 0.5610 0.4897 0.2109 Cook's D 0.0115 0.0112 0.0398 0.6118 0.0862 0.0427 0.0034 0.0238 0.0087 0.0002 0.0028 0.0253 0.0758 0.0668 1.2556 0.1874 0.0654 Dffits -0.2102 0.2053 -0.4007 1.6960 -0.5768 -0.4067 0.1133 0.3127 0.1828 0.0291 -0.1036 -0.3173 0.5568 -0.5567 2.4984 -0.8595 -0.5113 CovRatio 1.3387 1.7632 1.1052 0.9198 2.0269 1.5119 1.3545 0.9652 1.3131 1.3627 1.3777 1.1212 1.1251 0.5988 0.9548 2.0769 1.2741 18 19 20 0.9788 1.0262 -0.9798 0.9775 1.0281 -0.9785 0.0767 0.1726 0.1097 0.0199 0.0549 0.0296 0.2818 0.4696 -0.3434 1.0952 1.1916 1.1352 Plots Section Histogram of Residuals of Y Normal Probability Plot of Residuals of Y 8.0 1.0 Residuals of Y 6.0 0.5 Coun 4.0 0.0 2.0 -0.5 0.0 -1.0 -1.0 -0.5 0.0 0.5 1.0 -2.0 Residuals of Y -1.0 0.0 1.0 2.0 Expected Normals Residuals of Y vs Pcharge Residuals of Y vs Row 1.0 1.0 0.5 Residuals of Y 0.5 Residuals of Y 0.0 0.0 -0.5 -0.5 -1.0 -0.2 -1.0 0.0 5.0 10.0 15.0 0.1 20.0 0.3 0.6 0.8 0.2 0.2 Pcharge Row Residuals of Y vs ρR1 Residuals of Y vs K(r)X 1.0 1.0 Residuals of Y0.5 Residuals of Y 0.5 0.0 0.0 -0.5 -0.5 -1.0 -1.0 0.2 0.2 0.2 ρR1 0.2 0.2 0.1 0.1 0.1 K(r)X Errors and t-test statistics for regression coefficients for model with 16 samples: R2 Adj R2 Coefficient of Variation Mean Square Error Square Root of MSE Ave Abs Pct Error Variable Y Pcharge ρR1 K(r)X 0.9005 0.8757 0.1175 0.1798956 0.424141 10.580 Count 16 16 16 16 Independent Variable Intercept Pcharge ρR1 K(r)X Standard Deviation 0.2010285 4.987871E-03 8.273194E-03 1.202886 Mean 0.305139 0.1825122 0.1187495 3.610556 Regression Coefficient b(i) -5.6115 -3.2546 89.8508 -52.0736 Standard Error Sb(i) 6.4000 0.5836 27.9870 15.9346 T-Value to test H0:B(i)=0 -0.877 -5.577 3.210 -3.268 Minimum -0.041429 0.1706071 0.0959985 1.1791 Prob Level 0.3978 0.0001 0.0075 0.0067 Maximum 0.640682 0.1882551 0.1295073 5.421 Reject H0 at 5%? No Yes Yes Yes Power of Test at 5% 0.1276 0.9991 0.8378 0.8503 Estimated Model -5.612-3.255*Pcharge+ 89.851* ρR1-52.073* K(r)X Regression Coefficients Independent Regression Standard Variable Coefficient Error Intercept -5.6115 6.4000 Pcharge -3.2546 0.5836 ρR1 89.8508 27.9870 K(r)X -52.0736 15.9346 Note: The T-Value used to calculate these confidence limits was 2.179. Lower 95% C.L. -19.5559 -4.5262 28.8722 -86.7921 Upper 95% C.L. 8.3329 -1.9830 150.8293 -17.3551 Standardized Coefficient 0.0000 -0.5439 0.3726 -0.3582 F-Ratio Prob Level Power (5%) 36.216 0.0000 1.0000 F-Ratio Prob Level Power (5%) 36.216 31.098 10.307 10.680 0.0000 0.0001 0.0075 0.0067 1.0000 0.9991 0.8378 0.8503 Analysis of Variance Section Source Intercept Model Error Total(Adjusted) DF 1 3 12 15 Analysis of Variance Detail Section Model Term DF Intercept 1 Model 3 Pcharge 1 ρR1 1 K(r)X 1 Error 12 Total(Adjusted) 15 Normality Tests Section Test Name Shapiro Wilk Anderson Darling D'Agostino Skewness D'Agostino Kurtosis R2 0.9005 0.0995 1.0000 R2 0.9005 0.2578 0.0854 0.0885 0.0995 1.0000 Test Value 0.9354 0.3552 -0.0485 -1.7839 Sum of Squares 208.5779 19.54526 2.158747 21.70401 Mean Square 208.5779 6.515087 0.1798956 1.446934 Sum of Squares 208.5779 19.54526 5.594337 1.854171 1.921203 2.158747 21.70401 Mean Square 208.5779 6.515087 5.594337 1.854171 1.921203 0.1798956 1.446934 Prob Level 0.296507 0.460112 0.961284 0.074447 Reject H0 At Alpha = 20%? No No No Yes D'Agostino Omnibus 3.1845 0.203468 No Predicted Values with Confidence Limits of Means Row 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Actual Y 3.579 4.533 4.102 2.262 1.179 1.807 3.372 4.086 3.667 4.246 5.421 2.107 4.333 3.774 5.235 4.067 Predicted Y 3.690 4.641 4.489 1.842 1.610 1.822 3.133 3.726 3.565 4.298 5.062 2.643 4.683 3.202 4.739 4.624 Actual Y 3.579 4.533 4.102 2.262 1.179 1.807 3.372 4.086 3.667 4.246 5.421 2.107 4.333 3.774 5.235 4.067 Predicted Y 3.690 4.641 4.489 1.842 1.610 1.822 3.133 3.726 3.565 4.298 5.062 2.643 4.683 3.202 4.739 4.624 Standard Error Of Predicted 0.161 0.353 0.171 0.294 0.292 0.270 0.131 0.141 0.108 0.141 0.206 0.201 0.226 0.141 0.199 0.182 95% Lower Conf. Limit Of Mean 3.339 3.872 4.117 1.201 0.973 1.233 2.847 3.419 3.329 3.992 4.614 2.205 4.190 2.895 4.306 4.228 95% Upper Conf. Limit Of Mean 4.042 5.411 4.861 2.483 2.247 2.411 3.418 4.034 3.801 4.604 5.510 3.081 5.175 3.508 5.171 5.019 Residual -0.112 -0.108 -0.387 0.420 -0.431 -0.016 0.239 0.360 0.101 -0.052 0.359 -0.537 -0.350 0.573 0.497 -0.557 Absolute Percent Error 3.118 2.383 9.439 18.572 36.520 0.866 7.086 8.804 2.764 1.226 6.615 25.467 8.075 15.171 9.487 13.683 Sqrt(MSE) Without This Row 0.442 0.439 0.424 0.407 0.405 0.443 0.436 0.428 0.442 0.443 0.425 0.403 0.425 0.403 0.409 0.402 Residual Report Row 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Regression Diagnostics Section Standardized Row Residual RStudent 1 -0.2844 -0.2732 2 -0.4600 -0.4444 3 -0.9972 -0.9970 4 1.3752 1.4346 5 -1.4012 -1.4670 6 -0.0478 -0.0458 7 0.5922 0.5754 8 0.8994 0.8916 9 0.2471 0.2372 10 -0.1301 -0.1246 11 0.9666 0.9637 12 -1.4368 -1.5118 13 -0.9748 -0.9726 14 1.4312 1.5046 15 1.3253 1.3733 16 -1.4520 -1.5312 Hat Diagonal 0.1447 0.6934 0.1621 0.4812 0.4750 0.4060 0.0953 0.1107 0.0651 0.1099 0.2350 0.2249 0.2839 0.1101 0.2193 0.1834 Cook's D 0.0034 0.1196 0.0481 0.4385 0.4442 0.0004 0.0092 0.0252 0.0011 0.0005 0.0718 0.1498 0.0942 0.0634 0.1233 0.1184 Dffits -0.1124 -0.6682 -0.4385 1.3815 -1.3955 -0.0379 0.1867 0.3146 0.0626 -0.0438 0.5342 -0.8144 -0.6124 0.5293 0.7278 -0.7256 CovRatio 1.6117 4.3019 1.1959 1.3746 1.3202 2.3823 1.3903 1.2047 1.4844 1.5822 1.3388 0.8588 1.4219 0.7529 0.9633 0.8008 Histogram of Residuals of Y Normal Probability Plot of Residuals of Y 4.0 0.6 Residuals of Y 3.0 0.3 Count 2.0 0.0 1.0 -0.3 0.0 -0.6 -0.6 -0.3 0.0 0.3 0.6 -2.0 Residuals of Y -1.0 0.0 1.0 2.0 Expected Normals Residuals of Y vs Pcharge Residuals of Y vs Row 0.6 0.6 Residuals of Y Residuals of Y 0.3 0.3 0.0 0.0 -0.3 -0.3 -0.6 -0.2 -0.6 0.0 5.0 10.0 15.0 0.1 20.0 0.3 0.6 0.8 0.1 0.1 Pcharge Row Residuals of Y vs ρR1 Residuals of Y vs K(r)X 0.6 0.6 0.3 Residuals of Y0.3 0.0 0.0 -0.3 -0.3 Residuals of Y -0.6 -0.6 0.2 0.2 0.2 ρR1 0.2 0.2 0.1 0.1 0.1 K(r)X