Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Canadian system of soil classification wikipedia , lookup
Soil salinity control wikipedia , lookup
Crop rotation wikipedia , lookup
Soil respiration wikipedia , lookup
Terra preta wikipedia , lookup
Soil compaction (agriculture) wikipedia , lookup
Soil food web wikipedia , lookup
No-till farming wikipedia , lookup
Sustainable agriculture wikipedia , lookup
Soil microbiology wikipedia , lookup
Total organic carbon wikipedia , lookup
Comparison of Methods for the Assessment of Soil Organic Carbon Using Visible/Near-Infrared Reflectance Spectroscopy Soil and Water Science Department Gustavo M. Vasques, Sabine Grunwald, and James O. Sickman University of Florida 2169 McCarty Hall, P.O. Box 110290 Gainesville, FL 32611-0290 Soil and Water Science Department, University of Florida Phone: 352-392-1951 ext. 233 Fax: 352-392-3902 E-mail: [email protected] Introduction Study Area Results In the last decades, models to predict soil properties have become more accurate and less costly. Advances in information technology and the development of new sensors and instruments have facilitated the collection and analysis of data, making possible the formulation of more complex models. Carbon is of great importance to soils. It has a strong relationship with soil organic matter, influencing the soil physical, chemical and biological processes. In addition, soil is a potential reservoir to sequester atmospheric CO2 and mitigate global warming. Hence, the analysis of the distribution and dynamics of soil carbon is an essential requirement for sustainable land management. Visible/near-infrared spectroscopy (VNIRS) is a fast, cheap and accurate alternative for the investigation of soil properties, and is gradually becoming recognized as a powerful analytical tool in soil science. Study area: Santa Fe River watershed (3,585 km2) in northcentral Florida. Dominant soil orders: Ultisols (47%), Spodosols (27%), and Entisols (17%). Land use/land cover: Pine plantation (30%), grassland and crops (29%), upland forest (11%) and wetlands (14%). Sampling design: Composite sampling across 4 depth intervals (0-30, 30-60, 60-120, and 120-180 cm) at 141 sites distributed across different land uses and soil types in a stratified random design. Overall, PLSR gave the most accurate predictions of Log(TOC). Some advantages of PLSR are: rapidness, ease of use, and flexibility to deal with correlated and missing data. Soil Spectra 1 Reflectance (%) 0.50 0.40 0.30 0.20 0.10 0.00 359 659 959 1259 1559 1859 2159 2459 1559 1859 2159 2459 2 Log(1/R) 1.60 Objectives 1.20 0.80 0.40 359 659 959 1259 Search window 959 1259 1559 1859 2159 3, 5, 7, and 9 2nd-order polynomial 3, 5, 7, and 9 3rd-order polynomial 5, 7, and 9 nd 2 -order polynomial 3, 5, 7, and 9 3rd-order polynomial 5, 7, and 9 Standard Normal Variate (SNV) * Savitzky-Golay smoothing, and averaging, were used as a standard preparation of the soil spectral curves to reduce noise and match the resolution of the instrument. This standard curve was used as the input to all other pre-processing transformations. 1559 SMLR-SNV 1859 PCR-SG-1D 5 2159 1 2459 PLSR-SG-1D 5 RT-NGD 1 5 3 RT-NGD 2 3 4 5 5 2 3 (1) SG-1D with 3 y = 0.8933x + 0.3474 R2 = 0.8549 5 y = 0.8288x + 0.5867 R 2 = 0.7606 2 2 3 4 5 2 Observed 5 Observed 1599 1899 2199 0.02 0.01 0.00 -0.01 359 659 959 1259 1559 1859 2159 2459 -0.02 Mean Mean - 2s Std. Deviation 0.040 0.075 0.028 0.056 Number of Predictors1/PCs2/Terminal Nodes3 Minimum Maximum Mean Std. Deviation 6 35 19 7 7 17 12 2 6 13 7 2 3 22 10 5 All the methods were sensitive to the regions of absorption features of C-H, O-H and H2O. Except for RT, all methods included variables in the absorbance region of N-H. ANN: The ANN method was performed using SA pre-processing. A single-layer perceptron was used because it approximates a linear least-squares estimator. An exhaustive comparison among different transfer functions, learning rules and numbers of epochs was performed to identify the best combination of learning parameters. Comparative results: ANN outperformed all the Method Rv2 other methods when they were calibrated using SMLR 0.841 0.830 the same dataset (SA). The main advantage of PCR ANN is its flexibility to adjust to any dataset. The PLSR 0.854 0.675 main drawbacks are the non-transparency, the RT 0.878 difficulty to find the optimum set of parameters ANN and the long learning period. PCR-SA SMLR-SA 5 PLSR-SA 3 4 3 3 4 Observed 5 3 2 3 4 Observed 5 5 4 3 y = 0.8723x + 0.4188 R 2 = 0.8543 2 2 2 ANN-SA 5 4 y = 0.8436x + 0.5237 R 2 = 0.829 y = 0.8642x + 0.4325 R 2 = 0.8413 2 RT-SA 5 5 4 Predicted 1299 Predicted 999 Predicted 699 Predicted 399 Predicted st S-G 1 Derivative Norris Derivative SMLR1 PCR2 PLSR2 RT3 4 Observed ANN vs. Other Methods 0.00 Results Method 3 4 polynomial and window of size 9; (2) NGD with window of size 5. 0.01 -0.02 SMLR: The best pre-processing transformations were Log(1/R) and SNV, both with a Rv2 of 0.854. The Log(1/R) model selected 14 predictors, while the SNV model selected 23. PCR: The best transformations were SG-1D with 1st or 2nd-order polynomial and window of size 9 (Rv2 = 0.834), using 13 principal components (PCs), followed by SA (11 PCs, Rv2 = 0.830). For the SG transformations, the degree of the derivative and the size of the search window were more sensitive factors than the order of the polynomial. PLSR: Like PCR, the best transformations were SG-1D with 1st or 2nd-order polynomial and window of size 9 (7 PCs, Rv2 = 0.855), followed by SA (8 PCs, Rv2 = 0.854). RT: The best pre-processing transformation was NGD with window of size 5, followed by NGD with window of size 7 (Rc2 = 0.739). Rv2(1)/Rc2(2) Minimum Maximum Mean 0.656 0.854 0.814 0.560 0.834 0.770 0.741 0.855 0.830 0.493 0.754 0.659 1st-order 3 2 2 5 Observed 0.02 -0.01 4 4 y = 0.8825x + 0.3833 R 2 = 0.8318 2 2 4 3 y = 0.8461x + 0.5158 R 2 = 0.8533 y = 0.8021x + 0.6639 R 2 = 0.8529 3 4 Predicted 4 Predicted Predicted Predicted 3 2 (1) Savitzky-Golay smoothing, and averaging; (2) Log(1/R); (3) SNV transformation; (4) SG-1D with 1st-order polynomial and window of size 9; (5) NGD with window of size 5. SMLR1 PCR1 PLSR1 RT2 1st-order polynomial 3, 5, 7, and 9 Savitzky-Golay Second Derivative (SG-2D) SMLR-Log(1/R) 4 2459 Wavelength (nm) Method By the maximum By the mean By the range 1259 Multicollinearity and missing data are potential problems in SMLR, but they were not observed in this analysis. Linearity is assumed in SMLR, and to some extent in PCR and PLSR. Alternatively, non-parametric methods are more flexible to deal with non-linear relationships. In this study, RT was not well suited for the estimation of Log(TOC) using VNIRS, as it predicted discontinuous values and produced the worst results. 2 Mean + 2s 9, and 210 Baseline Correction Kubelka-Munk Transformation (K-M) Log(1/Reflectance) Savitzky-Golay First Derivative (SG-1D) 959 Wavelength (nm) Predicted SNV 659 5 1 Norris Gap Derivative (NGD) PLSR-SG-1D 659 Observed Lab analysis/spectroscopy: Total organic carbon (TOC) was measured with a Thermo Electron FlashEA Elemental Analyzer; VNIR spectra were derived with an ASD QualitySpec Pro spectroradiometer (350-2500 nm). Pre-treatment of TOC: Log-normalization using base-10 logarithm. TOC Log(TOC) Statistic Validation: 400 observations were (mg/kg) 169 2.2279 randomly separated for calibration, Minimum 268995 5.4297 leaving 154 observations for Maximum Median 2903 3.4628 validation. 7235 3.5072 Pre-processing transformations: 30 Mean Std. Deviation 18942 0.4939 techniques were tested for each Skewness 8.82 0.49 method, except ANN. Methods: Stepwise Multiple Linear Regression (SMLR) with a stepping probability of 0.05 (SPSS 11); Principal Components Regression (PCR) (Unscrambler 9.5); Partial Least-Squares Regression (PLSR) (Unscrambler 9.5); Regression Tree (RT) (CART 5.0); Artificial Neural Networks (ANN) using a perceptron, with hyperbolic tangent transfer function, conjugate gradient learning, and 20,000 epochs (NeuroSolutions 4.0). Comparison of methods and pre-processing transformations: The best methods and pre-processing techniques were selected based on the coefficient of determination of calibration (Rc2) for the RT and of validation (Rv2) for all other methods. Normalization PCR-SG-1D 359 5 3.00 2.00 1.00 0.00 -1.00 359 -2.00 -3.00 4 Technique details SMLR-SNV 3 Methods Pre-processing technique* SMLR-Log(1/R) 0.00 To identify, among thirty pre-processing transformations of soil VNIR reflectance spectra, and five calibration methods, the best combination to estimate soil total organic carbon using VNIRS. Savitzky-Golay Smoothing1, and Averaging2 (SA) Important Variables / Variables Selected by the Methods 3 4 5 3 y = 0.8798x + 0.3914 R2 = 0.878 y = 0.6823x + 1.0861 R 2 = 0.6774 2 2 4 2 2 Observed 3 4 Observed 5 2 3 4 5 Observed Conclusions (1) The best model to predict Log(TOC) using VNIRS was obtained by ANN. (2) Overall performance of the methods (excluding ANN): PLSR > SMLR > PCR > RT (3) Performance of the methods with SA pre-processing (including ANN): ANN > PLSR > SMLR > PCR > RT Future Research (1) To compare the 30 pre-processing transformations using ANN, and explore other configurations of ANN, including multilayer perceptron, radial basis functions, and wavelength selection with genetic algorithm. (2) To compare the 30 pre-processing transformations using Boosted Regression Trees (BRT). Acknowledgements To Carolyn Olson, Steve Bloom and Sanjay Lamsal. This work was funded by the Cooperative Ecosystem Studies Unit (CESU) – Natural Resources Conservation Service (NRCS).