TY - JOUR
T1 - Random forest–based estimation of heavy metal concentration in agricultural soils with hyperspectral sensor data
AU - Tan, Kun
AU - Ma, Weibo
AU - Wu, Fuyu
AU - Du, Qian
N1 - Publisher Copyright:
© 2019, Springer Nature Switzerland AG.
PY - 2019/7/1
Y1 - 2019/7/1
N2 - Heavy metals in the agricultural soils of reclaimed mining areas can contaminate food and endanger human health. The objective of this study is to effectively estimate the concentrations of heavy metals, such as zinc, chromium, arsenic, and lead, using hyperspectral sensor data and the random forest (RF) algorithm in the study area of Xuzhou, China. The RF’s built-in feature selection ability and modeling expressive ability in heavy metal estimation of soil were explored. After the preprocessing of the spectrum obtained by an ASD (analytical spectral device) field spectrometer, the random forest algorithm was carried out to establish the estimation model based on the correlation-selected features and the full-spectrum features respectively. Results of all the different processes were compared with classical approaches, such as partial least squares (PLS) regression and support vector machine (SVM). In all the experimental results, from the perspective of models, the best estimation model for Zn (R2 = 0.9061; RMSE = 6.5008) is based on the full-spectrum data of continuum removal (CR) pretreatment, and the best models for Cr (R2 = 0.9110; RMSE = 4.5683), As (R2 = 0.9912; RMSE = 0.5327), and Pb (R2 = 0.9756; RMSE = 1.1694) are all derived from the correlation-selected features. And these best models of these heavy metals are all established by the RF method. The experiments in this paper show that random forests can make full use of the input spectral data in the estimation of four kinds of heavy metals, and the obtained models are superior to those established by traditional methods.
AB - Heavy metals in the agricultural soils of reclaimed mining areas can contaminate food and endanger human health. The objective of this study is to effectively estimate the concentrations of heavy metals, such as zinc, chromium, arsenic, and lead, using hyperspectral sensor data and the random forest (RF) algorithm in the study area of Xuzhou, China. The RF’s built-in feature selection ability and modeling expressive ability in heavy metal estimation of soil were explored. After the preprocessing of the spectrum obtained by an ASD (analytical spectral device) field spectrometer, the random forest algorithm was carried out to establish the estimation model based on the correlation-selected features and the full-spectrum features respectively. Results of all the different processes were compared with classical approaches, such as partial least squares (PLS) regression and support vector machine (SVM). In all the experimental results, from the perspective of models, the best estimation model for Zn (R2 = 0.9061; RMSE = 6.5008) is based on the full-spectrum data of continuum removal (CR) pretreatment, and the best models for Cr (R2 = 0.9110; RMSE = 4.5683), As (R2 = 0.9912; RMSE = 0.5327), and Pb (R2 = 0.9756; RMSE = 1.1694) are all derived from the correlation-selected features. And these best models of these heavy metals are all established by the RF method. The experiments in this paper show that random forests can make full use of the input spectral data in the estimation of four kinds of heavy metals, and the obtained models are superior to those established by traditional methods.
KW - Hyperspectral estimation
KW - Random forest
KW - Soil heavy metal concentration
UR - https://www.scopus.com/pages/publications/85067513234
U2 - 10.1007/s10661-019-7510-4
DO - 10.1007/s10661-019-7510-4
M3 - 文章
C2 - 31214787
AN - SCOPUS:85067513234
SN - 0167-6369
VL - 191
JO - Environmental Monitoring and Assessment
JF - Environmental Monitoring and Assessment
IS - 7
M1 - 446
ER -