光谱学与光谱分析, 2018, 38 (12): 3883, 网络出版: 2018-12-16
混合式随机森林的土壤钾含量高光谱反演
Random Forests-Based Hybrid Feature Selection Algorithm for Soil Potassium Content Inversion Using Hyperspectral Technology
土壤速效钾含量 高光谱 特征波长选择 混合式特征选择 随机森林 Soil available potassium content Hyperspectral Characteristic wavelength selection Hybrid feature selection Random forests
摘要
从土壤速效钾光谱中挖掘关键特征较为困难, 导致高光谱反演模型预测精度较低。 针对此问题, 提出了一种混合式随机森林特征选择算法。 首先采用封装式特征选择方法进行特征预选, 快速去除冗余并保留相关特征, 然后再利用改进的随机森林特征选择算法对预处理后的特征进行精选, 通过增大关键特征与冗余特征的区分度以及采用迭代特征选择的方式, 使精选后的特征具有更好的鲁棒性与区分性, 较好的解决了土壤速效钾高光谱反演模型精度较低的问题。 为了验证所提出算法的有效性, 选取了青岛市大沽河流域具有代表性的124个土壤样品为实验对象, 利用提出的算法从2 051个原始波段选出含有13个敏感波段的最优光谱子集建立土壤速效钾反演模型, 并与现有特征选择算法所建模型进行对比分析。 结果表明: 该算法构建的回归模型具有较低的预测均方根误差RMSEP(9.661 5), 较高的相关系数r(0.936 9)和预测分析相对误差RPD(2.14)。 混合式随机森林特征选择算法以较少的特征波长数实现了较好的预测效果, 可为土壤养分实时光谱传感器的设计提供一定的理论依据。
Abstract
In order to solve the problem of lower prediction performance caused by the difficulty in retrieving the key features from hyperspectral data of soil available potassium, this paper proposes a novel hybrid feature selection algorithm based on Random Forests. Firstly, wrapper-based feature selection methods were applied to rapidly remove the redundancies and preserve the related features. Secondly, an Improved-RF feature selection algorithm was applied to further accurately select the wavelength variables from the pre-selected feature sets. In this step, characteristic wavelength with strong robustness and discriminative could be selected through improving the dipartite degree between the key and redundant features and using an iterative feature selection method. Therefore, the problem of low prediction performance in the soil available potassium inversion model could be better solved by using our hybrid feature selection algorithm. In order to verify the validity of our algorithm, 124 representative soil samples collected from the Dagu River Basin were chosen. Using our algorithm, the optimal feature subset which contained 13 sensitive bands have been selected and used to build soil available potassium content inversion model. This work compared the model performance of full bands, current feature selection algorithms and our algorithm. The comparison results indicated that our algorithm not only selects minimum numbers of wavelength features and reduces the dimension of full bands, but also achieves better prediction performance with lower RMSEP (9.661 5), higher R (0.936 9) and RPD (2.14). As an effective method of soil available potassium inversion model, the algorithm proposed in this paper can provide theoretical basis for the design of real-time soil nutrient sensors.
王轩慧, 郑西来, 韩仲志, 王轩力, 王娟. 混合式随机森林的土壤钾含量高光谱反演[J]. 光谱学与光谱分析, 2018, 38(12): 3883. WANG Xuan-hui, ZHENG Xi-lai, HAN Zhong-zhi, WANG Xuan-li, WANG Juan. Random Forests-Based Hybrid Feature Selection Algorithm for Soil Potassium Content Inversion Using Hyperspectral Technology[J]. Spectroscopy and Spectral Analysis, 2018, 38(12): 3883.