激光与光电子学进展, 2017, 54 (10): 103001, 网络出版: 2017-10-09   

基于二分搜索结合修剪随机森林的特征选择算法在近红外光谱分类中的应用 下载: 656次

Feature Selection Algorithm Application in Near-Infrared Spectroscopy Classification Based on Binary Search Combined with Random Forest Pruning
作者单位
1 中国海洋大学信息科学与工程学院, 山东 青岛 266100
2 云南中烟工业有限责任公司技术中心, 云南 昆明 650024
摘要
针对随机森林(RF)在高维空间特征选择过程中计算繁琐和内存开销大、分类准确率低等问题, 提出了基于二分搜索(BS)结合修剪随机森林(RFP)的特征选择算法(BSRFP); 该算法首先根据纯度基尼指数获取特征重要性评分, 删除重要性评分较低的特征, 然后利用BS算法结合基分类器差异性的修剪技术得到最优特征子集和最高分类准确率的分类器; 为了验证算法的有效性, 构建卷烟质量识别模型并与其他方法进行比较。结果表明: BS算法简化了特征搜索过程, RFP算法缩减了RF算法的规模; RFP算法的分类准确率可达96.47%; BSRFP算法选择出的特征相关性更强, 对卷烟质量识别具有更高的准确度。
Abstract
In view of the problems of the random forest in the feature selection process in high-dimensional spaces, such as calculation complexity, large model memory overhead, and low classification accuracy, a feature selection algorithm named binary search random forest pruning (BSRFP) is proposed. This algorithm firstly obtains the feature importance scores according to the purity Gini index, and deletes features with low importance scores. The optimal feature subset and the classifier with the highest classification accuracy are then obtained with utilization of the pruning technique combining binary search with the diversity among base classifiers. To verify the effectiveness of this algorithm, a cigarette quality recognition model is established and compared with other methods. The results show that the binary search algorithm simplifies the feature search process, and the RFP algorithm reduces the size of random forest algorithm. The classification accuracy of the random forest pruning algorithm is 96.47%. The features selected by using BSRFP algorithm are more correlated, and the algorithm provides higher accuracy of cigarette quality recognition.

刘明, 李忠任, 张海涛, 于春霞, 唐兴宏, 丁香乾. 基于二分搜索结合修剪随机森林的特征选择算法在近红外光谱分类中的应用[J]. 激光与光电子学进展, 2017, 54(10): 103001. Liu Ming, Li Zhongren, Zhang Haitao, Yu Chunxia, Tang Xinghong, Ding Xiangqian. Feature Selection Algorithm Application in Near-Infrared Spectroscopy Classification Based on Binary Search Combined with Random Forest Pruning[J]. Laser & Optoelectronics Progress, 2017, 54(10): 103001.

本文已被 1 篇论文引用
被引统计数据来源于中国光学期刊网
引用该论文: TXT   |   EndNote

相关论文

加载中...

关于本站 Cookie 的使用提示

中国光学期刊网使用基于 cookie 的技术来更好地为您提供各项服务,点击此处了解我们的隐私策略。 如您需继续使用本网站,请您授权我们使用本地 cookie 来保存部分信息。
全站搜索
您最值得信赖的光电行业旗舰网络服务平台!