光谱学与光谱分析, 2010, 30 (4): 915, 网络出版: 2011-01-26
基于GA和SCMWPLS算法的NIR光谱信息变量提取研究
Study on Variable Selection of NIR Spectral Information Based on GA and SCMWPLS
近红外光谱 变量提取 正交信号校正 区间组合移动窗口偏最小二乘法 遗传算法 杏 NIR spectroscopy Variable selection Orthogonal signal correction (OSC) Searching combination moving window PLS (SCMWPLS) Genetic algorithms (GA) Apricot
摘要
光谱数据压缩、 信息变量提取是近红外应用研究的热点, 是简化模型、 提高预测精度的重要手段。 本文以杏可见/近红外光谱为例, 采用二阶导数、 标准化和正交信号校正(OSC)处理以滤除光谱与浓度阵无关的信号; 使用SCMWPLS选择出880, 894~910和932 nm为建模区间建立PLS预测模型, 其相关系数(R)、 校正误差(SEC)和预测误差(SEP)分别为0.920, 0.454和0.470; 进行独立运行GA程序100次, 依次选择入选频率较高的2个波长点888和900 nm作为回归变量, 建立GA-MLR预测模型, 其R, SEC, SEP分别为0.905, 0.488和0.459, 均优于全谱的偏最小二乘建模结果。 结果显示, OSC可以滤除光谱与浓度阵无关的信号, 减少建立模型所用的主因子数; SCMWPLS和GA可以寻找最优信息变量组合。 该方法对于建立低维度、 高精度近红外快速分析模型具有普遍参考意义。
Abstract
Spectral data compression and informative variable selection are the research focus on the application of NIR, which enable to simplify the model and improve the accuracy of prediction. The research used the pretreatment methods such as the second derivative, normalization and orthogonal signal correction (OSC) to filter irrelevant array according to the concentration of soluble solid content (SSC) based on the Vis/NIR spectroscopy of apricot. SCMWPLS was used to select 880, 894-910 and 932 nm as the regions for constructing prediction PLS model with correlation coefficient (R) of 0.920, standard error of calibration (SEC) of 0.454 and standard error of prediction (SEP) of 0.470 for SSC. Besides, after conducting an independent run for 100 times, GA obtained the regression variables as 888 and 900 nm according to the higher frequency of selection to set up GA.MLR prediction model, and the R, SEC and SEP were 0.905, 0.488 and 0.459 respectively. The results of the two modeling methods are both better than those of full-region PLS model. This demonstrates that OSC enables to filter irrelevant signal array according to the concentration of SSC and reduce the latent variables used for modeling. Also, SCMWPLS and GA can identify the optimal combination of information variables. These methods have a universal significance on building NIR express analysis model with low dimension and high precision.
曹楠宁, 王加华, 李鹏飞, 韩东海. 基于GA和SCMWPLS算法的NIR光谱信息变量提取研究[J]. 光谱学与光谱分析, 2010, 30(4): 915. CAO Nan-ning, WANG Jia-hua, LI Peng-fei, HAN Dong-hai. Study on Variable Selection of NIR Spectral Information Based on GA and SCMWPLS[J]. Spectroscopy and Spectral Analysis, 2010, 30(4): 915.