光谱学与光谱分析, 2019, 39 (9): 2800, 网络出版: 2019-09-28   

光谱预处理方法选择研究

Study on the Selection of Spectral Preprocessing Methods
作者单位
天津工业大学省部共建分离膜与膜过程国家重点实验室, 环境与化学工程学院, 天津 300387
摘要
复杂样品光谱信号往往会受到杂散光、 噪声、 基线漂移等因素的干扰, 从而影响最终的定性定量分析结果, 因此通常需要在建模前对原始光谱进行预处理。 目前已有的光谱预处理方法包括很多种, 如何寻找合适的预处理方法是很棘手的问题。 一种途径是观察光谱信号特点选择预处理方法(visual inspection), 另一种途径是根据建模性能的优劣反过来选择预处理方法(trial-and-error strategy)。 前者无需建模, 更具有解释性, 但是有时会由于选择者主观的因素导致错误的结果; 后者无需观察光谱特点, 但需要考察大量的预处理方法, 对大数据集比较费时。 因此需要探讨哪种选择方式更科学与合理。 本研究采用9组数据, 通过对10种预处理方法的120种排列组合来探讨预处理的必要性及预处理方法的选择。 首先, 优化偏最小二乘(PLS)的因子数及一阶导数、 二阶导数、 SG平滑的窗口参数, 连续小波变换(CWT)的小波函数和分解尺度。 然后把无预处理及一阶导数、 二阶导数、 CWT、 多元散射校正(MSC)、 标准正态变量(SNV)、 SG平滑、 中心化、 Pareto尺度化、 最大最小归一化、 标准化10种预处理方法按照背景校正、 散射校正、 平滑和尺度化的顺序进行排列组合, 得到120种预处理及其组合方法。 最后对不同数据及相同数据的不同组分分别进行120种预处理, 分析光谱信号特点及预处理后PLS建模的预测均方根误差值(RMSEP)。 结果表明, 相比观察光谱信号特点, 根据光谱与预测组分的建模效果可以更为准确地选择最佳预处理方法。 对于多数数据, 采用合适的预处理方法可以提高建模效果; 对于不同的数据集, 因为其数据集信息和复杂性不同, 所以其最佳预处理方法也不同; 对于相同数据集, 即使光谱相同, 但不同组分的预处理方法也不相同。 因此, 不存在普适性的最佳预处理方法, 最佳预处理方法除了与光谱有关, 还与预测组分有关。 通过对已有预处理方法按照预处理目的进行分类再排列组合是选择最佳预处理方法的一种有效途径。
Abstract
Spectral signals of complex samples are usually disturbed by stray light, noise, baseline drift and other undesirable factors, which can affect the final qualitative and quantitative analysis results. Therefore, it is necessary to pretreat the raw spectra before modeling. How to find a proper preprocessing method from the existing spectral preprocessing methods is a difficult problem. One strategy is to choose the optimal preprocessing by observing the characteristics of the spectral signal directly, which does not require modeling and is more explanatory. However, it may be difficult and subjective for subtle or multiple interferences and lead to misleading results. Another strategy is based on the modeling performance, which does not need observe the spectral characteristics, but numerous processing methods need to investigate which is time-consuming for large datasets. In summary, it is necessary to explore which selection method is more scientific and reasonable. In this study, nine datasets were used to investigate the necessity of preprocessing and the choice of preprocessing methods by arranging and combining of 10 preprocessing methods. Firstly, the latent variables of partial least squares (PLS), the window size of first derivative (1st Der), second derivative (2nd Der) and SG smoothing, the wavelet function and decomposition scale of continuous wavelet transform (CWT) were optimized, respectively. Then, non-preprocessing and 10 preprocessing methods including 1st Der, 2nd Der, CWT, multiplicative scatter correction (MSC), standard normal variate (SNV), SG smoothing, mean centering, normalization, Pareto scaling, auto scaling were combined in order of baseline correction, scattering correction, smoothing and scaling. A total of 120 preprocessing and their combinations were obtained. Finally, the characteristics of spectral signals and the root mean squared error of prediction (RMSEP) with PLS for 120 preprocessing methods were analyzed for the nine datasets and the same dataset with different components. Results show that compared with observing the characteristics of spectral signals, the optimal preprocessing method can be selected more accurately according to the modeling performance of the spectra and predictive components. For most datasets, appropriate preprocessing method can improve the modeling performance. For different datasets, the optimal preprocessing method is different because of the different information and complexity of the datasets. For the same dataset, the optimal preprocessing methods for different components are also different even if the spectra are the same. Thus, it can be concluded that no universal preprocessing method exists. The optimal preprocessing method is related to the spectra and the predictive components. Furthermore, it is an effective way to select the optimal pretreatment method by sorting and combining the existing preprocessing methods according to the preprocessing purpose.

第五鹏瑶, 卞希慧, 王姿方, 刘巍. 光谱预处理方法选择研究[J]. 光谱学与光谱分析, 2019, 39(9): 2800. DIWU Peng-yao, BIAN Xi-hui, WANG Zi-fang, LIU Wei. Study on the Selection of Spectral Preprocessing Methods[J]. Spectroscopy and Spectral Analysis, 2019, 39(9): 2800.

本文已被 9 篇论文引用
被引统计数据来源于中国光学期刊网
引用该论文: TXT   |   EndNote

相关论文

加载中...

关于本站 Cookie 的使用提示

中国光学期刊网使用基于 cookie 的技术来更好地为您提供各项服务,点击此处了解我们的隐私策略。 如您需继续使用本网站,请您授权我们使用本地 cookie 来保存部分信息。
全站搜索
您最值得信赖的光电行业旗舰网络服务平台!