光学学报, 2020, 40 (5): 0504001, 网络出版: 2020-03-10
基于深度注意力机制的多尺度红外行人检测 下载: 1430次
Multi-Scale Infrared Pedestrian Detection Based on Deep Attention Mechanism
探测器 红外行人检测 卷积神经网络 超大视场 特征金字塔网络 注意力机制 detectors infrared pedestrian detection convolutional neural network ultrawide field of view feature pyramid network attention mechanism
摘要
针对多尺度目标检测问题,提出一种基于深度注意力机制的多尺度红外行人检测方法。首先,选取较为轻量级的Darknet53作为深度卷积特征提取的主干网络,设计四尺度的特征金字塔网络负责目标的定位和分类,通过引入更低层高分辨率的特征图来改善对小尺度行人目标的检测性能。其次,利用注意力模块替代特征金字塔网络中传统的上采样模块,生成基于卷积特征的局部显著图,可以有效抑制不相关区域的特征响应,突出图像局部特性。最后,利用Caltech行人数据集和U-FOV红外行人数据集进行两次迁移训练,以提高模型的泛化能力,丰富行人的样本特征。实验结果表明,所提方法在U-FOV数据集上的识别平均准确率达到了93.45%,比YOLOv3高26.74个百分点,能检测到的最小行人像素为6×13。在LTIR数据集上的定性实验结果验证,所提模型具有良好的泛化能力,适用于多尺度红外行人的检测。
Abstract
In this paper, for multi-scale target detection, a multi-scale infrared pedestrian detection method based on deep attention mechanism is proposed. The lightweight Darknet53 is adopted as the backbone network for deep convolutional features extracting, and a four-scale feature pyramid network is constructed to classify and localize objects. The detection performance with respect to small-scale objects such as pedestrians is improved by introducing low-level and high-resolution feature maps. Furthermore, an attention module is designed to replace the traditional upsampling block in the feature pyramid network, which generate local saliency map based on convolution feature, thus suppress the feature responses of unrelated areas and highlight the local feature of the image. Finally, the Caltech pedestrian and U-FOV infrared pedestrian datasets are used to execute two-step transfer learning to ensure the generalization of the proposed model and improve the pedestrian features. The results show that the average precision of the proposed method is 93.45% on the U-FOV dataset, which is 26.74 percentage higher than that obtained using YOLOv3, and the minimum pixel size of the pedestrian that can be detected is 6×13. In addition, the qualitative experiment results obtained using the LTIR dataset validate the good generalization of the proposed model, which makes it suitable for multi-scale infrared pedestrian detection.
赵斌, 王春平, 付强, 陈一超. 基于深度注意力机制的多尺度红外行人检测[J]. 光学学报, 2020, 40(5): 0504001. Bin Zhao, Chunping Wang, Qiang Fu, Yichao Chen. Multi-Scale Infrared Pedestrian Detection Based on Deep Attention Mechanism[J]. Acta Optica Sinica, 2020, 40(5): 0504001.