光学学报, 2020, 40 (5): 0504001, 网络出版: 2020-03-10   

基于深度注意力机制的多尺度红外行人检测 下载: 1430次

Multi-Scale Infrared Pedestrian Detection Based on Deep Attention Mechanism
作者单位
陆军工程大学石家庄校区电子与光学工程系, 河北 石家庄 050003
摘要
针对多尺度目标检测问题,提出一种基于深度注意力机制的多尺度红外行人检测方法。首先,选取较为轻量级的Darknet53作为深度卷积特征提取的主干网络,设计四尺度的特征金字塔网络负责目标的定位和分类,通过引入更低层高分辨率的特征图来改善对小尺度行人目标的检测性能。其次,利用注意力模块替代特征金字塔网络中传统的上采样模块,生成基于卷积特征的局部显著图,可以有效抑制不相关区域的特征响应,突出图像局部特性。最后,利用Caltech行人数据集和U-FOV红外行人数据集进行两次迁移训练,以提高模型的泛化能力,丰富行人的样本特征。实验结果表明,所提方法在U-FOV数据集上的识别平均准确率达到了93.45%,比YOLOv3高26.74个百分点,能检测到的最小行人像素为6×13。在LTIR数据集上的定性实验结果验证,所提模型具有良好的泛化能力,适用于多尺度红外行人的检测。
Abstract
In this paper, for multi-scale target detection, a multi-scale infrared pedestrian detection method based on deep attention mechanism is proposed. The lightweight Darknet53 is adopted as the backbone network for deep convolutional features extracting, and a four-scale feature pyramid network is constructed to classify and localize objects. The detection performance with respect to small-scale objects such as pedestrians is improved by introducing low-level and high-resolution feature maps. Furthermore, an attention module is designed to replace the traditional upsampling block in the feature pyramid network, which generate local saliency map based on convolution feature, thus suppress the feature responses of unrelated areas and highlight the local feature of the image. Finally, the Caltech pedestrian and U-FOV infrared pedestrian datasets are used to execute two-step transfer learning to ensure the generalization of the proposed model and improve the pedestrian features. The results show that the average precision of the proposed method is 93.45% on the U-FOV dataset, which is 26.74 percentage higher than that obtained using YOLOv3, and the minimum pixel size of the pedestrian that can be detected is 6×13. In addition, the qualitative experiment results obtained using the LTIR dataset validate the good generalization of the proposed model, which makes it suitable for multi-scale infrared pedestrian detection.

赵斌, 王春平, 付强, 陈一超. 基于深度注意力机制的多尺度红外行人检测[J]. 光学学报, 2020, 40(5): 0504001. Bin Zhao, Chunping Wang, Qiang Fu, Yichao Chen. Multi-Scale Infrared Pedestrian Detection Based on Deep Attention Mechanism[J]. Acta Optica Sinica, 2020, 40(5): 0504001.

本文已被 16 篇论文引用
被引统计数据来源于中国光学期刊网
引用该论文: TXT   |   EndNote

相关论文

加载中...

关于本站 Cookie 的使用提示

中国光学期刊网使用基于 cookie 的技术来更好地为您提供各项服务,点击此处了解我们的隐私策略。 如您需继续使用本网站,请您授权我们使用本地 cookie 来保存部分信息。
全站搜索
您最值得信赖的光电行业旗舰网络服务平台!