液晶与显示, 2019, 34 (8): 825, 网络出版: 2019-10-12
基于注意力掩模融合的目标检测算法
Object detection algorithm based on attention mask fusion
计算机视觉 目标检测 注意力掩模 特征金字塔 多尺度检测 computer vision object detection attention mask feature pyramid multiscale detection
摘要
在计算机视觉任务中, 平衡目标检测的精度与速度对于后续的实际应用如目标跟踪和识别起到关键作用。基于此, 提出了一种基于注意力掩模融合的目标检测算法。首先, 通过VGG网络提取特征, 经过初步二分类和回归后得到一系列预选框;然后, 将上述预选框输入到特征金字塔结构中, 通过构建注意力掩模模块自适应地学习有效特征, 同时融合特征金字塔结构与注意力掩模模块得到更具表征性的特征; 最后经过多分类和回归得到多尺度的检测结果。在PASCAL VOC 2007和PASCAL VOC 2012数据集上展开了实验, 测试集结果显示, 在交集并集比(IOU)为0.5的条件下, 对于320×320的图片输入, 平均精度均值(mAP)分别为81.0%和79.0%, 检测速度为60.9 fps。本文算法将注意力信息结合到目标检测中, 实现了通用目标检测的精度和速度均衡。
Abstract
In computer vision tasks, balancing the accuracy and speed of object detection plays a significant role in subsequent practical applications such as object tracking and recognition. An object detection algorithm based on attention mask fusion is proposed. Firstly, the VGG network is used to extract features, and a series of preselected boxes are obtained after preliminary regression and binary classification. Then, the preselected boxes are input into the feature pyramid structure, learning effective features adaptively by constructing the attention mask module, and more representational features are gotten by integrating the feature pyramid structure and the attention mask module. Finally, the multiscale detection results are obtained by multiple classification and regression. Experiments are conducted on the data sets of PASCAL VOC 2007 and PASCAL VOC 2012. Test set results show that under the condition that Intersection over Union(IOU) is 0.5, the mean average precision(mAP) for the image input of 320×320 is 81.0% and 79.0% respectively, and the detection speed is 60.9 fps, realizing the balance between precision and speed. In this paper, the attention information is integrated into object detection to achieve the balance of accuracy and speed of generic object detection.
董潇潇, 何小海, 吴晓红, 卿粼波, 滕奇志. 基于注意力掩模融合的目标检测算法[J]. 液晶与显示, 2019, 34(8): 825. DONG Xiao-xiao, HE Xiao-hai, WU Xiao-hong, QING Lin-bo, TENG Qi-zhi. Object detection algorithm based on attention mask fusion[J]. Chinese Journal of Liquid Crystals and Displays, 2019, 34(8): 825.