红外与激光工程, 2018, 47 (2): 0203001, 网络出版: 2018-04-26   

基于双模全卷积网络的行人检测算法(特邀)

Pedestrian detection algorithm based on dual-model fused fully convolutional networks(Invited)
罗海波 1,2,3,4,*何淼 1,2,3,4惠斌 1,3,4常铮 1,3,4
作者单位
1 中国科学院沈阳自动化研究所, 辽宁 沈阳 110016
2 中国科学院大学, 北京 100049
3 中国科学院光电信息处理重点实验室, 辽宁 沈阳 110016
4 辽宁省图像理解与视觉计算重点实验室, 辽宁 沈阳 110016
摘要
在近距离行人检测任务中, 平衡算法的检测精度与检测速度对于检测算法的实际应用有着重要意义。为了快速并准确地检测出近景行人目标, 提出了一种基于模型融合全卷积网络的行人检测算法。首先,通过全卷积检测网络对图像中的目标进行检测, 得到一系列候选框; 其次, 通过弱监督训练的语义分割网络得到图像的像素级分类结果; 最后, 将候选框与像素级分类结果融合, 完成检测。实验结果表明: 算法在检测速度与精度方面都具有较高的性能。
Abstract
In the task of close range pedestrian detection, the balance of the precision and speed were of great significance to the practical application of the detection algorithm. In order to detect the close range target quickly and accurately, a pedestrian detection algorithm based on fused fully convolutional network was proposed. Firstly, a fully convolutional detection network was used to detect the target in the image, and a series of candidate bounding boxes were obtained. Secondly, pixel level classification results of the image were obtained by using a semantic segmentation network with weakly supervised training. Finally, the candidate bounding boxes and the pixel level classification results were fused to complete the detection. The experimental results show that the algorithm has good performance in both the speed and the precision of detection.
参考文献

[1] Benenson R, Omran M, Hosang J, et al. Ten years of pedestrian detection, what have we learned[C]//European Conference on Computer Vision, 2014: 613-627.

[2] Zhang Difei, Zhang Jinsuo, Yao Keming, et al. Infrared ship- target recognition based on SVM classification[J]. Infrared and Laser Engineering, 2016, 45(1): 0104004. (in Chinese)

[3] Yosinski J, Clune J, Bengio Y, et al. How transferable are features in deep neural networks[C]//Advances in Neural Information Processing Systems, 2014: 3320-3328.

[4] Luo Haibo, Xu Lingyun, Hui Bin, et al. Status and prospect of target tracking based on deep learning[J]. Infrared and Laser Engineering, 2017, 46(5): 0502002. (in Chinese)

[5] Viola P, Jones M J. Robust real-time face detection[J]. International Journal of Computer Vision, 2004, 57(2): 137-154.

[6] Dalal N, Triggs B. Histograms of oriented gradients for human detection[C]//Computer Vision and Pattern Recognition, 2005. IEEE Computer Society Conference on. IEEE, 1: 886-893.

[7] Dollár P, Appel R, Belongie S, et al. Fast feature pyramids for object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(8): 1532-1545.

[8] Sermanet P, Eigen D, Zhang X, et al. Overfeat: Integrated recognition, localization and detection using convolutional networks[C]//International Conference on Learning Representations, 2014, arXiv preprint arXiv: 1312.6229v4.

[9] Ouyang W, Wang X. Joint deep learning for pedestrian detection[C]//Proceedings of the IEEE International Conference on Computer Vision. 2013: 2056-2063.

[10] Ren S, He K, Girshick R, et al. Faster r-cnn: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.

[11] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[C]//Computer Vision and Pattern Recognition, 2014, arXiv preprint arXiv: 1409.1556.

[12] Angelova A, Krizhevsky A, Vanhoucke V, et al. Real-time pedestrian detection with deep network cascades[C]//BMVC, 2015, 32: 1-12.

[13] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788.

[14] Shelhamer E, Long J, Darrell T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4): 640-651.

[15] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 770-778.

[16] Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]//Computer Vision and Pattern Recognition, 2017, arXiv preprint arXiv: 1708.02002.

[17] Khoreva A, Benenson R, Hosang J, et al. Simple does it: Weakly supervised instance and semantic segmentation[C]// Computer Vision and Pattern Recognition, 2016, arXiv preprint arXiv: 1603.07485.

罗海波, 何淼, 惠斌, 常铮. 基于双模全卷积网络的行人检测算法(特邀)[J]. 红外与激光工程, 2018, 47(2): 0203001. Luo Haibo, He Miao, Hui Bin, Chang Zheng. Pedestrian detection algorithm based on dual-model fused fully convolutional networks(Invited)[J]. Infrared and Laser Engineering, 2018, 47(2): 0203001.

本文已被 6 篇论文引用
被引统计数据来源于中国光学期刊网
引用该论文: TXT   |   EndNote

相关论文

加载中...

关于本站 Cookie 的使用提示

中国光学期刊网使用基于 cookie 的技术来更好地为您提供各项服务,点击此处了解我们的隐私策略。 如您需继续使用本网站,请您授权我们使用本地 cookie 来保存部分信息。
全站搜索
您最值得信赖的光电行业旗舰网络服务平台!