首页 > 论文 > 激光与光电子学进展 > 56卷 > 7期(pp:71505--1)

基于卷积神经网络与长短期记忆神经网络的多特征融合人体行为识别算法

Multi-Feature Fusion Human Behavior Recognition Algorithm Based on Convolutional Neural Network and Long Short Term Memory Neural Network

  • 摘要
  • 论文信息
  • 参考文献
  • 被引情况
  • PDF全文
分享:

摘要

提出了一种基于卷积神经网络和长短期记忆(LSTM)神经网络的深度学习网络结构。采用特征融合的方法,通过卷积网络提取出浅层特征与深层特征并进行联接,对特征通过卷积进行融合,将获得的矢量信息输入LSTM单元。分别使用数据光流信息与红绿蓝信息训练网络,将各网络的结果进行加权融合。实验结果表明,所提模型有效地提高了行为识别精度。

Abstract

A deep learning network structure based on the convolutional neural network and long short term memory (LSTM) neural network is proposed. The feature fusion is used to extract the shallow features and deep features through the convolutional network, and the features are fused by convolution, and the the obtained vector information is input into the LSTM unit. Networks are trained separately using the optical flow images and the red green blue information, and the results from each network are fused with weights. The experimental results show that the proposed model effectively improves the accuracy of behavior recognition.

Newport宣传-MKS新实验室计划
补充资料

中图分类号:TP183

DOI:10.3788/lop56.071505

所属栏目:机器视觉

基金项目:江西省教育厅科技项目(GJJ150683)、江西理工大学校级重点课题(NSFJ2014-K18)

收稿日期:2018-09-21

修改稿日期:2018-10-22

网络出版日期:2018-10-30

作者单位    点击查看

黄友文:江西理工大学信息工程学院, 江西 赣州 341000
万超伦:江西理工大学信息工程学院, 江西 赣州 341000
冯恒:江西理工大学信息工程学院, 江西 赣州 341000

联系人作者:万超伦(353382420@qq.com)

【1】Laptev I, Marszalek M, Schmid C, et al. Learning realistic human actions from movies[C]∥IEEE Conference on Computer Vision and Pattern Recognition, 2008: 1-8.

【2】Xu H Y, Kong J, Jiang M, et al. Action recognition based on histogram of spatio-temporal oriented principal components[J]. Laser & Optoelectronics Progress, 2018, 55(6): 061009.
徐海洋, 孔军, 蒋敏, 等. 基于时空方向主成分直方图的人体行为识别[J]. 激光与光电子学进展, 2018, 55(6): 061009.

【3】Zhao X J, Zeng X Q. Action recognition method based on dense optical flow trajectory and sparse coding algorithm[J]. Journal of Computer Applications, 2016, 36(1): 181-187.
赵晓健, 曾晓勤. 基于稠密光流轨迹和稀疏编码算法的行为识别方法[J]. 计算机应用, 2016, 36(1): 181-187.

【4】Xie F, Gong S R, Liu C P, et al. Human action recognition by visual word based on local and global features[J]. Computer Science, 2015, 42(11): 293-298.
谢飞, 龚声蓉, 刘纯平, 等. 基于局部和全局特征视觉单词的人物行为识别[J]. 计算机科学, 2015, 42(11): 293-298.

【5】Luo H L, Wang C J, Lu F. Survey of video behavior recognition[J]. Journal on Communications, 2018, 39(6): 169-180.
罗会兰, 王婵娟, 卢飞. 视频行为识别综述[J]. 通信学报, 2018, 39(6): 169-180.

【6】Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90.

【7】Chollet F. Xception: deep learning with depthwise separable convolutions[C]∥IEEE Conference on Computer Vision and Pattern Recognition, 2017: 1800-1807.

【8】Howard A G, Zhu M, Chen B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications[J]. arXiv preprint arXiv:1704.04861, 2017.

【9】Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780.

【10】Cai M, Liu J. Maxout neurons for deep convolutional and LSTM neural networks in speech recognition[J]. Speech Communication, 2016, 77: 53-64.

【11】Donahue J, Hendricks L A, Guadarrama S, et al. Long-term recurrent convolutional networks for visual recognition and description[C]∥IEEE Conference on Computer Vision and Pattern Recognition, 2015: 2625-2634.

【12】Wang L M, Xiong Y J, Wang Z, et al. Temporal segment networks: towards good practices for deep action recognition[M]. Cham: Springer International Publishing, 2016: 20-36.

【13】Ji S W, Xu W, Yang M, et al. 3D convolutional neural networks for human action recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(1): 221-231.

【14】Qin Y, Mo L F, Guo W K, et al. Combination of 3D CNNs and LSTMs and its application in activity recognition[J]. Measurement & Control Technology, 2017, 36(2): 28-32.
秦阳, 莫凌飞, 郭文科, 等. 3D CNNs与LSTMs在行为识别中的组合及其应用[J]. 测控技术, 2017, 36(2): 28-32.

【15】Huang G, Liu Z, Maaten L V D, et al. Densely connected convolutional networks[C]∥IEEE Conference on Computer Vision and Pattern Recognition, 2017: 2261-2269.

【16】Brox T, Bruhn A, Papenberg N, et al. High accuracy optical flow estimation based on a theory for warping[C]∥European Conference on Computer Vision, 2004: 25-36.

【17】Qu L, Wang K R, Chen L L, et al. Fast road detection based on RGBD images and convolutional neural network[J]. Acta Optica Sinica, 2017, 37(10): 101003.
曲磊, 王康如, 陈利利, 等. 基于RGBD图像和卷积神经网络的快速道路检测[J]. 光学学报, 2017, 37(10): 101003.

【18】Yang Y, Saleemi I, Shah M. Discovering motion primitives for unsupervised grouping and one-shot learning of human actions, gestures, and expressions[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(7): 1635-1648.

【19】Lu T R, Yu F Q, Yang H Z, et al. Human action recognition based on dense trajectories with saliency detection[J]. Computer Engineering and Applications, 2018, 54(14): 163-167.
鹿天然, 于凤芹, 杨慧中, 等. 基于显著性检测和稠密轨迹的人体行为识别[J]. 计算机工程与应用, 2018, 54(14): 163-167.

引用该论文

Huang Youwen,Wan Chaolun,Feng Heng. Multi-Feature Fusion Human Behavior Recognition Algorithm Based on Convolutional Neural Network and Long Short Term Memory Neural Network[J]. Laser & Optoelectronics Progress, 2019, 56(7): 071505

黄友文,万超伦,冯恒. 基于卷积神经网络与长短期记忆神经网络的多特征融合人体行为识别算法[J]. 激光与光电子学进展, 2019, 56(7): 071505

您的浏览器不支持PDF插件,请使用最新的(Chrome/Fire Fox等)浏览器.或者您还可以点击此处下载该论文PDF