激光与光电子学进展, 2019, 56 (12): 121004, 网络出版: 2019-06-13
基于三维挤压激励模块的视频分类 下载: 1004次
Video Classification Based on Three-Dimensional Squeeze Excitation Module
图像处理 信号处理 视频分类 挤压激励 三维卷积 残差网络 深度学习 image processing signal processing video classification squeeze excitation three-dimensional convolution residual network deep learning
摘要
针对视频分类中时序特征的融合问题,将二维卷积神经网络中的挤压激励(SE)网络与三维卷积残差网络相结合,提出了新的三维挤压激励网络结构模块,该模块比直接转化而来的三维挤压激励模块多了一个时间维度系数,时间维度系数记录了研究对象在时间轨迹上所进行的动作轨迹变化。新模块不仅可以记录某个时间点的特征,而且能够强化多个时间点的关联性。将具有时空纬度的挤压激励网络应用于人物的动作行为识别,检验了新模块的有效性。实验结果表明,新模块可加快损失收敛并有效提高视频分类精度。
Abstract
To address the fusion problem of time sequence features in video classification, this paper proposes a new three-dimensional (3D) squeezing excitation (SE) network structure module that is constructed by combining the SE network in a two-dimensional convolutional neural network (CNN) with a 3D convolutional residual network. The new module adds an extra time-dimension coefficient to the coefficient set of a directly transformed 3D SE module, allowing it to record the changes in the motion trajectories of the research objects on time trajectories. The proposed module can not only record the characteristics of a specific time point, but also strengthen the relevance of multiple time points. To assess the effectiveness of the module, an SE network with a spatial and temporal latitude was used to perform character-action-behavior recognition. The experimental results indicate that the module can accelerate the loss convergence and effectively improve the accuracy of video classification.
李宁孝, 王国栋, 王岩杰, 胡诗语, 王亮亮. 基于三维挤压激励模块的视频分类[J]. 激光与光电子学进展, 2019, 56(12): 121004. Ningxiao Li, Guodong Wang, Yanjie Wang, Shiyu Hu, Liangliang Wang. Video Classification Based on Three-Dimensional Squeeze Excitation Module[J]. Laser & Optoelectronics Progress, 2019, 56(12): 121004.