激光与光电子学进展, 2020, 57 (20): 201506, 网络出版: 2020-10-17
自适应融合RGB和骨骼特征的行为识别 下载: 1114次
Action Recognition Based on Adaptive Fusion of RGB and Skeleton Features
机器视觉 行为识别 姿态估计 自适应权重计算网络 长短时记忆网络 自注意力 machine vision action recognition pose estimation adaptive weight computing network long short-term memory network self-attention
摘要
传统的基于RGB和骨骼特征的行为识别算法,普遍存在两种特征互补性不足及视频关键时序性不强等问题。为解决这一问题,提出一种自适应融合RGB和骨骼特征的行为识别算法。首先,面向RGB图像和骨骼图像,联合双向长短时记忆(LSTM)网络和自注意力机制提取两者的时空特征;然后,构建自适应权重计算网络(AWCN),并以两者的空间特征为输入计算出自适应权重;最后,利用自适应权重得到上述时空特征的融合特征,实现了最终的动作分类。通过在Penn Action、JHMDB和NTU RGB-D人体行为数据集上与现有的方法进行比较,实验结果表明,本文算法有效地提高了行为识别精度。
Abstract
In this paper, we proposed an action recognition algorithm based on the adaptive fusion of RGB and skeleton features to efficiently improve the accuracy of action recognition. However, the conventional action recognition algorithms based on RGB and skeleton features generally suffer from the inability to effectively utilize the complementarity of the two features and thus fail to focus on important frames in the video. Considering this, we first used the bidirectional long short-term memory network (Bi-LSTM) combined with a self-attention mechanism to extract spatial-temporal features of RGB and skeleton images. Next, we constructed an adaptive weight computing network (AWCN) and computed these adaptive weights based on the spatial features of two types of images. Finally, the extracted spatial-temporal features were fused by the adaptive weights to implement action recognition. Using Penn Action, JHMDB, and NTU RGB-D dataset, the experimental results show that our proposed algorithm effectively improves the accuracy of action recognition compared with existing methods.
郭伏正, 孔军, 蒋敏. 自适应融合RGB和骨骼特征的行为识别[J]. 激光与光电子学进展, 2020, 57(20): 201506. Fuzheng Guo, Jun Kong, Min Jiang. Action Recognition Based on Adaptive Fusion of RGB and Skeleton Features[J]. Laser & Optoelectronics Progress, 2020, 57(20): 201506.