Visual Gesture Recognition Based on Spatial-Temporal Features and Channel Attention

HE Jian; LIU Yan; ZU Tianqi

doi:10.11936/bjutxb2020120028

HE Jian, LIU Yan, ZU Tianqi. Visual Gesture Recognition Based on Spatial-Temporal Features and Channel Attention[J]. Journal of Beijing University of Technology, 2021, 47(8): 824-832. DOI: 10.11936/bjutxb2020120028

Citation:

Visual Gesture Recognition Based on Spatial-Temporal Features and Channel Attention

Graphical Abstract

Graphical Abstract

Abstract

Abstract

To solve problems of insufficient detection of dynamic gesture key frames and hand contour features in two-stream fusion network, a dynamic gesture recognition method was proposed in this paper based on the fusion of spatial-temporal features and channel attention. First, the efficient channel attention (ECA) was introduced into the two-stream fusion network to enhance the attention of key frames of gestures, the spatial convolutional network and the temporal convolutional network of two-stream were used to extract spatial and temporal features of dynamic gestures. Second, the gesture frame with the highest attention in the spatial network was selected by ECA, and single shot multibox detector (SSD) was used to extract the hand contour features. Finally, hand contour features were integrated with body posture features and temporal features were extracted from two-stream to recognize gestures. The method proposed in this paper was verified on Chalearn 2013 multi-modal sign language recognition dataset, with an accuracy rate of 66.23%. Compared with the previous two-stream methods which only RGB information from this dataset was adopted, it achieves a better gesture recognition effect.

FullText(HTML)

References (28)

Cited By

Turn off MathJax

Article Contents

Visual Gesture Recognition Based on Spatial-Temporal Features and Channel Attention

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content