Shuffle Graph Convolutional Network for Skeleton-Based Action Recognition

Qiwei Yu; Yaping Dai; Kaoru Hirota; Shuai Shao; Wei Dai

doi:10.20965/jaciii.2023.p0790

single-jc.php

« previous

JACIII Vol.27 No.5 pp. 790-800

doi: 10.20965/jaciii.2023.p0790

(2023)

Research Paper:

Views over last 60 days: 3,772

Shuffle Graph Convolutional Network for Skeleton-Based Action Recognition

Qiwei Yu^, Yaping Dai^, Kaoru Hirota^, Shuai Shao^,†, and Wei Dai^**

^*School of Automation, Beijing Institute of Technology
No.5 Zhongguancun South Street, Haidian District, Beijing 100081, China

^†Corresponding author

^**River Security Technology Co., Ltd.
1520 Gu Mei Road, Xuhui District, Shanghai 200336, China

Received:

September 15, 2022

Accepted:

April 19, 2023

Published:

September 20, 2023

Keywords:

action recognition, convolutional network, shuffle graph convolution, skeleton data

Abstract

A shuffle graph convolutional network (Shuffle-GCN) is proposed to recognize human action by analyzing skeleton data. It uses channel split and channel shuffle operations to process multi-feature channels of skeleton data, which reduces the computational cost of graph convolution operation. Compared with the classical two-stream adaptive graph convolutional network model, the proposed method achieves a higher precision with 1/3 of the floating-point operations (FLOPs). Even more, a channel-level topology modeling method is designed to extract more motion information of human skeleton by learning the graph topology from different channels dynamically. The performance of Shuffle-GCN is tested under 56,880 action clips from the NTU RGB+D dataset with the accuracy 96.0% and the computational complexity 12.8 GFLOPs. The proposed method offers feasible solutions for developing practical applications of action recognition.

Structure of the shuffle graph convolution

Cite this article as:

Q. Yu, Y. Dai, K. Hirota, S. Shao, and W. Dai, “Shuffle Graph Convolutional Network for Skeleton-Based Action Recognition,” J. Adv. Comput. Intell. Intell. Inform., Vol.27 No.5, pp. 790-800, 2023.

Data files:

References

[1] F. Gu, M. Chung, M. Chignell, S. Valaee, B. Zhou, and X. Liu, “A Survey on Deep Learning for Human Activity Recognition,” ACM Computing Surveys (CSUR), Vol.54, No.8, Article No.177, 2021. https://doi.org/10.1145/3472290
[2] C. Bandi and U. Thomas, “Skeleton-Based Action Recognition for Human-Robot Interaction Using Self-Attention Mechanism,” Proc. of 2021 16th IEEE Int. Conf. on Automatic Face and Gesture Recognition (FG 2021), 2021. https://doi.org/10.1109/FG52635.2021.9666948
[3] J. Kim, “Efficient Human Action Recognition with Dual-Action Neural Networks for Virtual Sports Training,” Proc. of 2022 IEEE Int. Conf. on Consumer Electronics-Asia (ICCE-Asia), 2022. https://doi.org/10.1109/ICCE-Asia57006.2022.9954758
[4] D. Zhao and M. Zhi, “A review of action recognition methods based on skeleton data,” Proc. of 13th Int. Conf. on Graphics and Image Processing (ICGIP 2021), Vol.12083, 2022. https://doi.org/10.1117/12.2623195
[5] L. Feng, Y. Zhao, W. Zhao, and J. Tang, “A comparative review of graph convolutional networks for human skeleton-based action recognition,” Artificial Intelligence Review, Vol.55, No.5, pp. 4275-4305, 2022. https://doi.org/10.1007/s10462-021-10107-y
[6] X. Shen and Y. Ding, “Human skeleton representation for 3D action recognition based on complex network coding and LSTM,” J. of Visual Communication and Image Representation, Vol.82, Article No.103386, 2022. https://doi.org/10.1016/j.jvcir.2021.103386
[7] W. Ng, M. Zhang, and T. Wang, “Multi-localized sensitive autoencoder-attention-LSTM for skeleton-based action recognition,” IEEE Trans. on Multimedia, Vol.24, pp. 1678-1690, 2021. https://doi.org/10.1109/TMM.2021.3070127
[8] M. Naveenkumar and S. Domnic, “Spatio Temporal Joint Distance Maps for Skeleton-Based Action Recognition Using Convolutional Neural Networks,” Int. J. of Image and Graphics, Vol.21, No.05, Article No.2140001, 2021. https://doi.org/10.1142/S0219467821400015
[9] W. Ding, C. Ding, G. Li, and K. Liu, “Skeleton-based square grid for human action recognition with 3D convolutional neural network,” IEEE Access, Vol.9, pp. 54078-54089, 2021. https://doi.org/10.1109/ACCESS.2021.3059650
[10] S. Guan, H. Lu, L. Zhu, and G. Fang, “AFE-CNN: 3D Skeleton-based Action Recognition with Action Feature Enhancement,” Neurocomputing, Vol.514, pp. 256-267, 2022. https://doi.org/10.1016/j.neucom.2022.10.016
[11] S. Yan, Y. Xiong, and D. Lin, “Spatial temporal graph convolutional networks for skeleton-based action recognition,” Proc. of the AAAI Conf. on Artificial Intelligence, Vol.32, No.1, 2018. https://doi.org/10.1609/aaai.v32i1.12328
[12] L. Shi, Y. Zhang, J. Cheng, and H. Lu, “Two-stream adaptive graph convolutional networks for skeleton-based action recognition,” Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 12018-12027, 2019. https://doi.org/10.1109/CVPR.2019.01230
[13] D.-T. Pham, Q.-T. Pham, T.-L. Le, and H. Vu, “An Efficient Feature Fusion of Graph Convolutional Networks and its Application for Real-Time Traffic Control Gestures Recognition,” IEEE Access, Vol.9, pp. 121930-121943, 2021. https://doi.org/10.1109/ACCESS.2021.3109255
[14] D.-T. Pham, Q.-T. Pham, T.-T. Nguyen, T.-L. Le, and H. Vu, “A lightweight graph convolutional network for skeleton-based action recognition,” Multimedia Tools and Applications, Vol.82, pp. 3055-3079, 2022. https://doi.org/10.1007/s11042-022-13298-w
[15] Y. Chen, Z. Zhang, C. Yuan, B. Li, Y. Deng, and W. Hu, “Channel-wise topology refinement graph convolution for skeleton-based action recognition,” Proc. of 2021 IEEE/CVF Int. Conf. on Computer Vision (ICCV), pp. 13339-13348, 2021. https://doi.org/10.1109/ICCV48922.2021.01311
[16] W. Zhang, L. Zhou, and X. Qian, “Skeleton-based Action Recognition with Attention and Temporal Graph Convolutional Network,” Proc. of 2021 IEEE 6th Int. Conf. on Signal and Image Processing (ICSIP), pp. 19-23, 2021. https://doi.org/10.1109/ICSIP52628.2021.9688615
[17] L. Shi, Y. Zhang, J. Cheng, and H. Lu, “Skeleton-Based Action Recognition with Directed Graph Neural Networks,” Proc. of 2019 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 7904-7913, 2019. https://doi.org/10.1109/CVPR.2019.00810
[18] W. Peng, X. Hong, H. Chen, and G. Zhao, “Learning graph convolutional network for skeleton-based human action recognition by neural searching,” Proc. of the AAAI Conf. on Artificial Intelligence, Vol.34, No.03, 2020. https://doi.org/10.1609/aaai.v34i03.5652
[19] N. Ma, X. Zhang, H. Zheng, and J. Sun, “ShuffleNet v2: Practical guidelines for efficient CNN architecture design,” Proc. of the European Conf. on Computer Vision (ECCV 2018), pp. 122-138, 2018. https://doi.org/10.1007/978-3-030-01264-9_8
[20] A. Shahroudy, J. Liu, T. Ng, and G. Wang, “NTU RGB+D: A large scale dataset for 3D human activity analysis,” Proc. of 2016 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 1010-1019, 2016. https://doi.org/10.1109/CVPR.2016.115
[21] K. Xu, F. Ye, Q. Zhong, and D. Xie, “Topology-Aware Convolutional Neural Network for Efficient Skeleton-Based Action Recognition,” Proc. of the AAAI Conf. on Artificial Intelligence, Vol.36, No.3, pp. 2866-2874, 2022. https://doi.org/10.1609/aaai.v36i3.20191
[22] S. Jang, H. Lee, S. Cho, S. Woo, and S. Lee, “Ghost Graph Convolutional Network for Skeleton-Based Action Recognition,” Proc. of 2021 IEEE Int. Conf. on Consumer Electronics-Asia (ICCE-Asia), 2021. https://doi.org/10.1109/ICCE-Asia53811.2021.9641919
[23] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the inception architecture for computer vision,” Proc. of 2016 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 2818-2826, 2016. https://doi.org/10.1109/CVPR.2016.308
[24] D. Liu, H. Xu, J. Wang, Y. Lu, J. Kong, and M. Qi, “Adaptive Attention Memory Graph Convolutional Networks for Skeleton-Based Action Recognition,” Sensors, Vol.21, No.20, Article No.6761, 2021. https://doi.org/10.3390/s21206761
[25] Y.-F. Song, Z. Zhang, C. Shan, and L. Wang, “Richly activated graph convolutional network for robust skeleton-based action recognition,” IEEE Trans. on Circuits and Systems for Video Technology, Vol.31, No.5, pp. 1915-1925, 2020. https://doi.org/10.1109/TCSVT.2020.3015051

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.

[1] [1] F. Gu, M. Chung, M. Chignell, S. Valaee, B. Zhou, and X. Liu, “A Survey on Deep Learning for Human Activity Recognition,” ACM Computing Surveys (CSUR), Vol.54, No.8, Article No.177, 2021. https://doi.org/10.1145/3472290

[2] [2] C. Bandi and U. Thomas, “Skeleton-Based Action Recognition for Human-Robot Interaction Using Self-Attention Mechanism,” Proc. of 2021 16th IEEE Int. Conf. on Automatic Face and Gesture Recognition (FG 2021), 2021. https://doi.org/10.1109/FG52635.2021.9666948

[3] [3] J. Kim, “Efficient Human Action Recognition with Dual-Action Neural Networks for Virtual Sports Training,” Proc. of 2022 IEEE Int. Conf. on Consumer Electronics-Asia (ICCE-Asia), 2022. https://doi.org/10.1109/ICCE-Asia57006.2022.9954758

[4] [4] D. Zhao and M. Zhi, “A review of action recognition methods based on skeleton data,” Proc. of 13th Int. Conf. on Graphics and Image Processing (ICGIP 2021), Vol.12083, 2022. https://doi.org/10.1117/12.2623195

[5] [5] L. Feng, Y. Zhao, W. Zhao, and J. Tang, “A comparative review of graph convolutional networks for human skeleton-based action recognition,” Artificial Intelligence Review, Vol.55, No.5, pp. 4275-4305, 2022. https://doi.org/10.1007/s10462-021-10107-y

[6] [6] X. Shen and Y. Ding, “Human skeleton representation for 3D action recognition based on complex network coding and LSTM,” J. of Visual Communication and Image Representation, Vol.82, Article No.103386, 2022. https://doi.org/10.1016/j.jvcir.2021.103386

[7] [7] W. Ng, M. Zhang, and T. Wang, “Multi-localized sensitive autoencoder-attention-LSTM for skeleton-based action recognition,” IEEE Trans. on Multimedia, Vol.24, pp. 1678-1690, 2021. https://doi.org/10.1109/TMM.2021.3070127

[8] [8] M. Naveenkumar and S. Domnic, “Spatio Temporal Joint Distance Maps for Skeleton-Based Action Recognition Using Convolutional Neural Networks,” Int. J. of Image and Graphics, Vol.21, No.05, Article No.2140001, 2021. https://doi.org/10.1142/S0219467821400015

[9] [9] W. Ding, C. Ding, G. Li, and K. Liu, “Skeleton-based square grid for human action recognition with 3D convolutional neural network,” IEEE Access, Vol.9, pp. 54078-54089, 2021. https://doi.org/10.1109/ACCESS.2021.3059650

[10] [10] S. Guan, H. Lu, L. Zhu, and G. Fang, “AFE-CNN: 3D Skeleton-based Action Recognition with Action Feature Enhancement,” Neurocomputing, Vol.514, pp. 256-267, 2022. https://doi.org/10.1016/j.neucom.2022.10.016

[11] [11] S. Yan, Y. Xiong, and D. Lin, “Spatial temporal graph convolutional networks for skeleton-based action recognition,” Proc. of the AAAI Conf. on Artificial Intelligence, Vol.32, No.1, 2018. https://doi.org/10.1609/aaai.v32i1.12328

[12] [12] L. Shi, Y. Zhang, J. Cheng, and H. Lu, “Two-stream adaptive graph convolutional networks for skeleton-based action recognition,” Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 12018-12027, 2019. https://doi.org/10.1109/CVPR.2019.01230

[13] [13] D.-T. Pham, Q.-T. Pham, T.-L. Le, and H. Vu, “An Efficient Feature Fusion of Graph Convolutional Networks and its Application for Real-Time Traffic Control Gestures Recognition,” IEEE Access, Vol.9, pp. 121930-121943, 2021. https://doi.org/10.1109/ACCESS.2021.3109255

[14] [14] D.-T. Pham, Q.-T. Pham, T.-T. Nguyen, T.-L. Le, and H. Vu, “A lightweight graph convolutional network for skeleton-based action recognition,” Multimedia Tools and Applications, Vol.82, pp. 3055-3079, 2022. https://doi.org/10.1007/s11042-022-13298-w

[15] [15] Y. Chen, Z. Zhang, C. Yuan, B. Li, Y. Deng, and W. Hu, “Channel-wise topology refinement graph convolution for skeleton-based action recognition,” Proc. of 2021 IEEE/CVF Int. Conf. on Computer Vision (ICCV), pp. 13339-13348, 2021. https://doi.org/10.1109/ICCV48922.2021.01311

[16] [16] W. Zhang, L. Zhou, and X. Qian, “Skeleton-based Action Recognition with Attention and Temporal Graph Convolutional Network,” Proc. of 2021 IEEE 6th Int. Conf. on Signal and Image Processing (ICSIP), pp. 19-23, 2021. https://doi.org/10.1109/ICSIP52628.2021.9688615

[17] [17] L. Shi, Y. Zhang, J. Cheng, and H. Lu, “Skeleton-Based Action Recognition with Directed Graph Neural Networks,” Proc. of 2019 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 7904-7913, 2019. https://doi.org/10.1109/CVPR.2019.00810

[18] [18] W. Peng, X. Hong, H. Chen, and G. Zhao, “Learning graph convolutional network for skeleton-based human action recognition by neural searching,” Proc. of the AAAI Conf. on Artificial Intelligence, Vol.34, No.03, 2020. https://doi.org/10.1609/aaai.v34i03.5652

[19] [19] N. Ma, X. Zhang, H. Zheng, and J. Sun, “ShuffleNet v2: Practical guidelines for efficient CNN architecture design,” Proc. of the European Conf. on Computer Vision (ECCV 2018), pp. 122-138, 2018. https://doi.org/10.1007/978-3-030-01264-9_8

[20] [20] A. Shahroudy, J. Liu, T. Ng, and G. Wang, “NTU RGB+D: A large scale dataset for 3D human activity analysis,” Proc. of 2016 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 1010-1019, 2016. https://doi.org/10.1109/CVPR.2016.115

[21] [21] K. Xu, F. Ye, Q. Zhong, and D. Xie, “Topology-Aware Convolutional Neural Network for Efficient Skeleton-Based Action Recognition,” Proc. of the AAAI Conf. on Artificial Intelligence, Vol.36, No.3, pp. 2866-2874, 2022. https://doi.org/10.1609/aaai.v36i3.20191

[22] [22] S. Jang, H. Lee, S. Cho, S. Woo, and S. Lee, “Ghost Graph Convolutional Network for Skeleton-Based Action Recognition,” Proc. of 2021 IEEE Int. Conf. on Consumer Electronics-Asia (ICCE-Asia), 2021. https://doi.org/10.1109/ICCE-Asia53811.2021.9641919

[23] [23] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the inception architecture for computer vision,” Proc. of 2016 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 2818-2826, 2016. https://doi.org/10.1109/CVPR.2016.308

[24] [24] D. Liu, H. Xu, J. Wang, Y. Lu, J. Kong, and M. Qi, “Adaptive Attention Memory Graph Convolutional Networks for Skeleton-Based Action Recognition,” Sensors, Vol.21, No.20, Article No.6761, 2021. https://doi.org/10.3390/s21206761

[25] [25] Y.-F. Song, Z. Zhang, C. Shan, and L. Wang, “Richly activated graph convolutional network for robust skeleton-based action recognition,” IEEE Trans. on Circuits and Systems for Video Technology, Vol.31, No.5, pp. 1915-1925, 2020. https://doi.org/10.1109/TCSVT.2020.3015051

Shuffle Graph Convolutional Network for Skeleton-Based Action Recognition

Qiwei Yu*, Yaping Dai*, Kaoru Hirota*, Shuai Shao*,†, and Wei Dai**

Qiwei Yu^, Yaping Dai^, Kaoru Hirota^, Shuai Shao^,†, and Wei Dai^**