single-jc.php

JACIII Vol.28 No.2 pp. 265-272
doi: 10.20965/jaciii.2024.p0265
(2024)

Research Paper:

Applying LSTM and GRU Methods to Recognize and Interpret Hand Gestures, Poses, and Face-Based Sign Language in Real Time

Amil Ahmad Ilham*,† ORCID Icon, Ingrid Nurtanio* ORCID Icon, Ridwang** ORCID Icon, and Syafaruddin*** ORCID Icon

*Department of Informatics, Universitas Hasanuddin
Jl. Poros Malino Km. 6, Bontomarannu, Gowa, Sulawesi Selatan 92171, Indonesia

Corresponding author

**Department of Electrical Engineering, Universitas Muhammadiyah Makassar
Jl. Sultan Alauddin No.259, Makassar, Sulawesi Selatan 90221, Indonesia

***Department of Electrical Engineering, Universitas Hasanuddin
Jl. Poros Malino Km. 6, Bontomarannu, Gowa, Sulawesi Selatan 92171, Indonesia

Received:
July 25, 2023
Accepted:
October 12, 2023
Published:
March 20, 2024
Keywords:
hand gesture, sign language, long short time memory, gated recurrent unit, real time
Abstract

This research uses a real-time, human-computer interaction application to examine sign language recognition. This work develops a rule-based hand gesture approach for Indonesian sign language in order to interpret some words using a combination of hand movements, mimics, and poses. The main objective in this study is the recognition of sign language that is based on hand movements made in front of the body with one or two hands, movements which may involve switching between the left and right hand or may be combined with mimics and poses. To overcome this problem, a research framework is developed by coordinating hand gestures with poses and mimics to create features by using holistic MediaPipe. To train and test data in real time, the long short time memory (LSTM) and gated recurrent unit (GRU) approaches are used. The research findings presented in this paper show that hand gestures in real-time interactions are reliably recognized, and some words are interpreted with the high accuracy rates of 94% and 96% for the LSTM and GRU methods, respectively.

Cite this article as:
A. Ilham, I. Nurtanio, Ridwang, and Syafaruddin, “Applying LSTM and GRU Methods to Recognize and Interpret Hand Gestures, Poses, and Face-Based Sign Language in Real Time,” J. Adv. Comput. Intell. Intell. Inform., Vol.28 No.2, pp. 265-272, 2024.
Data files:
References
  1. [1] E. Ayodele et al., “A review of deep learning approaches in glove-based gesture classification,” P. Kumar, Y. Kumar, and M. A. Tawhid (Eds.), “Machine Learning, Big Data, and IoT for Medical Informatics,” pp. 143-164, Academic Press, 2021. https://doi.org/10.1016/B978-0-12-821777-1.00012-4
  2. [2] J. P. Sahoo, S. Ari, and S. K. Patra, “A user independent hand gesture recognition system using deep CNN feature fusion and machine learning technique,” S. Chakraverty (Ed.), “New Paradigms in Computational Modeling and its Applications,” pp. 189-207, Academic Press, 2021. https://doi.org/10.1016/B978-0-12-822133-4.00011-6
  3. [3] I. A. Adeyanju, O. O. Bello, and M. A. Adegboye, “Machine learning methods for sign language recognition: A critical review and analysis,” Intelligent Systems with Applications, Vol.12, Article No.200056, 2021. https://doi.org/10.1016/j.iswa.2021.200056
  4. [4] P. K. Athira, C. J. Sruthi, and A. Lijiya, “A signer independent sign language recognition with co-articulation elimination from live videos: An Indian scenario,” J. of King Saud University – Computer and Information Sciences, Vol.34, No.3, pp. 771-781, 2022. https://doi.org/10.1016/j.jksuci.2019.05.002
  5. [5] S. Sharma and S. Singh, “Vision-based hand gesture recognition using deep learning for the interpretation of sign language,” Expert Systems with Applications, Vol.182, Article No.115657, 2021. https://doi.org/10.1016/j.eswa.2021.115657
  6. [6] N. M. Kakoty and M. D. Sharma, “Recognition of sign language alphabets and numbers based on hand kinematics using a data glove,” Procedia Computer Science, Vol.133, pp. 55-62, 2018. https://doi.org/10.1016/j.procs.2018.07.008
  7. [7] C. Chansri and J. Srinonchat, “Hand gesture recognition for Thai Sign Language in complex background using fusion of depth and color video,” Procedia Computer Science, Vol.86, pp. 257-260, 2016. https://doi.org/10.1016/j.procs.2016.05.113
  8. [8] M. A. Hammadi et al., “Hand gesture recognition for sign language using 3DCNN,” IEEE Access, Vol.8, pp. 79491-79509, 2020. https://doi.org/10.1109/ACCESS.2020.2990434
  9. [9] C. Millar, N. Siddique, and E. Kerr, “LSTM network classification of dexterous individual finger movements,” J. Adv. Comput. Intell. Intell. Inform., Vol.26, No.2, pp. 113-124, 2022. https://doi.org/10.20965/jaciii.2022.p0113
  10. [10] W. Abdul et al., “Intelligent real-time Arabic sign language classification using attention-based inception and BiLSTM,” Computers and Electrical Engineering, Vol.95, Article No.107395, 2021. https://doi.org/10.1016/j.compeleceng.2021.107395
  11. [11] Q. Xiao et al., “Multi-information spatial–temporal LSTM fusion continuous sign language neural machine translation,” IEEE Access, Vol.8, pp. 216718-216728, 2020. https://doi.org/10.1109/ACCESS.2020.3039539
  12. [12] Q. Zhang et al., “A novel fault diagnosis method based on stacked LSTM,” IFAC-PapersOnLine, Vol.53, No.2, pp. 790-795, 2020. https://doi.org/10.1016/j.ifacol.2020.12.832
  13. [13] M. Z. Islam, M. M. Islam, and A. Asraf, “A combined deep CNN-LSTM network for the detection of novel coronavirus (COVID-19) using X-ray images,” Informatics in Medicine Unlocked, Vol.20, Article No.100412, 2020. https://doi.org/10.1016/j.imu.2020.100412
  14. [14] A. Sherstinsky, “Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network,” Physica D: Nonlinear Phenomena, Vol.404, Article No.132306, 2020. https://doi.org/10.1016/j.physd.2019.132306
  15. [15] A. H. Manurung, W. Budiharto, and H. Prabowo, “Algorithm and modeling of stock prices forecasting based on long short-term memory (LSTM),” ICIC Express Letters, Vol.12, No.12, pp. 1277-1283, 2018. https://doi.org/10.24507/icicel.12.12.1277
  16. [16] T. Wahyono et al., “Enhanced LSTM multivariate time series forecasting for crop pest attack prediction,” ICIC Express Letters, Vol.14, No.10, pp. 943-949, 2020. https://doi.org/10.24507/icicel.14.10.943
  17. [17] Ridwang et al., “Deaf sign language translation system with pose and hand gesture detection under LSTM-sequence classification model,” ICIC Express Letters, Vol.17, No.7, pp. 809-816, 2023. https://doi.org/10.24507/icicel.17.07.809
  18. [18] E. Pan et al., “Spectral-spatial classification for hyperspectral image based on a single GRU,” Neurocomputing, Vol.387, pp. 150-160, 2020. https://doi.org/10.1016/j.neucom.2020.01.029
  19. [19] P.-S. Kim, D.-G. Lee, and S.-W. Lee, “Discriminative context learning with gated recurrent unit for group activity recognition,” Pattern Recognition, Vol.76, pp. 149-161, 2018. https://doi.org/10.1016/j.patcog.2017.10.037
  20. [20] P. Zhang et al., “Simulation model of vegetation dynamics by combining static and dynamic data using the gated recurrent unit neural network-based method,” Int. J. of Applied Earth Observation and Geoinformation, Vol.112, Article No.102901, 2022. https://doi.org/10.1016/j.jag.2022.102901
  21. [21] J. Liu, C. Wu, and J. Wang, “Gated recurrent units based neural network for time heterogeneous feedback recommendation,” Information Sciences, Vol.423, pp. 50-65, 2018. https://doi.org/10.1016/j.ins.2017.09.048

*This site is desgined based on HTML5 and CSS3 for modern browsers, e.g. Chrome, Firefox, Safari, Edge, Opera.

Last updated on Jul. 19, 2024