Research Paper:
Applying LSTM and GRU Methods to Recognize and Interpret Hand Gestures, Poses, and Face-Based Sign Language in Real Time
Amil Ahmad Ilham*,
, Ingrid Nurtanio*
, Ridwang**
, and Syafaruddin***

*Department of Informatics, Universitas Hasanuddin
Jl. Poros Malino Km. 6, Bontomarannu, Gowa, Sulawesi Selatan 92171, Indonesia
Corresponding author
**Department of Electrical Engineering, Universitas Muhammadiyah Makassar
Jl. Sultan Alauddin No.259, Makassar, Sulawesi Selatan 90221, Indonesia
***Department of Electrical Engineering, Universitas Hasanuddin
Jl. Poros Malino Km. 6, Bontomarannu, Gowa, Sulawesi Selatan 92171, Indonesia
This research uses a real-time, human-computer interaction application to examine sign language recognition. This work develops a rule-based hand gesture approach for Indonesian sign language in order to interpret some words using a combination of hand movements, mimics, and poses. The main objective in this study is the recognition of sign language that is based on hand movements made in front of the body with one or two hands, movements which may involve switching between the left and right hand or may be combined with mimics and poses. To overcome this problem, a research framework is developed by coordinating hand gestures with poses and mimics to create features by using holistic MediaPipe. To train and test data in real time, the long short time memory (LSTM) and gated recurrent unit (GRU) approaches are used. The research findings presented in this paper show that hand gestures in real-time interactions are reliably recognized, and some words are interpreted with the high accuracy rates of 94% and 96% for the LSTM and GRU methods, respectively.
- [1] E. Ayodele et al., “A review of deep learning approaches in glove-based gesture classification,” P. Kumar, Y. Kumar, and M. A. Tawhid (Eds.), “Machine Learning, Big Data, and IoT for Medical Informatics,” pp. 143-164, Academic Press, 2021. https://doi.org/10.1016/B978-0-12-821777-1.00012-4
- [2] J. P. Sahoo, S. Ari, and S. K. Patra, “A user independent hand gesture recognition system using deep CNN feature fusion and machine learning technique,” S. Chakraverty (Ed.), “New Paradigms in Computational Modeling and its Applications,” pp. 189-207, Academic Press, 2021. https://doi.org/10.1016/B978-0-12-822133-4.00011-6
- [3] I. A. Adeyanju, O. O. Bello, and M. A. Adegboye, “Machine learning methods for sign language recognition: A critical review and analysis,” Intelligent Systems with Applications, Vol.12, Article No.200056, 2021. https://doi.org/10.1016/j.iswa.2021.200056
- [4] P. K. Athira, C. J. Sruthi, and A. Lijiya, “A signer independent sign language recognition with co-articulation elimination from live videos: An Indian scenario,” J. of King Saud University – Computer and Information Sciences, Vol.34, No.3, pp. 771-781, 2022. https://doi.org/10.1016/j.jksuci.2019.05.002
- [5] S. Sharma and S. Singh, “Vision-based hand gesture recognition using deep learning for the interpretation of sign language,” Expert Systems with Applications, Vol.182, Article No.115657, 2021. https://doi.org/10.1016/j.eswa.2021.115657
- [6] N. M. Kakoty and M. D. Sharma, “Recognition of sign language alphabets and numbers based on hand kinematics using a data glove,” Procedia Computer Science, Vol.133, pp. 55-62, 2018. https://doi.org/10.1016/j.procs.2018.07.008
- [7] C. Chansri and J. Srinonchat, “Hand gesture recognition for Thai Sign Language in complex background using fusion of depth and color video,” Procedia Computer Science, Vol.86, pp. 257-260, 2016. https://doi.org/10.1016/j.procs.2016.05.113
- [8] M. A. Hammadi et al., “Hand gesture recognition for sign language using 3DCNN,” IEEE Access, Vol.8, pp. 79491-79509, 2020. https://doi.org/10.1109/ACCESS.2020.2990434
- [9] C. Millar, N. Siddique, and E. Kerr, “LSTM network classification of dexterous individual finger movements,” J. Adv. Comput. Intell. Intell. Inform., Vol.26, No.2, pp. 113-124, 2022. https://doi.org/10.20965/jaciii.2022.p0113
- [10] W. Abdul et al., “Intelligent real-time Arabic sign language classification using attention-based inception and BiLSTM,” Computers and Electrical Engineering, Vol.95, Article No.107395, 2021. https://doi.org/10.1016/j.compeleceng.2021.107395
- [11] Q. Xiao et al., “Multi-information spatial–temporal LSTM fusion continuous sign language neural machine translation,” IEEE Access, Vol.8, pp. 216718-216728, 2020. https://doi.org/10.1109/ACCESS.2020.3039539
- [12] Q. Zhang et al., “A novel fault diagnosis method based on stacked LSTM,” IFAC-PapersOnLine, Vol.53, No.2, pp. 790-795, 2020. https://doi.org/10.1016/j.ifacol.2020.12.832
- [13] M. Z. Islam, M. M. Islam, and A. Asraf, “A combined deep CNN-LSTM network for the detection of novel coronavirus (COVID-19) using X-ray images,” Informatics in Medicine Unlocked, Vol.20, Article No.100412, 2020. https://doi.org/10.1016/j.imu.2020.100412
- [14] A. Sherstinsky, “Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network,” Physica D: Nonlinear Phenomena, Vol.404, Article No.132306, 2020. https://doi.org/10.1016/j.physd.2019.132306
- [15] A. H. Manurung, W. Budiharto, and H. Prabowo, “Algorithm and modeling of stock prices forecasting based on long short-term memory (LSTM),” ICIC Express Letters, Vol.12, No.12, pp. 1277-1283, 2018. https://doi.org/10.24507/icicel.12.12.1277
- [16] T. Wahyono et al., “Enhanced LSTM multivariate time series forecasting for crop pest attack prediction,” ICIC Express Letters, Vol.14, No.10, pp. 943-949, 2020. https://doi.org/10.24507/icicel.14.10.943
- [17] Ridwang et al., “Deaf sign language translation system with pose and hand gesture detection under LSTM-sequence classification model,” ICIC Express Letters, Vol.17, No.7, pp. 809-816, 2023. https://doi.org/10.24507/icicel.17.07.809
- [18] E. Pan et al., “Spectral-spatial classification for hyperspectral image based on a single GRU,” Neurocomputing, Vol.387, pp. 150-160, 2020. https://doi.org/10.1016/j.neucom.2020.01.029
- [19] P.-S. Kim, D.-G. Lee, and S.-W. Lee, “Discriminative context learning with gated recurrent unit for group activity recognition,” Pattern Recognition, Vol.76, pp. 149-161, 2018. https://doi.org/10.1016/j.patcog.2017.10.037
- [20] P. Zhang et al., “Simulation model of vegetation dynamics by combining static and dynamic data using the gated recurrent unit neural network-based method,” Int. J. of Applied Earth Observation and Geoinformation, Vol.112, Article No.102901, 2022. https://doi.org/10.1016/j.jag.2022.102901
- [21] J. Liu, C. Wu, and J. Wang, “Gated recurrent units based neural network for time heterogeneous feedback recommendation,” Information Sciences, Vol.423, pp. 50-65, 2018. https://doi.org/10.1016/j.ins.2017.09.048
This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.