Research Paper:
Egocentric Behavior Analysis Based on Object Relationship Extraction with Graph Transfer Learning for Cognitive Rehabilitation Support
Adnan Rachmat Anom Besari*,**,
, Fernando Ardilla*,**
, Azhar Aulia Saputra*
, Kurnianingsih***
, Takenori Obo*
, and Naoyuki Kubota*

*Department of Mechanical Systems Engineering, Faculty of Systems Design, Tokyo Metropolitan University
6-6 Asahigaoka, Hino, Tokyo 191-0065, Japan
**Department of Information and Computer Engineering, Politeknik Elektronika Negeri Surabaya
PENS Campus, Raya ITS, Keputih, Sukolilo, Surabaya, East Java 60111, Indonesia
***Department of Electrical Engineering, Politeknik Negeri Semarang
Polines Campus, Prof. H. Soedarto, S.H., Tembalang, Semarang 50275, Indonesia
Corresponding author
Recognizing human behavior is essential for early interventions in cognitive rehabilitation, particularly for older adults. Traditional methods often focus on improving third-person vision but overlook the importance of human visual attention during object interactions. This study introduces an egocentric behavior analysis (EBA) framework that uses transfer learning to analyze object relationships. Egocentric vision is used to extract features from hand movements, object detection, and visual attention. These features are then used to validate hand-object interactions (HOI) and describe human activities involving multiple objects. The proposed method employs graph attention networks (GATs) with transfer learning, achieving 97% accuracy in categorizing various activities while reducing computation time. These findings suggest that integrating the EBA with advanced machine learning methods could revolutionize cognitive rehabilitation by offering more personalized and efficient interventions. Future research can explore real-world applications of this approach, potentially improving the quality of life for older adults through better cognitive health monitoring.
- [1] J. J. Zhang, F.-Y. Wang, X. Wang, G. Xiong, F. Zhu, Y. Lv et al., “Cyber-physical-social systems: The state of the art and perspectives,” IEEE Trans. Comput Soc. Syst., Vol.5, No.3, pp. 829-840, 2018. https://doi.org/10.1109/TCSS.2018.2861224
- [2] A. L. Faria, A. Andrade, L. Soares, and S. B. I Badia, “Benefits of virtual reality based cognitive rehabilitation through simulated activities of daily living: A randomized controlled trial with stroke patients,” J. NeuroEngineering Rehabil., Vol.13, No.1, 96, 2016. https://doi.org/10.1186/s12984-016-0204-z
- [3] A. R. A. Besari, A. A. Saputra, W. H. Chin, Kurnianingsih, and N. Kubota, “Finger joint angle estimation with visual attention for rehabilitation support: A case study of the chopsticks manipulation test,” IEEE Access, Vol.10, pp. 91316-91331, 2022. https://doi.org/10.1109/ACCESS.2022.3201894
- [4] M. F. Tsai, R. H. Wang, and J. Zariffa, “Identifying hand use and hand roles after stroke using egocentric video,” IEEE J. Transl. Eng. Health Med., Vol.9, Article No.2100510, 2021. https://doi.org/10.1109/jtehm.2021.3072347
- [5] A. Bandini and J. Zariffa, “Analysis of the hands in egocentric vision: A survey,” IEEE Trans. Pattern Anal. Mach. Intell., Vol.45, No.6, pp. 6846-6866, 2020. https://doi.org/10.1109/TPAMI.2020.2986648
- [6] Y. Prior, A. Tennant, S. Tyson, I. Kjeken, and A. Hammond, “Measure of activity performance of the hand (MAP-Hand) questionnaire: linguistic validation, cultural adaptation and psychometric testing in people with rheumatoid arthritis in the UK,” BMC Musculoskeletal Disord., Vol.19, Article No.275, 2018. https://doi.org/10.1186/s12891-018-2177-5
- [7] T. M. Smith, M. R. Pappadis, S. Krishnan, and T. A. Reistetter, “Stroke survivor and caregiver perspectives on post-stroke visual concerns and long-term consequences,” Behav. Neurol., Vol.2018, Article No.1463429, 2018. https://doi.org/10.1155/2018/1463429
- [8] A. Rashid and O. Hasan, “Wearable technologies for hand joints monitoring for rehabilitation: A survey,” Microelectronics J., Vol.88, pp. 173-183, 2019. https://doi.org/10.1016/j.mejo.2018.01.014
- [9] S. Cha, J. Ainooson, E. Chong, I. Soulières, J. M. Rehg, and M. Kunda, “Enhancing cognitive assessment through multimodal sensing: A case study using the block design test,” Proc. of the 42nd Annual Meeting of the Cognitive Science Society, 2020. https://par.nsf.gov/biblio/10209949-enhancing-cognitive-assessment-through-multimodal-sensing-case-study-using-block-design-test [Accessed December 26, 2022]
- [10] G. Kapidis, R. Poppe, and R. C. Veltkamp, “Multi-dataset, multitask learning of egocentric vision tasks,” IEEE Trans. Pattern Anal. Mach. Intell., Vol.45, No.6, pp. 6618-6630, 2021. https://doi.org/10.1109/TPAMI.2021.3061479
- [11] C. Chalmers, P. Fergus, C. A. C. Montanez, S. Sikdar, F. Ball, and B. Kendall, “Detecting activities of daily living and routine behaviours in dementia patients living alone using smart meter load disaggregation,” IEEE Trans. Emerg. Topics Comput., Vol.10, No.1, pp. 157-169, 2022. https://doi.org/10.1109/TETC.2020.2993177
- [12] A. R. A. Besari, A. A. Saputra, W. H. Chin, N. Kubota, and Kurnianingsih, “Hand-object interaction detection based on visual attention for independent rehabilitation support,” 2022 Int. Joint Conf. on Neural Networks (IJCNN), 2022. https://doi.org/10.1109/IJCNN55064.2022.9892903
- [13] A. R. A. Besari, W. H. Chin, N. Kubota, and Kurnianingsih, “Ecological approach for object relationship extraction in elderly care robot,” 2020 21st Int. Conf. on Research and Education in Mechatronics (REM), 2020. https://doi.org/10.1109/REM49740.2020.9313877
- [14] A. R. A. Besari, A. A. Saputra, W. H. Chin, K. Kurnianingsih, and N. Kubota, “Hand–object interaction recognition based on visual attention using multiscopic cyber-physical-social system,” Int. J. Adv. Intell. Informatics., Vol.9, No.2, pp. 187-205, 2023. https://doi.org/10.26555/ijain.v9i2.901
- [15] J. Terven and D. Cordova-Esparza, “A Comprehensive review of YOLO architectures in computer vision: From YOLOv1 to YOLOv8 and YOLO-NAS,” arXiv:2304.00501, 2023. https://doi.org/10.48550/arXiv.2304.00501
- [16] H. Fu, L. Wu, M. Jian, Y. Yang, and X. Wang, “MF-SORT: Simple online and realtime tracking with motion features,” Y. Zhao, N. Barnes, B. Chen, R. Westermann, X. Kong, and C. Lin (Eds.), “Image and Graphics: Lecture Notes in Computer Science,” Vol.11901, pp. 157-168, 2019. https://doi.org/10.1007/978-3-030-34120-6_13
- [17] A. R. A. Besari, A. A. Saputra, W. H. Chin, N. Kubota, and Kurnianingsih, “Feature-based egocentric grasp pose classification for expanding human-object interactions,” 2021 IEEE 30th Int. Symp. on Industrial Electronics (ISIE), 2021. https://doi.org/10.1109/ISIE45552.2021.9576369
- [18] S. Shin and W.-Y. Kim, “Skeleton-based dynamic hand gesture recognition using a part-based GRU-RNN for gesture-based interface,” IEEE Access, Vol.8, pp. 50236-50243, 2020. https://doi.org/10.1109/ACCESS.2020.2980128
- [19] A. R. A. Besari, A. A. Saputra, T. Obo, Kurnianingsih, and N. Kubota, “Multiscopic CPSS for independent block-design test based on hand–object interaction recognition with visual attention,” IEEE Access, Vol.11, pp. 58188-58208, 2023. https://doi.org/10.1109/ACCESS.2023.3282876
- [20] J. Tong and F. Wang, “Basketball sports posture recognition technology based on improved graph convolutional neural network,” J. Adv. Comput. Intell. Intell. Inform., Vol.28, No.3, pp. 552-561, 2024. https://doi.org/10.20965/jaciii.2024.p0552
- [21] P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio, “Graph attention networks,” arXiv:1710.10903, 2018. https://doi.org/10.48550/arXiv.1710.10903
- [22] S. Brody, U. Alon, and E. Yahav, “How attentive are graph attention networks?,” arXiv:2105.14491, 2022. https://doi.org/10.48550/arXiv.2105.14491
- [23] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez et al., “Attention is all you need,” arXiv:1706.03762, 2023. https://doi.org/10.48550/arXiv.1706.03762
- [24] M. Fey and J. E. Lenssen, “Fast graph representation learning with pytorch geometric,” ICLR, 2019.
- [25] J. Pennington, R. Socher, and C. Manning, “GloVe: Global vectors for word representation,” Proc. of the 2014 Conf. on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532-1543, 2014. https://doi.org/10.3115/v1/D14-1162
- [26] R. Ito, H. Nobuhara, and S. Kato, “Transfer learning method for object detection model using genetic algorithm,” J. Adv. Comput. Intell. Intell. Inform., Vol.26, No.5, pp. 776-783, 2022. https://doi.org/10.20965/jaciii.2022.p0776
- [27] N. Kooverjee, S. James, and T. van Zyl, “Investigating transfer learning in graph neural networks,” Electronics, Vol.11, No.8, 1202, 2022. https://doi.org/10.3390/electronics11081202
- [28] R. Tanaka, J. Woo, and N. Kubota, “Action acquisition method for constructing cognitive development system through instructed learning,” 2019 Int. Joint Conf. on Neural Networks (IJCNN), 2019. https://doi.org/10.1109/IJCNN.2019.8852180
- [29] K. Ota, K. Shirai, H. Miyao, and M. Maruyama, “Multimodal analogy-based image retrieval by improving semantic embeddings,” J. Adv. Comput. Intell. Intell. Inform, Vol.26, No.6, pp. 995-1003, 2022. https://doi.org/10.20965/jaciii.2022.p0995
- [30] R. Ni, H. Shibata, and Y. Takama, “Entity and entity type composition representation learning for knowledge graph completion,” J. Adv. Comput. Intell. Intell. Inform., Vol.27, No.6, pp. 1151-1158, 2023. http://dx.doi.org/10.20965/jaciii.2023.p1151
This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.