AST-MLEC: AST-Enhanced Multi-Label Error Classifier for Programming Education Support

Taku Matsumoto

doi:10.20965/jrm.2026.p0242

single-rb.php

« previous

JRM Vol.38 No.1 pp. 242-251

(2026)

doi: 10.20965/jrm.2026.p0242

Paper:

Views over last 60 days: 15

AST-MLEC: AST-Enhanced Multi-Label Error Classifier for Programming Education Support

Taku Matsumoto

Hokkaido University of Science
7-Jo 15-4-1 Maeda, Teine-ku, Sapporo, Hokkaido 006-8585, Japan

Received:

July 24, 2025

Accepted:

October 28, 2025

Published:

February 20, 2026

Keywords:

programming education, logical errors, automatic feedback generation, abstract syntax tree

Abstract

Programming education has become a vital component of the modern STEM and STEAM curricula. Among the challenges that learners face, logical error codes that are syntactically correct but semantically incorrect are particularly difficult to identify and correct without guidance. To address this issue, we propose an abstract syntax tree (AST)-enhanced multi-label error classifier (AST-MLEC). This Transformer-based model performs fine-grained detection and classification of logical errors at the AST node level. The architecture integrates syntactic structures and token-level semantics through multi-stream encoding with tree positional encoding, enabling the model to capture both structural and contextual cues. Evaluations of submissions from the Aizu Online Judge demonstrated strong performance, achieving an F1-score of 0.6621 and exhibiting high precision in error localization and classification. In contrast to conventional token-based models, AST-MLEC provides interpretable feedback by identifying “where” an error occurs, “what” type it is, and “how” to fix it. By supporting automated, explainable feedback for novice programmers, our approach aligns with the goals of STEAM education, fostering more profound understanding, critical thinking, and self-directed learning in programming.

Cite this article as:

T. Matsumoto, “AST-MLEC: AST-Enhanced Multi-Label Error Classifier for Programming Education Support,” J. Robot. Mechatron., Vol.38 No.1, pp. 242-251, 2026.

Data files:

References

[1] K. Kawada, K. Okamoto, T. Tamai, and Y. Ohnishi, “A study on developmentally appropriate programming education learning materials for lower-elementary school students,” J. Robot. Mechatron., Vol.31, No.3, pp. 441-451, 2019. https://doi.org/10.20965/jrm.2019.p0441
[2] T. Noguchi, H. Kajiwara, K. Chida, and S. Inamori, “Development of a programming teaching-aid robot with intuitive motion instruction set,” J. Robot. Mechatron., Vol.29, No.6, pp. 980-991, 2017. https://doi.org/10.20965/jrm.2017.p0980
[3] M. M. Rahman and Y. Watanobe, “Chatgpt for education and research: Opportunities, threats, and strategies,” Applied Sciences, Vol.13, No.9, Article No.5783, 2023. https://doi.org/10.3390/app13095783
[4] A. Shirafuji, T. Matsumoto, M. F. I. Amin, and Y. Watanobe, “Rule-based error classification for analyzing differences in frequent errors,” 2023 IEEE Int. Conf. on Teaching, Assessment and Learning for Engineering (TALE), 2023. https://doi.org/10.1109/TALE56641.2023.10398341
[5] S. Kawabayashi, M. M. Rahman, and Y. Watanobe, “A model for identifying frequent errors in incorrect solutions,” 2021 10th Int. Conf. on Educational and Information Technology (ICEIT), pp. 258-263, 2021. https://doi.org/10.1109/ICEIT51700.2021.9375615
[6] M. F. I. Amin, Y. Watanobe, M. M. Rahman, and A. Shirafuji, “Source code error understanding using bert for multi-label classification,” IEEE Access, Vol.13, pp. 3802-3822, 2025. https://doi.org/10.1109/ACCESS.2024.3525061
[7] J. Lee, D. Song, S. So, and H. Oh, “Automatic diagnosis and correction of logical errors for functional programming assignments,” Proc. ACM Program. Lang., Vol.2, No.OOPSLA, Article No.158, 2018. https://doi.org/10.1145/3276528
[8] J. Zhang, X. Wang, H. Zhang, H. Sun, K. Wang, and X. Liu, “A novel neural source code representation based on abstract syntax tree,” 2019 IEEE/ACM 41st Int. Conf. on Software Engineering (ICSE), pp. 783-794, 2019. https://doi.org/10.1109/ICSE.2019.00086
[9] M. Hoq, P. Brusilovsky, and B. Akram, “SANN: A subtree-based attention neural network model for student success prediction through source code analysis,” Proc. of the 6th Educational Data Mining in Computer Science Education (CSEDM) Workshop, 2022. https://doi.org/10.5281/zenodo.6983496
[10] K. S. Tai, R. Socher, and C. D. Manning, “Improved semantic representations from tree-structured long short-term memory networks,” arXiv:1503.00075, 2015. https://doi.org/10.48550/arXiv.1503.00075
[11] Z. Feng, D. Guo, D. Tang, N. Duan, X. Feng, M. Gong, L. Shou, B. Qin, T. Liu, D. Jiang et al., “CodeBERT: A pre-trained model for programming and natural languages,” T. Cohn, Y. He, and Y. Liu (Eds.), “Findings of the Association for Computational Linguistics: EMNLP 2020,” pp. 1536-1547, Association for Computational Linguistics, 2020. https://doi.org/10.18653/v1/2020.findings-emnlp.139
[12] Y. Wang, W. Wang, S. Joty, and S. C. Hoi, “CodeT5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation,” Proc. of the 2021 Conf. on Empirical Methods in Natural Language Processing, pp. 8696-8708, 2021. https://doi.org/10.18653/v1/2021.emnlp-main.685
[13] D. Guo, S. Lu, N. Duan, Y. Wang, M. Zhou, and J. Yin, “UniXcoder: Unified cross-modal pre-training for code representation,” Proc. of the 60th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 7212-7225, 2022.
[14] M. F. I. Amin, A. Shirafuji, M. M. Rahman, and Y. Watanobe, “Multi-label code error classification using codet5 and ML-KNN,” IEEE Access, Vol.12, pp. 100805-100820, 2024. https://doi.org/10.1109/ACCESS.2024.3430558
[15] Y. Watanobe, “Development and operation of an online judge system,” IPSJ Magazine, Vol.56, No.10, pp. 998-1005, 2015.
[16] I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” arXiv:1711.05101, 2017. https://doi.org/10.48550/arXiv.1711.05101
[17] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” arXiv:1708.02002, 2018. https://doi.org/10.48550/arXiv.1708.02002

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.

[1] [1] K. Kawada, K. Okamoto, T. Tamai, and Y. Ohnishi, “A study on developmentally appropriate programming education learning materials for lower-elementary school students,” J. Robot. Mechatron., Vol.31, No.3, pp. 441-451, 2019. https://doi.org/10.20965/jrm.2019.p0441

[2] [2] T. Noguchi, H. Kajiwara, K. Chida, and S. Inamori, “Development of a programming teaching-aid robot with intuitive motion instruction set,” J. Robot. Mechatron., Vol.29, No.6, pp. 980-991, 2017. https://doi.org/10.20965/jrm.2017.p0980

[3] [3] M. M. Rahman and Y. Watanobe, “Chatgpt for education and research: Opportunities, threats, and strategies,” Applied Sciences, Vol.13, No.9, Article No.5783, 2023. https://doi.org/10.3390/app13095783

[4] [4] A. Shirafuji, T. Matsumoto, M. F. I. Amin, and Y. Watanobe, “Rule-based error classification for analyzing differences in frequent errors,” 2023 IEEE Int. Conf. on Teaching, Assessment and Learning for Engineering (TALE), 2023. https://doi.org/10.1109/TALE56641.2023.10398341

[5] [5] S. Kawabayashi, M. M. Rahman, and Y. Watanobe, “A model for identifying frequent errors in incorrect solutions,” 2021 10th Int. Conf. on Educational and Information Technology (ICEIT), pp. 258-263, 2021. https://doi.org/10.1109/ICEIT51700.2021.9375615

[6] [6] M. F. I. Amin, Y. Watanobe, M. M. Rahman, and A. Shirafuji, “Source code error understanding using bert for multi-label classification,” IEEE Access, Vol.13, pp. 3802-3822, 2025. https://doi.org/10.1109/ACCESS.2024.3525061

[7] [7] J. Lee, D. Song, S. So, and H. Oh, “Automatic diagnosis and correction of logical errors for functional programming assignments,” Proc. ACM Program. Lang., Vol.2, No.OOPSLA, Article No.158, 2018. https://doi.org/10.1145/3276528

[8] [8] J. Zhang, X. Wang, H. Zhang, H. Sun, K. Wang, and X. Liu, “A novel neural source code representation based on abstract syntax tree,” 2019 IEEE/ACM 41st Int. Conf. on Software Engineering (ICSE), pp. 783-794, 2019. https://doi.org/10.1109/ICSE.2019.00086

[9] [9] M. Hoq, P. Brusilovsky, and B. Akram, “SANN: A subtree-based attention neural network model for student success prediction through source code analysis,” Proc. of the 6th Educational Data Mining in Computer Science Education (CSEDM) Workshop, 2022. https://doi.org/10.5281/zenodo.6983496

[10] [10] K. S. Tai, R. Socher, and C. D. Manning, “Improved semantic representations from tree-structured long short-term memory networks,” arXiv:1503.00075, 2015. https://doi.org/10.48550/arXiv.1503.00075

[11] [11] Z. Feng, D. Guo, D. Tang, N. Duan, X. Feng, M. Gong, L. Shou, B. Qin, T. Liu, D. Jiang et al., “CodeBERT: A pre-trained model for programming and natural languages,” T. Cohn, Y. He, and Y. Liu (Eds.), “Findings of the Association for Computational Linguistics: EMNLP 2020,” pp. 1536-1547, Association for Computational Linguistics, 2020. https://doi.org/10.18653/v1/2020.findings-emnlp.139

[12] [12] Y. Wang, W. Wang, S. Joty, and S. C. Hoi, “CodeT5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation,” Proc. of the 2021 Conf. on Empirical Methods in Natural Language Processing, pp. 8696-8708, 2021. https://doi.org/10.18653/v1/2021.emnlp-main.685

[13] [13] D. Guo, S. Lu, N. Duan, Y. Wang, M. Zhou, and J. Yin, “UniXcoder: Unified cross-modal pre-training for code representation,” Proc. of the 60th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 7212-7225, 2022.

[14] [14] M. F. I. Amin, A. Shirafuji, M. M. Rahman, and Y. Watanobe, “Multi-label code error classification using codet5 and ML-KNN,” IEEE Access, Vol.12, pp. 100805-100820, 2024. https://doi.org/10.1109/ACCESS.2024.3430558

[15] [15] Y. Watanobe, “Development and operation of an online judge system,” IPSJ Magazine, Vol.56, No.10, pp. 998-1005, 2015.

[16] [16] I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” arXiv:1711.05101, 2017. https://doi.org/10.48550/arXiv.1711.05101

[17] [17] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” arXiv:1708.02002, 2018. https://doi.org/10.48550/arXiv.1708.02002