Research Paper:
Predicting Student Performance Through Data Mining: A Case Study in Sultan Ageng Tirtayasa University
Rocky Alfanz*
, Raphael Kusumo Hendrianto*,
, and Al Hafiz Akbar Maulana Siagian**

*Department of Electrical Engineering, Faculty of Engineering, Universitas Sultan Ageng Tirtayasa
Jl. Jend. Sudirman KM 3 Kota Bumi, Cilegon, Banten 42435, Indonesia
Corresponding author
**Research Center for Data and Information Science, National Research and Innovation Agency
Jl. M.H. Thamrin No.8, Jakarta Pusat, Jakarta 10340, Indonesia
Failure in compulsory subjects such as chemistry, calculus, physics, and basic control systems could hamper the graduation process of students. Thus, students must be successful in such obligatory courses. To address this issue, this study aims to predict student performance based on their learning outcomes using data mining techniques. In particular, we utilize decision tree (DT), k-nearest neighbor (KNN), support vector machine (SVM), and naive Bayes (NB) algorithms to predict student performance. The data for this study were gathered from the learning outcomes of students in the basic control systems course and subsequently modeled using binary and nine-level classifications. The experimental results showed that DT could perform better than KNN, SVM, and NB in the binary and nine-level classifications. Interestingly, the results of DT (i.e., the prediction values) are almost similar to those of the original values of the basic control systems course.
- [1] D. P. Chattopadhyaya, “Education and national development,” M. Jal and J. Bawane (Eds.), “Theory and Praxis: Reflections on the Colonization of Knowledge,” pp. 125-138, Routledge India, 2020.
- [2] M. Kumar and A. J. Singh, “Performance analysis of students using machine learning & data mining approach,” Int. J. of Engineering and Advanced Technology, Vol.8, No.3, pp. 75-79, 2019.
- [3] K. Halperin, “‘Race’, parental occupation and academic performance in a public school population,” European J. of Mathematics and Science Education, Vol.1, No.1, pp. 25-30, 2020. https://doi.org/10.12973/ejmse.1.1.25
- [4] B. Reyes et al., “Prediction of academic achievement in Dominican students: Mediational role of learning strategies and study habits and attitudes toward study,” Psychology in the Schools, Vol.60, No.3, pp. 606-625, 2022. https://doi.org/10.1002/pits.22780
- [5] R. Alamri and B. Alharbi, “Explainable student performance prediction models: A systematic review,” IEEE Access, Vol.9, pp. 33132-33143, 2021. https://doi.org/10.1109/ACCESS.2021.3061368
- [6] B. Albreiki, N. Zaki, and H. Alashwal, “A systematic literature review of student’ performance prediction using machine learning techniques,” Education Sciences, Vol.11, No.9, Article No.552, 2021. https://doi.org/10.3390/educsci11090552
- [7] Muktha Priya K. S. and Sunitha G. P., “Performance analysis of machine learning models for credit delinquency prediction,” Int. J. of Advanced Research in Science, Communication and Technology, Vol.2, No.2, pp. 301-307, 2022. https://doi.org/10.48175/IJARSCT-5454
- [8] J. C. Xia et al., “Artificial intelligence and data mining: Algorithms and applications,” Abstract and Applied Analysis, Vol.2013, Article No.524720, 2013. https://doi.org/10.1155/2013/524720
- [9] M. Yağcı, “Educational data mining: Prediction of students’ academic performance using machine learning algorithms,” Smart Learning Environments, Vol.9, No.1, Article No.11, 2022. https://doi.org/10.1186/s40561-022-00192-z
- [10] T. Pattiasina and D. Rosiyadi, “Comparison of data mining classification algorithm for predicting the performance of high school students,” J. of Computing and Information Technology, Vol.17, No.1, pp. 22-30, 2020. https://doi.org/10.33480/techNo.v17i1.1226
- [11] C. Anam and H. B. Santoso, “Perbandingan Kinerja Algoritma C4.5 dan Naive Bayes untuk Klasifikasi Penerima Beasiswa,” Jurnal Ilmiah Ilmu-Ilmu Teknik, Vol.8, No.1, pp. 13-19, 2018 (in Indonesian).
- [12] A. Budiman, A. Setyanto, and F. W. Wibowo, “Prediksi Tingkat Kelulusan Mahasiswa Menggunakan Algoritma C4.5 (Studi Kasus: Informatika Universitas AMIKOM Yogyakarta),” Teknomatika, Vol.11, No.2, pp. 83-93, 2019 (in Indonesian).
- [13] M. H. Sadiq and N. S. Ahmed, “Classifying and predicting students’ performance using improved decision tree C4.5 in higher education institutes,” J. of Computer Science, Vol.15, No.9, pp. 1291-1306, 2019. https://doi.org/10.3844/jcssp.2019.1291.1306
- [14] V. Sheth, U. Tripathi, and A. Sharma, “A comparative analysis of machine learning algorithms for classification purpose,” Procedia Computer Science, Vol.215, pp. 422-431, 2022. https://doi.org/10.1016/j.procs.2022.12.044
- [15] P. Nuankaew and W. S. Nuankaew, “Student performance prediction model for predicting academic achievement of high school students,” European J. of Educational Research, Vol.11, No.2, pp. 949-963, 2022. https://doi.org/10.12973/eu-jer.11.2.949
- [16] A. Fadli, M. I. Zulfa, and Y. Ramadhani, “Performance comparison of data mining classification algorithms for early warning system of students graduation timeliness,” Jurnal Teknologi dan Sistem Komputer, Vol.6, No.4, pp. 158-163, 2018 (in Indonesian). https://doi.org/10.14710/jtsiskom.6.4.2018.158-163
- [17] M. Mohammadi et al., “Comparative study of supervised learning algorithms for student performance prediction,” 2019 Int. Conf. on Artificial Intelligence in Information and Communication (ICAIIC), pp. 124-127, 2019. https://doi.org/10.1109/ICAIIC.2019.8669085
- [18] T. Cao et al., “A Kernel k-means-based method and attribute selections for diabetes diagnosis,” J. Adv. Comput. Intell. Intell. Inform., Vol.24, No.1, pp. 73-82, 2020. https://doi.org/10.20965/jaciii.2020.p0073
- [19] P. Cortez and A. M. G. Silva, “Using data mining to predict secondary school student performance,” Proc. of 5th Annual Future Business Technology Conf., pp. 5-12, 2008.
- [20] L. Sun et al., “Demand forecasting for petrol products in gas stations using clustering and decision tree,” J. Adv. Comput. Intell. Intell. Inform., Vol.22, No.3, pp. 387-393, 2018. https://doi.org/10.20965/jaciii.2018.p0387
- [21] H. Amalia et al., “Application of decision tree and naive Bayes on student performance dataset,” J. of Computing and Information System, Vol.18, No.1, pp. 53-58, 2022. https://doi.org/10.33480/pilar.v18i1.2714
- [22] N. Insan, M. Hadijati, and I. Irwansyah, “Perbandingan Metode Classification and Regression Trees (CART) dengan Naïve Bayes Classification (NBC) dalam Klasifikasi Status Gizi Balita di Kelurahan Pagesangan Barat,” Eigen Mathematics J., Vol.3, No.1, pp. 9-22, 2020 (in Indonesian). https://doi.org/10.29303/emj.v3i1.68
- [23] D. A. N. Wulandari et al., “Educational data mining for student academic prediction using k-means clustering and Naïve Bayes classifier,” J. of Computing and Information System, Vol.16, No.2, pp. 155-160, 2020. https://doi.org/10.33480/pilar.v16i2.1432
- [24] B. Sunny and L. George, “Comparative study between KNN & SVM,” Int. J. of Advanced Research in Science, Communication and Technology, Vol.2, No.3, pp. 53-64, 2022. https://doi.org/10.48175/IJARSCT-4908
- [25] T. Setiyorini and R. T. Asmono, “Implementation of gain ratio and k-nearest neighbor for classification of student performance,” J. of Computing and Information System, Vol.16, No.1, pp. 19-24, 2020. https://doi.org/10.33480/pilar.v16i1.813
- [26] B. M. Alsafy, Z. M. Aydam, and W. K. Mutlag, “Multiclass classification methods: A review,” Int. J. of Advanced Engineering Technology and Innovative Science, Vol.5, No.3, 2019.
This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.