Predicting Student Performance using Linear Regression
DOI:
https://doi.org/10.63017/jdsi.v3i2.104Abstract
This study explores how to measure and predict student performance using various machine learning algorithms to determine the model that produces the best predictions. The collected data is obtained from the Kaggle data science and machine learning community website, obtaining a dataset with 6 attributes, namely: (1) Hours Studied, (2) Previous Scores, (3) Extracurricular Activities, (4) Sleep Hours, (5) Sample Question Papers Practiced, and (6) Performance Index. The data was cleaned and explored using Microsoft Excel, Google Colab and Tableau. Model development using RapidMiner and Google Colab. The algorithms used for the study were: k-NN, SVM, Linear Regression, Generalized Linear Model, Deep Learning. The Root Mean Squared Error (RMSE) results obtained by the algorithm were 2,455 (k-NN), 2,072 (SVM), 2,013 (Linear Regression), 2,030 (Generalized Linear Model), 2,364 (Deep Learning). From the RMSE it can be seen that the algorithm that gets the best results is Linear Regression, after being retested, Linear Regression gets an RMSE of 2.015, and Root Squared (R2) of 0.989, meaning the Linear Regression algorithm has an accuracy of 98.9%.
References
S. Wiyono and T. Abidin, “Implementation Of K-Nearest Neighbour (KNN) Algorithm To Predict Student’s Performance,” Simetris: Jurnal Teknik Mesin, Elektro dan Ilmu Komputer, vol. 9, no. 2, 2018, doi: 10.24176/simet.v9i2.2424.
Admin AWS, “Apa itu Machine Learning?,” Amazon Web Services.
E. Purwaningsih and E. Nurelasari, “Penerapan K-Nearest Neighbor Untuk Klasifikasi Tingkat Kelulusan Pada Siswa,” Syntax : Jurnal Informatika, vol. 10, no. 01, 2021, doi: 10.35706/syji.v10i01.5173.
S. Akuma and H. Abakpa, “Predicting Undergraduate Level Students’ Performance Using Regression,” Nigerian Annals Of Pure And Applied Sciences, vol. 4, no. 1, 2021, doi: 10.46912/napas.224.
Y. A. Alsariera, Y. Baashar, G. Alkawsi, A. Mustafa, A. A. Alkahtani, and N. Ali, “Assessment and Evaluation of Different Machine Learning Algorithms for Predicting Student Performance,” 2022. doi: 10.1155/2022/4151487.
S. O. Oppong, “Predicting Students’ Performance Using Machine Learning Algorithms: A Review,” Asian Journal of Research in Computer Science, vol. 16, no. 3, 2023, doi: 10.9734/ajrcos/2023/v16i3351.
H. Altabrawee, O. A. J. Ali, and S. Q. Ajmi, “Predicting Students’ Performance Using Machine Learning Techniques,” Journal Of University Of Babylon For Pure And Applied Sciences, vol. 27, no. 1, 2019, doi: 10.29196/jubpas.v27i1.2108.
M. Seckin Kapucu, I. Ozcan, H. Ozcan, and A. Aypay, “Predicting Secondary School Students’ Academic Performance in Science Course by Machine Learning,” International Journal of Technology in Education and Science, vol. 8, no. 1, 2024, doi: 10.46328/ijtes.518.
B. Aliyu Sani, S. Baoku I.G, B. Jamilu Ahmed, and S. Musa, “Comparative Between Three Machine Learning Algorithms to Predict and Improve Students’ Academic Performance,” International Journal of Science for Global Sustainability, vol. 8, no. 4, 2023, doi: 10.57233/ijsgs.v8i4.365.
Kompas.com, “Mengenal Google Colab, Fungsi dan Manfaatnya,” 2023.
D. A. Nasution, H. H. Khotimah, and N. Chamidah, “Perbandingan Normalisasi Data untuk Klasifikasi Wine Menggunakan Algoritma K-NN,” Computer Engineering, Science and System Journal, vol. 4, no. 1, 2019, doi: 10.24114/cess.v4i1.11458.
A. miftahul I. Habiba, A. Prasetiadi, and C. Ramdani, “Analisis Kesehatan Terumbu Karang Berdasarkan Karakteristik Sungai, Laut, Dan Populasi Area Pemukiman Menggunakan Machine Learning,” IJIS - Indonesian Journal On Information System, vol. 5, no. 2, 2020, doi: 10.36549/ijis.v5i2.119.
S. Huang, “Linear regression analysis,” in International Encyclopedia of Education: Fourth Edition, Elsevier, 2022, pp. 548–557. doi: 10.1016/B978-0-12-818630-5.10067-3.
H. Hasanah, A. Farida, and P. P. Yoga, “Implementation of Simple Linear Regression for Predicting of Students’ Academic Performance in Mathematics,” Jurnal Pendidikan Matematika (Kudus), vol. 5, no. 1, 2022, doi: 10.21043/jpmk.v5i1.14430.
P. K. Dunn, “Generalized linear models,” in International Encyclopedia of Education: Fourth Edition, 2022. doi: 10.1016/B978-0-12-818630-5.10077-6.
A. Banafa, “What is Deep Learning?,” in Quantum Computing and Other Transformative Technologies, 2023. doi: 10.1201/9781003339175-12.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Data Science Insights

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.