Student Attrition Prediction Using Machine Learning Techniques

Authors

  • Doris Chinedu Asogwa Nnamdi Azikiwe University, Awka, Anambra State, 234, Nigeria
  • Emmanuel Chibuogu Asogwa Nnamdi Azikiwe University, Awka, Anambra State, 234, Nigeria
  • Emmanuel Chinedu Mbonu Nnamdi Azikiwe University, Awka, Anambra State, 234, Nigeria
  • Joshua Makuochukwu Nwankpa Nnamdi Azikiwe University, Awka, Anambra State, 234, Nigeria
  • Tochukwu Sunday Belonwu Nnamdi Azikiwe University, Awka, Anambra State, 234, Nigeria

Keywords:

Machine learning, Predictive model, Random Forest, Random Tree algorithm, Student Attrition, Feature selection method, (Java Virtual Machine (JVM), Netbeans Integrated Software Development Environment (IDE), Weka Tool, Weka Plugin

Abstract

In educational systems, students’ course enrollment is fundamental performance metrics to academic and financial sustainability. In many higher institutions today, students’ attrition rates are caused by a variety of circumstances, including demographic and personal factors such as age, gender, academic background, financial abilities, and academic degree of choice. In this study, machine learning approaches was used to develop prediction models that predicted students’ attrition rate in pursuing computer science degree, as well as students who have a high risk of dropping out before graduation. This can help higher education institutes to develop proper intervention plans to reduce attrition rates and increase the probability of student academic success. Student’s data were collected from the Federal University Lokoja (FUL), Nigeria. The data were preprocessed using existing weka machine learning libraries where the data was converted into attribute related file form (arff) and resampling techniques was used to partition the data into training set and testing set. The correlation-based feature selection was extracted and used to develop the students’ attrition model and to identify the students’ risk of dropping out. Random forest and random tree machine learning algorithms were used to predict students' attrition. The results showed that the random forest had an accuracy of 79.45%, while the random tree's accuracy was 78.09%. This is an improvement over previous results where 66.14% and 57.48% accuracy was recorded for random forest and random tree respectively. This improvement was as a result of the techniques used. It is therefore recommended that applying techniques to the classification model can improve the performance of the model.

Author Biographies

Doris Chinedu Asogwa, Nnamdi Azikiwe University, Awka, Anambra State, 234, Nigeria

 

 

Emmanuel Chinedu Mbonu , Nnamdi Azikiwe University, Awka, Anambra State, 234, Nigeria



Joshua Makuochukwu Nwankpa , Nnamdi Azikiwe University, Awka, Anambra State, 234, Nigeria



References

. Gansemer-Topf, A. M. and Schuh, J. H. (2006). Institutional selectivity and institutional expenditures: Examining organizational factors that contribute to retention and graduation. Research in Higher Education, 47(6):613–642.

. Yu, C. H., DiGangi, S., Jannasch-Pennell, A., and Kaprolet, C. (2010). A data mining approach for identifying predictors of student retention from sophomore to junior year. Journal of Data Science, 8(2):307–325.

. Zhang, Y., Oussena, S., Clark, T., and Kim, H. (2010). Using data mining to improve student retention in higher education: a case study. International Conference on Enterprise Information Systems.

. Delen, D. (2010). A comparative analysis of machine learning techniques for student retention management. Decision Support Systems, 49(4):498–506.

. Lorenz, k., (2018) . Predicting Student Dropout: A Machine Learning https://www.researchgate.net/publication/322919234

. Office, F. L. (2017). Fachkrfteengpassanalyse. Technical report, Arbeitsmarktberichterstattung.

. Tinto, V. (1975). Dropout from higher education: A theoretical synthesis of recent research. Review of educational research, 45(1):89–125

. Yu, C. H., DiGangi, S., Jannasch-Pennell, A., and Kaprolet, C. (2010). A data mining approach for identifying predictors of student retention from sophomore to junior year. Journal of Data Science, 8(2):307–325.

. Kovacic, Z. (2010). Early prediction of student success: Mining students’ enrollment data. Proceedings of Informing Science and IT Education Conference, pages 647–665

. Zhang, Y., Oussena, S., Clark, T., and Kim, H. (2010). Using data mining to improve student retention in higher education: a case study. International Conference on Enterprise Information Systems.

. Delen, D. (2010). A comparative analysis of machine learning techniques for student retention management. Decision Support Systems, 49(4), 498–506. https://doi.org/10.1016/j.dss.2010.06.003.

. Banumathi, A., & Pethalakshmi, A. (2012). A novel approach for upgrading Indian education by using data mining techniques. 2012 IEEE International Conference on Technology Enhanced Education (ICTEE), 1–5. https://doi.org/10.1109/ICTEE.2012.6208603

. Alam, M. M., Mohiuddin, K., Das, A. K., Islam, Md. K., Kaonain, Md. S., & Ali, Md. H. (2018). A Reduced feature based neural network approach to classify the category of students. Proceedings of the 2nd International Conference on Innovation in Artificial Intelligence - ICIAI ’18, 28–32. https://doi.org/10.1145/3194206.3194218

. Manrique, R., Nunes, B. P., Marino, O., Casanova, M. A., & Nurmikko-Fuller, T. (2019). An Analysis of Student Representation, Representative Features and Classification Algorithms to Predict Degree Dropout. Proceedings of the 9th International Conference on Learning Analytics & Knowledge - LAK19, 401–410. https://doi.org/10.1145/3303772.3303800.

. Puarungroj, W., Boonsirisumpun, N., Pongpatrakant, P., & Phromkhot, S. (2018). Application of Data Mining Techniques for Predicting Student Success in English Exit Exam. Proceedings of the 12th International Conference on Ubiquitous Information Management and Communication - IMCOM ’18, 1–6. https://doi.org/10.1145/3164541.3164638.

. Supianto, A. A., Julisar Dwitama, A., & Hafis, M. (2018). Decision Tree Usage for Student Graduation Classification: A Comparative Case Study in Faculty of Computer Science Brawijaya University. 2018 International Conference on Sustainable Information Engineering and Technology (SIET), 308–311. https://doi.org/10.1109/SIET.2018.8693158

. Márquez-Vera, C., Cano, A., Romero, C., Noaman, A. Y. M., Mousa Fardoun, H., & Ventura, S. (2016). Early dropout prediction using data mining: A case study with high school students. Expert Systems, 33(1), 107–124. https://doi.org/10.1111/exsy.12135

. Ratnaningsih, D. J., & Sitanggang, I. S. (2016). Comparative analysis of classification methods in determining non-active student characteristics in Indonesia Open University. Journal of Applied Statistics, 43(1), 87–97

. Hossin, M., Sulaiman, M.N. (2015). A Review on Evaluation Metrics for Data Classification Evaluations. International Journal of Data Mining & Knowledge Management Process, 5(2), 01–11.

. Petkovski A., Stojkoska B., Trivodaliev K., and Kalajdziski S., (2016) “Analysis of Churn Prediction: A Case Study on Telecommunication services in Macedonia,” in Proceedings of 24th Telecommunications Forum, Belgrade, pp. 1-4, 2016.

. Yukselturk E., Ozekes S., and Türel Y.,(2014) “Predicting Dropout Student: An Application of Data Mining Methods in an Online Education Program,” European Journal of Open, Distance and e-learning, vol. 17, no. 1, pp.118-133, 2014

. Rai S. and Jain A., (2013) “Students' Dropout Risk Assessment in Undergraduate Courses of ICT at Residential University-A Case Study,” International Journal of Computer Applications, vol. 84, no. 14, pp. 31-36, 2013.

. Rai S. and Jain A., (2013) “Students' Dropout Risk Assessment in Undergraduate Courses of ICT at Residential University-A Case Study,” International Journal of Computer Applications, vol. 84, no. 14, pp. 31-36, 2013

. Kemper L., Vorhoff G., and Wigger B., (2020), “Predicting Student Dropout: A Machine Learning Approach,” European Journal of Higher Education, vol. 10, no. 1, pp. 28-47, 2020.

. Mulugeta M. and Borena B.,(2013) “Higher Education Students’ Enrolment Forecasting System Using Data Mining Application in Ethiopia” HiLCoE Journal of Computer Science and Technology, vol. 2, no. 2, pp. 37-43, 2013

. Wikipedia contributors, “Random_tree”, Wikipedia, The Free Encyclopedia. Wikimedia Foundation, 13 -Jul-2014.

. E. Frank, M. Hall, G. Holmes, R. Kirkby, B. Pfahringer, I. H. itten, and L. Trigg,(2005) Weka, in Data Mining and Knowledge Discovery Handbook, Springer, 2005, pp. 1305 – 1314

Downloads

Published

2023-09-02

How to Cite

Asogwa, D. C., Asogwa, E. C., Mbonu , E. C., Nwankpa , J. M., & Belonwu , T. S. (2023). Student Attrition Prediction Using Machine Learning Techniques. International Journal of Computer (IJC), 49(1), 16–29. Retrieved from https://www.ijcjournal.org/index.php/InternationalJournalOfComputer/article/view/2110

Issue

Section

Articles