Predicting Student Academic Performance
Abstract
The increased use of electronic systems in education leads to a vast volume of data in educational databases. Educational data mining (EDM) is an emerging field of re- search interested in exploring data taken from educational institutions. The knowledge extracted from educational data will provide a clearer view to improve the education process. One of the current research areas in this field is predicting student performance. Predicting student performance would help decision makers discover the main reasons students either fail or succeed in their studies. This research uses EDM techniques to build a framework for predicting students’ academic performance in advance, using pre-admission information and first-year subject marks.
The data set used in this research is from students enrolled in the College of Computer Science at King Khalid University in Saudi Arabia. In order to achieve our goal, we start with applying data preprocessing and analysis techniques to the data to select the most important attribute and then apply machine learning algorithms to the data. The experimental results show that the Random Forest algorithm gave the best result in the prediction process with an accuracy of more than 78%. Moreover, the result shows that high school GPA , English, Math and computer science subjects are the most important features that affect student performance in the university.