Analysing and Predicting School Performance of sted Using Machine Learning Classification Models
Abstract
Parents nowadays are starting to focus more on school performance and ratings than
before, mainly because schools will be the major effector on the child, starting from
competencies, personality, and many other core skills. Education is regarded as one of
the most essential principles, and each state devotes significant resources to advancing
it and bolstering its credibility by implementing contemporary curricula and educa tional techniques that improve students’ capacity to acquire a wide range of skills. UK
will be the target of this study since it is adopting the OFSTED (Office of Standards
in Education) rating system, which monitors education and school performance, and
is becoming an essential metric as directors want their schools to receive a favourable
evaluation from the Board of Inspectors.The OFSTED rating consist of four main rat ings: Outstanding, Good, Requires Improvement and Inadequate. This dissertation’s
major objective is to examine how factors such as the school’s absence rate, general
information, student level, and parent feedback determine how the rating of OFSTED
should be and whether the rating rises or falls. This study starts with exploring the
available data sources from the UK government database and selecting the most impor tant sets of data that can relate to the study. Then the collected dataset, in addition
to the parent view dataset, was merged into one dataset using the school id. Moving to
the data analysis and visualization part, various analyses on the dataset features were
performed, which led to understanding the distribution of features’ values for each class
and how some variations can affect the rating directly, like the absence rate. Moving
on to the machine learning part, a set of training methods were used to get the best
results. The last part was an extra analysis that focused on the regions, comparing the
East of London, Norfolk, and Hampshire to see if the region could affect the OFSTED
rating. As a result, the best model using oversampled data and the LGBM classifier
reached 91% accuracy, and based on the importance of the features in that model,
we concluded that students’ levels (A-level scores) and parent feedback (based on the
questionnaire) are the most important features that play an important role in deciding
the school rating. In addition, the analysis shows that some features, like absenteeism
rates in schools, can have a direct impact on the Ofsted rating.The region analysis also
shows that the area where the school is and what kind of school it is are important.
Description
Keywords
School performance, OFSTED, machine learning