Evaluation of early student performance prediction given concept drift

Forecasting student performance can help to identify students at risk and aids in recommending actions to improve their learning outcomes. That often involves elaborate machine learning pipelines. These tend to use large feature sets including behavioral data from learning management systems or demo...

Full description

Saved in:
Bibliographic Details
Main Authors: Benedikt Sonnleitner, Tom Madou, Matthias Deceuninck, Filotas Theodosiou, Yves R. Sagaert
Format: Article
Language:English
Published: Elsevier 2025-06-01
Series:Computers and Education: Artificial Intelligence
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2666920X25000098
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Forecasting student performance can help to identify students at risk and aids in recommending actions to improve their learning outcomes. That often involves elaborate machine learning pipelines. These tend to use large feature sets including behavioral data from learning management systems or demographic information. However, this complexity can lead to inaccurate predictions when concept drift occurs, or when a large number of features are used with a limited sample size. We investigate the performance of different machine learning pipelines on a data set with change in study behavior during the Covid-19 period. We demonstrate that (i) LASSO, a shrinkage estimator that reduces complexity and overfitting, outperforms several machine learning models under these circumstances, (ii) a linear regression relying on only two handcrafted features achieves higher accuracy and substantially less predictive bias than commonly used, more complex models with large feature sets. Due to their simplicity, these models can serve as a benchmark for future studies and a fallback model when substantial concept or covariate drift is encountered.
ISSN:2666-920X