A Multi-Layer Attention Knowledge Tracking Method with Self-Supervised Noise Tolerance
The knowledge tracing method based on deep learning is used to assess learners’ cognitive states, laying the foundation for personalized education. However, deep learning methods are inefficient when processing long-term series data and are prone to overfitting. To improve the accuracy of cognitive...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-08-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/15/8717 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | The knowledge tracing method based on deep learning is used to assess learners’ cognitive states, laying the foundation for personalized education. However, deep learning methods are inefficient when processing long-term series data and are prone to overfitting. To improve the accuracy of cognitive state prediction, we design a Multi-layer Attention Self-supervised Knowledge Tracing Method (MASKT) using self-supervised learning and the Transformer method. In the pre-training stage, MASKT uses a random forest method to filter out positive and negative correlation feature embeddings; then, it reuses noise-processed restoration tasks to extract more learnable features and enhance the learning ability of the model. The Transformer in MASKT not only solves the problem of long-term dependencies between input and output using an attention mechanism, but also has parallel computing capabilities that can effectively improve the learning efficiency of the prediction model. Finally, a multidimensional attention mechanism is integrated into cross-attention to further optimize prediction performance. The experimental results show that, compared with various knowledge tracing models on multiple datasets, MASKT’s prediction performance remains 2 percentage points higher. Compared with the multidimensional attention mechanism of graph neural networks, MASKT’s time efficiency is shortened by nearly 30%. Due to the improvement in prediction accuracy and performance, this method has broad application prospects in the field of cognitive diagnosis in intelligent education. |
|---|---|
| ISSN: | 2076-3417 |