Masked autoencoder of multi-scale convolution strategy combined with knowledge distillation for facial beauty prediction
Abstract Facial beauty prediction (FBP) is a leading area of research in artificial intelligence. Currently, there is a small amount of labeled data and a large amount of unlabeled data in the FBP database. The features extracted by the model based on supervised training are limited, resulting in lo...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2025-01-01
|
Series: | Scientific Reports |
Online Access: | https://doi.org/10.1038/s41598-025-86831-0 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832585851450687488 |
---|---|
author | Junying Gan Junling Xiong |
author_facet | Junying Gan Junling Xiong |
author_sort | Junying Gan |
collection | DOAJ |
description | Abstract Facial beauty prediction (FBP) is a leading area of research in artificial intelligence. Currently, there is a small amount of labeled data and a large amount of unlabeled data in the FBP database. The features extracted by the model based on supervised training are limited, resulting in low prediction accuracy. Masked autoencoder (MAE) is a self-supervised learning method that outperforms supervised learning methods without relying on large-scale databases. The MAE can improve the feature extraction ability of the model effectively. The multi-scale convolution strategy can expand the receptive field and combine the attention mechanism of the MAE to capture the dependency between distant pixels and acquire shallow and deep image features. Knowledge distillation can take the abundant knowledge from the teacher net to the student net, reduce the number of parameters, and compress the model. In this paper, the MAE of the multi-scale convolution strategy is combined with knowledge distillation for FBP. First, the MAE model with a multi-scale convolution strategy is constructed and used in the teacher net for pretraining. Second, the MAE model is constructed for the student net. Finally, the teacher net performs knowledge distillation, and the student net receives the loss function transmitted from the teacher net for optimization. The experimental results show that the proposed method outperforms other methods on the FBP task, improves FBP accuracy, and can be widely applied in tasks such as image classification. |
format | Article |
id | doaj-art-7ba8aae7f34e4fce8eea398fffc802b7 |
institution | Kabale University |
issn | 2045-2322 |
language | English |
publishDate | 2025-01-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Reports |
spelling | doaj-art-7ba8aae7f34e4fce8eea398fffc802b72025-01-26T12:25:16ZengNature PortfolioScientific Reports2045-23222025-01-0115111710.1038/s41598-025-86831-0Masked autoencoder of multi-scale convolution strategy combined with knowledge distillation for facial beauty predictionJunying Gan0Junling Xiong1School of Electronics and Information Engineering, Wuyi UniversitySchool of Electronics and Information Engineering, Wuyi UniversityAbstract Facial beauty prediction (FBP) is a leading area of research in artificial intelligence. Currently, there is a small amount of labeled data and a large amount of unlabeled data in the FBP database. The features extracted by the model based on supervised training are limited, resulting in low prediction accuracy. Masked autoencoder (MAE) is a self-supervised learning method that outperforms supervised learning methods without relying on large-scale databases. The MAE can improve the feature extraction ability of the model effectively. The multi-scale convolution strategy can expand the receptive field and combine the attention mechanism of the MAE to capture the dependency between distant pixels and acquire shallow and deep image features. Knowledge distillation can take the abundant knowledge from the teacher net to the student net, reduce the number of parameters, and compress the model. In this paper, the MAE of the multi-scale convolution strategy is combined with knowledge distillation for FBP. First, the MAE model with a multi-scale convolution strategy is constructed and used in the teacher net for pretraining. Second, the MAE model is constructed for the student net. Finally, the teacher net performs knowledge distillation, and the student net receives the loss function transmitted from the teacher net for optimization. The experimental results show that the proposed method outperforms other methods on the FBP task, improves FBP accuracy, and can be widely applied in tasks such as image classification.https://doi.org/10.1038/s41598-025-86831-0 |
spellingShingle | Junying Gan Junling Xiong Masked autoencoder of multi-scale convolution strategy combined with knowledge distillation for facial beauty prediction Scientific Reports |
title | Masked autoencoder of multi-scale convolution strategy combined with knowledge distillation for facial beauty prediction |
title_full | Masked autoencoder of multi-scale convolution strategy combined with knowledge distillation for facial beauty prediction |
title_fullStr | Masked autoencoder of multi-scale convolution strategy combined with knowledge distillation for facial beauty prediction |
title_full_unstemmed | Masked autoencoder of multi-scale convolution strategy combined with knowledge distillation for facial beauty prediction |
title_short | Masked autoencoder of multi-scale convolution strategy combined with knowledge distillation for facial beauty prediction |
title_sort | masked autoencoder of multi scale convolution strategy combined with knowledge distillation for facial beauty prediction |
url | https://doi.org/10.1038/s41598-025-86831-0 |
work_keys_str_mv | AT junyinggan maskedautoencoderofmultiscaleconvolutionstrategycombinedwithknowledgedistillationforfacialbeautyprediction AT junlingxiong maskedautoencoderofmultiscaleconvolutionstrategycombinedwithknowledgedistillationforfacialbeautyprediction |