Text this: Masked autoencoder of multi-scale convolution strategy combined with knowledge distillation for facial beauty prediction