Generating and Improving a Dataset of Masked Faces Using Data Augmentation

Before the spread of the COVID-19 virus in 2020, modern face recognition systems performed excellently, but then the wearing of masks was imposed by countries on their population, which led to a noteworthy decrease in the discriminatory ability of those systems, where they had been trained on large...

Full description

Saved in:
Bibliographic Details
Main Authors: Waleed Ayad, Siraj Qays, Ali Al-Naji
Format: Article
Language:English
Published: middle technical university 2023-06-01
Series:Journal of Techniques
Subjects:
Online Access:https://journal.mtu.edu.iq/index.php/MTU/article/view/1140
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832595119394521088
author Waleed Ayad
Siraj Qays
Ali Al-Naji
author_facet Waleed Ayad
Siraj Qays
Ali Al-Naji
author_sort Waleed Ayad
collection DOAJ
description Before the spread of the COVID-19 virus in 2020, modern face recognition systems performed excellently, but then the wearing of masks was imposed by countries on their population, which led to a noteworthy decrease in the discriminatory ability of those systems, where they had been trained on large-scale datasets of unmasked faces and not available large-scale masked faces datasets that time. To contribute to addressing the shortage of large-scale data sets that consist of people wearing masks, a developed method has been presented to create simulated masks and overlay them on faces in two main steps. The first step was to detect, align and crop the faces of unmasked faces datasets in a dataset and then apply simulated masks on the faces utilizing the dlib-ml library. This method was used to generate a dataset for masked faces (CASIA-mask). The second step used five techniques of data augmentation with the generated dataset. To evaluate the masked dataset and data augmentation, an accuracy of 96.4% was achieved by training one of the latest and most important facial recognition systems, FaceNet, on the masked dataset. The same system also achieved excellent results of 97.71% when trained on CASIA-mask and data augmentation together.
format Article
id doaj-art-61ba49061cf5448c89555af2769181c5
institution Kabale University
issn 1818-653X
2708-8383
language English
publishDate 2023-06-01
publisher middle technical university
record_format Article
series Journal of Techniques
spelling doaj-art-61ba49061cf5448c89555af2769181c52025-01-19T11:01:52Zengmiddle technical universityJournal of Techniques1818-653X2708-83832023-06-015210.51173/jt.v5i2.1140Generating and Improving a Dataset of Masked Faces Using Data AugmentationWaleed Ayad0Siraj Qays 1Ali Al-Naji2Electrical Engineering Technical College, Middle Technical University, Baghdad, Iraq.Electrical Engineering Technical College, Middle Technical University, Baghdad, Iraq.School of Engineering, University of South Australia, Mawson Lakes, SA 5095, Australia Before the spread of the COVID-19 virus in 2020, modern face recognition systems performed excellently, but then the wearing of masks was imposed by countries on their population, which led to a noteworthy decrease in the discriminatory ability of those systems, where they had been trained on large-scale datasets of unmasked faces and not available large-scale masked faces datasets that time. To contribute to addressing the shortage of large-scale data sets that consist of people wearing masks, a developed method has been presented to create simulated masks and overlay them on faces in two main steps. The first step was to detect, align and crop the faces of unmasked faces datasets in a dataset and then apply simulated masks on the faces utilizing the dlib-ml library. This method was used to generate a dataset for masked faces (CASIA-mask). The second step used five techniques of data augmentation with the generated dataset. To evaluate the masked dataset and data augmentation, an accuracy of 96.4% was achieved by training one of the latest and most important facial recognition systems, FaceNet, on the masked dataset. The same system also achieved excellent results of 97.71% when trained on CASIA-mask and data augmentation together. https://journal.mtu.edu.iq/index.php/MTU/article/view/1140COVID-19Masked Face DatasetData AugmentationFace Recognition
spellingShingle Waleed Ayad
Siraj Qays
Ali Al-Naji
Generating and Improving a Dataset of Masked Faces Using Data Augmentation
Journal of Techniques
COVID-19
Masked Face Dataset
Data Augmentation
Face Recognition
title Generating and Improving a Dataset of Masked Faces Using Data Augmentation
title_full Generating and Improving a Dataset of Masked Faces Using Data Augmentation
title_fullStr Generating and Improving a Dataset of Masked Faces Using Data Augmentation
title_full_unstemmed Generating and Improving a Dataset of Masked Faces Using Data Augmentation
title_short Generating and Improving a Dataset of Masked Faces Using Data Augmentation
title_sort generating and improving a dataset of masked faces using data augmentation
topic COVID-19
Masked Face Dataset
Data Augmentation
Face Recognition
url https://journal.mtu.edu.iq/index.php/MTU/article/view/1140
work_keys_str_mv AT waleedayad generatingandimprovingadatasetofmaskedfacesusingdataaugmentation
AT sirajqays generatingandimprovingadatasetofmaskedfacesusingdataaugmentation
AT alialnaji generatingandimprovingadatasetofmaskedfacesusingdataaugmentation