Unveiling Facial Fidelity: A Novel Approach to Synthesizing High-Quality Face Images Using Generative Adversarial Networks

In the field of computer vision, generating realistic images from sketches has garnered considerable attention due to its wide-ranging applications in art, design, and facial recognition systems. With increasing demand for more sophisticated image synthesis, advanced methodologies are essential to b...

Full description

Saved in:
Bibliographic Details
Main Authors: Modafar Ati, Muhammad Ahmed Hassan, Muhammad Usman Ghani
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11030461/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In the field of computer vision, generating realistic images from sketches has garnered considerable attention due to its wide-ranging applications in art, design, and facial recognition systems. With increasing demand for more sophisticated image synthesis, advanced methodologies are essential to bridge the gap between sketches and high-fidelity images. This study proposes a novel methodology that comprises two key modules: the Generator and the Discriminator. The Generator is designed to transform sketches into images, while the Discriminator extracts 512-dimensional face encodings from both generated and ground truth images. Leveraging pre-trained ResNet-101, the Discriminator uses Mean Squared Error (MSE) loss to compare the face encodings, reinforcing the Generator to produce images with face encodings closely resembling the ground truth. The research utilizes two datasets: Labeled Faces in the Wild (LFW) for training the ResNet-101 CNN model for face recognition, and CelebFaces Attributes Dataset (Celeb-A) for training the sketch-to-image generator. To assess the quality of generated images, Signal-to-Noise Ratio (SNR) and Peak Signal-to-Noise Ratio (PSNR) evaluation measures are used. The SNR steadily increases over the epochs, indicating enhanced performance, and reaches a peak value of 0.334. This indicates the model’s efficacy in reducing noise and enhancing signal strength, leading to improved overall performance. The highest achieved PSNR value of 0.334 signifies an enhancement in the quality of the reconstructed signal. In particular, the proposed model exceeds previous models, with accuracies of 99.9% on the LFW dataset and 99.4% on the Celeb-A dataset, respectively.
ISSN:2169-3536