From KL Divergence to Wasserstein Distance: Enhancing Autoencoders with FID Analysis
Variational Autoencoders (VAEs) are popular Bayesian inference models that excel at approximating complex data distributions in a lower-dimensional latent space. Despite their widespread use, VAEs frequently face challenges in image generation, often resulting in blurry outputs. This outcome is pri...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
LibraryPress@UF
2025-05-01
|
| Series: | Proceedings of the International Florida Artificial Intelligence Research Society Conference |
| Online Access: | https://journals.flvc.org/FLAIRS/article/view/139006 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Variational Autoencoders (VAEs) are popular Bayesian
inference models that excel at approximating complex
data distributions in a lower-dimensional latent space.
Despite their widespread use, VAEs frequently face
challenges in image generation, often resulting in blurry
outputs. This outcome is primarily attributed to two
factors: the inherent probabilistic nature of the VAE
framework and the oversmoothing effect induced by
the Kullback-Leibler (KL) divergence term in the loss
function. This paper explores the integration of Wasser-
stein Distance into the VAEs framework, resulting in
a Wasserstein Autoencoders (WAEs) designed to mit-
igate the oversmoothing issue and enhance the qual-
ity of generated images. We evaluated the proposed
WAEs using the Fr´echet Inception Distance (FID), In-
ception Score (IS) and Structural Similarity Index Mea-
sure (SSIM). The experimental results in the CelebA
dataset demonstrate that WAEs significantly outperform
VAEs by 25% in FID, 13.6% in IS and 15.3% in SSIM.
Additionally, the evaluation considers the issue of class
imbalance in the ODIR dataset, where WAEs demon-
strate superior accuracy and precision in classification
tasks. Our findings highlight WAEs as a practical and
efficient alternative to VAEs for image generation and
reconstruction, particularly in resource-limited settings
|
|---|---|
| ISSN: | 2334-0754 2334-0762 |