Text to Realistic Image Generation with Attentional Concatenation Generative Adversarial Networks

In this paper, we propose an Attentional Concatenation Generative Adversarial Network (ACGAN) aiming at generating 1024 × 1024 high-resolution images. First, we propose a multilevel cascade structure, for text-to-image synthesis. During training progress, we gradually add new layers and, at the same...

Full description

Saved in:

Bibliographic Details
Main Authors:	Linyan Li, Yu Sun, Fuyuan Hu, Tao Zhou, Xuefeng Xi, Jinchang Ren
Format:	Article
Language:	English
Published:	Wiley 2020-01-01
Series:	Discrete Dynamics in Nature and Society
Online Access:	http://dx.doi.org/10.1155/2020/6452536
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832547716466475008
author	Linyan Li Yu Sun Fuyuan Hu Tao Zhou Xuefeng Xi Jinchang Ren
author_facet	Linyan Li Yu Sun Fuyuan Hu Tao Zhou Xuefeng Xi Jinchang Ren
author_sort	Linyan Li
collection	DOAJ
description	In this paper, we propose an Attentional Concatenation Generative Adversarial Network (ACGAN) aiming at generating 1024 × 1024 high-resolution images. First, we propose a multilevel cascade structure, for text-to-image synthesis. During training progress, we gradually add new layers and, at the same time, use the results and word vectors from the previous layer as inputs to the next layer to generate high-resolution images with photo-realistic details. Second, the deep attentional multimodal similarity model is introduced into the network, and we match word vectors with images in a common semantic space to compute a fine-grained matching loss for training the generator. In this way, we can pay attention to the fine-grained information of the word level in the semantics. Finally, the measure of diversity is added to the discriminator, which enables the generator to obtain more diverse gradient directions and improve the diversity of generated samples. The experimental results show that the inception scores of the proposed model on the CUB and Oxford-102 datasets have reached 4.48 and 4.16, improved by 2.75% and 6.42% compared to Attentional Generative Adversarial Networks (AttenGAN). The ACGAN model has a better effect on text-generated images, and the resulting image is closer to the real image.
format	Article
id	doaj-art-8a46cfd3ab634c6680d2d6eda5f78505
institution	Kabale University
issn	1026-0226 1607-887X
language	English
publishDate	2020-01-01
publisher	Wiley
record_format	Article
series	Discrete Dynamics in Nature and Society
spelling	doaj-art-8a46cfd3ab634c6680d2d6eda5f785052025-02-03T06:43:43ZengWileyDiscrete Dynamics in Nature and Society1026-02261607-887X2020-01-01202010.1155/2020/64525366452536Text to Realistic Image Generation with Attentional Concatenation Generative Adversarial NetworksLinyan Li0Yu Sun1Fuyuan Hu2Tao Zhou3Xuefeng Xi4Jinchang Ren5College of Information Technology, Suzhou Institute of Trade & Commerce, Suzhou 215009, ChinaCollege of Electronic & Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, ChinaCollege of Electronic & Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, ChinaSchool of Computer Science and Engineering, North Minzu University, Yinchuan 750021, ChinaCollege of Electronic & Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, ChinaUniversity of Strathclyde, Glasgow, UKIn this paper, we propose an Attentional Concatenation Generative Adversarial Network (ACGAN) aiming at generating 1024 × 1024 high-resolution images. First, we propose a multilevel cascade structure, for text-to-image synthesis. During training progress, we gradually add new layers and, at the same time, use the results and word vectors from the previous layer as inputs to the next layer to generate high-resolution images with photo-realistic details. Second, the deep attentional multimodal similarity model is introduced into the network, and we match word vectors with images in a common semantic space to compute a fine-grained matching loss for training the generator. In this way, we can pay attention to the fine-grained information of the word level in the semantics. Finally, the measure of diversity is added to the discriminator, which enables the generator to obtain more diverse gradient directions and improve the diversity of generated samples. The experimental results show that the inception scores of the proposed model on the CUB and Oxford-102 datasets have reached 4.48 and 4.16, improved by 2.75% and 6.42% compared to Attentional Generative Adversarial Networks (AttenGAN). The ACGAN model has a better effect on text-generated images, and the resulting image is closer to the real image.http://dx.doi.org/10.1155/2020/6452536
spellingShingle	Linyan Li Yu Sun Fuyuan Hu Tao Zhou Xuefeng Xi Jinchang Ren Text to Realistic Image Generation with Attentional Concatenation Generative Adversarial Networks Discrete Dynamics in Nature and Society
title	Text to Realistic Image Generation with Attentional Concatenation Generative Adversarial Networks
title_full	Text to Realistic Image Generation with Attentional Concatenation Generative Adversarial Networks
title_fullStr	Text to Realistic Image Generation with Attentional Concatenation Generative Adversarial Networks
title_full_unstemmed	Text to Realistic Image Generation with Attentional Concatenation Generative Adversarial Networks
title_short	Text to Realistic Image Generation with Attentional Concatenation Generative Adversarial Networks
title_sort	text to realistic image generation with attentional concatenation generative adversarial networks
url	http://dx.doi.org/10.1155/2020/6452536
work_keys_str_mv	AT linyanli texttorealisticimagegenerationwithattentionalconcatenationgenerativeadversarialnetworks AT yusun texttorealisticimagegenerationwithattentionalconcatenationgenerativeadversarialnetworks AT fuyuanhu texttorealisticimagegenerationwithattentionalconcatenationgenerativeadversarialnetworks AT taozhou texttorealisticimagegenerationwithattentionalconcatenationgenerativeadversarialnetworks AT xuefengxi texttorealisticimagegenerationwithattentionalconcatenationgenerativeadversarialnetworks AT jinchangren texttorealisticimagegenerationwithattentionalconcatenationgenerativeadversarialnetworks

Text to Realistic Image Generation with Attentional Concatenation Generative Adversarial Networks

Similar Items