Description
TitleGenerative adversarial networks for image synthesis
Date Created2019
Other Date2019-01 (degree)
Extent1 online resource (97 pages) : illustrations
DescriptionImage synthesis is an important problem in computer vision and has many applications, such as computer-aided design and photo-editing. There has been remarkable progress in this direction with the emergence of Generative Adversarial Networks (GANs). However, GANs still face many challenges in generating high quality images: the difficulty of directly approximating the high-resolution image distribution, the poor model generalization ability to datasets with multiple classes, the frequent occurrences of mode collapse and unstable training are among the key challenges. To tackle those challenges, we conduct extensive studies on designing new network architectures, adding regularization, introducing heuristic tricks, and modifying the learning objectives and dynamics. (i) New Stacked Generative Adversarial Networks (StackGANs) are proposed for high-resolution images synthesis. The StackGAN-v1 is first built to decompose the hard image generation problem into more manageable sub-problems through a sketch-refinement process, generating unprecedented 256256 photo-realistic images from text descriptions. Moreover, a novel Conditioning Augmentation technique, that encourages smoothness in the latent conditioning manifold, is introduced to improve the diversity of the synthesized images and stabilize the training of the conditional-GAN. To further improve the quality of generated samples and stabilize GANs’ training, an advanced multi-stage generative
adversarial network architecture, StackGAN-v2, is presented for both conditional and unconditional generative tasks. (ii) A novel Self-Attention Generative Adversarial Networks (SAGAN) is introduced for multi-class image generation. Our SAGAN incorporates the self-attention mechanism into the convolutional GAN framework, so that it can model long-range multi-level dependencies for generating realistic images on challenging datasets, such as ImageNet. Moreover, we show that the spectral normalization applied to the generator can stabilize GANs’ training and the TTUR can speed up training of regularized discriminators. (iii) We present the Optimal Transport Generative Adversarial Networks (OT-GAN), a variant of GANs minimizing a new metric measuring the distance between the generator distribution and the data distribution. This metric, called mini-batch energy distance, combines optimal transport in primal form with an energy distance defined in an adversarially learned feature space, resulting in a highly discriminative distance function with unbiased mini-batch gradients. Both qualitative and quantitative validation experiments are conducted for all proposed methods.
NotePh.D.
NoteIncludes bibliographical references
Noteby Han Zhang
Genretheses, ETD doctoral
Languageeng
CollectionSchool of Graduate Studies Electronic Theses and Dissertations
Organization NameRutgers, The State University of New Jersey
RightsThe author owns the copyright to this work.