Improving inference and generation process of generative adversarial networks

Tian, Yu

doi:doi:10.7282/t3-jrdp-sy48

RUcore: Rutgers University Community Repository

Search
- All
- Text
- Images
- Audio
- Video
Advanced Search | Help

Search all content in all RUcore collections.
Services
Collections

Help Contact Us My Account

Home

Resource

Improving inference and generation process of generative adversarial networks

PDF

PDF format is widely accepted and good for printing.

Plug-in required

PDF-1(59.35 MB)

Citation & Export

View Usage Statistics

Staff View

Citation & Export
Hide

Simple citation

Tian, Yu. Improving inference and generation process of generative adversarial networks. Retrieved from https://doi.org/doi:10.7282/t3-jrdp-sy48

Export

Click here for information about Citation Management Tools at Rutgers.

Statistics
Hide

Description

TitleImproving inference and generation process of generative adversarial networks

NameTian, Yu (author); Metaxas, Dimitris N. (chair); Rutgers University; School of Graduate Studies

Date Created2021

Other Date2021-05 (degree)

SubjectComputer Science

Extent1 online resource (xiv, 91 pages)

DescriptionGenerative Adversarial Networks (GANs) have been witnessed tremendous successes in broad Computer Vision applications, such as image synthesis, photo-editing, and image-to-image translation, etc. Closely related to GAN-based image synthesis, there are two promising directions. (i) GAN-based inference learning and (ii) GAN-based video synthesis. Both directions face many challenges in generation and inference: In GAN-based Inference Learning, the application-driven methods usually outperform the approaches with elegant theories. For GAN-based video synthesis, the generation quality is far behind the contemporary image generators. To tackle those challenges, we conduct extensive studies on improving inference and generation processes.

First, we investigate a specific application: Generating multi-view images from a single view input. We identify the “incomplete” representation issue in the existing single-pathway framework, then propose a two-pathway approach to address this problem. In addition to the single reconstruction path, we introduce a generation sideway to maintain the completeness of the learned embedding space. Self-supervised learning is also employed to make the use of both labeled and unlabeled data. The experimental results prove that the proposed method significantly outperforms state-of-the-art methods, especially when generating from “unseen” inputs in wild conditions.

Second, as the theoretical extension of the previous work. We analyze three issues that degrade both generation and inference in GAN-based inference learning approaches: “holes” issue and “non-reverse encoder-generator” issue in single-pathway adversarial learning, and “unstable inference mapping” issue in bidirectional adversarial learning. To address all three issues in a unified framework, we take the single-pathway approach as a baseline to avoid the “unstable inference mapping” issue and propose two strategies to solve the remaining others. Theoretical analysis proves that the learned encoder and decoder are mutually inverse. Results on both synthetic data and real-world applications support our theoretical analysis and demonstrate improved performance against baselines.

Third, we present a framework that leverages contemporary image generators to render high-resolution videos. We frame the video synthesis problem as discovering a trajectory in the latent space of a pre-trained and fixed image generator. We introduce a motion generator that discovers the desired trajectory, in which content and motion are disentangled. Furthermore, we introduce a new task, which we call cross-domain video synthesis, in which the image and motion generators are trained on disjoint datasets belonging to different domains. This allows for generating moving objects for which the desired video data is not available. Extensive experiments on various datasets demonstrate the advantages of our methods over existing video generation techniques.

NotePh.D.

NoteIncludes bibliographical references

Genretheses, ETD doctoral

Persistent URLhttps://doi.org/doi:10.7282/t3-jrdp-sy48

LanguageEnglish

CollectionSchool of Graduate Studies Electronic Theses and Dissertations

Organization NameRutgers, The State University of New Jersey

RightsThe author owns the copyright to this work.

Version 8.5.5

Citation & ExportHide

Simple citation

Export

StatisticsHide

Description

Citation & Export
Hide

Statistics
Hide