Overview

学習ずみのGANの生成結果を簡単にコントロールするための簡単で効率的な手法を提供

Abstract

This paper describes a simple technique to analyze Generative Adversarial Networks (GANs) and create interpretable controls for image synthesis, such as change of viewpoint, aging, lighting, and time of day. We identify important latent directions based on Principal Components Analysis (PCA) applied either in latent space or feature space. Then, we show that a large number of interpretable controls can be defined by layer-wise perturbation along the principal directions. Moreover, we show that BigGAN can be controlled with layer-wise inputs in a StyleGAN-like manner. We show results on different GANs trained on various datasets, and demonstrate good qualitative matches to edit directions found through earlier supervised approaches.

Motivation

学習ずみのGANの生成をコントロールしたいが、z(潜在変数)はentangleしてるのが普通 (=各次元が特定の特徴と紐づいているわけではない) かといって学習し直したり、個別のアーキテクチャを作る(条件付きを追加)するのは面倒 → 簡単にコントロールする方法が欲しい！

Architecture

StyleGANの場合、ランダムに生成したzから多数のwベクトル( スタイルベクトル)を生成。このwをPCA(主成分分析)にかけて主成分となるベクトルが得られれば… あとはそのベクトルに沿ってwをコントロールしたら、生成される画像の重要な要素がコントロールされる。

StyleGANを定式化すると….

$\mathbf{y}_i$ はStyleGANの各レイヤー. $M()$ は潜在変数をスタイルベクトルに変換するnon-linearな変換 (MLP)

\mathbf{y}_i=G_i\left(\mathbf{y}_{i-1}, \mathbf{w}\right) \quad \text { with } \mathbf{w}=M(\mathbf{z})

\mathbf{w}^{\prime}=\mathbf{w}+\mathbf{V} \mathbf{x}

Results

Further Thoughts

論文を読んで考えた個人的感想

Links

https://arxiv.org/abs/2004.02546