class: center, middle # Generative Adversarial Networks by _Roozbeh Farhoodi_ **download tutorial:** [Github repository](https://github.com/KordingLab/lab_teaching_2016) [IPython notebook](https://github.com/KordingLab/lab_teaching_2016/blob/master/session_4/Generative%20Adversarial%20Networks.ipynb) --- ### Generative models “Generative modeling is a broad area of machine learning which deals with models of distributions `\(\mathbb{P}(X)\)`, defined over datapoints X in some potentially high-dimensional space `\(\Theta\)`." ### Motivations - New samples from a dataset: generating natural images, ... - completing the incomplete data .center[
] --- ### variational autoencoder (VAE) .center[
] [tutorial](https://arxiv.org/pdf/1606.05908v2.pdf) --- ### GAN [Goodfellow et al, 2014](https://arxiv.org/pdf/1406.2661v1.pdf) `\(D\)` and `\(G\)` play the following two-player minimax game with value function `\(V(G, D)\)`: `\(\min_G \max_D V(D, G) = \mathbb{E}_{x \sim p_{\text{data}}({x})}[\log D({x})] + \mathbb{E}_{{z} \sim p_{{z}}({z})}[\log (1 - D(G({z})))]\)` .center[
] --- ### GAN [Goodfellow et al, 2014](https://arxiv.org/pdf/1406.2661v1.pdf) `\(D\)` and `\(G\)` play the following two-player minimax game with value function `\(V(G, D)\)`: `\(\min_G \max_D V(D, G) = \mathbb{E}_{x \sim p_{\text{data}}({x})}[\log D({x})] + \mathbb{E}_{{z} \sim p_{{z}}({z})}[\log (1 - D(G({z})))]\)` .center[
] --- ### Example .left[
] .left[
] --- .center[**Generative and Discriminative**] .center[
] --- ## Variations .left[
] - dcGAN - InfoGAN - AC-GAN - EB-GAN - LAPGAN... --- ## DC-GAN [**Alec Radford and Luke Metz et al, 2016**](https://arxiv.org/pdf/1511.06434.pdf) .center[
] "We propose and evaluate a set of constraints on the architectural topology of Convolutional GANs that make them stable to train in most settings. We name this class of architectures Deep Convolutional GANs (DCGAN)" --- ## DC-GAN Architecture guidelines for stable Deep Convolutional GANs - Replace any pooling layers with strided convolutions (discriminator) and fractional-strided convolutions (generator). - Use batchnorm in both the generator and the discriminator. - Remove fully connected hidden layers for deeper architectures. - Use ReLU activation in generator for all layers except for the output, which uses Tanh. - Use LeakyReLU activation in the discriminator for all layers. --- ## Info-GAN [**Xi Chen et al, 2016**](https://arxiv.org/pdf/1606.03657v1.pdf) "The GAN formulation uses a simple factored continuous input noise vector z, while imposing no restrictions on the manner in which the generator may use this noise. As a result, it is possible that the noise will be used by the generator in a highly entangled way, causing the individual dimensions of z to not correspond to semantic features of the data." --- ## Info-GAN Solution: Introducing the latent variable `c` and modifying: .center[
] the second term can be approximated by: .center[
] Which can be approximated again by Monte Carlo simulations. .center[
] --- ## Info-GAN .center[
] --- ## LAPGAN [**Emily Denton et al, 2015**](https://arxiv.org/pdf/1506.05751.pdf) "Our approach uses a cascade of convolutional networks within a Laplacian pyramid framework to generate images in a coarse-to-fine fashion." --- ## LAPGAN .center[
] --- ## LAPGAN .center[
] --- ## AC-GAN [**Augustus Odena et al, 2016**](https://arxiv.org/pdf/1610.09585v1.pdf) "GANs struggle to generate globally coherent, high resolution samples - particularly from datasets with high variability." .center[
] --- ## AC-GAN .center[
] --- ## EBGAN [**Junbo Zhao et al, 2016**](https://arxiv.org/pdf/1609.03126v2.pdf) "We introduce the “Energy-based Generative Adversarial Network” model (EBGAN) which views the discriminator as an energy function that associates low energies with the regions near the data manifold and higher energies with other regions." --- ## EBGAN .center[
] --- ## CGAN [**Jon Gauthier**](http://www.foldl.me/uploads/2015/conditional-gans-face-generation/paper.pdf) "Our approach uses a cascade of convolutional networks within a Laplacian pyramid framework to generate images in a coarse-to-fine fashion." --- ### Remedies - **Freezing learning:** Frequently either the generator or discriminator will outpace the other and the GAN will get stuck in a degenerate solution. One simple remedy is to freeze learning in one model when it begins to get too strong. [See here](https://arxiv.org/pdf/1610.01945.pdf?utm_content=buffer13ebb&utm_medium=social&utm_source=facebook.com&utm_campaign=buffer) - **Label smoothing:** A simple trick to prevent gradients from vanishing when the discriminator’s predictions are very confident, label smoothing replaces `\(0 : 1\)` labels with `\(\epsilon : 1 - \epsilon\)`, which guarantees the generator will always have informative gradients. - **Minibatch discrimination:** To prevent the generator from collapsing onto a single sample, minibatch discrimination extends the role of the discriminator from classifying single images to classifying entire minibatches.[See here](https://arxiv.org/abs/1606.03498) --- ### Remedies - **Batch normalization:** Batch normalization has been critical for scaling GANs to deep convolutional networks, and virtual batch normalization has extended this by using a constant reference batch to prevent correlations in predictions due to the other elements of the minibatch.[see here](https://arxiv.org/pdf/1502.03167v3.pdf) .center[
] --- ### tutorial's talk .center[
] --- ### let's start painting! .center[
] --- class: center, middle # Questions and Discussion