GANs 101

A gentle introduction to Generative Adversarial Networks.

Sharan Babu
4 min readJun 25, 2020

So, what does GAN stand for? Gaming All Night??.. Nah, that sounds cool but let’s go over Generative Adversarial Networks(GANs) which I feel are way cooler than gaming.

Generative Adversarial Networks are a class of Neural Networks in which two neural networks compete against each other to generate results. Essentially, these networks learn to generate data that resembles the training set.

1 winner among 2
Photo by Randy Fath on Unsplash

First, let us obtain a high-level overview of the 2 Neural Networks in action and then we can discuss the nitty-gritty details.

  1. The first Neural Network is called the Generator. It receives random noise (Gaussian Distribution) and is responsible for giving output data (often an image).

2. The second Neural Network is called the Discriminator. It takes a data set consisting of real images and fake images from the ‘Generator’. Its purpose is to attempt to classify real v/s fake images.

Note: The above classification is always a binary classification.

Training Phases:

  1. Train Discriminator
  2. Train Generator

How the Discriminator is trained…

Real images which are labelled 1 are combined with fake images from the Generator and these are labelled 0. Discriminator trains to distinguish real images from fake ones. The important part here is that it does so with backpropagation only on discriminator weights.

How the Generator is trained…

Fake images are produced with the Generator. Next, feed only these fake images to the generator with all labels set as real, that is, labelled 1.

This causes the Generator to attempt to produce images the discriminator believes to be real.

Source: Google

The generator never actually sees any real images. It learns by viewing the gradients going back through the discriminator. The better the discriminator gets through training, the more information the discriminator contains in its gradients, which means the generator can make progress in learning how to generate fake images.

Hence, the Generator generates convincing images only based off gradients flowing back through the discriminator.

Keep in mind, that the discriminator is also improving as the training phase is progressing meaning the generated images will also need to be better to fool the discriminator. This is why the 2 Neural Networks are said to be competing against each other.

Difficulties with GANs

  1. Training GANs that are capable of performing well require GPUs most of the times.
  2. Mode Collapse: Often, the generator will figure out a few images (or a single image) that can fool the discriminator and eventually “collapses” to only produce that images. Thereby reducing variability in the output images.
  3. Instability: It can be difficult to ascertain performance and set optimal hyper-parameters since the generated images are always truly “fake”. Due to the design of GANs, the generator and discriminator are constantly at odds, leading to an oscillation of performance between the two.

Overcoming Mode Collapse

  1. A variant of GANs called DCGANs (Deep Convolutional GANs) do a better job at avoiding mode collapse.
  2. Mini-batch discrimination: It stabilizes the networks and essentially punishes generated batches of images that are too similar.

Applications/Examples of GANs:

  1. The number 0 generated by a GAN trained on the MNIST dataset.
This zero was generated by a DCGAN!

2. Style Transfer

Source: Google

3. Generate real human faces and perform Vector Arithmetic on images produced.

Source: Google

GANs are a fairly new deep learning architecture and have a large number of applications in the real world and are evolving at a rapid pace. It will be exciting to see what else we could do with GANs.

Check out the code for making a GAN that generates Hand-written numbers here.

I hope that you could understand GANs intuitively at a high level. This was my first Medium Post. Thank You for reading!

--

--