Authors: Anders Boesen Lindbo Larsen, Søren Kaae Sønderby, Hugo Larochelle, Ole Winther
Publication date: 2015/12/31
Journal: arXiv preprint arXiv:1512.09300
By combining a variational autoencoder with a generative adversarial network we can use learned feature representations in the GAN discriminator as basis for the VAE reconstruction objective.
Thereby, we replace element-wise errors with feature-wise errors to better capture the data distribution while offering invariance towards eg translation. We apply our method to images of faces and show that it outperforms VAEs with element-wise similarity measures in terms of visual fidelity.
Moreover, we show that the method learns an embedding in which high-level abstract visual features (eg wearing glasses) can be modified using simple arithmetic.
Download