Explore our Research
By combining a variational autoencoder with a generative adversarial network, we can use learned feature representations in the GAN discriminator as a basis for the VAE reconstruction objective.
We replace element-wise errors with feature-wise errors to better capture the data distribution while offering invariance towards, e.g., translation. We apply our method to images of faces and show that it outperforms VAEs with element-wise similarity measures in terms of visual fidelity.
Moreover, we show that the method learns an embedding in which high-level abstract visual features (e.g., wearing glasses) can be modified using simple arithmetic.