Skip to content

ibrahimhabibeg/vae-faces

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Variational Autoencoder for Face Generation and Morphing

PyTorch WandB uv

Banner Image

"What I cannot create, I do not understand." — Richard Feynman

This repository implements a Variational Autoencoder (VAE) from scratch in PyTorch and investigates the continuity of the latent space learned by the model. The code is built to work on both the CelebA and CelebAMask-HQ datasets, with a focus on the latter to isolate facial features from background noise.

📖 Read the full Blog Post

Results

Latent Space Interpolation (Morphing)

By linearly interpolating between two latent vectors $z_1$ and $z_2$ generated from different faces and passing them through the decoder, we can generate smooth transitions between entirely different faces. The smoothness of the morphing animation demonstrates the continuity of the learned latent space.

Morphing Animation

Generative Sampling

The model learns to map the complex distribution of faces to a simple Standard Normal prior $p(z) = \mathcal{N}(0, I)$ which can later be sampled from to generate entirely new faces. The grid below shows 128 unique samples generated by sampling from the prior and decoding them.

Generated Samples

Distribution Choices

Component Distribution
Prior $p(z)$ Standard Normal $\mathcal{N}(0, I)$
Enocder $q_\phi(z|x)$ Diagonal Gaussian
Decoder $p_\theta(x|z)$ Bernoulli

The model minimizes the negative of the Evidence Lower Bound (ELBO): $$\mathcal{L} = - \mathbb{E}{q}[\log p(x|z)] + D{KL}(q(z|x) || p(z))$$

The first term is estimated using Monte Carlo sampling, while the second term is computed analytically due to the choice of the distributions (see the Analytical Solution to KL Divergence section in the blog post for details).

Installation

This project uses uv for fast, reliable dependency management.

  1. Clone the repository

    git clone https://github.com/ibrahimhabibeg/vae-faces.git
    cd vae-faces
  2. Install dependencies

    uv sync

Usage

All interaction is handled through CLI scripts in the scripts/ directory, powered by simple-parsing.

You can run

uv run <path_to_script> --help

to see all available options for each script.

1. Data Preparation

cd scripts
uv run celebamask_hq_download.py
uv run celebamask_hq_prep.py

2. Training

The default config in train.py is set to log to Weights & Biases. It assumes you're logged in to your W&B account. You can disable logging by passing --nouse_wandb flag.

uv run train.py

3. Inference & Morphing

Generate a GIF transitioning between two faces.

uv run morph.py 

Generate a grid of new faces.

uv run generated_images_grid.py

Generate a grid of reconstructions.

uv run reconstructions_grid.py

Generate a grid of interpolations between pairs of faces.

uv run morph_grid.py

About

Training a Variational Autoencoder to Generate Faces

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages