"What I cannot create, I do not understand." — Richard Feynman
This repository implements a Variational Autoencoder (VAE) from scratch in PyTorch and investigates the continuity of the latent space learned by the model. The code is built to work on both the CelebA and CelebAMask-HQ datasets, with a focus on the latter to isolate facial features from background noise.
By linearly interpolating between two latent vectors
The model learns to map the complex distribution of faces to a simple Standard Normal prior
| Component | Distribution |
|---|---|
|
Prior |
Standard Normal |
|
Enocder |
Diagonal Gaussian |
|
Decoder |
Bernoulli |
The model minimizes the negative of the Evidence Lower Bound (ELBO): $$\mathcal{L} = - \mathbb{E}{q}[\log p(x|z)] + D{KL}(q(z|x) || p(z))$$
The first term is estimated using Monte Carlo sampling, while the second term is computed analytically due to the choice of the distributions (see the Analytical Solution to KL Divergence section in the blog post for details).
This project uses uv for fast, reliable dependency management.
-
Clone the repository
git clone https://github.com/ibrahimhabibeg/vae-faces.git cd vae-faces -
Install dependencies
uv sync
All interaction is handled through CLI scripts in the scripts/ directory, powered by simple-parsing.
You can run
uv run <path_to_script> --helpto see all available options for each script.
cd scripts
uv run celebamask_hq_download.py
uv run celebamask_hq_prep.pyThe default config in train.py is set to log to Weights & Biases. It assumes you're logged in to your W&B account. You can disable logging by passing --nouse_wandb flag.
uv run train.pyGenerate a GIF transitioning between two faces.
uv run morph.py Generate a grid of new faces.
uv run generated_images_grid.pyGenerate a grid of reconstructions.
uv run reconstructions_grid.pyGenerate a grid of interpolations between pairs of faces.
uv run morph_grid.py

