REPR: Position-Aware Reconstruction with Transformers

Official implementation of REPR (Position-Aware Reconstruction with Transformers) for self-supervised learning.

Note: In this codebase, REPR is referred to as partmae_v5 (L_pose only) or partmae_v6 (all loses).

Installation

Use uv package manager to install all dependencies:

uv sync

Data Setup

ImageNet-1K

Download ImageNet-1K and organize as:

data/
  imagenet/
    train/
      n01440764/
        ...
    val/
      n01440764/
        ...

Update paths in fabric_configs/data/in1k.yaml:

train_root: "/path/to/imagenet/train"
val_root: "/path/to/imagenet/val"

Other Datasets

ADE20K: For semantic segmentation evaluation
COCO: For object detection finetuning
VOC: For semantic segmentation evaluation

Configure paths in respective config files under fabric_configs/data/.

Pretrained Models

Download pretrained checkpoints from HuggingFace:

huggingface-cli download dgcnz/REPR --local-dir .

Available models:

outputs/2025-04-11/10-15-18/epoch_0199.ckpt: REPR (L_pose only)
outputs/2025-06-22/19-16-53/epoch_0199.ckpt: REPR

Reproduce Paper Results

Linear Classification on ImageNet-1K

uv run python -m src.experiments.linear_classification.main_linear \
    model=partmaev6_ep199_b \
    data=imagenet

Semantic Segmentation on ADE20K

uv run python -m src.experiments.linear_segmentation.eval_linear \
    model=partmaev6_b_ep199 \
    data=ade20k

Object Detection on COCO

uv run python -m src.main_finetune_det \
    model=partmaev6_b_ep199 \
    data=coco

K-NN Classification

uv run python -m src.experiments.knn.main_knn \
    model=partmaev6_b_ep199 \
    data=imagenet

Training from Scratch

Pretraining

Local (for debugging):

uv run python -m src.main_pretrain \
    experiment=pretrain/in1k/partmae_v6_vit_b_16/4060ti

SLURM cluster:

sbatch scripts/slurm/train_partmae_v6_h100.sh

Custom Training

Config files are in fabric_configs/experiment/. Override parameters:

uv run python -m src.main_pretrain \
    experiment=pretrain/in1k/partmae_v6_vit_b_16/4060ti \
    trainer.max_epochs=300 \
    data.batch_size=256

Model Architecture

The main model is implemented in src/models/components/partmae_v6.py.

Key features:

Off-grid position embedding for flexible patch sampling
Position-aware reconstruction with pose loss
Multi-crop training support
Distributed training with PyTorch Lightning Fabric

Repository Structure

├── src/
│   ├── main_pretrain.py          # Main pretraining script
│   ├── models/components/        # Model implementations
│   ├── experiments/             # Evaluation scripts
│   └── data/                    # Data loading and preprocessing
├── fabric_configs/              # Hydra configuration files
├── scripts/slurm/              # SLURM job scripts
└── tests/                      # Unit tests and benchmarks

Name		Name	Last commit message	Last commit date
Latest commit History 410 Commits
artifacts		artifacts
fabric_configs		fabric_configs
playground		playground
scripts		scripts
src		src
tests		tests
third-party		third-party
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
.project-root		.project-root
.python-version		.python-version
Makefile		Makefile
README.md		README.md
env.yaml		env.yaml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

REPR: Position-Aware Reconstruction with Transformers

Installation

Data Setup

ImageNet-1K

Other Datasets

Pretrained Models

Reproduce Paper Results

Linear Classification on ImageNet-1K

Semantic Segmentation on ADE20K

Object Detection on COCO

K-NN Classification

Training from Scratch

Pretraining

Custom Training

Model Architecture

Repository Structure

About

Uh oh!

Languages

dgcnz/REPR

Folders and files

Latest commit

History

Repository files navigation

REPR: Position-Aware Reconstruction with Transformers

Installation

Data Setup

ImageNet-1K

Other Datasets

Pretrained Models

Reproduce Paper Results

Linear Classification on ImageNet-1K

Semantic Segmentation on ADE20K

Object Detection on COCO

K-NN Classification

Training from Scratch

Pretraining

Custom Training

Model Architecture

Repository Structure

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages