We detail basic instructions below, and the original Pi-0 README is below this section.
uv venv
source .venv/bin/activate
GIT_LFS_SKIP_SMUDGE=1 uv sync
uv pip install "tensorflow<2.20" tensorflow_datasets shapely openai # openai is for the PEEK evaluation
uv pip install -e ../peek_vlm # for PEEK vlm inference codeFollow the below instructions to train Pi-0 on BRIDGE-v2 with PEEK. If you want to just serve the policy, skip to the next section.
uv run python -c "from lerobot.common.datasets.lerobot_dataset import LeRobotDataset; dataset = LeRobotDataset('jesbu1/bridge_v2_lerobot_pathmask')"Pi-0+PEEK and Pi-0 are trained with LoRA on BRIDGE-v2. We trained full fine-tuning for pi-0 initially, but it was not noticeably different than LoRA, so we provide LoRA training instructions because it requires far less memory than FFT.
You can train with PEEK on 4 GPUs and batch size of 256 (should work for 4 48GB GPUs) by running the following:
XLA_PYTHON_CLIENT_MEM_FRACTION=0.95 uv run scripts/train.py pi0_lora_bridge_1_cam_path_masked --exp-name=EXP_NAME --overwrite [--resume if you want to resume training]You can train the original Pi-0 on BRIDGE-v2 with PEEK on 4 GPUs and batch size of 256 (should work for 4 48GB GPUs) by running the following:
XLA_PYTHON_CLIENT_MEM_FRACTION=0.95 uv run scripts/train.py pi0_lora_bridge_1_cam --exp-name=EXP_NAME --overwrite [--resume if you want to resume training]If you want to change the # of GPUs or batch size, modify the fsdp_devices and batch_size in the config file for pi0_lora_bridge_1_cam_path_masked or pi0_lora_bridge_1_cam at src/openpi/training/config.py.
You can also change the num_workers in src/openpi/training/config.py to change the # of workers for data loading.
Pre-trained checkpoints available via download:
# PEEK checkpoint
uv run hf download jesbu1/pi0_lora_bridge_1_cam_path_masked --local-dir checkpoints/pi0_lora_bridge_1_cam_path_masked
# Original Pi-0 checkpoint
uv run hf download jesbu1/pi0_lora_bridge_1_cam --local-dir checkpoints/pi0_lora_bridge_1_camOnce done training/downloading, you can evaluate the model:
First, make sure you have the PEEK VLM server running if you're using the PEEK checkpoint (no need for original Pi-0).
cd ../peek_vlm
conda activate peek_vlm
python scripts/vila_server.py --host localhost --port 8000 --model_path memmelma/peek_3bNow, run the following command in this repo to initialize a policy server. This assumes you are running the PEEK VLM server on localhost:8000 and want to serve the policy on localhost:8001.
# for PEEK
uv run scripts/serve_policy_vlm.py --port 8001 \
--vlm-query-frequency=5 \
--vlm-server-ip=http://localhost:8000 \
policy:checkpoint --policy.config=pi0_lora_bridge_1_cam_path_masked \
--policy.dir=checkpoints/pi0_lora_bridge_1_cam_path_masked/pi0_lora_bridge_1_cam_path_masked/29999/
# for original Pi-0
uv run scripts/serve_policy_vlm.py --port 8001 \
--no-vlm-draw-mask \
--no-vlm-draw-path \
--vlm-server-ip=http://localhost:8000 \
policy:checkpoint \
--policy.config=pi0_lora_bridge_1_cam \
--policy.dir=checkpoints/pi0_lora_bridge_1_cam/pi0_lora_bridge_1_cam/29999/ If you plan on serving the policy with a different machine than the one running the robot, you can use a tunneling tool like ngrok, bore, pinggy, localtunnel, etc. to host the policy server on a web-accessible address.
You can also do SSH-based port forwarding instead.
Make sure you have the WidowX robot connected to the computer and the openpi server running.
Follow the instructions at the top of examples/bridge/main.py to set up the WidowX robot environment using a new conda or venv.
You might have to change some camera names manually depending on your WidowX setup.
Then:
python examples/bridge/main.py --policy-server-address <policy-server-address> --robot-ip localhost --robot-port 5556 --prompt "pick up the red block"openpi holds open-source models and packages for robotics, published by the Physical Intelligence team.
Currently, this repo contains two types of models:
- the π₀ model, a flow-based diffusion vision-language-action model (VLA)
- the π₀-FAST model, an autoregressive VLA, based on the FAST action tokenizer.
For both models, we provide base model checkpoints, pre-trained on 10k+ hours of robot data, and examples for using them out of the box or fine-tuning them to your own datasets.
This is an experiment:
To run the models in this repository, you will need an NVIDIA GPU with at least the following specifications. These estimations assume a single GPU, but you can also use multiple GPUs with model parallelism to reduce per-GPU memory requirements by configuring fsdp_devices in the training config. Please also note that the current training script does not yet support multi-node training.
| Mode | Memory Required | Example GPU |
|---|---|---|
| Inference | > 8 GB | RTX 4090 |
| Fine-Tuning (LoRA) | > 22.5 GB | RTX 4090 |
| Fine-Tuning (Full) | > 70 GB | A100 (80GB) / H100 |
The repo has been tested with Ubuntu 22.04, we do not currently support other operating systems.
When cloning this repo, make sure to update submodules:
git clone --recurse-submodules git@github.com:Physical-Intelligence/openpi.git
# Or if you already cloned the repo:
git submodule update --init --recursiveWe use uv to manage Python dependencies. See the uv installation instructions to set it up. Once uv is installed, run the following to set up the environment:
GIT_LFS_SKIP_SMUDGE=1 uv syncNOTE: GIT_LFS_SKIP_SMUDGE=1 is needed to pull LeRobot as a dependency.
Docker: As an alternative to uv installation, we provide instructions for installing openpi using Docker. If you encounter issues with your system setup, consider using Docker to simplify installation. See Docker Setup for more details.
We provide multiple base VLA model checkpoints. These checkpoints have been pre-trained on 10k+ hours of robot data, and can be used for fine-tuning.
| Model | Use Case | Description | Checkpoint Path |
|---|---|---|---|
| Fine-Tuning | Base diffusion π₀ model for fine-tuning | s3://openpi-assets/checkpoints/pi0_base |
|
|
|
Fine-Tuning | Base autoregressive π₀-FAST model for fine-tuning | s3://openpi-assets/checkpoints/pi0_fast_base |
We also provide "expert" checkpoints for various robot platforms and tasks. These models are fine-tuned from the base models above and intended to run directly on the target robot. These may or may not work on your particular robot. Since these checkpoints were fine-tuned on relatively small datasets collected with more widely available robots, such as ALOHA and the DROID Franka setup, they might not generalize to your particular setup, though we found some of these, especially the DROID checkpoint, to generalize quite broadly in practice.
| Model | Use Case | Description | Checkpoint Path |
|---|---|---|---|
|
|
Inference |
|
s3://openpi-assets/checkpoints/pi0_fast_droid |
|
|
Fine-Tuning |
|
s3://openpi-assets/checkpoints/pi0_droid |
|
|
Inference |
|
s3://openpi-assets/checkpoints/pi0_aloha_towel |
|
|
Inference |
|
s3://openpi-assets/checkpoints/pi0_aloha_tupperware |
|
|
Inference |
|
s3://openpi-assets/checkpoints/pi0_aloha_pen_uncap |
By default, checkpoints are automatically downloaded from s3://openpi-assets and are cached in ~/.cache/openpi when needed. You can overwrite the download path by setting the OPENPI_DATA_HOME environment variable.
Our pre-trained model checkpoints can be run with a few lines of code (here our
from openpi.training import config
from openpi.policies import policy_config
from openpi.shared import download
config = config.get_config("pi0_fast_droid")
checkpoint_dir = download.maybe_download("s3://openpi-assets/checkpoints/pi0_fast_droid")
# Create a trained policy.
policy = policy_config.create_trained_policy(config, checkpoint_dir)
# Run inference on a dummy example.
example = {
"observation/exterior_image_1_left": ...,
"observation/wrist_image_left": ...,
...
"prompt": "pick up the fork"
}
action_chunk = policy.infer(example)["actions"]You can also test this out in the example notebook.
We provide detailed step-by-step examples for converting data, training, and running inference on various robots:
- DROID
- ALOHA
- USC WidowX
- Libero (Data conversion and training only)
Remote Inference: We provide examples and code for running inference of our models remotely: the model can run on a different server and stream actions to the robot via a websocket connection. This makes it easy to use more powerful GPUs off-robot and keep robot and policy environments separate.
Test inference without a robot: We provide a script for testing inference without a robot. This script will generate a random observation and run inference with the model. See here for more details.
We will fine-tune the
- Convert your data to a LeRobot dataset (which we use for training)
- Defining training configs and running training
- Spinning up a policy server and running inference
We provide a minimal example script for converting Libero data to a LeRobot dataset in examples/libero/convert_libero_data_to_lerobot.py. You can easily modify it to convert your own data! You can download the raw Libero dataset from here, and run the script with:
uv run examples/libero/convert_libero_data_to_lerobot.py --data_dir /path/to/your/libero/dataTo fine-tune a base model on your own data, you need to define configs for data processing and training. We provide example configs with detailed comments for Libero below, which you can modify for your own dataset:
LiberoInputsandLiberoOutputs: Defines the data mapping from the Libero environment to the model and vice versa. Will be used for both, training and inference.