We introduce a physics-informed planning framework to solve a set of challenging complex physical reasoning tasks.
Given the task of positioning a ball-like object to a goal region beyond direct reach, humans can often throw, slide, or rebound objects against the wall to attain the goal. However, enabling robots to reason similarly is non-trivial. Existing methods for physical reasoning are data-hungry and struggle with complexity and uncertainty inherent in the real world. This paper presents PhyPlan, a novel physics-informed planning framework that combines physics-informed neural networks (PINNs) with modified Monte Carlo Tree Search (MCTS) to enable embodied agents to perform dynamic physical tasks. PhyPlan leverages PINNs to simulate and predict outcomes of actions in a fast and accurate manner and uses MCTS for planning. It dynamically determines whether to consult a PINN-based simulator (coarse but fast) or engage directly with the actual environment (fine but slow) to determine optimal policy. Evaluation with robots in simulated 3D environments demonstrates the ability of our approach to solve 3D-physical reasoning tasks involving the composition of dynamic skills. Quantitatively, PhyPlan excels in several aspects: (i) it achieves lower regret when learning novel tasks compared to state-of-the-art, (ii) it expedites skill learning and enhances the speed of physical reasoning, (iii) it demonstrates higher data efficiency compared to a physics un-informed approach. output a manipulation program that can be executed by the robot on the input scene resulting in the desired output scene.
-
Follow the installation process for Isaac Gym (version 1.0rc4). Further, ensure the specific versions of the following:
Version cudatoolkit11.1.1 tkinter8.6.12 yaml0.2.5 -
Create a virtual environment with Python=3.7.12 and install all the dependencies:
conda env create -f environment.yml
-
Download and setup GroundingDINO (version 0.1.0) in Agent folder.
Execute commands in activate_env.sh in the terminal or run
source activate_env.shExecute commands in deactivate_env.sh in the terminal or run
source deactivate_env.shPhyPlan
├── Agent: (Baseline and Our Planner Implementation)
│ ├── agent_models: (Baseline models for each task)
│ ├── assets: (URDF files describing objects in the environments)
│ ├── data: (Training and testing data generated by data_generation.py)
│ ├── data_generation.py: (Generate traininig data for baselines)
│ ├── detection.py: (Detect objects from scene using GroundingDINO)
│ ├── environments: (Environment Descriptions for each task)
│ ├── BASELINE_agent.py: (BASELINE agents)
│ ├── BASELINE_eval_agent.py: (BASELINE evaluation)
│ ├── BASELINE_train.py: (BASELINE evaluation)
│ ├── phyplan.py: (Our Planner or reasoning agent)
│ └── pinn_models: (PINN-based skill models)
│
└── Skill_Learning:
├── Data_Collection: (Collecting dataset for different skills)
│ ├── Collision
│ ├── Sliding
│ ├── Swinging
│ └── Throwing
└── Skill_Network: (Learning different skill)
├── main.py:
├── data.py:
├── model.py:
└── loss.py:
Migrate to the Agent directory and run python data_generation.py with specific arguments to generate the data. Various arguments are:
| Argument | Description |
|---|---|
| env | Name of the environment to run |
| simulate (bool) | Set True to render the simulation |
| action_params | Dimension of action vector |
| L (int) | Number of contexts generated |
| M (int) | Number of actions per context |
| mode (train, test, dryrun) | Type of experiment running |
| fast (bool) | Invoke PINN for fast training |
| robot (bool) | Set True to use robot to perform actions |
| perception (bool) | Set True to use perception |
| Environment Name (internal) | Environment Name (external) | num_action_parameters |
|---|---|---|
pendulum |
Launch | 2 |
sliding |
Slide | 2 |
wedge |
Bounce | 2 |
sliding_bridge |
Bridge | 3 |
Migrate to the Agent directory and run python BASELINE_train.py with specific arguments to train the models. Various arguments are:
| Argument | Description |
|---|---|
| env | Name of the environment to run |
| action_params | Dimension of action vector |
| epochs | Number of epochs to train each intermediate model |
| contexts | Total number of contexts to train |
| contexts_step | Number of training contexts after which to save intermediate model |
| fast | Train model on PINN data |
Migrate to the Agent directory and run python {phyre_}eval_agent.py with specific arguments to get regret results. Various arguments are:
Migrate to the Agent directory and run python llm_brain.py in one terminal and minimize it letting it run in the background. Now, in a new terminal again migrate to the Agent directory and run python llm_eval_agent.py with specific arguments to get regret results. Various arguments are:
| Argument | Description |
|---|---|
| env | Name of the environment to run |
| action_params | Dimension of action vector |
| simulate (bool) | Set True to render the simulation |
| model | Name of the agent model to evaluate |
| contexts | Total number of contexts to evaluate |
| actions | Number of actions available for each context |
| robot (bool) | Set True to use robot to perform actions |
| perception (bool) | Set True to use perception |
This is the latest agent design which does not need any training or data generataion. It uses skill composition.
Migrate to the Agent directory and run python phyplan.py with specific arguments to run MCTS algorithm. Various arguments are:
| Argument | Description |
|---|---|
| env | Name of the environment to run (Currently works only for paddles) |
| action_params | Dimension of action vector |
| simulate (bool) | Set True to render the simulation |
| contexts | Total number of contexts to evaluate |
| actions | Number of actions available for each context |
| robot (bool) | Set True to use robot to perform actions |
| perception (bool) | Set True to use perception |
| pinn_attempts (int) | Number of attempts using PINN before each simulator attempt |
| adaptive (bool) | Set False to not use GP-UCB adaptation |
Migrate to the Agent directory and write the necessary scripts with the specific hyperparameters of the experiment. Stitch everything in script.sh and run it. Running all stages will take around 4-5 hours when robot is not employed.
Note: Employing robot significantly slows down the simulation to avoid execution errors. Since, the robotic arm is not involved in reasoning, and only executes the actions to demostrate the physical feasibility and performability of our tasks, it is advised to train, analyse and compare the models without the robotic arm unless robotic visualisation is needed.
Training different skills like sliding, swinging, throwing, collision.
@inproceedings{phyplan2026,
title = {PhyPlan: Learning To Plan Tasks with Generalizable and Rapid Physical Reasoning for Embodied Manipulation},
author = {Kanwar, Ankit and Soin, Hartej, and Barnawal, Abhinav and Chopra, Mudit and Vagadia, Harshil and Banerjee, Tamajit and Tuli, Shreshth and Chakraborty, Souvik and Paul, Rohan},
booktitle = {},
year = {2026}
}
