Skip to content

Latest commit

 

History

History
50 lines (33 loc) · 3.08 KB

File metadata and controls

50 lines (33 loc) · 3.08 KB

Hello World in Imitation Learning

Imitation learning is supervised learning where data comes as expert demonstration. The expert can be a human or any other agent. Input data is referred to as "state" and output data as "action." In discrete action spaces, it resembles classification; in continuous action spaces, it is regression.

Policy $\pi: S \rightarrow A$ is the function/model that takes a state as input and outputs an action. The goal of imitation learning is to learn a policy that mimics the expert's behavior.

Behavioral Cloning (BC) is offline imitation learning that use only the collected demonstrations and doesn't use simulator during learning.

  • This tutorial is educational purpose, so code isn't optimized for production but easy to understand.
  • Each policy training is done in a single jupyter notebook.
  • Each directory contains a readme file.

Demos

Video Task State Space Action Space Expert Colab
MountainCar-v0 Continuous(2) Discrete(3) Human Open In Colab
Pendulum-v1 Continuous(3) Continuous(1) RL Open In Colab
CarRacing-v2 Image(96x96x3) Continuous(3) Human Open In Colab
Ant-v3 Continuous(111) Continuous(8) RL todo
Lift Continuous(multi-modal) Continuous(7) Human Open In Colab

Quick start

  • use the "Open In Colab" links above to run the code in colab.

Install locally to collect data on your own

  • please see the readme file in each directory for installation and data collection instructions.

Data format

  • We use hdf5 file for robomimic (see the 'readme.md' in robomimic directory to understand the data format) and real robot.
  • For rest of the environment we store as *.pkl file.

Collecting demonstrations

  • Please see the respective folders (e.g. robomimic_tasks) for data collection instructions.