google-research · ssnl · Mar 19, 2022 · Apr 14, 2022 · Mar 23, 2022 · Mar 30, 2022
diff --git a/README.md b/README.md
@@ -1,14 +1,24 @@
-# RoboDesk
+# RoboDesk with A Diverse Set of Distractors
 
-[![PyPI](https://img.shields.io/pypi/v/robodesk.svg)](https://pypi.python.org/pypi/robodesk/#history)
+A Multi-Task Reinforcement Learning Benchmark with A Diverse Set of Distractors
 
-A Multi-Task Reinforcement Learning Benchmark
+<img src="./animation.gif" width="830" />
 
-![Robodesk Banner](https://i.imgur.com/1qp1SUh.gif)
+This repository contains a version of [RoboDesk](https://github.com/google-research/robodesk) that supports a rich set of challenging distractors, including camera and lighting noises, and even natural video noises. The distractors can be individually switched on or off. When they are all off (default), the environment behaves identically to the original RoboDesk.  Support for these distractors is added by [Tongzhou Wang](https://ssnl.github.io/).
 
-If you find this open source release useful, please reference in your paper:
+This environment is designed for advancing learning with noisy observations and rewards, a step beyond controlled toy environments and towards more realistic cases. Here is a (likely incomplete) list of projects using this environment:
++ [Denoised MDPs: Learning World Models Better Than The World Itself](https://ssnl.github.io/denoised_mdp). ICML 2022.
+
+If you find this open source release useful, please reference the following entries in your paper:
 
 ```
+@misc{wang2022robodeskdistractor,
+  author = {Tongzhou Wang},
+  title = {RoboDesk with A Diverse Set of Distractors},
+  year = {2022},
+  howpublished = {\url{https://github.com/SsnL/robodesk}},
+}
+
 @misc{kannan2021robodesk,
   author = {Harini Kannan and Danijar Hafner and Chelsea Finn and Dumitru Erhan},
   title = {RoboDesk: A Multi-Task Reinforcement Learning Benchmark},
@@ -23,11 +33,14 @@ If you find this open source release useful, please reference in your paper:
 - **Complexity:** The high-dimensional image inputs contain objects of different shapes and colors, whose initial positions are randomized to avoid naive memorization and require learning algorithms to generalize.
 - **Robustness:** We carefully designed and tested RoboDesk to ensure fast and stable physics simulation. This avoids objects from intersecting, getting stuck, or quickly flying away, a common problem with some existing environments.
 - **Lightweight:** RoboDesk comes as a self-contained Python package with few dependencies. The source code is clean and pragmatic, making it a useful blueprint for creating new MuJoCo environments.
+- **Distractors:** RoboDesk provides an easy-to-use API to turn on various distractors, i.e., environment noises. Available distractors include
+  - Shaky and flickering *environment headlights*;
+  - Shaky *camera movements*;
+  - *TV* playing natural videos, with hue controlled by buttons;
+  - Noisy *button sensor*, affecting the indicator lights on desks and TV hue.
 
 ## Training Agents
 
-Installation: `pip3 install -U robodesk`
-
 The environment follows the [OpenAI Gym][gym] interface:
 
 ```py
@@ -48,9 +61,11 @@ while not done:
 
 ## Tasks
 
-![Robodesk Tasks](https://i.imgur.com/OwTT2pk.gif)
+Crucially,  the `tv_green_hue` task gives out reward based on a distractor (TV image green-ness), but its optimal strategy is agnostic of the distractor state (i.e., simply always pushing the button).
 
-The behaviors above were learned using the [Dreamer](https://github.com/danijar/dreamer) agent. These policies have been learned from scratch and only from pixels, not proprioceptive states.
+For an environment with **all** distractors and `tv_green_hue`, here is an example of a signal-noise factorization identified by a [Denoised MDP](https://ssnl.github.io/denoised_mdp) model:
+
+https://user-images.githubusercontent.com/5674597/172927710-84c805dd-4326-4064-9079-237c26102812.mp4
 
 | Task | Description |
 | :-------- | :---------- |
@@ -63,14 +78,29 @@ The behaviors above were learned using the [Dreamer](https://github.com/danijar/
 | `flat_block_in_shelf` | Push the green flat block into the shelf, navigating around the other blocks.  |
 | `lift_upright_block` | Grasp the blue upright block and lift it above the table.  |
 | `lift_ball` | Grasp the magenta ball and lift it above the table. |
+| `tv_green_hue` | Push the green button to affect the TV hue to be more green. |
 
 
 ## Environment Details
 
 ### Constructor
 
+Two entry points are available:
++ `robodesk.RoboDesk`: A regular RoboDesk environment with top-down camera view of the desk.
++ `robodesk.RoboDeskWithTV`: A RoboDesk environment with a TV placed in the scene, and a further camera view looking at three desks and the TV. The leftmost desk is where the robot operates.
+
 ```py
-robodesk.RoboDesk(task='open_slide', reward='dense', action_repeat=1, episode_length=500, image_size=64)
+# Regular environment
+robodesk.RoboDesk(task='open_slide', reward='dense', action_repeat=1,
+                  episode_length=500, image_size=64)
+
+# Environment with noisy camera
+robodesk.RoboDesk(task='open_slide', reward='dense', action_repeat=1,
+                  episode_length=500, image_size=64, distractors={'camera'})
+
+# Environment with a TV in scene and all distractors turned on
+robodesk.RoboDeskWithTV(task='open_slide', reward='dense', action_repeat=1,
+                        episode_length=500, image_size=64, distractors="all")
 ```
 
 | Parameter | Description |
@@ -80,6 +110,8 @@ robodesk.RoboDesk(task='open_slide', reward='dense', action_repeat=1, episode_le
 | `action_repeat` | Reduces the control frequency by applying each action multiple times. This is faster than using an environment wrapper because only the needed images are rendered. |
 | `episode_length` | Time limit for the episode, can be `None`. |
 | `image_size` | Size of the image observations in pixels, used for both height and width. |
+| `distractors` | Configures distractors. `"all"` turns on all distractors, `"none"` turns off all of them. <br> Can also be a subset of all available distractors: `{'camera', 'env_light', 'button'}` for `RoboDesk`; and `{'camera', 'env_light', 'button', 'tv}` for `RoboDeskWithTV`. |
+| `tv_video_file_pattern`<br>(Only for `RoboDeskWithTV`) | Glob pattern that specifies the TV video files to use. Requires [`scikit-video`](http://www.scikit-video.org/stable/) to be installed. |
 
 ### Reward
 

diff --git a/animation.gif b/animation.gif
diff --git a/robodesk/__init__.py b/robodesk/__init__.py
@@ -1 +1,3 @@
-from .robodesk import RoboDesk
+from .robodesk import RoboDesk, RoboDeskWithTV
+
+__all__ = ['RoboDesk', 'RoboDeskWithTV']
diff --git a/robodesk/assets/desk.xml b/robodesk/assets/desk.xml
@@ -172,27 +172,49 @@
         <geom type="cylinder" size=".06 .001" pos="0 0 .011" rgba="0 0 0 1.0" group="1" contype="0" conaffinity="0"/>
         <geom type="cylinder" size=".08 .02" pos="0 0 .011" class="light_collision"/>
         <body>
-          <geom type="cylinder" size=".06 .0075" rgba="1 0 0 .4" group="1" mass="0.01"/>
+          <geom  name="red_light_rise_cylinder" type="cylinder" size=".06 .0075" rgba="1 0 0 .4" group="1" mass="0.01"/>
           <joint name="red_light" type="slide" axis="0 0 -1"/>
         </body>
+        <!--
+          NOTE [ Button to Light Mechanism ]
+
+          The above light is implemented via a slider always in sync with button. When button is pressed by X amount,
+          the colored cylinder is raised by X amount (up to some scaling and offsetting), which becomes visible when it
+          is higher than some threshold, defined by the height of the desk/holder. This emulates a light of that color.
+          However, this uses mujoco's joint mechanism, which doesn't easily support "noisy" lights.
+
+          To model "noisy" lights, we make an alternative implementation, having an overlay cylinder that is always
+          higher than the desk/holder. But we programmatically change its rgba based on the amount the button is
+          pressed, which can be noisily affected.
+
+          This alternative impl is only used when noisy light is enabled.
+        -->
+        <!-- `light_collision` is not rendered (group 4), so exceeding the blocker's height = 0.011 + 0.01 = 0.012 suffices -->
+        <!-- 0.012 - 0.0075 = 0.0045 -->
+        <geom name="red_light_overlay" pos="0 0 0.005" type="cylinder" size=".06 .0075" rgba="1 0 0 0" group="1"/>
+        <geom name="red_light_neg_overlay" pos="0 0 0.005" type="cylinder" size=".06 .0075" rgba="0.3 0.3 0.3 0" group="1"/>
       </body>
       <body name="green_light" pos="0 .15 0" childclass="light_visual">
         <geom type="mesh" mesh="lights_holder" euler="-1.57 0 3.14"/>
         <geom type="cylinder" size=".06 .001" pos="0 0 .011" rgba="0 0 0 1.0" group="1" contype="0" conaffinity="0"/>
         <geom type="cylinder" size=".08 .02" pos="0 0 .011" class="light_collision"/>
         <body>
-          <geom type="cylinder" size=".06 .0075" rgba="0 1 0 .4" group="1" mass="0.01"/>
+          <geom name="green_light_rise_cylinder" type="cylinder" size=".06 .0075" rgba="0 1 0 .4" group="1" mass="0.01"/>
           <joint name="green_light" type="slide" axis="0 0 -1"/>
         </body>
+        <geom name="green_light_overlay" pos="0 0 0.005" type="cylinder" size=".06 .0075" rgba="0 1 0 0" group="1" mass="0.01"/>
+        <geom name="green_light_neg_overlay" pos="0 0 0.005" type="cylinder" size=".06 .0075" rgba="0.3 0.3 0.3 0" group="1" mass="0.01"/>
       </body>
       <body name="blue_light" pos=".2 .15 0" childclass="light_visual">
         <geom type="mesh" mesh="lights_holder" euler="-1.57 0 3.14"/>
         <geom type="cylinder" size=".06 .001" pos="0 0 .011" rgba="0 0 0 1.0" group="1" contype="0" conaffinity="0"/>
         <geom type="cylinder" size=".08 .02" pos="0 0 .011" class="light_collision"/>
         <body>
-          <geom type="cylinder" size=".06 .0075" rgba="0 0 1 .4" group="1" mass="0.01"/>
+          <geom name="blue_light_rise_cylinder" type="cylinder" size=".06 .0075" rgba="0 0 1 .4" group="1" mass="0.01"/>
           <joint name="blue_light" type="slide" axis="0 0 -1"/>
         </body>
+        <geom name="blue_light_overlay" pos="0 0 0.005" type="cylinder" size=".06 .0075" rgba="0 0 1 0" group="1" mass="0.01"/>
+        <geom name="blue_light_neg_overlay" pos="0 0 0.005" type="cylinder" size=".06 .0075" rgba="0.3 0.3 0.3 0" group="1" mass="0.01"/>
       </body>
     </body>