This is used as blueprint to implement MARAProtocol. The protocol is designed to be a language-agnostic interface to implement MARA environments and agents. MARAProtocol is an open, language-agnostic interface built on Protocol Buffers that standardizes how interactive environments and learning agents exchange observations, actions, and rewards, no matter which programming language or domain they use. To do this, MARA protocols are currently defined in Protobuf. By separating runtime orchestration from domain logic and shipping ready-made stubs, evaluation tooling, and a growing benchmark suite, it lets environment builders plug in new tasks quickly while enabling agent developers to test and compare algorithms across heterogeneous, high-bandwidth domains—from robotics to Autumn video-game worlds.
The protocols are stored in protocols/*.proto files.
- Installation
- Examples
- License
- Python >= 3.12 (see note on 3.13 pre-release)
protoc>= 3.0- macOS/Linux (Windows instructions TBD)
To install these dependencies and generate the corresponding Python stubs and library for implementing environments and agents, please execute the following scripts:
git clone https://github.com/BasisResearch/MARAProtocol.git
cd MARAProtocol
sh setup_script_mac.sh # installs deps
sh scripts/generate_python.sh # generates gRPC stubs into ./generatedWe have a prebuilt interpreter for Python 3.13 on MacOSX. More prebuilts are available at Autumn.cpp/Autumn.wasm repository. For any different versions, please follow the build guide of the repository.
Following this, please create an .env file following the .env_sample. This file is mostly used for providing credentials for LLM Agents.
Our supported LLM Agents includes (and will grow if needed): Ollama, OpenAI, Claude, MLX, Gemini. We also support framework OpenRouter framework for easily switch between different LLM providers.
You can also setup the API key directly, for example:
export OPENROUTER_API_KEY="YOUR_API_KEY_HERE"We list the corresponding providers belows for quick start.
| Provider | Docs / Quick-start |
|---|---|
| Ollama | REST API reference |
| OpenAI | API key setup |
| Claude (Anthropic) | Anthropic Developer Docs |
| Apple MLX | MLX GitHub |
| Google Gemini | Gemini API docs |
| OpenRouter | OpenRouter Quickstart |
Drop these variables into your local .env; the MARAProtocol tooling will load them automatically at runtime.
We provide several examples:
- A standard environment and random agent implementing MARAProtocol as a text adventure in
python_examples/text_adventures. - AutumnBench environments: Masked Frame Prediction (MFP), Change Detection (or Change Detection) and Planning. Along with three types of agents: Random agent, LLM-based agent, and an agent that's built with AutumnSynth in mind.
To run the AutumnBench examples, you first need to download the dataset. The dataset contains the tasks, programs, and prompts needed for the benchmark.
To download the public dataset, run the following script from the root of the repository:
./scripts/download_dataset.shThis script requires the following tools to be installed:
- curl: The script uses
curlto download files via HTTPS. It is pre-installed on most modern operating systems (like macOS and popular Linux distributions). - jq: A lightweight and flexible command-line JSON processor. You can install it with
brew install jqon macOS or find other installation methods here.
After the script finishes, the dataset will be available in the python_examples/autumnbench/example_benchmark/ directory.
We also provide an example benchmark for Autumn that consists of a single program, this is stored in Example Benchmark.
Once this is done, you can run the agent with either one of the following agents. Note that, protobuf codes are originally meant for creating gRPC for a language-agnostic interface. However, for simplicity, we usethem locally.
python -m python_examples.autumnbench.run_no_server +experiment=debug data_dir=$(pwd)/python_examples/autumnbench/example_benchmarkMore configurable parameters can be found in python_examples/autumnbench/conf/config.yaml. You can either specify a new experiment config in conf/experiments/ or specify them at runtime. For example, to change the render mode you can simply run the following command.
python -m python_examples.autumnbench.run_no_server +experiment=debug data_dir=$(pwd)/python_examples/autumnbench/example_benchmark render_mode=imageIf you would like to run with another model (say Claude 4 Opus) on OpenRouter, you can do the following:
python -m python_examples.autumnbench.run_no_server +experiment=debug data_dir=/python_examples/autumnbench/example_benchmark="anthropic/claude-opus-4"You can also configure the environments you want to run on by changing the list in the envs parameter. For the task_name parameter the options supported as (mfp, cd, planning).
The main agent is the UnifiedReactAgent defined in llm_agent.py, with some of the prompts defined in prompts.py. The task type themselves are defined in concrete_envs.py. Adding a new environment should be done by adding it to the AutumnBenchmark repo directly.
We currently provide the following three agents:
- "autumn_llm_unified_interactive_agent_v1" # LLM-based agent
- "autumn_random_interactive_agent_v1" # Random agent
- "autumn_simple_wm_agent" # Oracle autumnSynth agent
You can select the default desired agent in config.yaml.
This project is licensed under the MIT License - see the LICENSE file for details.
Dat Nguyen, Moksh Jain, Yichao Liang, Archana Warrier, Michelangelo Naim, Cambridge Yang, Zenna Tavares
Open source contributors: @hellohawaii