Skip to content

mlbio-epfl/LaMer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Meta-RL Induces Exploration in Language Agents

Yulun Jiang*, Liangze Jiang*, Damien Teney, Michael Moor**, Maria Brbić**

Project page | Paper | BibTeX


This repo contains the source code of 🌊LaMer, a Meta-RL framework of training LLM agents to actively explore and adapt to the environment at test time.



Training

To train the LLM Agent with LaMer:

bash examples/minesweeper/lamer_minesweeper_qwen3_4b.sh

To train the LLM Agent with RL baselines:

bash examples/minesweeper/gigpo_minesweeper_qwen3_4b.sh

See the examples folder for more examples.



Acknowledgements

This work is built upon verl, verl-agent, reflexion, RAGEN. We thank the authors and contributors of these projects for sharing their valuable work.

Citing

If you find our code useful, please consider citing:

@article{jiang2025metarl,
    title={Meta-RL Induces Exploration in Language Agents},
    author={Yulun Jiang and Liangze Jiang and Damien Teney and Michael Moor and Maria Brbic},
    journal={arXiv preprint arXiv:2512.16848}
    year={2025}
}