Skip to content

doesburg11/PredPreyGrass

Repository files navigation

Python 3.11.13 RLlib

Predator-Prey-Grass

Emerging coevolution and cooperation through multi-agent deep reinforcement learning

This project studies how cooperative behavior emerges and stabilizes in a spatial, resource-limited ecosystem by combining within-lifetime multi-agent reinforcement learning with population-level ecological and evolutionary dynamics. It explores the interplay between nature (inherited traits via reproduction and mutation) and nurture (behavior learned via reinforcement learning) in ecological systems. We combine Multi-Agent Deep Reinforcement Learning (MADRL) with evolutionary dynamics to explore emergent behaviors in a multi-agent dynamic ecosystem of Predators, Prey, and regenerating Grass. Agents differ by speed, vision, energy metabolism, and decision policies—offering ground for open-ended adaptation. At its core lies a gridworld simulation where agents are not just trained—they are born, age, reproduce, die, and even mutate in a continuously changing environment.

Emerging human cooperative hunting of Mammoths

Environment:

  • Mammoth hunting : Mammoths are only hunted down and eaten by a human(s) in its Moore neighborhood if the cumulative human energy is strictly larger than the mammoth's energy. On failure (if cumulative human energy is too low), humans optionally lose energy proportional to their share of the attacking group's energy ( energy_percentage_loss_per_failed_attacked_prey). On success, prey energy is split among attackers (proportional by default, optional equal split via team_capture_equal_split). (implementation)

Other environments:

  • Base environment: The two-policy base environment. (results)

  • Mutating agents: A four-policy extension of the base environment. (results)

  • Centralized training: A single-policy variant of the base environment

  • Walls occlusion: An extension with walls and occluded vision

  • Reproduction kick back rewards: On top of direct reproduction rewards, agents receive indirect rewards when their children reproduce

  • Lineage rewards: On top of direct reproduction rewards, agents receive rewards when their offspring survives over time

  • Shared prey : This environment is very similar in logic to mammoth hunting, but in this case the typical energy level of a prey is smaller than that of a predator. With mammoth hunting this is typically the other way around: prey possess more energy than predators.

Experiments:

  • Testing the Red Queen Hypothesis in the co-evolutionary setting of (non-mutating) predators and prey (implementation, results)

  • Testing the Red Queen Hypothesis in the co-evolutionary setting of mutating predators and prey (implementation, results)

Hyperparameter tuning

  • Hyperparameter tuning base environment - Population-Based Training (Implementation)

Installation of the repository

Editor used: Visual Studio Code 1.107.0 on Linux Mint 22.0 Cinnamon

  1. Clone the repository:
    git clone https://github.com/doesburg11/PredPreyGrass.git
  2. Open Visual Studio Code and execute:
    • Press ctrl+shift+p
    • Type and choose: "Python: Create Environment..."
    • Choose environment: Conda
    • Choose interpreter: Python 3.11.13 or higher
    • Open a new terminal
    • pip install -e .
  3. Install the additional system dependency for Pygame visualization:
    • conda install -y -c conda-forge gcc=14.2.0

Quick start

Run the pre-trained policy in a Visual Studio Code terminal:

python ./src/predpreygrass/rllib/base_environment/evaluate_ppo_from_checkpoint_debug.py

Or a random policy:

python ./src/predpreygrass/rllib/base_environment/random_policy.py

References

Releases

No releases published

Contributors 2

  •  
  •  

Languages