A generalized, high-performance reinforcement learning agent based on DeepMind's AlphaZero algorithm.
This repository provides a clean, modular, and highly parallelized implementation of the AlphaZero algorithm. Originally introduced by DeepMind in their groundbreaking paper Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm, this implementation represents a generalized game-playing agent capable of mastering symmetric zero-sum games solely through self-play and reinforcement learning.
By combining Monte Carlo Tree Search (MCTS) with a deep neural network that evaluates board states and policy distributions, this agent iteratively improves without any human data or domain-specific heuristics.
- 🧠 Generalized Self-Play: The agent plays against itself, dynamically generating high-quality training data.
- ⚡ Parallelized Execution: Leverages multiprocessing for tree search and self-play (
Alpha_MCTS_Parallel.py,Alpha_Zero_Parallel.py), massively accelerating data generation. - 🎮 Multi-Game Support: Pluggable game environment architecture. Easily extensible to various board games (e.g., TicTacToe, ConnectFour).
- 📊 Intuitive Web Interface: Includes a fully-featured Streamlit Web UI for seamless interaction, configuration, and game monitoring.
- 💾 Automated Checkpointing: Built-in utilities (
save_games.py,train_from_saved_games.py) for caching game histories and resuming training runs seamlessly. - 🏟️ Competitive Arena: Pit different neural network models against each other in
Arena.pyto evaluate ELO and true model performance.
The implementation is heavily inspired by the following foundational research papers:
- Mastering the game of Go without human knowledge
- Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
- Human-level control through deep reinforcement learning
- Python: Version
3.12or higher is strictly required for optimal performance.
-
Clone the repository:
git clone https://github.com/amanmoon/general_alpha_zero.git cd general_alpha_zero -
Create and activate a virtual environment:
# Create a virtual environment natively with Python 3.12+ python3 -m venv venv # Activate on macOS/Linux source venv/bin/activate # Activate on Windows venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
We provide a beautiful, interactive web interface to interact with the models and environments.
- Ensure your virtual environment is active and all dependencies are installed.
- Launch the Streamlit server:
streamlit run app.py
- Open the provided
localhostURL in your browser to start exploring!
- Open
Train.py. - Import the desired game class and configure the hyperparameters dictionary (
args). - Execute the training script:
Tip: Use
python3 Train.py
train_from_saved_games.pyto bootstrap learning from previously serialized MCTS explorations!
Want to test your skills against the neural network?
- Open
Play.py. - Select your desired model checkpoint and configure the MCTS simulation count for the AI's turns.
- Run the script:
python3 Play.py
To definitively prove which iteration of your neural network is superior, let them battle:
- Open
Arena.pyand assign the respective paths to the models you wish to evaluate. - Run the tournament:
python3 Arena.py
├── Games/ # Game logic implementations (TicTacToe, Connect4, etc.)
├── Alpha_MCTS.py # Core Monte Carlo Tree Search logic
├── Alpha_MCTS_Parallel.py # Asynchronous MCTS implementation
├── Alpha_Zero.py # Self-play and neural network training loops
├── Alpha_Zero_Parallel.py # High-performance parallelized self-play
├── Arena.py # Model vs Model evaluation environment
├── Play.py # Human vs AI interactive script
├── Train.py # Entry point for training from scratch
├── app.py # Streamlit frontend application
└── requirements.txt # Python dependency specifications
If you are a recruiter, researcher, or just an enthusiast who wants to discuss reinforcement learning, AI architecture, or optimal search algorithms, I'd love to connect!
Email: amanmoon099@gmail.com