Skip to content

Conversation

@aryansid
Copy link

Summary

The Chess arena enables agents to modify the Kojiro chess engine codebase (evaluation, search, move ordering, etc.) and then compete in head to head matches using Fastchess.

Implementation

Core Methods

  • validate_code(): Compiles Kojiro engine and verifies executable exists
  • execute_round(): Compiles engines, builds pairings, runs matches in parallel
  • get_results(): Parses PGN files, aggregates match results, determines winners

Key Design Choices

  • Each match consists of 2 games (one with each color) to eliminate first move advantage (by default in Fastchess)
  • Winner of match gets 1 point. Draw counts as 0 points for both players
  • Executable name fixed as 'Kojiro'

Error Handling
Agents fail to compile are skipped, proper handling of incomplete/crashed games, PGN parsing etc.

@john-b-yang
Copy link
Contributor

john-b-yang commented Jan 2, 2026

Hey @aryansid this is great progress! Main note (won't repeat the ones in #92):

Can you create a single repository that houses the chess engine + fastchess? Like you did for BattleCode 2024.

  • I see right now you're cloning two open source repos (e.g. this line).
  • Instead, can you create a https://github.com/CodeClash-ai/Chess repository that contains all the relevant code?
  • And once you do, update the Dockerfile to clone the CodeClash repo instead of the OS ones.

Can go into greater detail about why we do this, but tl;dr reasons:

  • Want to eliminate changes to 3rd party repos as a potential reason for CodeClash issues.
  • Gives us flexibility to makes changes to codebase in case we need to.
  • Want to enable a model to run their own bots for testing purposes.
    • In this current set up, models can see the Kojiro engine in /workspace within the Docker container, but it may not be obvious fastchess is available
    • I can see fastchess is invokable via CLI in the current Docker image, which is great, but for prior arenas, we've also kept the match-running source code around in case an LM wants to inspect it to understand game mechanics. (For chess, this arguably doesn't really matter b/c chess is so ubiquitous, but I'm in favor of following precedent 🙂)

When I run uv run python main.py configs/test/chess.yaml, things are running great. So tl;dr - the current logic is excellent, just two asks remain:

  • Put all relevant code into a github.com/CodeClash-ai/Chess repository. Update codeclash/arenas/chess.py to reflect the updated repo paths if necessary.
  • Add some tests

And it should be good to go - the implementation so far is great.

Screenshots from running python main.py configs/test/chess.yaml

Screenshot 2026-01-01 at 10 38 48 PM Screenshot 2026-01-01 at 10 39 11 PM Screenshot 2026-01-01 at 10 39 36 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants