This is a multi-agent reinforcement learning simulation I’ve been building as a personal project.
It’s a grid-based combat environment with per-agent PPO training runtime.
Some of the things I’ve been experimenting with:
– Per-agent PPO (isolated optimizers per agent)
– Runtime checkpointing and resume chains
– Headless-mode live CSV telemetry logging
– Config-driven experiment control
The repo includes the simulation engine, PPO runtime, and telemetry tooling.