Built this as an experimental platform for large-scale multi-agent combat simulation and multi-agent RL.
It uses PyTorch for vectorized simulation, per-agent neural policies, PPO training, checkpoint/resume, telemetry, and a Pygame viewer.
The repo has a detailed README and technical docs.
I’d especially value feedback on the simulation design, training setup, observability, and overall architecture.