I’ve been working on `benchmax`, a open-source framework for building, running, and parallelizing environments, to fine-tune LLMs with reinforcement learning.
What I wanted to solve for:
- Environments are tightly coupled with RL trainers, leading to fragmentation and limited compatibility.
- These coupled environments are tend to be mostly competitive math and coding → for OSS RL + LLMs to scale, we need more complex, real-world environments.
- Scaling these environments in parallel is still not easily possible
What I'm excited about:
- benchmax is training framework agnostic with adapters already built out for verl and verifiers. we’re gonna build more adapters for other frameworks (e.g. SkyRL, etc.), instead of forcing others to adopt our standard (though ofc they’re welcome to )
- benchmax comes with a few interesting environments out of the box: spreadsheet processing, CRM, etc. → more coming soon!
- benchmax supports MCP as a first class citizen. there has been an explosion of MCP servers/tools built out for usecases ranging from browser use to excel to game creation.`benchmax` allow folks to leverage and compose these existing MCP servers to build environments integrated with real world systems
- Multi-node environment parallelization coming soon!
If you like what you see, feel free to *star* the *repo* to support the project!! Our hope’s to really let anyone benchmax on their tasks, with benchmax
https://github.com/cgftinc/benchmax
It’s still very early! And I expect to be shipping a lot more things → more environments, more trainer integrations. Would love y’all’s thoughts what environments and trainer integrations could be interesting!