Our demo runs:
Gemini 2.5 Flash as the fast, lightweight model
Qwen3-Coder-480B as the more powerful planner and code generator
But you can configure whatever combo works best for your use case. Forge offers better performance, lower costs, and full control with self-hosting options.
You can find our quickstart repo here: https://github.com/TensorBlock/claude-code-forge
Would love to hear your thoughts on this!