I've been working with LLMs in production for a while both as a solo dev building apps for clients and working at an AI startup. The one thing that always was a pain was to pay OpenAI/Gemini/Anthropic a few dollars a month just for me to say "test" or have a CI runner validate some UI code. So I built this server called ChunkBack, that mocks the popular llm provider's functionality but allows you to type in a deterministic language:
`SAY "cheese"` or `TOOLCALL "tool_name" {} "tool response"`
I've had to work in some test environments and give good results for experimenting with CI, but it's still an early project so would love feedback and more testers on.