I run a few different models on my compute nodes and was constantly editing json files managing configs for which one was where. Built this to solve the problem of aggregating them into one place behind a public nginx reverse proxy. My goal was hooking it to claude-code or qwen when I run out of tokens so I could use minimax or glm-5, but it works great for that and also sharing those with other people.
MIT licensed, reasonably secure, maybe useful.
TZubiri•1h ago
So, like litellm?
yatesdr•1h ago
Pretty similar to litellm[proxy], but supports the Responses API and also some re-write. This is pretty much targeted at coding TUIs but I do use it a lot for text embeddings and streaming inference in applications too.
yatesdr•1h ago
MIT licensed, reasonably secure, maybe useful.