Over the last months, we kept running into the same production issues with LLMs:
– Provider outages and partial degradation – Silent retries multiplying cost – Hard coupling to a single vendor
We built Perpetuo, a thin gateway that sits between your app and LLM providers.
It routes requests based on latency, cost and availability, applies automatic failover, and keeps billing predictable — all using your own API keys (no reselling, no lock-in).
This is early, but already running in real workloads.
I’d really appreciate feedback from people running LLMs in production — especially what you’d expect from this kind of infrastructure layer.
Happy to answer any technical questions.
mtmail•1m ago
"If your work isn't ready for users to try out, please don't do a Show HN. Once it's ready, come back and do it then. Don't post landing pages or fundraisers." https://news.ycombinator.com/showhn.html
As alternative it can be a normal submission or question ("Ask HN").