frontpage.

Show HN: I built a BYOK AI gateway after a 100x Cloudflare KV cost mistake

https://qzira.com/en/

1•qzira•2h ago

I run a small data recovery business in Japan, and over the past year I've been building with AI coding tools like Claude Code, Cursor, and Cline.

One of my side projects is an overnight content pipeline for my business. It pulls RSS feeds, fetches source articles, generates posts with AI, scores them, and publishes them to WordPress without supervision.

The content is a bit niche: cybersecurity incidents for Japanese manufacturing companies in Aichi Prefecture — Toyota's home region — where older workflows like fax and password-protected ZIPs still haven't fully disappeared.

The physical setup is also a little ridiculous: a SwitchBot turns the PC on at 3am, Windows Task Scheduler starts the Python pipeline, and another scheduled task shuts the machine down when it's done.

Originally this was only meant to solve my own problem. But the more failure modes I found, the more features I kept adding.

What pushed me to build qzira was cost control.

The first lesson was operational: alerts don't help at 3am. What I needed wasn't another notification, but a kill switch outside the application — something that could stop requests before they reached the provider, regardless of what the agent decided to do.

The second lesson was more embarrassing: I had miscalculated Cloudflare KV write costs by 100x. Every request was triggering a KV put. Rewriting that path to batch via cron jobs reduced writes by about 99% and fixed the unit economics.

I also became much more conservative about model choice for production content after running a simple comparison on my own pipeline.

I ran 10 articles through Claude and didn't find any hallucinations.

Then I ran 1 article through gpt-4o-mini, and it immediately inverted the meaning of the source: it wrote "operations were suspended" where the original said "no impact on operations was confirmed."

To be fair, the pipeline was tuned around Claude, so I don't take this as a general statement about model quality. It may simply have been a prompt/model fit issue. But for me, it was enough to become much more conservative about where lower-cost models are allowed to touch production content.

Both problems pointed to the same conclusion: cost and policy enforcement belong at the infrastructure layer, not inside the application.

So I built qzira — a BYOK AI gateway in front of OpenAI, Anthropic, and Google AI. It adds gateway-level budget controls, hard stops, and routing by changing the base_url in tools like Claude Code or Cursor.

Stack: Cloudflare Workers, Hono, D1, KV, and Vectorize.

There's a free tier.

Happy to answer questions about the architecture, the cost mistake, the overnight pipeline, or the slightly absurd physical setup behind it.

Show HN: GZOO Forge – persistent project memory as an MCP server for Claude Code

I Asked 15 DevTool Maintainers About Documentation Localization

Google Maps's Moat (2017)

How to automatically generate subtitles in any language

Anthropic vs. Dow

Private credit woes could become data center difficulties

The Execution Layer

A Mysterious Code Is Being Broadcast on Shortwave Radio. Is It Iran?

I want Artificial Competence, not more Artificial Intelligence

Show HN: Built a small CLI for self-improving OpenClaw agent loops

Evolve or Die

Why I'm not worried about AI causing mass unemployment

Anthropic vs. U.S. Department of War, etc. [pdf]

Show HN: ApexStore – An embedded LSM-Tree storage engine written in Rust

Show HN: Pacto – A runtime contract for cloud-native services

GitHub Security Lab's open source AI-powered vulnerability scanner

An opinionated take on how to do important research that matters

Show HN: Qiyaas – a word game based on numbers

In Defense of Death Caps

Bitcoin difficulty jumps 15% largest increase since 2021, despite price slump

Ghostty 1.3.0

Peter Thiel and Jeffrey Epstein Had a Yearslong Relationship

Show HN: Pu-erh Lab, a CUDA-accelerated RAW photo editor

Geo Platform for AI Search Visibility (ChatGPT, Claude, Gemini, Perplexity)

Mobile AI: fuzzy logic trust and compatibility

Avoiding temptation beats building willpower

Ask HN: What words and phrases make HN peeps see red?

Software Got Weird

US missile hit military base near Iran school, video analysis shows

AI and Software Development