frontpage.

I built CacheLens because I was burning through $200+/month on Claude API calls and had no idea where it was going.

It's a local HTTP proxy that sits between your app and the AI provider (Anthropic, OpenAI, Google). Every request flows through it, and it records token usage, cost, cache hit rates, latency — everything. Then there's a dashboard to visualize it all.

What makes it different from just checking your provider dashboard:

It's real-time (WebSocket live feed of every call as it happens) It works across all three major providers in one view It runs 100% locally — your prompts never leave your machine It has budget caps that actually block requests before you overspend It identifies optimization opportunities (cache misses, model downgrades, repeated prompts) Tech stack: Python, FastAPI, SQLite, vanilla JS. No React, no build step, no external dependencies beyond pip. The whole thing is ~3K lines of Python.

Interesting technical decisions:

The proxy captures streaming responses without buffering — it tees the byte stream so the client sees zero added latency Cost calculation uses a built-in pricing table with override support (providers change rates constantly) There's a Prometheus /metrics endpoint so you can plug it into existing monitoring Cacheability analysis uses diff-based detection across multiple API calls to identify what's actually static vs dynamic in your prompts Limitations I'm honest about:

The cacheability scorer is heuristic-based — solid for multi-call traces (~85% accurate), rougher for single prompts (~65%) Token counting uses cl100k_base for everything, which drifts ~10% for non-OpenAI models Three features (smart routing, scheduled reports, multi-user auth) are on the roadmap but not shipped yet Would love feedback, especially from anyone managing LLM costs at scale.

Safety Agents for Autonomous Systems

Claude can generate custom diagrams, and charts directly in your conversation

Claude now has Generative UI – interactive charts and diagrams

Show HN: Cigarette Rocket Booster – a rocket where the body itself is fuel

Show HN: JobStocks – track hiring changes at public companies vs. stock price

Source code of Swedish e-government services has been leaked

SSL/TLS and PKI History

Virtual Scrolling: Rendering messages without lag

Pentagon AI chief praises Palantir tech for speeding battlefield strikes

Major Outage in Datadog Web Application

Meta Acquires Moltbook

I hacked Perplexity Computer and got unlimited Claude Code

Gemini to Word exporter that preserves code blocks, tables, and headings

Silicon Valley is buzzing about this new idea: AI compute as compensation

How to Hire SaaS Developers in 2026?

Void, the Vite-native deployment platform

Internet Intro – hand-curated discovery site for independent web projects

Show HN: Agile V Skills – Open skills for verifiable, traceable AI engineering

Server managed SQLite for multi-tenant? Open-source idea

Five layers from writing code to writing companies

Hackable single-file console file manager powered by uv with S3/GCS support

Show HN: Update] StocksAnalyzer – Compare two stocks side-by-side

The Shape of the Thing

Invoice OCR API for Logistics Expense Tracking Automation

Rust and LLMs: The Compiler Does What Code Review Shouldn't Have To

SBOM Adoption on PyPI Is at 1.58%. We Can Do Better

Blue-LIRIC in rabbit cornea: efficacy, effects, and repetition rate (2022)

One of Our Credit Card

Blader/humanizer: Claude Code skill to remove AI-generated tells from writing

JSFX on Fedora Linux: an ultra-fast audio prototyping engine

Show HN: CacheLens – Local-first cost tracking proxy for LLM APIs