frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Tested OpenAI's prompt caching across models. Found undocumented behavior

5•harsharanga•2mo ago
Been building an AI agent from scratch to understand token economics. Spent a week on prompt caching. Found something interesting that isn't in OpenAI's docs. Setup: Network device monitoring chatbot, 10 tools, ~1,400 token prefix. Tested gpt-4o-mini, gpt-5-mini, gpt-5. Logged cached_tokens from every response.

Finding 1: Caching works as documented Once prefix exceeds 1024 tokens, OpenAI caches it automatically. I saw 80-90% cache hit rates after the first call. Cost reduction of 47-49% on input tokens. Cache discount is 50% for 4o-mini, 90% for gpt-5 family.

Finding 2: Tool schema tokenization is heavily compressed Added 4 tools to my existing 6. Expected +400-500 tokens based on JSON size. Actual increase: 56 tokens. OpenAI is clearly doing aggressive compression on function schemas.

Finding 3: Cache is shared across model generations (undocumented) This is the interesting part. Test: Call gpt-4o-mini first (cold start). Wait 5 seconds. Call gpt-5-mini with identical prefix. Result: gpt-5-mini got a cache hit on its first call. Tested all permutations. Every time, model 2 and 3 hit cache from model 1's warmup. The prefix-processing cache is shared across 4o-mini, 5-mini, and 5. I couldn't find this documented anywhere.

Why it matters: If you have many cold starts (separate user sessions, different contexts), you can warm cache with the cheapest model. Example - 1,000 cold starts/day, 10K token prefix, primary model gpt-5: Without cross-model warming: Each session pays 10K tokens at $1.25/1M = $0.0125 Daily: $12.50, Annual: $4,562 With nano warming first: 10K tokens at $0.05/1M = $0.0005 per warmup Daily: $0.50, Annual: $182 Savings: $4,380/year At gpt-5-pro pricing ($15/1M), difference is $54K+/year on warmup costs alone.

Technical note: This is prefix-processing cache sharing, not KV-cache sharing. Models share tokenization and prefix hashing, not attention states. But billing-wise, cached tokens are cached tokens.

Reproduction: Create 1024+ token prefix. Call model A, log cached_tokens. Call model B with same prefix. Check if B's first call shows cached tokens. Field is in response.usage.prompt_tokens_details.cached_tokens. Happy to share test scripts.

Oddly Simple GUI Programs

https://simonsafar.com/2024/win32_lights/
1•MaximilianEmel•37s ago•0 comments

The New Playbook for Leaders [pdf]

https://www.ibli.com/IBLI%20OnePagers%20The%20Plays%20Summarized.pdf
1•mooreds•1m ago•0 comments

Interactive Unboxing of J Dilla's Donuts

https://donuts20.vercel.app
1•sngahane•2m ago•0 comments

OneCourt helps blind and low-vision fans to track Super Bowl live

https://www.dezeen.com/2026/02/06/onecourt-tactile-device-super-bowl-blind-low-vision-fans/
1•gaws•4m ago•0 comments

Rudolf Vrba

https://en.wikipedia.org/wiki/Rudolf_Vrba
1•mooreds•4m ago•0 comments

Autism Incidence in Girls and Boys May Be Nearly Equal, Study Suggests

https://www.medpagetoday.com/neurology/autism/119747
1•paulpauper•5m ago•0 comments

Wellness Hotels Discovery Application

https://aurio.place/
1•cherrylinedev•6m ago•1 comments

NASA delays moon rocket launch by a month after fuel leaks during test

https://www.theguardian.com/science/2026/feb/03/nasa-delays-moon-rocket-launch-month-fuel-leaks-a...
1•mooreds•6m ago•0 comments

Sebastian Galiani on the Marginal Revolution

https://marginalrevolution.com/marginalrevolution/2026/02/sebastian-galiani-on-the-marginal-revol...
1•paulpauper•10m ago•0 comments

Ask HN: Are we at the point where software can improve itself?

1•ManuelKiessling•10m ago•0 comments

Binance Gives Trump Family's Crypto Firm a Leg Up

https://www.nytimes.com/2026/02/07/business/binance-trump-crypto.html
1•paulpauper•10m ago•0 comments

Reverse engineering Chinese 'shit-program' for absolute glory: R/ClaudeCode

https://old.reddit.com/r/ClaudeCode/comments/1qy5l0n/reverse_engineering_chinese_shitprogram_for/
1•edward•10m ago•0 comments

Indian Culture

https://indianculture.gov.in/
1•saikatsg•13m ago•0 comments

Show HN: Maravel-Framework 10.61 prevents circular dependency

https://marius-ciclistu.medium.com/maravel-framework-10-61-0-prevents-circular-dependency-cdb5d25...
1•marius-ciclistu•13m ago•0 comments

The age of a treacherous, falling dollar

https://www.economist.com/leaders/2026/02/05/the-age-of-a-treacherous-falling-dollar
2•stopbulying•13m ago•0 comments

Ask HN: AI Generated Diagrams

1•voidhorse•16m ago•0 comments

Microsoft Account bugs locked me out of Notepad – are Thin Clients ruining PCs?

https://www.windowscentral.com/microsoft/windows-11/windows-locked-me-out-of-notepad-is-the-thin-...
3•josephcsible•16m ago•0 comments

Show HN: A delightful Mac app to vibe code beautiful iOS apps

https://milq.ai/hacker-news
5•jdjuwadi•19m ago•1 comments

Show HN: Gemini Station – A local Chrome extension to organize AI chats

https://github.com/rajeshkumarblr/gemini_station
1•rajeshkumar_dev•19m ago•0 comments

Welfare states build financial markets through social policy design

https://theloop.ecpr.eu/its-not-finance-its-your-pensions/
2•kome•23m ago•0 comments

Market orientation and national homicide rates

https://onlinelibrary.wiley.com/doi/10.1111/1745-9125.70023
4•PaulHoule•23m ago•0 comments

California urges people avoid wild mushrooms after 4 deaths, 3 liver transplants

https://www.cbsnews.com/news/california-death-cap-mushrooms-poisonings-liver-transplants/
1•rolph•24m ago•0 comments

Matthew Shulman, co-creator of Intellisense, died 2019 March 22

https://www.capenews.net/falmouth/obituaries/matthew-a-shulman/article_33af6330-4f52-5f69-a9ff-58...
3•canucker2016•25m ago•1 comments

Show HN: SuperLocalMemory – AI memory that stays on your machine, forever free

https://github.com/varun369/SuperLocalMemoryV2
1•varunpratap369•26m ago•0 comments

Show HN: Pyrig – One command to set up a production-ready Python project

https://github.com/Winipedia/pyrig
1•Winipedia•28m ago•0 comments

Fast Response or Silence: Conversation Persistence in an AI-Agent Social Network [pdf]

https://github.com/AysajanE/moltbook-persistence/blob/main/paper/main.pdf
1•EagleEdge•28m ago•0 comments

C and C++ dependencies: don't dream it, be it

https://nibblestew.blogspot.com/2026/02/c-and-c-dependencies-dont-dream-it-be-it.html
1•ingve•29m ago•0 comments

Show HN: Vbuckets – Infinite virtual S3 buckets

https://github.com/danthegoodman1/vbuckets
1•dangoodmanUT•29m ago•0 comments

Open Molten Claw: Post-Eval as a Service

https://idiallo.com/blog/open-molten-claw
1•watchful_moose•30m ago•0 comments

New York Budget Bill Mandates File Scans for 3D Printers

https://reclaimthenet.org/new-york-3d-printer-law-mandates-firearm-file-blocking
2•bilsbie•31m ago•1 comments