frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Changes that cut our LLM pipeline costs more than model-switching did

3•Abbas_Maka•2h ago
I have been building multiple LLM systems and for our Organization biggest cost savings weren't from prompt-wordsmithing or model switchings. Sharing useful to anyone watching their token bill :

1) JSON → TOON for structured output: JSON was not made for LLMs. well you can implement your own verison that fits for your needs that reduce tokens usage but what worked for us was TOON. TOON cut output our tokens by ~30% same information, way less syntax tax.

2) Full markdown/HTML → condensed markdown: Using markdown for writing your prompts, getting intermediate results or communication between your Agents eats a lot of tokens. We swithced to condesed markdown and short system prompts that replicate Caveman. this alone cut just on input token costs ~50% on calls that pass prior context forward which can be implemented between Agent Calls.

3) Long Do/Don't instruction lists → 2-3 multi-shot examples: Counterintuitive one - replacing a large lists of DO's and Don'ts for agents rules don't help. rather couple of concrete examples that convers major and all cases actually improved output quality more reliably and it's usually fewer tokens once the instruction list gets long enough to cover real edge cases.

I have seen most people on this sub reddit talk about using open-source or cheaper models. Like we were spending thousands of dollar's but this all changes alone helped reduce cost by 60%.

edit: Open to Discussion, anyone whether something similar would help their setup.

1•swatiahuja•14s ago

Sponja found 897 companies running webinars (and who runs them) for under $20

https://blog.apify.com/how-sponja-found-companies-running-webinars/
2•roee_tsur•2m ago•0 comments

Smokey Yunick's Hot Vapor Engine Was Equally Genius and Horribly Unsafe

https://www.jalopnik.com/2131436/smokey-yunick-hot-vapor-engine-genius-unsafe/
1•cf100clunk•4m ago•1 comments

Show HN: StartupWiki – A Free Alternative to Crunchbase

https://startupwiki.tech/
2•shpran•4m ago•0 comments

ETLFunnel v1.0 – Accepting POC Requests

1•vivekburman•5m ago•0 comments

Ask HN: What do you do to make LLMs determine

1•hbarka•5m ago•0 comments

Lobsters Bug Allows Unauthorized Email Access

https://lobste.rs/s/7heurd
2•RandomGerm4n•8m ago•0 comments

Plasma Vitamin C levels are associated with brain structural networks on MRI

https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0348504
3•bookofjoe•9m ago•0 comments

Marathon Petroleum Company Is Making Diesel from Soybeans

https://www.jalopnik.com/2194402/marathon-gas-station-owner-makes-diesel-from-soybeans/
2•cf100clunk•10m ago•0 comments

Cancer Myths and Falsehoods Can Be Deadly

https://www.psychologytoday.com/us/blog/misguided/202606/cancer-myths-and-falsehoods-can-be-deadly
2•ndr42•10m ago•0 comments

Show HN: Namecom-CLI – CLI and agent skill so Claude Code/Codex can do your DNS

https://github.com/hypersocialinc/namecom-cli
3•selcuk•12m ago•0 comments

Follow when your world cup team is going to play

https://copa2026.florianobi.workers.dev/
2•brunojppb•12m ago•0 comments

22-year-old Mozart's handwritten notebook unearthed in 'major discovery'

https://www.classicfm.com/composers/mozart/handwritten-notebook-discovered-major-paris/
2•thunderbong•13m ago•0 comments

Agent Memory Layer: Repository-local memory for AI coding agents

https://github.com/ragnarok268/agent-memory-layer
2•einherjarlabs•15m ago•0 comments

Lightweight Compression in DuckDB (2022)

https://duckdb.org/2022/10/28/lightweight-compression
2•tosh•16m ago•0 comments

Cognitive Offloading

https://blog.daddooo.dev/posts/cognitive-offloading/
2•daddooo•16m ago•0 comments

FSST: Fast Random Access String Compression [pdf]

https://www.vldb.org/pvldb/vol13/p2649-boncz.pdf
2•tosh•17m ago•0 comments

Noverdesk – reusable skills for AI support agents, with a conversational builder

https://www.noverdesk.com/
2•kodicar•19m ago•0 comments

Big Tech is stoking unrest in the UK. Why?

https://www.ft.com/content/0f3e33d2-0b9e-481d-a911-245d8cc01a9c
24•mmarian•21m ago•5 comments

New GCP Big Query Emulator

https://github.com/jjviscomi/bqemulator
3•jjviscomi•21m ago•1 comments

The Joel Test (2000)

https://web.archive.org/web/20020604023729/http://www.joelonsoftware.com/articles/fog0000000043.html
3•tosh•23m ago•0 comments

Show HN: Tiny.Place – AI Social network for orchestration, payments & jobs

https://github.com/tinyhumansai/tiny.place
2•enamakel•25m ago•1 comments

One giant US power line, enough wind power for 1M homes

https://electrek.co/2026/06/19/sunzia-one-giant-us-power-line-wind-power-for-1-million-homes/
2•Brajeshwar•26m ago•0 comments

Indie hackers needed better founder pages, so I made this

https://www.founder.best
5•jacksonnick•26m ago•0 comments

Votre guide sur l'intelligence artificielle

https://theassociationwebmasters.blogspot.com/2026/06/i-spent-morning-chasing-ai-garage-sales.html
2•odilelof•27m ago•0 comments

Ask HN: Why SF? Why not India? UK? Or anywhere else?

2•akashwadhwani35•31m ago•1 comments

No training. No cloud. No transformers. It abstains instead of hallucinating [video]

https://www.youtube.com/watch?v=X90A9ZFtg6g
2•verhash•35m ago•1 comments

Tech companies seeking better control of spending on AI

https://www.cbc.ca/news/business/ai-spending-ending-tokenmaxxing-tokenomics-9.7237680
2•heresie-dabord•38m ago•0 comments

NewsGlobe – An interactive 3D globe of world newspapers and live news

https://newsglobe.app/
2•subinalex•38m ago•1 comments

Power shortages force Cuban churches to ration Communion wafers

https://www.ucanews.com/news/power-shortages-force-cuban-churches-to-ration-communion-wafers/113797
2•t-3•39m ago•0 comments