frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Librarian – Cut token costs by up to 85% for LangGraph and OpenClaw

https://uselibrarian.dev/
5•Pinkert•1h ago
Hi HN,

I'm building Librarian (https://uselibrarian.dev/), an open-source (MIT) context management tool that stops AI agents from burning tokens by blindly re-reading their entire conversation history on every turn.

The Problem: If you're building agentic loops in frameworks like LangGraph or OpenClaw, you hit two walls fast:

Financial Cost: Token usage scales quadratically over long conversations. Passing the whole history every time gets incredibly expensive.

Context Rot: As the context window fills up, the LLM suffers from the "Lost in the Middle" effect. Response latency spikes, and reasoning accuracy drops.

The standard workaround is vector search (RAG) over past messages, but that completely loses temporal logic and conversational dependencies.

How Librarian Fixes This: We replaced brute-force context windowing with a lightweight reasoning pipeline:

Index: After a message, a smaller model asynchronously creates a compressed summary (~100 tokens), building an index of the conversation.

Select: When a new prompt arrives, Librarian reads the summary index and reasons about which specific historical messages are actually relevant to the current turn.

Hydrate: It fetches only those selected messages and passes them to the responder.

The Results: Instead of passing 2,000+ tokens of noise, you pass a highly curated context of ~800 tokens. In our 50-turn benchmarks, this reduces token costs by up to 85% while actually increasing answer accuracy (82% vs 78% for brute-force) because the distracting noise is removed. It currently works as a drop-in integration for LangGraph and OpenClaw.

I'd love for you to check out the benchmark suite, try the integrations, and tear the methodology apart. I'll be hanging out in the comments to answer questions, debug, or hear why this approach is terrible. Thanks!

Comments

Pinkert•56m ago
One architectural tradeoff we are actively working on right now is the latency of the "Select" step for shorter conversations.

Currently, the open-source version of Librarian uses a general-purpose model to read the summary index and route the relevant messages. It works great for accuracy and drastically cuts token costs, but it does introduce a latency penalty for shorter conversations because it requires an initial LLM inference step before your actual agent can respond.

To solve this, we are currently training a heavily quantized, fine-tuned model specifically optimized only for this context-selection task. The goal is to push the selection latency below 1 second so the entire pipeline feels completely transparent. (We have a waitlist up for this hosted version on the site).

If anyone here has experience fine-tuning smaller models (like Llama 3 or Mistral) strictly for high-speed classification/routing over context indexes, I'd love to hear what pitfalls we should watch out for.

Show HN: The AltStack – A directory of 450 open-source SaaS alternatives

https://github.com/altstackHQ/altstack-data
1•AltStackHQ•13s ago•0 comments

CrowdStrike 2026 global threat report

https://go.crowdstrike.com/2026-global-threat-report.html
1•chha•20s ago•0 comments

David Johnson-Davies Brings Espressif's ESP32-P4 to the Feather Form Factor

https://www.hackster.io/news/david-johnson-davies-brings-espressif-s-powerful-esp32-p4-to-the-fea...
1•HardwareLust•1m ago•1 comments

The Agent Operating System: OpenFang

https://www.openfang.sh/
1•zakkg3•1m ago•0 comments

Cutting Cypher Query Latency: Streaming Traversal and Query Shape Specialization

https://github.com/orneryd/NornicDB/discussions/23
1•orneryd•2m ago•0 comments

AI powered migration from Postgres to ClickHouse

https://clickhouse.com/blog/ai-powered-migraiton-from-postgres-to-clickhouse-with-fiveonefour
1•craneca0•3m ago•0 comments

Rob Grant, co-creator of Red Dwarf, has died

https://www.ganymede.tv/2026/02/rob-grant-rip/
3•stuartmemo•4m ago•0 comments

The PaaS Graveyard: Why Platforms Keep Dying and Developers Keep Migrating

https://blog.cloud66.com/paas-graveyard-why-platforms-keep-dying
1•mooreds•5m ago•0 comments

Solid Principles in React: SRP, OCP and Dip

https://jsdev.space/react-solid-srp-ocp/
1•javatuts•8m ago•0 comments

Musk touts California robotaxis but Tesla does nothing to get permits

https://finance.yahoo.com/news/musk-touts-california-robotaxis-tesla-110542936.html
2•kklisura•9m ago•0 comments

What does " 2>&1 " mean?

https://stackoverflow.com/questions/818255/what-does-21-mean
3•alexmolas•9m ago•0 comments

Increased urination urgency facilitates impulse control in unrelated domains

https://pubmed.ncbi.nlm.nih.gov/21467548/
2•u1hcw9nx•9m ago•2 comments

Cancer mortality and proximity to nuclear power plants in the United States

https://www.nature.com/articles/s41467-026-69285-4
1•derbOac•10m ago•0 comments

Putting Git on AI Agents

https://www.vichoiglesias.com/writing/putting-git-on-ai-agents
1•vichoiglesias•10m ago•2 comments

NATO says iPhones are secure enough to handle classified data

https://www.theverge.com/tech/885516/nato-iphones-ipads-restricted-classified-information
3•mikece•13m ago•0 comments

Title Show HN: FrameWork – Open-source internal tools templates

https://github.com/framework-hq/framework
1•Fridaytmai•13m ago•0 comments

3D-printed nitinol lattices and wovens with dramatic mechanical properties

https://www.tandfonline.com/doi/full/10.1080/17452759.2025.2595478
2•PaulHoule•18m ago•0 comments

Two RCEs in Unitree Go2 Robots (CVE-2026-27509 and CVE-2026-27510)

https://boschko.ca/unitree-go2-rce/
1•Boschko•19m ago•0 comments

TKey-LUKS: Hardware-Based LUKS Unlock with Tillitis TKey

https://github.com/No-0n3/tkey-luks
1•JoachimS•20m ago•0 comments

Ontology-Guided LLMs: Grounding Inference with OpenMath Knowledge

https://arxiv.org/abs/2602.17826
1•rkovashikawa•20m ago•0 comments

Attorney General Finds Amazon Price Fixing, Urges Halt of Illegal Conduct

https://oag.ca.gov/news/press-releases/attorney-general-bonta-exposes-amazon-price-fixing-scheme-...
9•randycupertino•20m ago•0 comments

Show HN: In Veritas – AI summarized congressional bills

https://inveritas.io/
1•sartechb•20m ago•0 comments

Could a vaccine prevent dementia? Shingles shot data only getting stronger

https://arstechnica.com/health/2026/02/could-a-vaccine-prevent-dementia-shingles-shot-data-only-g...
4•asplake•21m ago•1 comments

Ohio lawmaker proposes land value tax amendment as alternative to property tax

https://ohiocapitaljournal.com/2026/02/26/ohio-lawmaker-proposed-land-value-tax-amendment-as-alte...
1•briandrum•22m ago•0 comments

Tech Journalists at Eating Hot Dogs

https://tomgermain.com/hotdogs.html
1•bariumbitmap•22m ago•2 comments

FCC electron collider consensus option in European Strategy for Particle Physics [pdf]

https://indico.cern.ch/event/1650119/attachments/3227895/5753043/ESPP_CERN_Colloquium_2026.02.26.pdf
2•SiempreViernes•23m ago•1 comments

Ask HN: How are UK sole traders preparing for quarterly tax reporting (MTD)?

5•nikethputta•25m ago•6 comments

The Office.js Stability, Security and Trust Crisis: An Open Letter

https://github.com/OfficeDev/office-js/issues/6513
2•awenner•25m ago•0 comments

America's dangerous pursuit of critical-mineral dominance

https://www.economist.com/leaders/2026/02/26/americas-dangerous-pursuit-of-critical-mineral-domin...
2•andsoitis•27m ago•0 comments

SF Is the Worst

https://sfistheworst.com/
1•clean_send•27m ago•3 comments