frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: I built a Harvey-style tabular review app, then open sourced the code

https://isaacus.com/blog/hallucination-free-tabular-review-from-scratch
1•afistfullof•1m ago•0 comments

Microsoft's executive shake-up continues as developer division chief resigns

https://www.theverge.com/tech/908793/microsoft-devdiv-julia-liuson-resignation
1•pjmlp•1m ago•0 comments

Debugy: Runtime Logs for Coding Agents

https://www.debugy.dev
1•amitay1599•1m ago•0 comments

We measured copyrighted-text memorization in 81 open-weight language models

https://zenodo.org/records/19431804
1•crovia•2m ago•0 comments

PKG47: AI-Controlled Package Registry

https://pkg47.com/
1•seuros•2m ago•0 comments

Afterchain – Deterministic inheritance protocol for digital assets

https://github.com/Afterchain/afterchain-protocol-public
1•Afterchain•3m ago•0 comments

Creating the Futurescape for the Fifth Element

https://theasc.com/articles/fantastic-voyage-creating-the-futurescape-for-the-fifth-element
2•nixass•5m ago•0 comments

Do links hurt news publishers on Twitter? Our analysis suggests yes

https://www.niemanlab.org/2026/04/do-links-hurt-news-publishers-on-twitter-our-analysis-suggests-...
1•giuliomagnifico•12m ago•0 comments

Nigel Farage wants to build a British ICE. Starmer may have handed him the tools

https://www.thenerve.news/p/reform-deportation-operation-restoring-justice-data-surveillance-pala...
1•doener•13m ago•0 comments

Fast, cheap AI-assisted decompilation of binary code is here

https://twitter.com/esrtweet/status/2042002143045890412
1•tosh•13m ago•0 comments

Engineers Are Great for Marketing

https://www.usenotra.com/blog/engineers-are-great-marketing
1•DominikKoch•15m ago•1 comments

Largest Dutch pension fund cuts ties with controversial tech firm Palantir

https://nltimes.nl/2026/04/02/largest-dutch-pension-fund-cuts-ties-controversial-tech-firm-palantir
4•doener•16m ago•0 comments

Cisco: Cybersecurity Remains Top Challenge as Industrial AI Adoption Expands

https://techgraph.co/tech/cisco-cybersecurity-remains-top-challenge-as-industrial-ai-adoption-exp...
1•visitednews•17m ago•0 comments

FalconFly 3dfx Archive

https://3dfxarchive.com/3dfx.htm
1•BruceEel•18m ago•0 comments

Influence Campaign on TikTok Uses AI Videos to Boost Hungary's Orbán

https://www.newsguardtech.com/special-reports/influence-campaign-uses-ai-tiktok-videos-to-boost-h...
2•doener•21m ago•0 comments

Reallocating $100/Month Claude Code Spend to Zed and OpenRouter

https://braw.dev/blog/2026-04-06-reallocating-100-month-claude-spend/
1•kisamoto•22m ago•0 comments

Škoda's Duobell bicycle bell outsmarts ANC headphones

https://www.heise.de/en/news/koda-s-Duobell-bicycle-bell-outsmarts-ANC-headphones-11249665.html
1•thdr•22m ago•1 comments

Content Giant Slashed Telemetry Cost 79%, Saved $1.2M

https://www.mydecisive.ai/blog/content_giant_case_study
1•jratkevic•25m ago•0 comments

A study linked various SAT test scores to favorite bands

https://twitter.com/arcticinstincts/status/2041936594601701393
2•MrBuddyCasino•28m ago•2 comments

We Have Become Obsessed with Attachment. And It Is Causing Harm

https://whatwouldjesssay.substack.com/p/we-have-become-obsessed-with-attachment
1•rendx•30m ago•0 comments

Some Better Defaults for Emacs

https://git.sr.ht/~technomancy/better-defaults/blob/main/better-defaults.el
2•fanf2•35m ago•1 comments

PBXN-110

https://en.wikipedia.org/wiki/Polymer-bonded_explosive
2•simonebrunozzi•37m ago•0 comments

Ask HN: What is the future of Devs, after launch of Anthropic's Glasswing?

3•shivang2607•41m ago•1 comments

No fine-tuning, no RAG – boosting Claude Code's bioinformatics up to 92%

https://github.com/jaechang-hits/SciAgent-Skills
2•jaechang•41m ago•1 comments

Opera 130 stable arrives with Chromium 146 and Twitch support

https://www.notebookcheck.net/Opera-130-stable-arrives-with-Chromium-146-and-Twitch-support.12697...
2•DarrylLinington•41m ago•0 comments

cppreference.com has been under maintenance for a year

https://en.cppreference.com/
1•GalaxySnail•42m ago•0 comments

Veteran artist behind Mass Effect, Halo, & Overwatch 2 weighs in on Nvidia DLSS5

https://www.notebookcheck.net/Veteran-artist-behind-Mass-Effect-Halo-and-Overwatch-2-weighs-in-on...
2•DarrylLinington•43m ago•0 comments

I was copy-pasting to Claude from WhatsApp – so I fixed that

https://github.com/sliamh11/Deus
2•sliamh11•44m ago•1 comments

From bytecode to bytes: automated magic packet generation

https://blog.cloudflare.com/from-bpf-to-packet/
1•syscll•47m ago•0 comments

Show HN: Giving My First Pitch at 1M Cups Using a Custom Mobile App

https://andonalert.net/dev-blog/giving-my-first-pitch-at-1-million-cups
3•SolarpunkRachel•52m ago•0 comments
Open in hackernews

Ask HN: How are you controlling costs and enforcing limits for LLM calls?

3•8dazo•2d ago
I’ve been running into an issue with LLM/agent systems where unexpected loops or repeated calls can quickly drive up costs.

Most tools I’ve seen focus on observability (logs, traces, dashboards), but not actual enforcement at runtime.

Curious how people here are handling this in production:

- Are you enforcing hard limits (budget, rate, etc.) or just monitoring?

- Do you handle this at the app level or via some middleware/proxy?

- Have you built something in-house for this?

Feels like an unsolved problem, especially with agents.

Would love to hear how others are dealing with it.

Comments

jackycufe•2d ago
Certainly. I use LiteLLM to get more cache and save more money
brandonharwood•1d ago
It’s a bit of a chicken & egg thing and depends a ton on how LLM is applied within an app. I always start at the core design of the integration and focus hard on the problem it solves. Why are you using an LLM in the first place? What is/are the function/s it needs to perform in the context of the user interaction? These are the kind of questions that help you understand the constraints you need to implement. So for example; a project I’m working on is a diagramming tool, and I’m implementing an AI layer on top of it so users can refine/edit/generate diagrams. The tool creates maps structured into a JSON schema, but these can get really long, sometime s thousands of lines depending on the complexity of the diagram. Obviously feeding an entire diagram or having the AI generate an entire diagram is expensive here, so the fix was building a deterministic translation layer that compressed the diagram into a compact semantic model for the LLM, stripping visual noise (x/y coordinates), deduplicating relationships, resolving references etc.

With this and keeping the interact, we cut token usage by ~75% across the app. On the output side, the LLM only produces changes needed, not the full diagram. Layout, validation, and rendering are computed client-side for free so costs only scale with what the user asks for. With good UX as well, we can pay attention to what users ask for, and create “quick actions” that use the LLM within closed loop subsystems. Since we assign a credit system for AI tool usage, we’re better able to accurately assign credit costs to quick actions because each action has a defined scope.

TLDR: make the LLM do less, then put hard limits around the smaller set of things it’s allowed to do