frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Agents need control flow, not more prompts

https://bsuh.bearblog.dev/agents-need-control-flow/
63•bsuh•1h ago

Comments

Neywiny•1h ago
If you're trying to get reliability and determinism out of the LLM, you've already lost
tekne•41m ago
Wait... why?

Making an unreliable, nondeterministic system give reliable results for a bounded task with well-understood parameters is... like half of engineering, no?

There's a huge difference between "generate this code here's a vague feature description" and "here's a list of criteria, assign this input to one of these buckets" -- the latter is obviously subject to prompt engineering, hallucination, etc -- but so can a human pipeline!

aleksiy123•13m ago
There’s a whole range between completely random and completely rule based deterministic.

Somewhere in between that I guess is the varying levels of intelligence more likely able to make the “right” decision for anything you throw at it.

pydry•12m ago
This is something I think some people are fundamentally not capable of understanding.
bwestergard•1h ago
I agree with the sentiment, but I think the conclusion should be altered. When you hit the limit of prompting, you need to move from using LLMs at run time to accomplish a task to using LLMs to write software to accomplish the task. The role of LLMs at run time will generally shrink to helping users choose compliant inputs to a software system that embodies hard business rules.
edgarvaldes•32m ago
Some have expressed the opinion in this forum that the future of software lies in programs that are created and adapted at runtime, using genAI. I don't know how far we are from that.
mjr00•29m ago
> Some have expressed the opinion in this forum that the future of software lies in programs that are created and adapted at runtime, using genAI.

Good luck with that. Users will flood you with complaints if a button moves 5px to the left after a design update. A program that is generated at runtime, with not just a variable UI but also UX and workflows, would get you death threats.

hilariously•20m ago
I think many software adjacent folks are super excited because they can now have the personalized toothbrush they keep asking people to make for them.

The problem is that outside of that most people want boring and regular interfaces so they can get in and solve the problem and get out - they don't want to "love" it or care if its "sexy" they want it to work and get out of the way.

LLMs transmogrifying your software at ever request assumes people are software architects and creators who love the computer interface, and that just doesn't describe the bulk of the population.

Most people using computers use the to consume things or utilize access to things, not for their own sake, and they certainly don't think "what if I just had code to do x..." unless x is make them a lot of money.

scrappyjoe•29m ago
I’ve had a couple of weeks of downtime at work, so I decided to incorporate agents into my work processes - things like note taking, task tracking, document management.

Your comment EXACTLY mirrors my experience. Week 1 was ever expanding prompts, and degrading performance. Week 2 has been all about actually defining the objects precisely (notes, tasks, projects, people etc) and defining methods for performing well defined operations against these objects. The agent surface has, as you rightly point out, shrunk to a translation layer that converts natural language to commands and args that pass the input validator.

AIorNot•1h ago
I mean we have Langgraph, BAML etc
apalmer•1h ago
Generally agree with this stance case in point: the breakthrough in ai coding was not that AI intelligence increased as much as that a lot of the core process execution moved out of the LLM prompt and into the harness.
eth415•1h ago
agreed - this is what we’ve been trying to build at scale.

https://github.com/salesforce/agentscript

droolingretard•1h ago
Are you the guy who used to write MapleStory hacks?
astrobiased•1h ago
It's the right direction, but control flow introduces limitations within a system that is quite adaptable to dynamic situations. The more control flow you try to do, the more buggy edge cases that pop up if done poorly.

Still have yet to see a universal treatment that tackles this well.

jerf•53m ago
This is why I frequently refer to "next generation AIs" that aren't just LLMs. LLMs are pretty cool and I expect that even if we see no further foundational advancement in AIs that we're going to continue to see them exploited in more interesting ways and optimized better. Even if the models froze as they are today, there's a lot more value to be squeezed out of them as we figure out how to do that.

However, there are some things that I think need a foundational next-generation improvement of some sort. The way that LLMs sort of smudge away "NEVER DO X" and can even after a lot of work end up seeing that as a bit of a "PLEASE DO X" seems fundamental to how they work. It can be easy to lose track of as we are still in the initial flush of figuring out what they can do (despite all we've already found), but LLMs are not everything we're looking for out of AI.

There should be some sort of architecture that can take a "NEVER DO X" and treat it as a human would. There should be some sort of architecture that instead of having a "context window" has memory hierarchies something like we do, where if two people have sufficiently extended conversations with what was initially the same AI, the resulting two AIs are different not just in their context windows but have actually become two individuals.

I of course have no more idea what this looks like than anyone else. But I don't see any reason to think LLMs are the last word in AI.

solomonb•45m ago
I agree and I think a really wonderful way to encode agentic control flow would be with Polynomial Functors.

https://arxiv.org/abs/2312.00990

59nadir•33m ago
This was one of the key insights in Stripe's explanations about Minions[0], their autonomous agent system; in-between non-deterministic LLM work they had deterministic nodes that handled quality assurance and so on in order to not leave those types of things to the LLMs.

0 - https://stripe.dev/blog/minions-stripes-one-shot-end-to-end-...

ModernMech•33m ago
Slowly and surely we are replacing AI with programming languages.
taherchhabra•33m ago
I wrote something recently on how agent development differs from traditional software development

https://x.com/i/status/2051706304859881495

onion2k•26m ago
Agents are probabilistic systems. A common mechanism to get a reliable answer from systems that can have variable output is to run them several times (ideally in separate, isolated instances) and then have something vote on the best result or use the most common result. This happens in things like rockets and aviation where you have multiple systems giving an answer and an orchestrator picking the result.

I've tried doing something similar with AI by running a prompt several times and then have an agent pick the best response. It works fairly well but it burns a lot of tokens.

suprfnk•14m ago
But then, if an agent picks the best response, how would you know that that is reliable?
encoderer•26m ago
You can get a lot done with agentic programming without going "all in" on a gastown-like system, but I think there is a minimum viable setup:

1. an adversarial agent harness that uses one agent to create a plan and implement it, and another to review the plan and code-review each step.

2. an agentic validation suite -- a more flexible take on e2e testing.

3. some custom skills that explain how to use both of those flows.

With this in place you can formulate ideas in a chat session, produce planning artifacts, then use the adversarial system to implement the plans and the validation layer to get everything working e2e for human review.

There are a lot of tools you can use for these things but I chose to just build the tooling in the repo as I go.

rnxrx•22m ago
I wonder if a part of the problem isn't just the misapplication of LLMs in the first place. As has been mentioned elsewhere, perhaps the agent's prompt should be to write code to accomplish as much of the task in as repeatable/verifiable/deterministic a way as possible. This would hopefully include validation of the agent's output as well. The overall goal would be to keep the LLM out of doing processing that could be more efficiently (and often correctly) handled programmatically.
foolserrandboy•11m ago
yup, the standard way of thinking about agents seems backwards and probably costly. Use LLMs to write scripts, then stick all your scripts in your own looping harness and call out for LLMs for those parts that are too hard to automate with some deterministic validation at the end.
briga•18m ago
Sometimes it feels like Agents are just reinventing microservices. Except they are are doing it in the most inefficient way possible. It is certainly a good way for the LLM companies to sell more tokens
oinoom•17m ago
this is just advocating for a harness, which has been the focus (along with evals) for at least the last three months by pretty much anyone working with agents professionally or seriously
yogthos•17m ago
This was basically my realization as well. We are trying to get LLMs to write software the way humans do it, but they have a different set of strength and weaknesses. Structuring tooling around what LLMs actually do well seems like an obvious thing to do. I wrote about this in some detail here:

https://yogthos.net/posts/2026-02-25-ai-at-scale.html

gardnr•15m ago
This is straight outta 2023:

Agents aren't reliable; use workflows instead.

tim-projects•7m ago
This is exactly the problem I've been working on and I see others are too. When you implement quality control gates, everything works better. It solves so many of the basic problems llms create - saying code is finished when it isn't. Skipping tests, introducing code regressions, basic code validation etc

I am finding that the better the quality gates are the lower quality llm you can use for the same result (at a cost of time).

The map that keeps Burning Man honest

https://www.not-ship.com/burning-man-moop/
360•speckx•4h ago•157 comments

AlphaEvolve: Gemini-powered coding agent scaling impact across fields

https://deepmind.google/blog/alphaevolve-impact/
157•berlianta•3h ago•50 comments

Agents need control flow, not more prompts

https://bsuh.bearblog.dev/agents-need-control-flow/
69•bsuh•1h ago•31 comments

DeepSeek 4 Flash local inference engine for Metal

https://github.com/antirez/ds4
89•tamnd•2h ago•30 comments

Natural Language Autoencoders: Turning Claude's Thoughts into Text

https://www.anthropic.com/research/natural-language-autoencoders
11•instagraham•39m ago•0 comments

Child marriages plunged when girls stayed in school in Nigeria

https://www.nature.com/articles/d41586-026-00720-8
243•surprisetalk•5h ago•156 comments

Chrome removes claim of On-device Al not sending data to Google Servers

https://old.reddit.com/r/chrome/comments/1t5qayz/chrome_removes_claim_of_ondevice_al_not_sending/
180•newsoftheday•2h ago•49 comments

PySimpleGUI 6

https://github.com/PySimpleGUI/PySimpleGUI
42•geophph•2d ago•13 comments

The Self-Cancelling Subscription

https://predr.ag/blog/the-self-cancelling-subscription/
96•surprisetalk•4h ago•40 comments

OpenBSD Stories: The closest thing to cute kittens (OpenBSD/zaurus)

http://miod.online.fr/software/openbsd/stories/zaurus1.html
30•zdw•23h ago•4 comments

RaTeX: KaTeX-compatible LaTeX rendering engine in pure Rust

https://ratex.lites.dev/
120•atilimcetin•3d ago•73 comments

MPEG-2 Transport Stream Packaging for Media over QUIC Transport

https://www.ietf.org/archive/id/draft-gregoire-moq-msfts-00.html
36•mondainx•4h ago•10 comments

Motherboard sales 'collapse' amid unprecedented shortages fueled by AI

https://www.tomshardware.com/pc-components/motherboards/motherboard-sales-collapse-by-more-than-2...
142•speckx•3h ago•126 comments

I switched from Mac to a Lenovo Chromebook

https://blog.johnozbay.com/i-left-apples-ecosystem-for-a-lenovo-chromebook-and-you-can-too.html
59•speckx•2h ago•82 comments

I want to live like Costco people

https://tastecooking.com/i-want-to-live-like-costco-people/
50•speckx•3h ago•110 comments

SQLite Is a Library of Congress Recommended Storage Format

https://sqlite.org/locrsf.html
548•whatisabcdefgh•20h ago•165 comments

Appearing productive in the workplace

https://nooneshappy.com/article/appearing-productive-in-the-workplace/
1493•diebillionaires•1d ago•601 comments

How Cloudflare responded to the “Copy Fail” Linux vulnerability

https://blog.cloudflare.com/copy-fail-linux-vulnerability-mitigation/
53•mobeigi•5h ago•53 comments

GovernGPT (YC W24) Is Hiring Engineers to Build Thinking Systems in Montreal

https://www.ycombinator.com/companies/governgpt/jobs/hRyltS0-backend-engineer-thinking-systems
1•owalerys•6h ago

Speedup in Lattice Boltzmann Cylinder Flow

https://github.com/alikamp/Parks-KPBM-Scaling
39•kauai1•3d ago•3 comments

Boris Cherny: TI-83 Plus Basic Programming Tutorial (2004)

https://www.ticalc.org/programming/columns/83plus-bas/cherny/
153•suoken•2d ago•67 comments

Indian matchbox labels as a visual archive

https://www.itsnicethat.com/features/the-view-from-mumbai-matchbook-graphic-design-130426
129•sahar_builds•3d ago•31 comments

Printing Blogs

https://fi-le.net/print/
11•fi-le•1d ago•1 comments

Show HN: Stage CLI – an easier way of reading your AI generated changes locally

https://github.com/ReviewStage/stage-cli
8•cpan22•2h ago•2 comments

Agent-harness-kit scaffolding for multi-agent workflows (MCP, provider-agnostic)

https://ahk.cardor.dev
61•enmanuelmag•7h ago•18 comments

Brazil's Pix Payment System Faces Pressure from Visa and Mastercard

https://www.elciudadano.com/en/brazils-pix-payment-system-faces-pressure-from-visa-and-mastercard...
16•wslh•50m ago•2 comments

Diskless Linux boot using ZFS, iSCSI and PXE

https://aniket.foo/posts/20260505-netboot/
174•stereo-highway•15h ago•91 comments

RSS feeds send me more traffic than Google

https://shkspr.mobi/blog/2026/05/rss-feeds-send-me-more-traffic-than-google/
238•SpyCoder77•17h ago•55 comments

Valve releases Steam Controller CAD files under Creative Commons license

https://www.digitalfoundry.net/news/2026/05/valve-releases-steam-controller-cad-files-under-creat...
1666•haunter•1d ago•564 comments

Vibe coding and agentic engineering are getting closer than I'd like

https://simonwillison.net/2026/May/6/vibe-coding-and-agentic-engineering/
730•e12e•1d ago•824 comments