Thought experiment: as you write code, an LLM generates tests for it & the IDE runs those tests as you type, showing which ones are passing & failing, updating in real time. Imagine 10-100 tests that take <1ms to run, being rerun with every keystroke, and the result being shown in a non-intrusive way.
The tests could appear in a separated panel next to your code, and pass/fail status in the gutter of that panel. As simple as red and green dots for tests that passed or failed in the last run.
The presence or absence and content of certain tests, plus their pass/fail state, tells you what the code you’re writing does from an outside perspective. Not seeing the LLM write a test you think you’ll need? Either your test generator prompt is wrong, or the code you’re writing doesn’t do the things you think they do!
Making it realtime helps you shape the code.
Or if you want to do traditional TDD, the tooling could be reversed so you write the tests and the LLM makes them pass as soon as you stop typing by writing the code.
When you give up the work of deciding what the expected inputs and outputs of the code/program is you are no longer in the drivers seat.
You don’t need to write tests for that, you need to write acceptance criteria.
That would be interesting. Of course, gherkin tends to just be transpiled into generated code that is customized for the particular test, so I'm not sure how AI can really abstract it away too much.
You need some of way of precisely telling AI what to do. As it turns out there is only that much you can do with text. Come to think of it, you can write a whole book about a scenery, and yet 100 people will imagine it quite differently. And still that actual photograph would be totally different compared to the imagination of all those 100 people.
As it turns out if you wish to describe something accurately enough, you have to write mathematical statements, in other words statements that reduce to true/false answers. We could skip to the end of the discussion here, and say you are better of either writing code directly or test cases.
This is just people revisiting logic programming all over again.
> When you give up the work of deciding what the expected inputs and outputs of the code/program is you are no longer in the drivers seat.
You don’t need to personally write code that mechanically iterates over every possible state to remain in the driver’s seat. You need to describe the acceptance criteria.
Isn't that logic programming/Prolog?
You basically write the sequence of conditions(i.e tests in our lingo) that have to be true, and the compiler(now AI) generates code for your.
Perhaps there has to be a relook on how Logic programming can be done in the modern era to make this more seamless.
There probably is a setup where this works well, but the LLM and humans need to be able to move across the respective boundaries fluidly...
Writing clear requirements and letting the AI take care of the bulk of both sides seems more streamlined and productive.
But why should it be only the human developer who benefits? What if that debugger program becomes a tool that AI agents can use to more accurately resolve bugs?
Indeed, why can't any programming HUD be used by AI tools? If they benefit humans, wouldn't they benefit AI as well?
I think we'll be pretty quickly at the point where AI agents are more often than not autonomously taking care of business, and humans only need to know about that work at critical points (like when approvals are needed). Once we're there, the idea that this HUD concept should be only human-oriented breaks down.
We're getting more and more information thrown at us each day, and the AIs are adding to that, not reducing it. The ability to summarise dense and specialist information (I'm thinking error logs, but could be anything really) just means more ways for people to access and view that information who previously wouldn't.
How do we, as individuals, best deal with all this information efficiently? Currently we have a variety of interfaces, websites, dashboards, emails, chat. Are all these necessary anymore? They might be now, but what about the next 10 years. Do I even need to visit a companies website if can get the same information from some single chat interface?
The fact we have AIs building us websites, apps, web UI's just seems so...redundant.
I'm not really sure what trust means in a world where everyone relies uncritically on LLM output. Even if the information from the LLM is usually accurate, can I rely on that in some particularly important instance?
I still believe it fundamentally comes down to an interface issue, but how trust gets decoupled from the interface (as you said, the padlock shown in the browser and certs to validate a website source), thats an interesting one to think about :-)
Planes do actually have this now. It seems to work okay:
https://en.m.wikipedia.org/wiki/Traffic_collision_avoidance_...
You’re right that there’s a voice alert. But TCAS also has a map of nearby planes which is much more “HUD”! So it’s a combo of both approaches.
(Interestingly it seems that TCAS may predate Weiser’s 1992 talk)
[0]: https://www.geoffreylitt.com/2024/12/22/making-programming-m...
Compare another sci-fi depiction taken to the opposite extreme: Sirius Cybernetics products in the Hitchhikers Guide books. "Thank you for making a simple door very happy!"
It can detect situations intelligently, do the filtering, summarisation of what’s happening and possibly a recommendation.
This feels a lot more natural to me, especially in a business context when you want to monitor for 100 situations about thousands of customers.
it kind of worked. the magic was the smallest UI around it:
- timeline of dials + retries
- "call me back" flags
- when it tried, who picked up
- short summaries with links to the raw transcript
once i could see the behavior, it stopped feeling spooky and started feeling useful.
so yeah, copilots are cool, but i want HUDs: quiet most of the time, glanceable, easy to interrupt, receipts for every action.
As the cost of tokens goes down, or commodity hardware can handle running models capable of driving these interactions, we may start to see these UIs emerge.
For example, if you are debugging memory leaks in a specific code path, you could get AI to write a visualisation of all the memory allocations and frees under that code path to help you identify the problem. This opens up an interesting new direction where building visualisations to debug specific problems is probably becoming viable.
This idea reminds me of Jonathan Blow's recent talk at LambdaConf. In it, he shows a tool he made to visualise his programs in different ways to help with identifying potential problems. I could imagine AI being good at building these. The talk: https://youtu.be/IdpD5QIVOKQ?si=roTcCcHHMqCPzqSh&t=1108
Like, we have HUDs - that's what a HUD is - it's a computer program.
What comes immediately to mind for me is using embeddings to show closest matches to current cursor position on the right tab for fast jumping to related files.
Orchestration platforms - Evolution of tools like n8n/Make into cybernetic process design systems where each node is an intelligent agent with its own optimization criteria. The key insight: treat processes as processes, not anthropomorphize LLMs as humans. Build walls around probabilistic systems to ensure deterministic outcomes where needed. This solves massive "communication problems"
Oracle systems - AI that holds entire organizations in working memory, understanding temporal context and extracting implicit knowledge from all communications. Not just storage but active synthesis. Imagine AI digesting every email/doc/meeting to build a living organizational consciousness that identifies patterns humans miss and generates strategic insights.
just explored more about it on my personal blog https://henriquegodoy.com/blog/stream-of-consciousness
I see the value in HUDs, but only when you can be sure output is correct. If that number is only 80% or so, copilots work better so that humans in the loop can review and course correct - the pair programmer/worker. This is not to say we need ai to get to higher levels of correctness inherently, just that systems deployed need to do so before they display some information on HUD.
Just because most people are fond of it doesn't actually mean it improves their life, goals and productivity.
keyle•3h ago
Aren't auto-completes doing exactly this? It's not a co-pilot in the sense of a virtual human, but already more in the direction of a HUD.
Sure you can converse with LLMs but you can also clearly just send orders and they eagerly follow and auto-complete.
I think what the author might be trying to express in a quirky fashion, is that AI should work alongside us, looking in the same direction as we are, and not being opposite to us at the table, staring at each other's and arguing. We'll have true AI when they'll be doing our bidding, without any interaction from us.
gklitt•3h ago
Recent coding interfaces are all trending towards chat agents though.
It’s interesting to consider what a “tab autocomplete” UI for coding might look like at a higher level of abstraction, letting you mold code in a direct-feeling way without being bogged down in details.