- When one of the agents does something wrong, a human operator needs to be able to intervene quickly and needs to provide the agent with expert instructions. However since experts do not execute the bare tasks anymore, they forget parts of their expertise quickly. This means the experts need constant training, hence they will have little time left to oversee the agent's work.
- Experts must become managers of agentic systems, a role which they are not familiar with, hence they are not feeling at home in their job. This problem is harder to be determined as a problem by people managers (of the experts) since they don't experience that problem often first hand.
Indeed the irony is that AI provides efficiency gains, which as they become more widely adopted, become more problematic because they outfit the necessary human in the loop.
I think this all means that automation is not taking away everyone's job, as it makes things more complicated and hence humans can still compete.
I was the only person in the factory who was a qualified welder.
In the same way, when everything just works, there will be no difference, but when something goes wrong, the person who learned the skills before will have a distinct advantage.
The question is if AI gets good enough that slowing down occasionally to find a specialist is tenable. It doesn't need to be perfect, it just needs to be predicably not perfect.
Expertw will always be needed, but they may be more like car mechanics, there to fix hopefully rare issues and provide a tune up, rather than building the cars themselves.
But the report was very wrong for months. Maybe longer. And since it was automated, the instinct to check and validate was gone. And tracking down the problem required extra work that hadn’t been part of the Excel flow
I use this example in all of my automation conversations to remind people to be thoughtful about where and when they automate.
If an AI generates a process more quickly than a human, and the process can be run deterministically, and the outputs are testable, then the process can run without direct human supervision after initial testing - which is how most automated processes work.
The testing should happen anyway, so any speed increase in process generation is a productivity gain.
Human monitoring only matters if the AI is continually improvising new solutions to dynamic problems and the solutions are significantly wrong/unreliable.
Which is a management/analysis problem, and no different in principle to managing a team.
The key difference in practice is that you can hire and fire people on a team, you can intervene to change goals and culture, and you can rearrange roles.
With an agentic workflow you can change the prompts, use different models, and redesign the flow. But your choices are more constrained.
That means that, with the current technology, there can never be a deterministic agent.
Now obviously, humans aren't deterministic either, but the error bars are a lot closer together than they are with LLMs these days.
An easy to point at example is the coding agent that removed someones home directory that was circulating around. I'm not saying a human has never done that, but it's far less likely because it's so far out of the realm of normal operations.
So as of today, we need humans in the loop. And this is understood by the people making these products. That's why they have all these permissions and prompts for you to accept/run commands and all of that.
And it would be far less likely that the human deleted someone else's home directory, and even if he did, there would be someone to be angry about.
What's the base rate of humans rm -rf'ing their own work?
[0] https://blog.toolprint.ai/p/i-asked-claude-to-wipe-my-laptop
That's literally exactly the kind of non-determinism I'm talking about. If he'd just left the agent to it's own devices, the exact same thing would have happened.
now you may argue this highlights that people make catastrophic mistakes too, but I'm not sure i agree.
Or at least, they don't often make that kind of mistake. Not saying that they don't make any catastrophic mistakes (they obviously do....)
We know people tend to click "accept" on these kinds of permission prompts with only a cursory read of what it's doing. And the more of these prompts you get, the more likely you are to just click "yes" or whatever to get through it..
If anything this kind of perfectly highlights some of the ironies referenced in the post itself.
So this is true on paper, but I can tell you that companies don't broadly do a very good job of being efficient. What they do a good job of is doing the bare minimum in a number of situations, generating fragile, messy, annoying, or tech-debt-ridden systems / processes / etc.
Companies regularly claim to make objective and efficient decisions, but often those decisions amount to little more than doing a half-assed job because it will save money and will probably be good enough. The "probably" does a lot of work here, and then "probably" is not good enough there's a lot of blame shifting / politics / bullshitting.
The idea that companies are efficient is generally not very realistic except when it comes to things with real, measurable costs, such as manufacturing.
Is that not efficiency? ~ some managers I know
Also, by and large the current AI tools are not in the critical path yet, well except those drones that lock on targets to eliminate them in case of interference, and even then it is ML. Agents can not be in that path due to predictability challenges yet.
I ask myself if I need to understand the code, and if the answer is yes I don’t use an LLM. It’s not a matter of discipline, it’s a sober view of what the minimal amount of work for me is.
Bainbridge by itself is a tough paper to read because it's so dense. It's just four pages long and worth following along:
https://ckrybus.com/static/papers/Bainbridge_1983_Automatica...
For example, see this statement in the paper: "the present generation of automated systems, which are monitored by former manual operators, are riding on their skills, which later generations of operators cannot be expected to have."
This summarizes the first irony of automation, which is now familiar to everyone on HN: using AI agents effectively requires an expert programmer, but to build the skills to be an expert programmer, you have to program yourself.
It's full of insights like that. Highly recommended!
And yet here we are, able to talk to a computer, that writes Pytorch code that orchestrates the complexity below it. And even talks back coherently sometimes.
There's no need for ongoing, consistent human verification at runtime. Any problems with the implementation can wait for a skilled human to do whatever research is necessary to develop the specific system understanding needed to fix it. This is really not a valid comparison.
While one can in principle learn C as well as you say, in practice there's loads of cases of people getting surprised by undefined behaviour and all the famous classes of bug that C has.
I believe it to be likely that the C programmer would even writes the code faster and better because of the useful abstractions. An LLM will certainly write the code faster but it will contain more bugs (IME).
It writes something that that's almost, but not quite entirely unlike Pytorch. You're putting a little too much value on a simulacrum of a programmer.
All of these AI outputs are both polluting the commons where they pulled all their training data AND are alienating the creators of these cultural outputs via displacement of labor and payment, which means that general purpose models are starting to run out of contemporary, low-cost training data.
So either training data is going to get more expensive because you're going to have to pay creators, or these models will slowly drift away from the contemporary cultural reality.
We'll see where it all lands, but it seems clear that this is a circular problem with a time delay, and we're just waiting to see what the downstream effect will be.
No dispute on the first part, but I really wish there were numbers available somehow to address the second. Maybe it's my cultural bubble, but it sure feels like the "AI Artpocalypse" isn't coming, in part because of AI backlash in general, but more specifically because people who are willing to pay money for art seem to strongly prefer that their money goes to an artist, not a GPU cluster operator.
I think a similar idea might be persisting in AI programming as well, even though it seems like such a perfect use case. Anthropic released an internal survey a few weeks ago that was like, the vast majority, something like 90% of their own workers AI usage, was spent explaining allnd learning about things that already exist, or doing little one-off side projects that otherwise wouldn't have happened at all, because of the overhead, like building little dashboards for a single dataset or something, stuff where the outcome isn't worth the effort of doing it yourself. For everything that actually matters and would be paid for, the premier AI coding company is using people to do it.
When AI tops the charts (in country music) and digital visual artists have to basically film themselves working to prove that they're actually creating their art, it's already gone pretty far. It feels like the even when people care (and they great mass do not) it creates problems for real artists. Maybe they will shift to some other forms of art that aren't so easily generated, or maybe they'll all just do "clean up" on generated pieces and fake brush sequences. I'd hate for art to become just tracing the outlines of something made by something else.
Of course, one could say the same about photography where the art is entirely in choosing the place, time, and exposure. Even that has taken a hit with believable photorealistic generators. Even if you can detect a generator, it spoils the field and creates suspicion rather than wonder.
Businesses which don't want to pay money strongly prefer AI.
Does it then really speeds us up and generally makes things better?
But we are in the later generation now. All the 1983 operators are now retired, and today's factory operators have never had the experience of 'doing it by hand'.
Operators still have skills, but it's 'what to do when the machine fails' rather than 'how to operate fully manually'. Many systems cannot be operated fully manually under any conditions.
And yet they're still doing great. Factory automation has been wildly successful and is responsible for why manufactured goods are so plentiful and inexpensive today.
But operator inexperience didn't turn out to be a substantial barrier to automation, and they were still able to achieve the end goal of producing more things at lower cost.
You can't ring more true than this. For decades now.
For a couple years there I was able to get some ML together and it helped me get my job done, never came close to AI, I only had kilobytes of memory anyway.
By the time 1983 rolled around I could see the writing on the wall, AI was going to take over a good share of automation tasks in a more intelligent way by bumping the expert systems up a notch. Sometimes this is going to be a quantum notch and it could end up like "expertise squared" or "productivity squared" [0]. At the rarefied upper bound. Using programmable electronics to multiply the abilities of the true expert whilst simultaneously the expert utilized their abilities to multiply the effectiveness of the electronics. Maybe only reaching the apex when the most experienced domain expert does the programming, or at least runs the show.
Never did see that paper, but it was obvious to many.
I probably mentioned this before, but that's when I really bucked down for a lifetime of experimental natural science across a very broad range of areas which would be more & more suitable for automation. While operating professionally within a very narrow niche where personal participation would remain the source of truth long enough for compounding to occur. I had already been a strong automation pioneer in my own environment.
So I was always fine regardless of the overall automation landscape, and spent the necessary decades across thousands of surprising edge cases getting an idea how I would make it possible for someone else to even accomplish some of these difficult objectives, or perhaps one day fully automate. If the machine intelligence ever got good enough. Along with the other electronics, which is one of the areas I was concentrating on.
One of the key strategies did turn out to be outliving those who had extensive troves of their own findings, but I really have not automated that much. As my experience level becomes less common, people seem to want me to perform in person with greater desire every decade :\
There's related concepts for that too, some more intelligent than others ;)
[0] With a timely nod to a college room mate who coined the term "bullshit squared"
But what most of them do is not to be more efficient but to be shown to be more efficient. The main reason they are so obsessed with AI is because they want to send the signal that they are pursuing to be more efficient, whether they succeed or not.
Being a credibly efficient at doing the wrong things, turns out to be a massive issue inside of most companies. What's interesting is I do think that AI gives opportunity to be massively more effective because if you have the right LLM, that's trained right, you can explore a variety of scenarios much faster than what you can do by yourself. However, we hear very little about this as a central thrust of how to utilize AI into the work space.
Yet, pilots are constantly trained on actual scenarios, and are expected to land airplanes manually monthly (and during take off too).
This ensures pilots maintain their skills, while the auto pilot helps most of the time.
On top of that, plane commands often are half automatic already, aka they are assisted (but not by LLMs!), so it’s a complex comparison.
Don’t get me wrong - manual practice is in some sense the correct solution, and I plan to try and do it myself in the next decade to make sure my skills stay sharp. But I don’t see the industry broadly encouraging it, still less making it mandatory as aviation does.
Addendum: as you probably know, even in aviation, this is hard to get right. (This is sometimes called the “children of the magenta” problem, but it’s really Bainbridge again.) The most famous example is perhaps Air France Flight 447[0], where the pilots put the plane into a stall at 35,000ft when they reacted poorly after the autopilot disconnecting, and did not even realize they had stalled the plane. Of course, that crash itself led to more regulations around training in manual scenarios too.
[0] https://admiralcloudberg.medium.com/the-long-way-down-the-cr...
I work at a firm that has given AI tooling to non-developer data analyst type people who otherwise live & die in excel. Much of their day job involves reading PDFs. I occasionally will use some of the firms AI tooling for PDF summarizing/parsing/interrogation/etc type tasks and remain consistently underwhelmed.
Stuff like taking 10 PDFs each with a simple 30 row table per PDF, with the same title in each file, it ends up puking on 3-4 out of 10 with silent failures. Row drops, duplicating data, etc. When you point out its missed rows, it goes back and duplicates rows to get to the correct row count.
Using it to interrogate standard company filings PDfs that it has been specially trained on and it gave very convincing answers which were wrong because it has silently truncated its search context to only recent year financial filings. Nowhere did it show this limitation to the user. It only became apparent after researching the 4th or 5th company when it decided to caveat its answer with its knowledge window. This invalidated the previous answers as questions such as "when was the first X" or "have they ever reported Y" were operating on incomplete information.
Most users of these tool are not that technical, and are going to be much more naive in taking the answers for fact without considering the context.
For example, imagine describing what files you want to find, and getting back a command-line string of find/grep piping. It doesn't execute anything without confirmation, it doesn't "summarize" the results, it's just a narrow tutor to help people in a translation step. A tool for learning that, ideally, eventually puts itself out of a job.
Returning to your PDF scenario: The LLM could help people weave together regular tools of "find regions with keywords" and "extract table as spreadsheet" and "cross-reference two spreadsheets using column values", etc.
If you're bad at your job, you're automating it at lightning speed.
You need have good business process and be good at your job without AI in order to have any chance in hell of being successful with it. The idea that you can just outsource your thinking to the AI and don't need to actually understand or learn anything new anymore is complete delusion.
However, this took 40 years and actual fatalities. We should keep that in mind when we're pushing the AI acceleration pedal down ever harder.
I question this.
z_•7h ago
“But at what cost?”
We’ve all accepted calculators into our lives as being faster and correct when utilized correctly (Minus Intel tomfoolery), but we emphasize the need to know how to do the math in educational settings.
Any post education adult will confirm when confronted with an irregular math problem (or a skill) that there is a wait time to revive the ability.
Programming automation having the potential skill decay AND being critical path is … worth thinking about.
xorcist•5h ago
singpolyma3•5h ago
kurthr•3h ago
Using a slide rule meant inherently knowing order-of-magnitude, rounding, and precision. Once calculators make it easy they enable both new kinds of solutions and new kinds of errors (that you have to separately teach to avoid).
At the same time, I basically agree. Humans are very bad calculators and we've needed tools (abacus) for millennia.
bitwize•1h ago
eastbound•4h ago