I can't imagine we'll really be able to trust AI without it's use in open source software where we can see how reliable it is.
AI bug reports went from junk to legit overnight, says Linux kernel czar (theregister.com)
58 points by amarant 4 days ago
Odd sentiment. It's pretty clear the tools crossed a threshold last year (in April as I recall) where they became good enough to actually write entire applications, and just accelerated from there. Today they're amazing and no-one I know is writing artisanal code anymore (at least, not at work).
supernes•1h ago
georgemcbay•1h ago
A lot of people seem stuck with their older (correct at the time) views of them still always producing slop.
FWIW I am more of an AI doomer (in the sense that I think the economic results from them will be disastrous for knowledge workers given our political realities) than booster, but in terms of utility to get work done they did pass a clear inflection point quite recently.
bluefirebrand•1h ago
So, still pretty likely to produce slop in a large majority of cases
If the most useful place for them is where you've already specced things out to that degree of precision then they aren't that useful?
Speccing things to that precision is the time consuming and difficult work anyways, after all.
georgemcbay•1h ago
I wish this wasn't true because I think it will economically upend the industry in which I have a career, but sadly the universe doesn't care what I wish.
mjr00•44m ago
IMO this vastly overestimates how good the "untrained masses" are at thinking in a logical, mathematical way. Apparently something as basic as Calculus II has a fail rate of ~50% in most universities.
xyzelement•22m ago
embedding-shape•4m ago
If they truly did, there wouldn't be a huge amount of humans whose role is basically "Take what users/executives say they want, and figure out what they REALLY want, then write that down for others".
Maybe I've worked for too many startups, and only consulted for larger companies, but everywhere in businesses I see so many problems that are basically "Others misunderstood what that person meant" and/or "Someone thought they wanted X, they actually wanted Y".
PhilipRoman•3m ago
ramesh31•1h ago
Opus 4.6 has been a step change. It's simply never wrong anymore. You may need to continue giving it further clarification as to what you want, but it never makes mistakes with what it intends to do now.
binarymax•56m ago
brcmthrowaway•45m ago
binarymax•20m ago
Balinares•12m ago
I do agree that the Q1 2026 models in general have passed a threshold, but goodness almighty Opus 4.6 still screws up a lot.
throwaway2027•33m ago