What a horrible world we live in where the author of great writing like this has to sit and be accused of "being AI slop" simply because they use grammar and rhetoric well.
If an LLM wrote that, then I no longer oppose LLM art.
One of AI’s strengths is definitely exploration, f.e. in finding bugs, but it still has a high false positive rate. Depending on context that matters or it wont.
Also one has to be aware that there are a lot of bugs that AI won’t find but humans would
I don’t have the expertise to verify this bug actually happened, but I’m curious.
But, I do think their explanation of the lock acquisition and the failure scenario is quite clear and compelling.
The intro says “We used Claude and Allium”. Allium looks like a tool they’ve built for Claude.
So the article is about how they used their AI tooling and workflow to find the bug.
If anything, if you try to cram a ton of complexity into a few kb of memory, the likelihood of introducing bugs becomes very high.
One of the more interesting things they have been working on, is a potential re-interpretation of the infamous 1202 alarm. It is, as of current writing, popularly described as something related to nonsensical readings of a sensor which could (and were) safely ignored in the actual moon landing. However, if I remember correctly, some of their investigation revealed that actually there were many conditions which would cause that error to have been extremely critical and would've likely doomed the astronauts. It is super fascinating.
josephg•2h ago
ModernMech•1h ago
voodooEntity•1h ago
therefor decided not gonne use any llm for blogging again and even tho it takes alot more time without (im not a very motivated writer) i prefer to release something that i did rather some llm stuff that i wouldnt read myself.
embedding-shape•1h ago
croemer•1h ago
Another one: "Two instructions are missing: [...] Four bytes."
One more: "The defensive coding hid the problem, but it didn’t eliminate it."
monooso•1h ago
This insistence that certain stylistics patterns are "tell-tale" signs that an article was written by AI makes no sense, particularly when you consider that whatever stylistic ticks an LLM may possess are a result of it being trained on human writing.
gcr•1h ago
For what it’s worth, Pangram reports that Marcus’ article is 100% LLM-written: https://www.pangram.com/history/640288b9-e16b-4f76-a730-8000...
croemer•1h ago
embedding-shape•50m ago
360MustangScope•1h ago
Even though they are perfect for usage in writing down thoughts and notes.
croemer•1h ago
brookst•5m ago
“An em dash… they’re a witch!”… “it’s not just X, it’s Y… they’re a witch!”
croemer•1h ago
My hunch that this is substantially LLM-generated is based on more than that.
In my head it's like a Bayesian classifier, you look at all the sentences and judge whether each is more or less likely to be LLM vs human generated. Then you add prior information like that the author did the research using Claude - which increases the likelihood that they also use Claude for writing.
Maybe your detector just isn't so sensitive (yet) or maybe I'm wrong but I have pretty high confidence at least 10% of sentences were LLM-generated.
Yes, the stylistic patterns exist in human speech but RLHF has increased their frequency. Also, LLM writing has a certain monotonicity that human writing often lacks. Which is not surprising: the machine generates more or less the most likely text in an algorithmic manner. Humans don't. They wrote a few sentences, then get a coffee, sleep, write a few more. That creates more variety than an LLM can.
Fun exercise: https://en.wikipedia.org/wiki/Wikipedia:AI_or_not_quiz
monooso•1h ago
Someone probably expended a lot of time and effort planning, thinking about, and writing an interesting article, and then you stroll by and casually accuse them of being a bone idle cheat, with no supporting evidence other than your "sensitive detector" and a bunch of hand-wavy nonsense that adds up to naught.
kenjackson•14m ago
bookofjoe•7m ago
If there is constant vigilance on the part of the reader as to how it was created, meaning and value become secondary, a sure path to the death of reading as a joy.
oscaracso•55m ago
brookst•9m ago
tapoxi•1h ago
croemer•1h ago
TruffleLabs•1h ago
croemer•1h ago
In fact, the latter is the opposite of terseness. LLMs love to tell you what things are not way more than people do.
See https://www.blakestockton.com/dont-write-like-ai-1-101-negat...
(The irony that I started with "it's not just" isn't lost on me)
gcr•1h ago
xmcqdpt2•1h ago
DiffTheEnder•1h ago
Don't understand how these tools exist.
gcr•50m ago
They found that Pangram suffers from false positives in non-prose contexts like bibliographies, outlines, formatting, etc. The article does not touch on Pangram’s false negatives.
I personally think it’s an intractable problem, but I do feel pangram gives some useful signal, albeit not reliably.
cameronh90•1h ago
What's making it even more difficult to tell now is people who use AI a lot seem to be actively picking up some of its vocab and writing style quirks.
embedding-shape•56m ago
Not sure how I feel about the whole "LLMs learned from human texts, so now the people who helped write human texts are suddenly accused of plagiarizing LLMs" thing yet, but seems backwards so far and like a low quality criticism.
snapcaster•30m ago
jnwatson•22m ago
croemer•45m ago
It seems to look at sections of ~300 words. And for one section at least it has low confidence.
I tested it by getting ChatGPT to add a paragraph to one of my sister comments. Result is "100% human" when in fact it's only 75% human.
Pangram test result: https://www.pangram.com/history/1ee3ce96-6ae5-4de7-9d91-5846...
ChatGPT session where it added a paragraph that Pangram misses: https://chatgpt.com/share/69d4faff-1e18-8329-84fa-6c86fc8258...
gcr•28m ago
timdiggerm•30m ago
Aurornis•8m ago
It’s becoming a problem in schools as teachers start accusing students of cheating based on these detectors or ignore obvious signs of AI use because the detectors don’t trigger on it.
ChrisRR•1h ago
monooso•1h ago
NiloCK•1h ago
It is:
- sneering
- a shallow dismissal (please address the content)
- curmudgeonly
- a tangential annoyance
All things explicitly discouraged in the site guidelines. [1]
Downvoting is the tool for items that you think don't belong on the front page. We don't need the same comment on every single article.
[1] - https://news.ycombinator.com/newsguidelines.html
monooso•1h ago
masklinn•51m ago
You can’t downvote submissions. That’s literally not a feature of the site. You can only flag submissions, if you have more that 31 karma.
NiloCK•38m ago
Optimistically, I guess I can call myself some sort of live-and-let-live person.
timdiggerm•28m ago
TruffleLabs•1h ago
mpalmer•57m ago
The short sentence construction is the most suspicious, but I actually don't see anything glaring. It normally jumps out and hits me in the face.
rudhdb773b•49m ago
It seems like almost every discussion has at least someone complaining about "AI slop" in either the original post or the comments.
Gigachad•8m ago
There is some real content in the haystack, but we almost need some kind of curator to find and display it rather than a vote system where most people vote on the title alone.