It says they recruited participants from the US through Prolific and paid them £10.12 per hour, so probably more like the latter.
On the other hand, if you're content with your pre-existing predictions about what would happen, which I think is actually a reasonable position, there's no reason to read the paper.
Social media has turned into cancer. It'd be riveting to watch it turn into bots talking to other bots. Social media wouldn't go away, but I get the feeling people will engage more with real life again.
As the platforms see less growth and fewer real users, we might even see a return to protocols and open standards instead of monolithic walled gardens.
Oh well not being a plumber, electrician, or farmer... but our society's current productivity, technology, automation reduced our need for 80% of the population needing to be farmers to now 1.3% in the US. Can you imagine what the equivalent of 1 billion digital engineers unlocks in understanding and implementing robotics?
For the record I always thought Kurzweil and that crowd was clowns, now I think I was the wrong one
Give it a decade though and people won't think twice about it, though I do hope we'd still do that kind of thing ourselves
Is GPT-4xyz better than the last one? I'm sure some benchmark numbers say that. But the number of applications where occasional hallucinations don't matter is very small, and where it matters nothing really changed. Companies are trying to use it for customer support but that predictably turned out to be a legal risk. Klarna went all-in on AI and regrets it now.
Some media are talking about Microsoft writing 30% of their new code with AI, but what Nadella actually said is less impressive: "maybe 20-30% of the code that is inside of our repos today in some of our projects are probably all written by software". Which, coincidentally, is the ratio of code that can be autocompleted by an IDE without LLM, according to Jetbrains.
I have yet to see any evidence that anything will change way faster than it ever has, aside from the readiness of many younger people to use it in everyday life for things it really shouldn't be used.
I tend to think that it’ll have an optimistic ending. The key to solving most political problems is eliminating scarcity.
A simple case I have found, is looking for existing or creating new terms. If I have a series of concepts, which I have names for which have a nice linguistic pattern to emphasize their close relationship, except for one. I can describe the regularly named concepts, then ask for suggestions for the remaining concept.
The LLM pulls from virtually every topic with domain terminology, repurposable languages (Greek, Roman), words from fiction, all the way to creative construction of new words, tenses, etc to come up with great proposals in seconds.
I could imagine that crafting persuasive wording would be a similar challenge. Choosing the right words, right phrasing, etc. to carry as much positive connotation, implication of solidity, avoiding anything sounding challenging or controlling, etc. from all of human language and its huge space of emotional constraints and composites.
Very shallow but very wide reasoning/searching/balancing done in very little time.
And with an ability to avoid giving any unnecessary purchase for disagreement, being informed of all the myriad of typical and idiosyncratic ways people get hung up on failed persuasions. Whether in general or specific topic related.
LLM generated writing can be stereotypical.
But the more constraints put on requested material, the more their ability to construct really very original high quality, or even cleverly unique, prose in real time shines.
I called the first one “pre” since it “precedes” the others.
I called the last one “pro” since it “proceeds” from the others.
These are somewhat poetic/whimsical terms, for an abstract arrangement, but it’s nice to have good terms even for non-serious stuff.
I couldn’t come up with an equally concise term for the middle.
Claude came up with “per” as in “through” for the middle thing.
Couldn’t be more fitting.
I think it would be difficult to truly convince me to answer differently in a test with 14 words where 30 would have enough space to actually convey an argument.
I would be very interested to see the test rerun while limiting LLM response length or encouraging long responses from humans.
The test already incentivises being persuasive! If writing more words would do that, and the incentivised human persuaders don't write more words and the LLMs do, then I think it's fair to say that LLMs are more persuasive than incentivised human persuaders.
I always try to pick out as many tidbits as possible from papers that might be applicable in other situations. I think the main difference of word count may be overshadowing other insights that may be more relevant to longer form argumentation.
I don’t know if that would have the effect you want. And if you’re more likely have hallucinations at lower word counts, that matters for those who are scrupulous, but many people trying to convince you of something believe the ends justify the means, and that honesty or correspondence to reality are not necessary, just nice to have.
Asking chatbots for short answers can increase hallucinations, study finds - https://news.ycombinator.com/item?id=43950684 - May 2025 (1 comment)
which is reporting on this post:
Good answers not necessarily factual answers: analysis of hallucination in LLMs - https://news.ycombinator.com/item?id=43950678 - May 2025 (1 comment)
I do think its distinctly possible that LLMs will be much less convincing due to increased hallucinations at a low word count. I also think that may have less of an effect for dishonest suggestions. Simply stating a lie confidently is relatively effective.
I would prefer advising humans to increase length rather than restricting LLMs because of the cited effects.
I would advise the opposite to humans, as your advice is playing to the strengths of AI/LLMs and away from the strengths of humans versus AI/LLMs.
The given study does not show any strength of humans over LLMs. Both goal metrics (truthful and deceptive) are better for LLMs than humans. If you are misinterpreting my advice as general advice for people not under the study's conditions, I would want to see the results of the proposed rerun before suggesting that.
However, if length of text is legitimately convincing regardless of content, I don't know why humans should avoid using that. If LLMs end up more convincing to humans than other humans simply because humans are too prideful to make their arguments longer, that seems like the worst possible future.
People aren’t too proud to make long arguments, they just take more time and effort to make for humans, and so historically, humans subconsciously consider longer arguments as more intellectually rigorous whether they are or not, and so length of a written piece is used as a kind of lazy heuristic corresponding with quality. When we're comparing the output of humans to that of other humans, this kind of approach may work to a certain extent, but AI/LLMs seem to be better at writing long pieces of text upon demand than humans. That humans find the LLM output more convincing if it is longer is not surprising to me, but I’ll agree with you that it isn’t a good sign either. The metric has become a target.
> ChatGPT, I choose YOU!
ChatGPT uses GISH GALLOP.
Let the real scientists settle things for us?
Alternatively, many people today let opinion media and/or their associated group allegiances settle what they believe. Much worse.
This creates a very poorly designed tool! A good tool should fail as loudly as possible, in that it alerts the user of the failure and does its best to specify the conditions that led to this. This isn't always possible, but if you look at physical engineers you'll see that this is where they spend a significant portion of their time. Even in software I'd argue we do a lot here, but also that it is easy to brush off (we all love those compiler messages... right?). Clearly right now LLMs are in a state where we don't know how to make their failures more visible, and honestly, that is okay. But what is not okay is to pretend that this is not current reality and pretend that there are no dangers or consequences that this presents. We dismiss this because we catch some obvious errors and over-generalize the error quality, but that just means we suffer from Murray Gell-Mann Amnesia. It's REALLY hard to measure what you don't know. Importantly, we can't even begin to resolve these issues and build the tools we want (the ones we pretend these are!) if we ignore the reality of what we have. You cannot make things better if you are unwilling to recognize their limitations.
Everyone here is an engineer, researcher, or builder. This framework of thinking should be natural to us! We should also be able to understand that there's a huge difference between critiques and limitations and dismissing things. I'm an AI critic, but also very optimistic. I'm a researcher and spending my life working on this topic. It'd be insane to do such a thing if I thought it was a fruitless or evil effort. But it would be equally insane to pursue a topic with pure optimism. If I were to blind myself to limits and paint everything as a trivial to solve problem, I'd never be able to solve any of those problems. Ignoring or dismissing technical issues and limitations is the domain of the MBA managers, not engineers.
There's so much dirty subliminal or informal advertising that you can do with these things.
metalcrow•4h ago
CJefferson•4h ago
Most people can’t lie that smoothly, and most readers don’t check carefully, unless they are already an expert in the area.
Any kind of maths proof is particularly bad, they will look convincing and clear until you read them very carefully and see all the holes.
armchairhacker•4h ago
EDIT: LLMs also aren't egocentric; they'll respond in the other person's style (grammar, tone, and perhaps maintain their "subtext" like assumptions), and they're less likely to omit important information that would be implicit to them but not the other person.
koakuma-chan•4h ago
hansmayer•4h ago
koakuma-chan•4h ago
hansmayer•3h ago
thinkcritical•3h ago
koakuma-chan•3h ago
lovasoa•3h ago
jstanley•4h ago
hansmayer•4h ago
lovasoa•3h ago
louthy•3h ago
That’s a queue, not a stack. The LLM response was correct.
danielbln•3h ago
idonotknowwhy•2h ago
Karrot_Kream•3h ago
hansmayer•3h ago
abtinf•3h ago
The structure of the LLM answer is:
A is B; B exhibits property C.
The correct answer is:
A exhibits property C; B is the class of things with property C; therefore A is B.
There is a crucial difference between these two.
literalAardvark•3h ago
moffkalast•3h ago
Sharlin•4h ago
hammock•3h ago
azemetre•3h ago
justonceokay•2h ago
https://youtu.be/LMO27PAHjrY
cwmoore•2h ago
OJFord•1h ago
What is the point of that? They're incomprehensible. (For those who haven't watched it: the video just shows people talking very fast, it doesn't explain why, kind of implies it's somehow good or impressive.)
nimih•42m ago
OJFord•36m ago
Again/stepping back: what is the point of winning a debate tournament like this, or that values this 'debate'?
1oooqooq•1h ago
upghost•2h ago
api•1h ago
Are there no rules in debates? There should be. You’re not allowed to punch someone in basketball so why should you be allowed to DOS people with bullshit in a debate?
Der_Einzige•29m ago
They’ve been quietly open sourcing all of their arguments for like 20+ years.
This dataset is so large and good entirely because of speed reading and the current state of debate tournament competitive dynamics. Spreading might be objectively absurd to listeners but the effects of it are literally good for society.
https://arxiv.org/abs/2406.14657
https://huggingface.co/datasets/Yusuf5/OpenCaselist