"The Illusion of Thinking" – Thoughts on This Important Paper

https://hardcoresoftware.learningbyshipping.com/p/233-the-illusion-of-thinking-thoughts

53•rbanffy•4d ago

Comments

withinboredom•16h ago

The number of people I’ve run into that think ChatGPT (or whatever model they are using) is still thinking about whatever it is they’ve talked about before while they’re not using it is non-zero. It doesn’t help that the models sometimes say things like “give me five minutes to think on it” and stuff like that.

tough•16h ago

yeah man, the antrophomorphization is bad.

Lately chatgpt is like -let me show you an example I used before- like its a real professional.

it's all in the context, its our duty to remember these are LARP machines

I copy - pasted something about a kid their parent was asking online. Chatgpt said:

⸻

A Personal Anecdote (Borrowed)

A parent I know gave their daughter a “leader hat.” [..]

How in the hell would a large language model have past personal anecdotes and know other parents, idk

elif•15h ago

It is not the models anecdote but that of the parent.

And knowing is something even dumb programs do with data. It doesn't imply human cognition

tough•15h ago

I mean yeah the "borrowed" is doing a lot of heavy lifting.

But that was just like the latest, I just don't love how it talks like it was an alive person at the other end, maybe I should add something to my personal prompt to avoid it.

TeMPOraL•15h ago

> yeah man, the antrophomorphization is bad.

Unfortunately, it's also the least wrong approach. People who refuse to entertain thinking about LLMs as quasi-humans, are the ones perpetually confused about what LLMs can or cannot do and why. They're always up in arms about prompt injections and hallucinations, and keep arguing those are bugs that need to be fixed, unable to recognize them as fundamental to the model's generality and handling natural language. They keep harping about "stochastic parrots" and Naur's "program theory", claiming LLMs are unable to reason or model in the abstract, despite many published research that lobotomizes the models live and nail concepts as they form and activate.

If you squint and imagine LLM to be a person, suddenly all of these things become apparent.

So I can't really blame people - especially non-experts - for sticking to an approach that's actually yielding much better intuition than the alternatives.

tsimionescu•13h ago

Anthropomorphization also leads to very wrong conclusions though. In particular, we have a theory of mind that we apply in relation to other humans, mostly on the basis of "they haven't lied to me so far, so it's unlikely they will suddenly start lying to me now". But this is dead wrong in relation to output from an LLM - just because it generated a hundred correct answers doesn't tell you anything about how likely the 101st one is to be be correct and not a fabrication. Trust is always misplaced if put into results returned by an LLM, fundamentally.

TeMPOraL•10h ago

> But this is dead wrong in relation to output from an LLM - just because it generated a hundred correct answers doesn't tell you anything about how likely the 101st one is to be be correct and not a fabrication.

Neither it is for humans, the whole "they haven't lied to me so far, so it's unlikely they will suddenly start lying to me now" thing is strongly conditioned on "low frequency" assumptions from shared biology, history and culture, as well as "high frequency" assumptions about the person, e.g. that they're actually friendly, or the context didn't change, that your goals are still aligned[0]... or that they aren't drunk, confused, or a kid.

Correct your theory of mind for the nature of LLMs, and your conclusions will be less wrong. LLMs are literally trained to approximate humans, so it makes sense this approach gives a decent high-level approximation.

(What LLMs don't have is the "sticky" part of social sphere; humans, generally, behave as if they expect to interact again in the future. Individual or public opinion doesn't weigh on them much. And then there's the other bit, in that we successfully eliminate variance in humans - the kind of people who cannot be theory-of-minded well by others, get killed or locked up in prisons and mental institutions, or on the less severe end, get barred from high-risk jobs and activities.)

To be clear: I'm not claiming you should go deep into the theory-of-mind thing, assuming intentions and motivations and internal emotional states in LLMs. But that's not necessary either. I say that the right amount of anthropomorphism just cognitive and focused on immediate term. "What would a human do in an equivalent situation?" gets you 90% there.

BTW, I looked up the theory of mind on Wikipedia to refresh some of the details, and found this by the end of the "Definition" section[1]:

> An alternative account of theory of mind is given in operant psychology and provides empirical evidence for a functional account of both perspective-taking and empathy. The most developed operant approach is founded on research on derived relational responding[jargon] and is subsumed within relational frame theory. Derived relational responding relies on the ability to identify derived relations, or relationships between stimuli that are not directly learned or reinforced; for example, if "snake" is related to "danger" and "danger" is related to "fear", people may know to fear snakes even without learning an explicit connection between snakes and fear.[20] According to this view, empathy and perspective-taking comprise a complex set of derived relational abilities based on learning to discriminate and respond verbally to ever more complex relations between self, others, place, and time, and through established relations

I then followed through to relational frame theory[2], and was surprised to find basically a paraphrase of how vector embeddings work in language models. Curious.

[0] - Yes, I used that word. Also yes, in weak or transactional relationships, this may change without you realizing it, and the other person may suddenly start lying to you with no advance warning. I'm pretty sure everyone experienced this at some point.

[1] - https://en.wikipedia.org/wiki/Theory_of_mind#Definition

[2] - https://en.wikipedia.org/wiki/Relational_frame_theory

tough•4h ago

There's also a big difference between a local llm which model's weights you fully control and have acess to, and thus can be sure it's not being changed under your nose.

Trusting OpenAI to not put sychophant 4o on, or not do funny things on their opaque CoT is a whole another thing.

The problem is 1% of us are programmers and can reason about these things as the technology they are, for the 99% of the rest that don't understand it and just -use it- it feels like magic, and magic is both good/bad from this POV

me thinks anyways

southernplaces7•15h ago

Right here on HN, where you'd think that most people would be able to think better, there's no shortage of AI fanboys who seriously try to frame the idea that LLMs may be conscious, and that we can't really be sure if we're conscious, since, you know, science doesn't know what causes consciousness.

I guess me feeling conscious, and thus directing much of my life and activity around that notion is just a big trick, and i'm no different from ChatGPT, only inferior and with lower electrical needs.

IshKebab•14h ago

This is a bad faith representation of what people are actually saying which is:

1. As far as we know there is nothing non-physical in the brain (a soul or magical quantum microtubules or whatever). Everything that happens in your head obeys the laws of physics.

2. We definitely are conscious.

3. Therefore consciousness can arise from purely physical processes.

4. Any physical process can be computed. And biological thought appears as far as we can see to simply be an extraordinarily complex computation.

5. Therefore it is possible for a sufficiently fast & appropriately programmed computer to be conscious.

6. LLMs bear many resemblances to human thinking (even though they're obviously different in many ways too), and we've established that computers can be conscious, so you can't trivially rule out LLMs being conscious.

At least for now, I don't think anyone sane says today's LLMs are actually conscious. (Though consciousness is clearly a continuum... so maybe they're as conscious as a worm or whatever.)

The main point is that people trivially dismissing AI consciousness or saying "it can't think" because it's only matrix multiplications are definitely wrong.

It might not be conscious because consciousness requires more complexity, or maybe it requires some algorithmic differences (e.g. on-line learning). But it definitely isn't not-conscious just because it is maths or running on a computer or deterministic or ...

Veen•14h ago

The "as far as we know" in premise one is hand-waving the critical point. We don't know how or if conscious arises from purely physical systems. It may be physical, but inadequately replicated by computer systems (it may be non-computable). It may not be physical in the way we use that concept. We just don't know. And the fact that we don't know puts the conclusion in doubt.

Even if consciousness arises from a physical system, it doesn't follow that computation on a physical substrate can produce consciousness, even if it mimics some of the qualities we associate with consciousness.

stevenhuang•13h ago

Rather, the fact we don't know means...

> you can't trivially rule out LLMs being conscious.

Which is the actual conclusion. That's all.

The one trying to read more into it by claiming a stronger conclusion than warranted (claiming this is all "doubtful") is you.

> Even if consciousness arises from a physical system, it doesn't follow that computation on a physical substrate can produce consciousness

This is incoherent.

Veen•13h ago

The actual conclusion: "We've established that computers can be conscious, so LLMs might be"

The foregoing argument does not establish that computers can be conscious. It assumes consciousness arises from purely physical interactions AND that any physical process can be computed AND that computation can produce consciousness. It provides no reason to believe any of those things, and therefore no reason to believe computers or LLMs can be conscious.

Once you remove the unjustifed assumptions, it boils down to this: LLMs can do something that looks a bit like what some conscious beings do so they might be conscious.

The final paragraph is not incoherent.

stevenhuang•11h ago

> It assumes consciousness arises from purely physical interactions AND that any physical process can be computed AND that computation can produce consciousness

Yes, and with these assumptions, then "We've established that computers can be conscious, so LLMs might be".

Again that is all that is being claimed. The one insisting the argument is anything more persuasive than that is you. You seem hung up on this claim, but need I remind you it's only an existence proof, whether it's actually possible/probable/likely is an entirely different claim, and is not what is being claimed.

> Even if consciousness arises from a physical system, it doesn't follow that computation on a physical substrate can produce consciousness

> If consciousness arises from a physical system, a physical system cannot produce consciousness

> If A not A

This argument is a contradiction, it's incoherent.

If you're trying to draw a distinction between physical systems and computation on physical systems, well that's a moot point as every tested theory in physics uses mathematics that is computable.

Perhaps reality is not computable. If so then yes, that would mean computation alone will not produce the same kind of consciousness we have. That leaves open other kinds of consciousness, but that's another discussion.

Veen•9h ago

If all that’s been shown is that LLM consciousness is not impossible then the point barely seems worth making. One tends to assume that interlocutors are trying to do more than point out the bleeding obvious.

stevenhuang•9h ago

Yet many argue LLM consciousness is impossible.

I'm not sure you're following what's being discussed in this thread.

Veen•8h ago

I think perhaps the confusion is in the other direction, but I’ve learned not to hold unmerited high-handedness against autists.

IshKebab•2h ago

No he's right. That was the point I'm trying to make. And I agree it's barely worth saying but so many people assume the opposite - that LLMs can't possibly be conscious that we have to state the obvious.

danaris•11h ago

What we have is:

For all consciousnesses that exist, they arose from a (complex) physical system.

What you are positing is, effectively:

Any physical system (of sufficient complexity) could potentially become conscious.

This is a classic logical error. It's basically the same error as "Socrates is a man. All men are mortal. Therefore, all men are Socrates."

stevenhuang•10h ago

It's actually not. The "sufficient complexity" part is doing all the heavy lifting. If a proposed physical system is not conscious, well clearly it's not sufficiently complex.

Am I saying LLMs are sufficiently complex? No.

But can they be in principle? I mean, maybe? We'll have to see right?

Saying something is possible is definitely an easier epistemic position to hold than saying something is impossible, I'll tell you that.

IshKebab•9h ago

No I'm not saying that. I'm saying that we can't trivially rule out the possibility that any sufficiently complex physical system is conscious.

You might think this is obvious and I agree. Yet there are many people that argue that computers can never be conscious.

lyu07282•12h ago

I agree, we don't know if the processes necessary to produce consciousness are computable. They could be, but we don't know.

> even if it mimics

Crucially even if it appears indistinguishable. It's obvious where that could lead, isn't it?

rsanheim•13h ago

> Any physical process can be computed

Um, really? How so? Show me the computations that model gut biome. Or pain, or the immune system, or how plants communicate, or how the big bang happened (or didn’t).

And now define “physical” wrt things like pain, or sensation, or the body, or consciousness.

We know so little about so much! it’s amazing when people speak with such certainty about things like consciousness and computation.

I don’t disagree with your main point. I would just say we don’t know very much about our own consciousness and mindbody. And humans have been studying that at least a few thousand years.

Not everyone on HN or who works with or studies AI believes AGI is just the next level on some algorithm or scaling level we haven’t unlocked yet.

jstanley•13h ago

> Show me the computations

Just because computations exist that model a physical process to an arbitrary degree of accuracy doesn't mean we know what those computations are.

tsimionescu•13h ago

But the reality is that we don't know if many of those things are computable. We just think it's likely they are, since the models we have of very simplified versions of them, or of very approximate versions, are computable.

But no one can use a physical model or computer to precisely compute, say, the fall of sand (where the exact position and momentum of every grain of sand is precisely known) - so we have no actual proof that the actual physical process is computable.

And, in fact, our best fundamental physical theory is not fully computable, at least in some sense - quantum measurement is not a deterministic process, so it's not exactly computable.

stevenhuang•10h ago

QM is computable on classical turing machines, it is just not efficient.

tsimionescu•9h ago

You can't compute the exact result of a quantum measurement. Now, randomness is a somewhat separate concept from computation. But still, a Turing Machine can't predict how a quantum system will behave after a measurement, so in some sense it's not computable. The best you can do is to compute the wavefunction and the resulting probabilities of the various possibilities.

IshKebab•9h ago

> Um, really? How so? Show me the computations that model gut biome.

You can simulate the wave function of particles, e.g. https://github.com/marl0ny/QM-Simulator-2D

From there you can build up simulations of atoms, molecules, proteins, cells, and the gut biome.

Obviously I'm not saying that it is computationally feasible to do an atom-level simulation of the guy biome, or pain receptors or the immune system.

But there's nothing fundamental that precludes it. It's just scale.

Also it's highly unlikely that biological intelligence depends on atom-level modelling. We can very likely use a simpler (but maybe still quite complex) model of a neurone and still be able to achieve the same result as biology does.

yusina•13h ago

These discussions remind me an awful lot of discussions between my religios and my atheist and my agnostic friends.

Reglious friend: Well there must be a god, I can feel him, and how else would all this marvellous reality around us have come to existence?

Atheist friend: How can you claim that? Where is the evidence? There could be many other explanations, we just don't know, and because of that it doesn't make sense to postulate the existence of a god. And besides, it's just a marketing stunt of the powerful to stay in power.

Agnostic friend: Why does any of this matter? Nature is beautiful, sometimes cruel, I'm glad I'm alive, and whether there is a god or not, I don't know or really care, it's insubstantial to my life.

That's exactly how the discussions around LLM consciousness or intelligence or sentience always go if you put enough tech folks into the conversation.

imiric•12h ago

> Right here on HN, where you'd think that most people would be able to think better, there's no shortage of AI fanboys who seriously try to frame the idea that LLMs may be conscious

Why does this surprise you? This is a community hosted by a Silicon Valley venture capital firm, and many of its members are either employed by or own companies in the AI industry. They are personally invested in pushing the narrative being criticized here, so the counter arguments are expected.

rednafi•16h ago

A large part of these anthropomorphic narratives was pushed by SV nerds to grab shareholder attention.

LLMs are transformative, but a lot of the tools around them already treat them as opaque function calls. Instead of piping text to sed, awk, or xargs, we’re just piping streams to these functions. This analogy can stretch to cover audio and video usage too. But that’s boring and doesn’t explain why you suddenly have to pay more for Google Work Suite just to get bombarded by AI slop.

This isn’t to undermine the absolutely incredible achievements of the people building this tech. It’s to point out the absurdity of the sales pitch from investors and benefactors.

But we’ve always been like this. Each new technology promises the world, and many even manage to deliver. Or is it that they succeed only because they overpromise and draw massive attention and investment in the first place? IDK, we’ll see.

tough•16h ago

you gotta aim for the stars to maybe with luck reach the moon

if you only aim for the moon, you’ll never break orbit.

rednafi•16h ago

If you don’t aim for the moon, you’re gonna surely miss it. The number of orbital calculations you need to get right to land on the moon is bonkers /s.

tough•15h ago

lmao true never played kerbal space tbh

TeMPOraL•11h ago

If either of you did, you'd know it's false. Getting to the Moon (and back) by eyeballing the trajectory is a rite of passage for every KSP player.

You launch for standard equatorial low orbit, then coast until the Moon is about 90 degrees to the right / behind you, and burn ahead until you run out of fuel in the second stage. This turns your orbit into an elongated ellipse, and as you get close to its far end, the Moon will catch up and capture you in its gravity well.

When playing one of the non-sandbox aka. "progression" modes, i.e. "career" or "science", the trajectory planning tools are locked behind an expensive upgrade, effectively forcing you to do your first Moon fly-by this way.

tough•11h ago

damn it this is peak hn

ty i shall pass that rite of passage some day and eyeball my way to the moon

SoKamil•15h ago

That got me thinking - what if we stripped that conversational sound-like-a-human-be-safe layer and focused RLHF on being best at transforming text for API usages?

pona-a•14h ago

The early OpenAI Instruct models were more like that. The original GPT-3 was only trained to predict the next token, then they used RLHF to make them interpret everything as queries, so that "Explain the theory of gravity to a 6 year old." wouldn't complete to "Explain the theory of relativity to a 6 year old in a few sentences." ChatGPT was probably that, expanded to multi-turn conversation. You can see the beginnings of that ChatGPT style in those examples.

https://openai.com/index/instruction-following/

Animats•16h ago

"We found that LRMs have limitations in exact computation: they fail to use explicit algorithms and reason inconsistently across puzzles."

That's the real conclusion of the Apple paper. It's correct. LLMs are terrible at arithmetic, or even counting. We knew that. So, now what?

It would be interesting to ask the "AI system" to write a program to solve such puzzle problems. Most of the puzzles given have an algorithmic solution.

This may be a strategy problem. LLMs may need to internalize Polya's How to Solve It?[2] Read the linked Wikipedia article. Most of those are steps an LLM can do, but a strategy controller is needed to apply them in a useful order and back off when stuck.

The "Illusion of thinking" article is far less useful than the the Apple paper.

(Did anybody proofread the Apple paper? [1] There's a misplaced partial sentence in the middle of page 2. Or a botched TeX macro.)

[1] https://ml-site.cdn-apple.com/papers/the-illusion-of-thinkin...

[2] https://en.wikipedia.org/wiki/How_to_Solve_It

withinboredom•16h ago

Eventually (many of us have hit it, and if you haven’t, you will), your program will hit a certain size/complexity where the AI also falls flat. These tests basically try to measure that limit, not just solve the puzzle.

scotty79•14h ago

So far they've shown that it fails early if the answer would exceed context size anyways. At least that was the case for the Hanoi problem from what I heard. They somehow failed to notice that though.

tsurba•15h ago

Is it a puzzle if there is no algorithm?

But testing via coding algos to known puzzles is problematic as the code may be in the training set. Hence you need new puzzles, which is kinda what ARC was meant to do, right? Too bad OpenAI lost credibility for that set by having access to it, but ”verbally promising” (lol) not to train on it, etc.

pram•14h ago

LLMs are awful and untrustworthy at math on their own for sure, but I found out you can tell Claude to use bc and suddenly they're not so bad when they have a calculator. It's as smart as me in that respect ;P

imiric•13h ago

You underestimate your abilities, and overestimate the LLM's.

Which version of `bc` was the LLM trained on? Perhaps this is not so important for a stable tool that doesn't change much, but it's critical for many programs and programming libraries. I lost count of the number of times when an LLM generated code that doesn't work for the current version of the library I'm using. In some cases you can tell it to use a specific version, or even feed it the documentation for that version, but that often fails to correct the issue.

And this is without considering the elephant in the room: hallucination. LLMs will often mix up or invent APIs that don't exist. Apologists will say that agents are the solution, and that feeding the error back to the LLM will fix it, but that is often not the case either. That might even make things worse by breaking something else, especially in large-context sessions.

pram•6h ago

Oh yeah it’s still going to hallucinate eventually, 100%. My point is mostly you can take it from “absolutely useless” to “situationally useful” if you’re honest about its limitations and abilities.

Nothing has convinced me that LLMs will ever be good at math on their own.

diimdeep•15h ago

Here is another opinion on the original paper

- Comment on The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity https://arxiv.org/abs/2506.09250

tkgally•14h ago

The points made in the abstract of that paper look like pretty serious challenges to the Apple paper. What might be the counterobjections?

jmsdnns•14h ago

Worth looking at how many papers the authors have published before assuming they challenge Apple.

frozenseven•12h ago

If they make a solid case against the Apple paper, why does it matter? It's not like they're the first ones to criticize (read: easily debunk) Apple on such grounds.

jmsdnns•12h ago

Apple's paper said a bunch of folks with weird predictions are wrong and so obviously lots of bad information will emerge in response. Who says what matters more now than it used to.

imiric•13h ago

Personal experience?

I don't care about benchmarks, nor LLMs' capability to solve puzzles. This is the first thing AI companies optimize their training for, which is misleading and practically false advertising.

I care about how good LLMs are for helping me with specific tasks. Do they generate code that on the surface appears correct, yet on closer inspection has security and performance issues, is unnecessarily complex, often doesn't even compile, which takes me more time to troubleshoot and fix than if I were to write it myself? Do they explain concepts confidently while being wrong, which I have no way of knowing unless I'm a domain expert? Do they repeat all these issues even after careful re-prompting and with all the contextual information they would need? Does all this waste my time more than it helps?

The answer is yes to all of the above.

So while we can argue whether LLMs can think and reason, my personal experience tells me that they absolutely cannot, and that any similarity to what humans can do is nothing but an illusion.

ofrzeta•14h ago

The conversational interface is a bit of a dark pattern, too, because it exaggerates what an LLM can do and creates an illusion that makes using LLMs a bit more addictive than a technical interface that says "generating random data that may be useful".

roywiggins•13h ago

The completion interface was way less usable but more honest

visarga•14h ago

"The Illusion of Thinking" is bad, but so is the opposite "it's just math and code". They might not be reasoning like humans, but they are not reducible to just code and math either. They do something new, something that just math and code did not do before.

jmsdnns•14h ago

> just code and math either

this is literally what they are

digbybk•14h ago

I think it’s the “just” that they are taking issue with. We are “just” neurons. But we demonstrate interesting emergent behaviors that, in principle, can be reduced to firing neurons but in practice we don’t understand and shouldn’t diminish with the word “just”.

jmsdnns•13h ago

fair point!

cantor_S_drug•14h ago

Here's something to think about:

Can we consider AI conscious similar to how Hardy recognised Ramanujan's genius? That if AI weren't to be conscious then they won't have the imagination to write what it wrote.

mkl•13h ago

It doesn't take conciousness to predict the next word repeatedly.

fc417fc802•13h ago

That's reductive. Cherry pick the most lucid LLM output and the least lucid human output and I think at this point the former clearly exceeds the latter. If an LLM is "just" predicting the next word repeatedly then what does that say about humans?

tsimionescu•13h ago

Nothing, since human output is adapted (better or worse) to the human's goals in a particular situation, while an LLM produces text that seems like a logical continuation of the text they are given as input. There is a fundamental difference in these two approaches that makes lessons from one very un-applicable from one to the other.

cantor_S_drug•12h ago

Have you noticed how predictable humans are when they have Transient Global Amensia?

https://www.radiolab.org/podcast/161744-loops/transcript

*The Prisoner of Memory*

In August 2010, Mary Sue Campbell became trapped in the most peculiar prison—a 90-second loop of her own consciousness. After experiencing transient global amnesia, she lost the ability to form new memories, resetting every minute and a half like a human record player stuck on repeat.

What made her condition haunting wasn't just the forgetting, but the eerie predictability. Every cycle, she asked the same questions in identical phrasing: "What's the date?" followed by "My birthday's already passed? Darn." She'd listen to her daughter's explanation, widen her eyes at the mention of paramedics, and declare "This is creepy!" The nurses began mouthing her words along with her.

Dr. Jonathan Vlahos, who treated similar cases, found this mechanical repetition deeply unsettling. It suggested that beneath our illusion of free will lies something more algorithmic—that given identical inputs, the brain produces identical outputs. Mary Sue's loop revealed the uncomfortable truth that our sense of spontaneous choice might be far more fragile than we imagine.

einsteinx2•9h ago

> that given identical inputs

This part doesn’t seem correct. Each 90 seconds she has completely different inputs: sights, sounds, the people she’s around, etc. Seems like it has more to do with her unique condition than any sort of revelation about free will.

safety1st•13h ago

I don't really agree. I use multiple LLMs every day and I feel like I get the most mileage out of them when I think about them as exactly that. Super good text transformers that can paraphrase anything that's been posted to the Internet.

There are a complexities beyond that of course. It can compare stuff, it can iterate through multiple steps ("Reason," though when you look at what it's doing, you definitely see how that term is a bit of a stretch). Lots of emergent applications yet to be discovered.

But yeah, it's math (probabilities to generate the next token) and code (go run stuff). The best applications will likely not be the ones that anthropomorphize it, but take advantage of what it really is.

XenophileJKO•13h ago

Have you used them to code? I think that is the tipping point where the level of interaction is so high, when the models are troubleshooting. That is when you can start to get a little philosophical.

The models make many mistakes and are not great software architects, but watch one or even two models work together to solve a bug and you'll quickly rethink "text transformers".

safety1st•12h ago

All the time. And what stands out is that when I hit a problem in a framework that's new and badly documented, they tend to fall apart. When there's a lot of documentation, StackOverflow discussions etc. for years about how something works they do an excellent job of digging those insights up. So it fits within my mental model of "find stuff on the web and run a text transformer" pretty well. I don't mean to underestimate the capabilities, but I don't think we need philosophy to explain them either.

If we ever develop a LLM that's able to apply symbolic logic to text, for instance to assess an argument's validity, develop an accurate proof step by step, and do this at least as well as many human beings, then I'll concede that we've invented a reasoning machine. Such a machine might very well be a miraculous invention in this age of misinformation, but I know of no work in that direction and I'm not at all convinced it's a natural outgrowth of LLMs which are so bad at math (perhaps they'd be of use in the implementation).

XenophileJKO•7h ago

I mean it is like asking a person to white board a framework they have never seen. However when I then say, "go ahead and research the correct methods or recommended methods of doing x with framework y." The models is usually capable of figuring out what to do.

pepinator•13h ago

We call it breakthrough. And it's just math and code. We've had many of those.

jstanley•13h ago

What part of an LLM do you think is not made out of code and maths?

victorbjorklund•13h ago

I think the point is we wouldnt say "humans dont reason. It is just chemical and eletrical signals"

jstanley•13h ago

Right. The error is in thinking that "just code and maths" can't reason.

LLMs very obviously are reducible to "just code and maths". We know that, because that is how they are made.

seydor•14h ago

> The Costs of Anthropomorphizing AI

These were not costs , but massive benefits for hyped AI startups. They attracted the attention of a wide audience of investors including clueless ones, they brought in politicians and hence the media attention, and created a FOMO race for them to plant flags before each other.

imiric•12h ago

This a balanced take on the subject. As usual, opinions on extreme ends should not be taken seriously. Modern ML tools are neither intelligent nor apocalyptic. What they are is very good at specific tasks, and the sooner we focus on those, the less time and resources will be wasted on tasks they're not good at.

Benchmarks and puzzles don't matter. AI companies will only use them to optimize their training, so that they can promote their results and boost their valuation on hype alone. They're more useful for marketing than as a measurement of real-world capabilities.

Judging by the increase of these negative discussions, we seem to be near or at the Peak of Inflated Expectations w.r.t. AI, and we'll be better off once we surpass it.

Infinite Grid of Resistors

I have reimplemented Stable Diffusion 3.5 from scratch in pure PyTorch

Inside the Apollo "8-Ball" FDAI (Flight Director / Attitude Indicator)

Solar Orbiter gets world-first views of the Sun's poles

Seven replies to the viral Apple reasoning paper and why they fall short

Clinical knowledge in LLMs does not translate to human interactions

Chicken Eyeglasses

We investigated Amsterdam's attempt to build a 'fair' fraud detection model

Unsupervised Elicitation of Language Models

Debunking HDR [video]

Waymo rides cost more than Uber or Lyft and people are paying anyway

Last fifty years of integer linear programming: Recent practical advances

How the Final Cartridge III Freezer Works

The Many Sides of Erik Satie

SIMD-friendly algorithms for substring searching (2018)

Peano arithmetic is enough, because Peano arithmetic encodes computation

Have a damaged painting? Restore it in just hours with an AI-generated "mask"

Endometriosis is an interesting disease

TimeGuessr

SSHTron: A multiplayer lightcycle game that runs through SSH

Slowing the flow of core-dump-related CVEs

Using `make` to compile C programs

Peeling the Covers Off Germany's Exascale "Jupiter" Supercomputer

Self-Adapting Language Models

Filedb: Disk-based key-value store inspired by Bitcask

Student discovers fungus predicted by Albert Hoffman

Solidroad (YC W25) Is Hiring

Implementing Logic Programming

“Language and Image Minus Cognition”: An Interview with Leif Weatherby

What is systems programming, really? (2018)

Infinite Grid of Resistors

I have reimplemented Stable Diffusion 3.5 from scratch in pure PyTorch

Inside the Apollo "8-Ball" FDAI (Flight Director / Attitude Indicator)

Solar Orbiter gets world-first views of the Sun's poles

Seven replies to the viral Apple reasoning paper and why they fall short

Clinical knowledge in LLMs does not translate to human interactions

Chicken Eyeglasses

We investigated Amsterdam's attempt to build a 'fair' fraud detection model

Unsupervised Elicitation of Language Models

Debunking HDR [video]

Waymo rides cost more than Uber or Lyft and people are paying anyway

Last fifty years of integer linear programming: Recent practical advances

How the Final Cartridge III Freezer Works

The Many Sides of Erik Satie

SIMD-friendly algorithms for substring searching (2018)

Peano arithmetic is enough, because Peano arithmetic encodes computation

Have a damaged painting? Restore it in just hours with an AI-generated "mask"

Endometriosis is an interesting disease

TimeGuessr

SSHTron: A multiplayer lightcycle game that runs through SSH

Slowing the flow of core-dump-related CVEs

Using `make` to compile C programs

Peeling the Covers Off Germany's Exascale "Jupiter" Supercomputer

Self-Adapting Language Models

Filedb: Disk-based key-value store inspired by Bitcask

Student discovers fungus predicted by Albert Hoffman

Solidroad (YC W25) Is Hiring

Implementing Logic Programming

“Language and Image Minus Cognition”: An Interview with Leif Weatherby

What is systems programming, really? (2018)

"The Illusion of Thinking" – Thoughts on This Important Paper

Comments