Dive into the Mindscape podcast, investigate complex systems. Go into information theory. Look at evolution from an information theory perspective. Look at how intelligence enables (collective) modeling of likely local future states of the universe, and how that helps us thrive.
Don’t get caught in what at least I consider to be a trap: “to use your consciousness to explain your consciousness”. I think the jump is, for now, too large.
Just my 2 ct. FWIW I consider myself a cocktail philosopher. I do have a PhD in Biophsyics, it means something to some. Although I myself consider it of limited value.
The linked multimedia article gives a narrative of intelligent systems, but Hutter and AIXI give a (noncomputable) definition of an ideal intelligent agent. The book situates the definitions in a reinforcement learning setting, but the core idea is succinctly expressed in a supervised learning setting.
The idea is this: given a dataset with yes/no labels (and no repeats in the features), and a commonsense encoding of turing machines as a binary string, the ideal map from input to probability distribution model is defined by
1. taking all turing machines that decide the input space and agree with the labels of the training set, and
2. the inference algorithm is that on new input, the output is exactly the distribution by counting all such machines that assent vs. reject the input, with their mass being weighted by the reciprocal of 2 to the power of the length, then the weighted counts normalized. This is of course a noncomputable algorithm.
The intuition is that if a simply-patterned function from input to output exists in the training set, then there is a simply/shortly described turing machine that captures that function, and so that machine's opinion on the new input is given a lot of weight. But there exist plausible more complex patterns, and we also consider them.
What I like about this abstract definition is that it is not in reference to "human intelligence" or "animal intelligence" or some other anthropic or biological notion. Rather, you can use these ideas anytime you isolate a notion of agent from an environment/data, and want to evaluate how the agent interacts/predicts intelligently against novel input from the environment/data, under the limited input that it has. It is a precise formalization of inductive thinking / Occam's razor.
Another thing I like about this is that it gives theoretical justification for the double-descent phenomenon. It is a (noncomputable) algorithm to give the best predictor, but it is defined in reference to the largest hypothesis space (all turing machines that decide on the input space). It suggests that whereas prior ML methods got better results with architectures that are carefully designed to make bad predictors unrepresentable, it is also not idle, if you have a lot of computational resources, to have an architecture that defines an expressive hypothesis space, and instead softly prioritizing simpler hypotheses through the learning algorithms (e.g. an approximation of which is regularization). This allows your model to learn complex patterns defined by the data that you did not anticipate, if that evidence in the data justifies it, whereas a small, biased hypothesis space would not be able to represent such a pattern if not anticipated but significant.
Note that under this definition, you might want to talk about a situation where the observations are noisy but you want to learn the trend of it without the noise. You can adapt the definition to be over noisy input by for example accompanying each input with distinct sequence numbers or random salts, then consider the marginal distribution for numbers/salts not in the training set (there are some technical issues of convergence, but the general approach is feasible), and this models the noise distribution as well.
Why not answer the question?
And looking at your paragraphs I'm still not sure I see a definition of intelligence. Unless you just mean that intelligence is something that can approximate this algorithm?
But there are real splits on substrate dependence and what actually drives the system. Can you get intelligence from pure prediction, or does it need the pressure of real consequences? And deeper: can it emerge from computational principles alone, or does it require specific environmental embeddedness?
My sense is that execution cost drives everything. You have to pay back what you spend, which forces learning and competent action. In biological or social systems you're also supporting the next generation of agents, so intelligence becomes efficient search because there's economic pressure all the way down. The social bootstrapping isn't decorative, it's structural.
I also posted yesterday a related post on HN
> What the Dumpster Teaches: https://news.ycombinator.com/item?id=45698854
It really is a stupid system. No one rational wants to hear that, just like no one religious wants to hear contradictions in their stories, or no one who plays chess wants to hear its a stupid game. The only thing that can be said about the chimp intelligence is it has developed a hatred of contradictions/unpredictability and lack of control unseen in trees, frogs, ants and microbes.
Stories becomes central to survive such underlying machinery. Part of the story we tell is no no we don't all have to be Kant or Einstein because we just absorb what they uncovered. So apparently the group or social structures matters. Which is another layer of pure hallucination. All social structures if they increase the prediction horizon also generate/expose themselves to more prediction errors and contradictions not less.
So again Coherence at group level is produced through story - religion will save us, the law will save us, trump will save, the jedi will save us, AI will save us etc. We then build walls and armies to protect ourselves from each others stories. Microbes don't do this. They do the opposite and have produced the krebs cycle, photosynthesis, crispr etc. No intelligence. No organization.
Our intelligence are just bubbling cauldrons at the individual and social level through which info passes and mutates. Info that survives is info that can survive that machinery. And as info explodes the coherence stabilization process is over run. Stories have to be written faster than stories can be written.
So Donald Trump is president. A product of "intelligence" and social "intelligence". Meanwhile more microbes exist than stars in the universe. No Trump or ICE or Church or data center is required to keep them alive.
If we are going to tell a story about Intelligence look to Pixar or WWE. Don't ask anyone in MIT what they think about it.
Also evolution is the original information-processing engine, and humans still run on it just like microbes. The difference is just the clock speed. Our intelligence, though chaotic and unstable, operates on radically faster time and complexity scales. It's an accelerator that runs in days and months instead of generations. The instability isn’t a flaw: it’s the turbulence of the way faster adaptation.
I think we need a meta layer - ability to reason over one's own goals (this does not contradict the environment creating hard constraints). The man has it. The machine may have it (notably a paperclip maximizer will not count under this criteria). The crow does not.
Similarly, a machine could emulate meta-cognition, but it would in effect only be an reflection and embodiment of certain meta-cognitive processes originally instantiated in the mind which created that machine.
I'll also add that a lot of people really binarize things. Although there is not a precise and formal definition, that does not mean there aren't useful ones and ones that are being refined. Progress has been made in not only the last millennia, but the last hundred years, and even the last decade. I'm not sure why so many are quick to be dismissive. The definition of life has issues and people are not so passionate about saying it is just a stab in the dark. Let your passion to criticize something be proportional to your passion to learn about that subject. Complaints are easy, but complaints aren't critiques.
That said, there's a lot of work in animal intelligence and neuroscience that sheds a lot of light on the subject. Especially in primate intelligence. There's so many mysteries here and subtle things that have surprising amounts of depth. It really is worth exploring. Frans de Waal has some fascinating books on Chimps. And hey, part of what is so interesting is that you have to take a deep look at yourself and how others view you. Take for example you reading this text. Bread it down, to atomic units. You'll probably be surprised at how complicated it is. Do you have a parallel process vocalizing my words? Do you have a parallel process spawning responses or quips? What is generating those? What are the biases? Such a simple every thing requires some pretty sophisticated software. If you really think you could write that program I think you're probably fooling yourself. But hey, maybe you're just more intelligent than me (or maybe less, since that too is another way to achieve the same outcome lol).
When I write prompts, I've stopped thinking of LLMs as just predicting a next word, and instead to think that they are a logical model built up by combining the logic of all the text they've seen. I think of the LLM as knowing that cats don't lay eggs, and when I ask it to finish the sentence "cats lay ..." It won't generate the word eggs even though eggs probably comes after lay frequently
What you are seeing is a semi-randomized prediction engine. It does not "know" things, it only shows you an approximation of what a completion of its system prompt and your prompt combined would look like, when extrapolated from its training corpus.
What you've mistaken for a "logical model" is simply a large amount of repeated information. To show the difference between this and logic, you need only look at something like the "seahorse emoji" case.
https://www.analyticsvidhya.com/blog/2021/07/word2vec-for-wo...
Surely trained neural networks could never develop circuits that implement actual logic via computational graphs...
https://transformer-circuits.pub/2025/attribution-graphs/met...
So what is it repeating?
It's not enough to just point to an instance of LLMs producing weird or dumb output. You need to show how it fits with your theory that they "just repeating information". This is like pointing out one of the millions of times a person has said something weird, dumb, or nonsensical and claiming it proves humans can't think and can only repeat information.
> It won't generate the word eggs even though eggs probably comes after lay frequently
Even a simple N-gram model won't predict "eggs". You're misunderstanding by oversimplifying.Next token prediction is still context based. It does not depend on only the previous token, but on the previous (N-1) tokens. You have "cat" so you should get words like "down" instead of "eggs" with even a 3-gram (trigram) model.
The other thing is their inability to intelligently forget and their inability to correctly manage their own context by building their own tools (some of which is labs intentionally crippling how they build AI to avoid an AI escape).
I don’t think there’s anything novel in human intelligence as a good chunk of it does appear in more primitive forms in other animals (primates, elephants, dolphins, cepholapods). But generally our intelligence is on hyperdrive because we also have the added physical ability of written language and the capability for tool building.
Then, what is what we are incapable of? Magic? ;-)
> Maybe the only thing we can do is advanced pattern matching
Pattern matching as a way to support the excellent heuristic "correlation is likely causation", yes. This is what allows us to analyze systems, what brings us from "something thrown away will eventually fall to the ground" to the theory to relativity.
Intelligence is understanding, and understanding comes from hacking systems in order to use them to our advantage - or just observe systems being broken or being built.
By doing that, we acquire more knowledge about the relationships and entities within the system, which in turn allows more advanced hacking. We probably started with fire, wolves, wheat, flint; and now we are considering going to Mars.
One example of this I often ponder is the boxing style of Muhammad Ali, specifically punching while moving backwards. Before Ali, no one punched while moving away from their opponent. All boxing data said this was a weak position, time for defense, not for punching (offense). Ali flipped it. He used to do miles of roadwork, throwing punches while running backwards to train himself on this style. People thought he was crazy, but it worked, and, imho, it was extremely creative (in the context of boxing), and therefore intelligent.
Did data exist that could've been analyzed (by an AI system) to come up with this boxing style? Perhaps. Kung Fu fighting styles have long known about using your opponents momentum against them. However, I think that data (Kung Fu fighting styles) would've been diluted and ignored in face of the mountains of traditional boxing style data, that all said not to punch while moving backwards.
I would argue that the only truly new things generative AI has introduced are mostly just byproducts of how the systems are built. The "AI style" of visual models, the ChatGPT authorial voice, etc., are all "new", but they are still just the result of regurgitating human created data and the novelty is an artifact of the model's competence or lack thereof.
There has not been, at least to my knowledge, a truly novel style of art, music, poetry, etc. created by an AI. All human advancements in those areas build mostly off of previous peoples' work, but there's enough of a spark of human intellect that they can still make unique advancements. All of these advancements are contingent rather than inevitable, so I'm not asking that an LLM, trained on nothing but visual art from the Medieval times and before, could recreate Impressionism. But I don't think it would make anything the progresses past or diverges from Medieval and pre-Medieval art styles. I don't think an LLM with no examples of or references to anything written before 1700 would ever produce poetry that looked like Ezra Pound's writing, though it just might make its own Finnegan's Wake if the temperature parameter were turned out high enough.
And how could it? It works because there's enough written data that questions and context around the questions are generally close enough to previously seen data that the minor change in the question will be matched by a commensurate change in the correct response from the ones in the data. That's all a posteriori!
I would have agreed with you at the dawn of LLM emergence, but not anymore. Not because the models have improved, but because I have a better understanding and more experience now. Token prediction is what everyone cites, and it still holds true. This mechanism is usually illustrated with an observable pattern, like the question, "Are antibiotics bad for your gut?" which is the predictability you mentioned. But LLM creativity begins to emerge when we apply what I’d call "constraining creativity." You still use token prediction, but the preceding tokens introduce an unusual or unexpected context - such as subjects that don't usually appear together or a new paradoxical observation (It's interesting that for fact-based queries, rare constraints lead to hallucinations, but here they're welcome)
I often use the latter for fun by asking an LLM to create a stand-up sketch based on an interesting observation I noticed. The results aren’t perfect, but they combine the unpredictability of token generation under constraints (funny details, in the case of the sketch) with the cultural constraints learned during training. For example, a sketch imagining doves and balconies as if they were people and real estate. The quote below from that sketch show that there are intersecting patterns between the world of human real estate and the world of birds, but mixed in a humorous way.
"You want to buy this balcony? That’ll be 500 sunflower seeds down, and 5 seeds a day interest. Late payments? We send the hawk after you."In this book, I see Hume cited in a misunderstanding of his thought, and Kant is only briefly mentioned for his metaphysical idealism rather than his epistemology, which is a legitimately puzzling to me. Furthermore, to refer to Kant's transcendental idealism as "solipsism" is so mistaken that it's actually shocking. Transcendental idealism has nothing whatsoever to do with "solipsism" and is really just saying that we (like LLMs!) don't truly understand objects as "things in themselves" but rather form understanding of them via perceptions of them within time and space that we schematize and categorize into rational understandings of those objects.
Regarding Hume, the author brings up his famous is/ought dichotomy and misrepresents it as Hume neatly separating statements and "preferring" descriptive ones. We're now talking more about fact-value distinction because this is not talking about moral judgments but rather descriptive vs prescriptive statements, but I'll ignore that because the two are so often combined. The author then comes to Hume's exact conclusion, but thinks he is refuting Hume when he says:
>While intuitive, the is/ought dichotomy falls apart when we realize that models are not just inert matrices of numbers or Platonic ideas floating around in a sterile universe. Models are functions computed by living beings; they arguably define living beings. As such, they are always purposive, inherent to an active observer. Observers are not disinterested parties. Every “is” has an ineradicable “oughtness” about it.
The author has also just restated a form of transcendental idealism right before dismissing Kant's (and the very rigorously articulated "more recent postmodern philosophers and critical theorists") transcendental idealism! He is able to deftly, if unconvincingly, hand wave it with:
>We can mostly agree on a shared or “objective” reality because we all live in the same universe. Within-species, our umwelten, and thus our models—especially of the more physical aspects of the world around us—are all virtually identical, statistically speaking. Merely by being alive and interacting with one another, we (mostly) agree to agree.
I think this bit of structuralism is where the actual solipsism is happening. Humanity's rational comprehension of the world is actually very contingent. An example of this is the study that were done by Alexander Luria on remote peasant cultures and their capacity for hypothetical reasoning and logic in general. They turned out to be very different from "our models" [1]. But, even closer to home, I share the same town as people who believe in reiki healing to the extent that they are willing to pay for it.
But, more to the point, he has also simply rediscovered Hume's idea, which I will quote:
>In every system of morality, which I have hitherto met with, I have always remarked, that the author proceeds for some time in the ordinary way of reasoning, and establishes the being of a God, or makes observations concerning human affairs; when of a sudden I am surprised to find, that instead of the usual copulations of propositions, is, and is not, I meet with no proposition that is not connected with an ought, or an ought not.
Emphasis mine. Hume's point was that he thought descriptive statements always carry a prescriptive one hidden in their premise, and so that, in practice, "is" statements are always just "ought" statements.
Had the author engaged more actively with Hume's writing, he would have come across Hume's fork, related to this is-ought problem, and eventually settled on (what I believe to be) a much more important epistemological problem with regards to generative AI: the possibility of synthetic a priori knowledge. Kant provided a compelling argument in favor of the possibility of synthetic a priori knowledge, but I would argue that it does not apply to machines, as machines can "know" things only by reproducing the data they are trained with and lack the various methods of apperception needed to schematize knowledge due to a variety of reasons, but "time" being the foremost. LLMs don't have a concept of "time"; every inference they make is independent, and transformers are just a great way to link them together into sequences.
I should point out that I'm not a complete AI skeptic. I think that it could be possible to have some hypothetical model that would simply use gen AI as its sensory layer and combine that with a reasoning component that makes logical inferences that more resemble the categories that Kant described being used to generate synthetic a priori knowledge. Such a machine would be capable of producing true new information rather than simply sampling an admittedly massive approximation of the joint probability of semiotics (be it tokens or images) and hoping that the approximation is well constructed enough to interpolate the right answer out. I would personally argue that the latter "knowledge", when correct, is nothing more than persuasive Gettier cases.
Overall, I'm not very impressed with the author's treatment of these thinkers. Some of the other stuff looks interesting, but I worry it's a Gell-Mann amnesia effect to be too credulous, given that I have done quite a bit of primary source study on 19th century epistemology as a basis for my other study in newer writing in that area. The author's background is in physics and engineering, so I have a slight suspicion that (since he used Hume's thought related to moral judgments rather than knowledge), these are hazily remembered subjects from a rigorous ethics course he took at Princeton, but that is purely speculative on my part. I think he has reached a bit too far here.
1: https://languagelog.ldc.upenn.edu/nll/?p=481 (I am referring mostly to the section in blue here)
Intelligence is the ability of the human body to organize its own contours in a way that corresponds to the objective contours of the world.
Evald illyenkov.
And yes, the mind is part of the body, thus thinking consists of an action of organization to the contours of the world
What about animals?
To me best definition of intelligence is:
It's the ability to:
- Solve problems
- Develop novel insightful ideas, patterns and conclusions. Have to add that since they might not immediately solve a problem, although they might help solve a problem down the line. Example could be a comedian coming up with a clever original story. It doesn't really "solve a problem" directly, but it's intelligent.
The more you are capable of either of the two above, the more intelligent you are. Anything that is able to do the above, is intelligent at least to some extent, but how intelligent depends on how much it's able to do.
What he would like people to believe is that AI is real intelligence, for some value of real.
Even without AI, computers can be programmed for a purpose, and appear to exhibit intelligence. And mechanical systems, such as the governor of a lawnmower engine, seem able to seek a goal they are set for.
What AI models have in common with human and animal learning is having a history which forms the basis for a response. For humans, our sensory motor history, with its emotional associations, is an embodied context out of which creative responses derive.
There is no attempt to recreate such learning in AI. And by missing out on embodied existence, AI can hardly be claimed as being on the same order as human or animal intelligence.
To understand the origin of human intelligence, a good starting point would be, Ester Thelen's book[0], "A Dynamic Systems Approach to the Development of Cognition and Action" (also MIT Press, btw.)
According to Thelen, there is no privileged component with prior knowledge of the end state of an infant's development, no genetic program that their life is executing. Instead, there is a process of trial and error that develops the associations between senses and muscular manipulation that organize complex actions like reaching.
If anything, it is caregivers in the family system that knowledge of an end result resides: if something isn't going right with the baby, if she not able to breastfeed within a few days of birth (a learned behavior) or not able to roll over by themselves at 9 months, they will be ones to seek help.
In my opinion, it is in the caring arts, investing time in our children's development and education, that advances us as a civilization, although there is now a separate track, the advances in computers and technology, that often serves as a proxy for improving our culture and humanity, easier to measure, easier to allocate funds, than for the squishy human culture of attentive parenting, teaching and caregiving.
[0] https://www.amazon.com/Approach-Development-Cognition-Cognit...
The effect is that it's unclear at first glance what the argument even might be, or which sections might be interesting to a reader who is not planning to read it front-to-back. And since it's apparently six hundred pages in printed form, I don't know that many will read it front-to-back either.
* Verifiable Fact
* Obvious Truth
* Widely Held Opinion
* Your Nonsense Here
* Tautological Platitude
This gets your audience nodding along in "Yes" mode and makes you seem credible so they tend to give you the benefit of the doubt when they hit something they aren't so sure about. Then, before they have time to really process their objection, you move onto and finish with something they can't help but agree with.
The stuff on the history of computation and cybernetics is well researched with a flashy presentation, but it's not original nor, as you pointed out, does it form a single coherent thesis. Mixing in all the biology and movie stuff just dilutes it further. It's just a grab bag of interesting things added to build credibility. Which is a shame, because it's exactly the kind of stuff that's relevant to my interests[3][4].
> "Your manuscript is both good and original; but the part that is good is not original, and the part that is original is not good." - Samuel Johnson
The author clearly has an Opinion™ about AI, but instead of supporting they're trying to smuggle it through in a sandwich, which I think is why you have that intuitive allergic reaction to it.
[1]: https://changingminds.org/disciplines/sales/closing/yes-set_...
[2]: https://en.wikipedia.org/wiki/Compliment_sandwich
analog8374•11h ago
pkoird•10h ago
Art9681•10h ago
analog8374•9h ago