The average chess players of Bletchley Park and AI research in Britain

https://blogs.bl.uk/science/2025/06/the-average-chess-players-of-bletchley-park-and-ai-research-in-britain.html

42•salonium_•8h ago

Comments

PaulRobinson•6h ago

It's strange today to remember that playing chess well was seen as a great marker of AI, but today we consider it much less so.

I thought Turing's Test would be a good barometer of AI, but in today's World of mountains of AI slop fooling more and more people, and ironically there being software that is better at solving CAPTCHAs than humans, I'm not so sure.

Add into the mix that there are reports of people developing psychological disorders when exposed deeply to LLMs, I'm not sure they are good replacements for therapists (ELIZA, ah, what a thought), and they seem - even with a lot of investment in agentic workflows and getting a lot of context into GraphRAG or wiring up MCP - to be good at helping experts get a bit faster, not replace experts. And that's not software development specific - it seems to be the case across all domains of expertise.

So what are we now chasing for? What's the test for AGI?

It's definitely not playing games well, like we thought, or pretending to be human, or even being useful to a human. What is it, then?

pvg•6h ago

playing chess well was seen as a great marker of AI

Was it? Alpha-beta pruning is from 1957 they had a decent idea chess of what human-beating computer chess would be like and that it probably wasn't some pathway to Turing-test-beating AI.

dandellion•6h ago

How about: the ability to independently implement ways to manipulate the local environment for their own benefit or self-preservation?

zmgsabst•6h ago

I’d argue we have AGI, at the level of a child; now we’re debating further steps, such as adult AGI and super intelligence.

But because AI is not like us, we have different results at different stages — eg, they’ve been better at arithmetic for a hundred years, games for twenty, and slowly are climbing up other domains.

nyrikki•4h ago

Any discussion about AGI requires a written definition of the term to have a reasonable discussion.

What we have now matches what many of the popular texts would call "Narrow AI", which is limited to specific tasks like speech recognition or playing chess, or mixtures of those.

Traditionally AGI represents a more aspirational goal, machines that could theoretically perform any intellectual task a human can do.

Under that definition we aren't close, and we will actually need new math to even hope to reach that goal.

Obviously individuals concepts of what 'AGI' differ, as well as their motivations for choosing one.

But the traditional hopeful mnomics concept of AGI is known to be unreachable without discoveries that upend what we think are hard limits today.

Machines being better at arithmetic, the ties from to the limits of algorithms is actually the source of the limits.

The work of Turing, Gödel, Tarski, Markov, Rice etc... is where that claim is coming from IMHO

Fortunately there is a lot of practical utility without AGI, but our industries use of aspirational mnomics is almost guaranteed to disappoint the rest of the world.

NitpickLawyer•3h ago

> Any discussion about AGI requires a written definition of the term to have a reasonable discussion.

I agree, this is the hardest thing to pin in any discussion. What version of AGI are we even talking about?

Here's a definition / explanation of AGI from the early days of DeepMind (from the movie "The thinking game"):

Quote from Shane Legg: "Our mission was to build an AGI - an artificial general intelligence, and so that means that we need a system which is general - it doesn't learn to do one specific thing. That's really key part of human intelligence, learn to do many many things".

Quote from Hassabis: "So, what is our mission? We summarise it as <Build the world's first general learning machine>. So we always stress the word general and learning here the key things."

And the key slide (that I think cements the difference between what AGI stood for then, vs. now):

AI - one task vs. AGI - many tasks

at human level intelligence.

----

Now, if we go by this definition, which is pretty specific and clear, I think we've already achieved this. We already have systems that have "generally" learned stuff. And can do "many tasks" at "human level intelligence". Again, notice the emphasis on "general" and "learning". We have a learning machine, that takes in vast amounts of tokens (text, multimodal, even bytes at the end of the day) and "learns" to "do" many things. And notice it's many tasks, not all tasks. I think this is QED at this point.

But, due to the old problem of "AI is everything that hasn't been done yet", and the constant goalpost moving, together with lots and lots of writing on this topic, the waters are muddier today, and lots of people argue and emphasise different things in the AGI field.

> Fortunately there is a lot of practical utility without AGI

Yeah, completely agree. I'm with Simon's recent article on this one. It doesn't even matter at this point if we reach AGI or not, or who's definition we use. I get a lot of value today from these systems. The debates are moot from my point.

AnimalMuppet•2h ago

> Now, if we go by this definition, which is pretty specific and clear...

I was going to say, no, you've defined "general" pretty well, but "intelligence" you didn't define at all. But on second thought, I guess you did - learning.

I might amend that slightly. It might be learning to do. I don't care if it can learn the words about, say, chemistry. Can it learn to solve chemistry problems?

The remaining area of fuzziness is hidden in "at human level". At what human level? I took a year of college chemistry. Can it do chemistry at that level? How about at the level of someone with a BS in chemistry? A PhD? Those are all "human" levels, but they are very different.

If it can do, say, all college subjects at undergrad level... I guess that's the benchmark for "a well rounded human".

> I think we've already achieved this.

I want to think about it some more before I definitely agree, but you've made the best case that I have seen.

The flaw I think I see is that, from a well-rounded education, we expect a human to be able to specialize, to become an expert in something. I'm not sure LLMs are quite there yet. They're closer than I was thinking 10 minutes ago, though.

hyghjiyhu•2h ago

The true definition of AGI is unfortunately "I know it when I see it". Any technical definition is provisional and achieves precision at the expense of the high risk of being simply wrong. That's not to say it lacks value. Finding how and when the provisional is wrong is itself a valuable insight bringing us closer to AGI.

Scarblac•6h ago

That can only be decided in hindsight. By the time everybody agrees that the system is clearly generally intelligent, it will have been for ages already. It will already be far more intelligent than even very smart humans.

But I think general problem solving is a part of it. Coming up with its own ideas for possible solutions rather than what it generalized from a training set, and being able to try them out and iterate. In an environment it wasn't specifically designed for by humans.

(not claiming most humans can do that)

zmgsabst•6h ago

Are you saying most humans aren’t generally intelligent, by your definition?

Scarblac•5h ago

Humans are very different from computers. In particular there are some things that computers are vastly better at (computation, memory, etc), and humans are optimized for surviving in their biological environment, not necessarily for general intelligence.

I think asking of an AGI to do what humans do is asking a submarine to swim. It's not very useful.

So I think that when we have useful computer AGI, it will be much better at it than humans.

You already see that even with say ChatGPT -- it's not expert level, but the knowledge it does have is way way wider than any human's. If we get something that's as smart as humans, it will probably still be as widely applicable.

And why even try, otherwise? We already have human intelligence.

pyman•6h ago

I have a similar philosophical question:

My dog doesn't know what I do for a living, and he has no concept of how intelligent I am. So if we're limited by our own intelligence, how would we ever recognise or measure the intelligence of an AI that's more advanced than us?

If an AI surpasses us, not just in memory or calculation but in reasoning, self-reflection, and abstraction, how would we even know?

officehero•5h ago

Wittgenstein's lion

dale_glass•4h ago

We could test it. We know with certainty that computers play far better chess than we do.

How do we know? Play a game with the computer, and see who wins.

There's no reason why we can't apply the same logic elsewhere. Set up a testable scenario, see who wins.

card_zero•4h ago

Either the alleged super-intelligence affects us in some way, directly or indirectly by altering things we can detect about the world/universe, in which case we can ultimately detect it, or else it doesn't, in which case it might as well belong to a separate universe, not only in terms of our perception but objectively too.

The error here is thinking that dogs understand anything.

Retric•4h ago

Some dogs can respond to “bring me my slippers” and go get them in another room, a concrete task that’s still difficult for robots today.

With dogs it’s less a question of intelligence but communication something more intelligent AI is unlikely to have a problem with.

card_zero•4h ago

OK, it might be a cultural thing. Do dogs probe the secrets of the world around them, with all that barking, even a little? Is it that they're in an early phase and will eventually advance to do more with stones than lick them sometimes?

What would our being baffled by a super-intelligence look like? Maybe some effect like dark matter. It would make less sense the more we found out about it, and because it's on a level beyond our comprehension, it would never add up. And the lack of apparent relevance to a super-intelligence's doings would be expected, because it's beyond our comprehension.

But this is silly and resembles apologies for God based on his being ineffable. So there's a way to avoid difficult questions like "what is his motivation" and "does he feel like he needs praise" because you can't eff him, not even a little. Then anything incomprehensible becomes evidence for God, or super-intelligence. We'd have to be pretty damn frustrated with things we don't understand before this looks true.

But that still doesn't work, because we're not supposed to be able to even suspect it exists. So even that much interaction with us is too much. In fact this "what if" question undermines itself from the start, because it represents the start of comprehension of the incomprehensible thing it posits.

Retric•3h ago

> Do dogs probe the secrets of the world around them, with all that barking, even a little?

It’s a form of communication. You can learn to distinguish different kinds of barking a healthy dog is making, but that doesn’t mean you’re going to care nearly as much about a large animal showing up.

gowld•3h ago

Robotic legs are hard, but also pretty wells solved. See Boston Dynamics.

Retric•3h ago

Legs are a small subset of the problem. “Where did I see that last?” involves mapping out and classifying the environment. There’s some really impressive demos of manipulating objects on a table etc, but random clutter throughout a house is still a problem.

TheOtherHobbes•4h ago

Dogs certainly do understand things. Dogs and cats have a theory of mind and can successfully manipulate and trick their owners - and each other.

Our perceptions are shaped by our cognitive limitations. A dog doesn't know what the Internet is, and completely lacks the cognitive capacity to understand it.

An ASI would almost certainly develop some analogous technology or ability, and it would be completely beyond us.

That does NOT mean we would notice we were being affected by that technology.

Advertising and manufactured addictions make people believe external manipulations are personal choices. An ASI would probably find similar manipulations trivial.

But it might well be capable of more complex covert manipulations we literally can't imagine.

card_zero•3h ago

Dogs certainly do not understand things. Do they enquire? What are some good dog theories? They have genetic theories. We breed theories into them.

pyman•3h ago

Yes, dogs can think and make choices, learn from experience, solve problems (like opening doors or finding hidden treats), and adapt to new situations.

The reason I mentioned my dog is because, even though dogs have limited intelligence compared to humans, my dog thinks he's better at playing ball than me. What he doesn't know is that I let him win because it makes him feel in control.

gadders•4h ago

https://fluent.pet/pages/getting-started-with-talking-button...

gadders•4h ago

It can have a fight with Nagel's Bat.

barrenko•3h ago

Whatever the hell observes the mind is not the same thing as it.

ben_w•3h ago

Depends by how much they're smarter, and in which ways.

As a trivial example, a century ago "can do arithmetic" was a signifier of being smart, yet if the entire human population today were all as fast as the current world record holder and on the same team, we would be defeated by one Raspberry Pi.

Easy to measure, but also very limited sense of "smart".

A Pi can also run Stockfish, so in that also-limited sense of "smart", it still beats humans. And chess inspires the wider use of Elo ratings in AI arenas, which means we can usefully assign scores to different AI that all beat the best humans.

For now, it's possible to point to things humans are (collectively) able to do better than AI — I originally wrote "very, very easy" rather than "possible", but then remembered noticing that whenever anyone actually tries to do so here on Hacker News, they're out of date already and there's is an AI which can do that thing superhumanly well (either that or they overstate what humans can do, e.g. claiming we can beat the halting problem); actual research papers with experiments generally do better when it comes to listing AI failure modes, including when the research comes from an AI lab showing off their new AI.

captainbland•6h ago

I think with the Turing test, it's turned out to be a fuzzier line than expected. People are to various degrees learning LLM tells even as they improve. So what might have passed the Turing test in 2020 might not today. Similarly it seems to be a case that conversations with LLMs often start better than they end, even today - so an LLM might pass a short turing test but fail a very long one that goes into hundreds of rounds.

kranke155•5h ago

We’ve clearly passed the Turing test I think. I can’t think of many ways I’d be able to detect an LLM reliably, if it was coded to just act as a person talking to me on discord.

uonr•20m ago

The Turing test isn't dead. The true Turing test is a thought experiment, and it's not something that can be replicated in the real world.

Given enough time and interaction, you can still spot a person on Discord being faked by an LLM—at the very least, something will feel off. This is even more true in a formal, knowing, adversarial setting.

iamflimflam1•5h ago

> It's strange today to remember that playing chess well was seen as a great marker of AI, but today we consider it much less so.

It was seen as so difficult to do that research should be abandoned.

Projects in category B were held to be failures. One important project, that of "programming and building a robot that would mimic human ability in a combination of eye-hand co-ordination and common-sense problem solving", was considered entirely disappointing. Similarly, chess playing programs were no better than human amateurs. Due to the combinatorial explosion, the run-time of general algorithms quickly grew impractical, requiring detailed problem-specific heuristics.

The report stated that it was expected that within the next 25 years, category A would simply become applied technologies engineering, C would integrate with psychology and neurobiology, while category B would be abandoned.

https://en.wikipedia.org/wiki/Lighthill_report

rjsw•5h ago

The linked post points out that it is a low-cost area of research and you don't need to explain the context to a reviewer.

jltsiren•5h ago

Tests can only show that something is not AGI. If you want to show that a system is AGI, you must wait for expert consensus. That means adding new tests and dropping old ones, as our understanding of intelligence improves. If something is truly AGI, people will eventually run out of plausible objections.

nemomarx•5h ago

I suppose doing useful research becomes the next target?

that's what the exponential lift off people want right

ben_w•3h ago

> and they seem - even with a lot of investment in agentic workflows and getting a lot of context into GraphRAG or wiring up MCP - to be good at helping experts get a bit faster, not replace experts. And that's not software development specific - it seems to be the case across all domains of expertise.

For now, this is a good thing: Given how generally LLMs are displacing juniors, if this was a situation where doing the same thing but harder can replace experts, it replaces approximately all of them.

But: in limited domains, not the "G" of "AGI" but just specific places here and there, AI does beat human experts. Those domains are often sufficiently narrow that they don't even encompass the entire role — think "can analyse an X-ray for cancer, can't write up its findings" kind of specificity. Indeed, I can only think of two careers where even the broadest definition of AI (some kind of programmable system) has been able to essentially fully replace that occupation:

1. https://en.wikipedia.org/wiki/Jacquard_machine

2. https://en.wikipedia.org/wiki/Computer_(occupation)

GuB-42•3h ago

> I thought Turing's Test would be a good barometer of AI

Depends on what you consider a "Turing's Test".

Fooling unsuspecting humans is relatively easy, it has been done with relatively simple software and some trickery. LLMs can do that too of course.

A more convincing "Turing's Test" would be:

- You have one interrogator, and two players, one human and one computer

- The interrogator, after chatting with both players has to find which is which

- The interrogator is an expert in the field, he knows everything there is to know when it comes to finding the computer

- The human player is also an expert, he knows how to solve problems that are hard for computers to solve, he also knows what to expect from the interrogator

- The interrogator and human player collaborate to find the computer

- The interrogator and human player are not allowed to have shared information that the computer doesn't have (and ideally, they shouldn't know each other personally), but everything else is fair game

jeremyjh•2h ago

I'm not sure that there is more to it than continuous learning. If an LLM of top-tier strength could learn from its experiences even the way a junior developer could, I'm not sure I can place an upper bound on how capable it would be at software engineering. But from what I understand this will require a completely different architecture.

codeulike•5h ago

If you look at the stuff Turing was writing in the 1950s its fascinating because he really saw the potential of what computation was going to be able to do. There was a paradigm shift in thinking about possibilities here that he grasped in the very early days.

https://www.cs.ox.ac.uk/activities/ieg/e-library/sources/t_a...

It would be amazing to go and fetch Turing with a time machine and bring him to our time. Show him an iPhone, his face on the UK £50 note, and Wikipedia's list of https://en.wikipedia.org/wiki/List_of_openly_LGBTQ_heads_of_...

barrenko•3h ago

The Christ of computation...

marcodiego•2h ago

I remember an interview with Kasparov. He said something, I don't remember exactly... It was something like "The skills chess develops are very important for... playing chess"; as a way to say "if you're good in chess, that doesn't mean you're particularly smart or good in other areas too".

As someone who played chess competitively in my childhood and teens, chess helped me a lot about concentration, problem solving and decision taking. I also learned to win and lose and to have respect for other people due to the competition.

As a teacher in my adulthood, I was extremely impressed by knowing a high rated player that was very weak student, especially in logic.

I now agree deeply with Kasparov about the importance of the skills chess develops.

The Fed says this is a cube of $1M. They're off by half a million

The Hoyle State (2021)

Feasibility study of a mission to Sedna - Nuclear propulsion and solar sailing

What should a native DOM templating API look like?

Ask HN: Who is hiring? (July 2025)

Show HN: Spegel, a Terminal Browser That Uses LLMs to Rewrite Webpages

1KB JavaScript Demoscene Challenge Just Launched

Experience converting a mathematical software package to C++20 modules [PDF]

I built something that changed my friend group's social fabric

Ask HN: Who wants to be hired? (July 2025)

Cua (YC X25) Is Hiring a Founding Engineer

Show HN: HackerNewt - Breadth-first exploring HN client for iOS

MicroPython on M68k Mac

When Did Nature Burst into Vivid Color?

All Good Editors Are Pirates: In Memory of Lewis H. Lapham

Code⇄GUI bidirectional editing via LSP

OpenFLOW – Quickly make beautiful infrastructure diagrams local to your machine

Graph Theory Applications in Video Games

Show HN: I built the tool I wished existed for moving Stripe between countries

Swearing as a Response to Pain: Assessing Effects of Novel Swear Words

Show HN: Jobs by Referral: Find jobs in your LinkedIn network

Aesop in Words of One Syllable

Exploring Trichromacy through Maxwell's Color Experiment (2023)

Amazon Is on the Cusp of Using More Robots Than Humans in Its Warehouses

Sam Altman Slams Meta's AI Talent Poaching: 'Missionaries Will Beat Mercenaries'

I write type-safe generic data structures in C

Embabel Agent Framework for the JVM

PlanetScale for Postgres

The new skill in AI is not prompting, it's context engineering

The hidden JTAG in a Qualcomm/Snapdragon device’s USB port