> Related to this, I will not be commenting on any self-reported AI competition performance results for which the methodology was not disclosed in advance of the competition. (3/3)
(This wasn't there when I first read the thread yesterday 18 hours ago; it was edited in 15 hours ago i.e. 3 hours later.)
It's one of the things to admire about Terence Tao: he's always insightful even when he comments about stuff outside mathematics, while always having the mathematician's discipline of not drawing confident conclusions when data is missing.
I was reminded of this because of a recent thread where some HN commenter expected him to make predictions about the future (https://news.ycombinator.com/item?id=44356367). Also reminded of Sherlock Holmes (from A Scandal in Bohemia):
> “This is indeed a mystery,” I remarked. “What do you imagine that it means?”
> “I have no data yet. It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.”
Edit: BTW, seeing some other commentary (here and elsewhere) about these posts is very disappointing — even when Tao explicitly says he's not commenting about any specific claim (like that of OpenAI), many people seem to be eager to interpret his comments as being about that claim: people's tendency for tribalism / taking “sides” is so great that they want to read this as Tao caring about the same things they care about, rather than him using the just-concluded IMO as an illustration for the point he's actually making (that results are sensitive to details). In fact his previous post (https://mathstodon.xyz/@tao/114877789298562646) was about “There was not an official controlled competition set up for AI models for this year’s IMO […] Hopefully by next year we will have a controlled environment to get some scientific comparisons and evaluations” — he's specifically saying we cannot compare across different AI models so it's hard to say anything specific, yet people think he's saying something specific!
Multiple days worth of processing, cross communication, picking only the best result? That’s just the power of parallel processing and how they reason so well. Altering to a more standard prompt? Communicating with a more strict natural language helps reduce confusion. Calculator access and the vast knowledge of humanity built in? That’s the whole point.
I tend to side with Tao on this one but the point is less who’s right and more why there’s so much arguing past each other. The basic fundamentals of how to judge these tools aren’t agreed upon.
That recent announcement might just be fluff or might be some real news, depending. We just don't know.
I can't even read into their silence - this is exactly how much OpenAI would share in the totally grifting scenario and in the massive breakthrough scenario.
I'm glad that Tao has caught on. As an academic it is easy to assume integrity from others but there is no such thing in software big business.
I'm not an academic, but from the outside looking in on academia I don't think academics should be so quick to assume integrity either
There seems to be a lot of perverse incentives in academia to cheat, cut corners, publish at all costs, etc
I think Tao's point was that a more appropriate comparison between AI and humans would be to compare it with humans that have calculator/internet access.
I agree with your overall point though: it's not straighforward to specify exactly what would be an appropriate comparison
AI in general has given humans great leverage in processing information, more than we have ever had before. Do we need AGI to start applying this wonderful leverage toward our problems as a species?
chronic0262•3h ago
what a badass
amelius•3h ago