This is a fascinating experiment! I've just been reading the first few paragraphs of the paper ... easily readable, intended to be accessible by anyone.
In Gauss's time mathematicians would solve problems, publish the solutions in an encrypted form, and then challenge their contemporaries to solve the problems.
Here the authors of a paper on the arXiv say:
"To assess the ability of current AI systems to correctly answer research-level mathematics questions, we share a set of ten math questions which have arisen naturally in the research process of the authors. The questions had not been shared publicly until now; the answers are known to the authors of the questions but will remain encrypted for a short time."
Tao says:
"... the challenge is to see whether 10 research-level problems (that arose in the course of the authors research) are amenable to modern AI tools within a fixed time period (until Feb 13).
"The problems appear to be out of reach of current "one-shot" AI prompts, but were solved by human domain experts, and would presumably a fair fraction would also be solvable by other domain experts equipped with AI tools. They are technical enough that a non-domain-expert would struggle to verify any AI-generated output on these problems, so it seems quite challenging to me to have such a non-expert solve any of these problems, but one could always be surprised."
ColinWright•1h ago
In Gauss's time mathematicians would solve problems, publish the solutions in an encrypted form, and then challenge their contemporaries to solve the problems.
Here the authors of a paper on the arXiv say:
"To assess the ability of current AI systems to correctly answer research-level mathematics questions, we share a set of ten math questions which have arisen naturally in the research process of the authors. The questions had not been shared publicly until now; the answers are known to the authors of the questions but will remain encrypted for a short time."
Tao says:
"... the challenge is to see whether 10 research-level problems (that arose in the course of the authors research) are amenable to modern AI tools within a fixed time period (until Feb 13).
"The problems appear to be out of reach of current "one-shot" AI prompts, but were solved by human domain experts, and would presumably a fair fraction would also be solvable by other domain experts equipped with AI tools. They are technical enough that a non-domain-expert would struggle to verify any AI-generated output on these problems, so it seems quite challenging to me to have such a non-expert solve any of these problems, but one could always be surprised."