frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

You don't need Mac mini to run OpenClaw

https://runclaw.sh
1•rutagandasalim•49s ago•0 comments

Learning to Reason in 13 Parameters

https://arxiv.org/abs/2602.04118
1•nicholascarolan•2m ago•0 comments

Convergent Discovery of Critical Phenomena Mathematics Across Disciplines

https://arxiv.org/abs/2601.22389
1•energyscholar•3m ago•1 comments

Ask HN: Will GPU and RAM prices ever go down?

1•alentred•3m ago•0 comments

From hunger to luxury: The story behind the most expensive rice (2025)

https://www.cnn.com/travel/japan-expensive-rice-kinmemai-premium-intl-hnk-dst
1•mooreds•4m ago•0 comments

Substack makes money from hosting Nazi newsletters

https://www.theguardian.com/media/2026/feb/07/revealed-how-substack-makes-money-from-hosting-nazi...
4•mindracer•5m ago•1 comments

A New Crypto Winter Is Here and Even the Biggest Bulls Aren't Certain Why

https://www.wsj.com/finance/currencies/a-new-crypto-winter-is-here-and-even-the-biggest-bulls-are...
1•thm•5m ago•0 comments

Moltbook was peak AI theater

https://www.technologyreview.com/2026/02/06/1132448/moltbook-was-peak-ai-theater/
1•Brajeshwar•6m ago•0 comments

Why Claude Cowork is a math problem Indian IT can't solve

https://restofworld.org/2026/indian-it-ai-stock-crash-claude-cowork/
1•Brajeshwar•6m ago•0 comments

Show HN: Built an space travel calculator with vanilla JavaScript v2

https://www.cosmicodometer.space/
2•captainnemo729•6m ago•0 comments

Why a 175-Year-Old Glassmaker Is Suddenly an AI Superstar

https://www.wsj.com/tech/corning-fiber-optics-ai-e045ba3b
1•Brajeshwar•6m ago•0 comments

Micro-Front Ends in 2026: Architecture Win or Enterprise Tax?

https://iocombats.com/blogs/micro-frontends-in-2026
1•ghazikhan205•8m ago•0 comments

These White-Collar Workers Actually Made the Switch to a Trade

https://www.wsj.com/lifestyle/careers/white-collar-mid-career-trades-caca4b5f
1•impish9208•9m ago•1 comments

The Wonder Drug That's Plaguing Sports

https://www.nytimes.com/2026/02/02/us/ostarine-olympics-doping.html
1•mooreds•9m ago•0 comments

Show HN: Which chef knife steels are good? Data from 540 Reddit tread

https://new.knife.day/blog/reddit-steel-sentiment-analysis
1•p-s-v•9m ago•0 comments

Federated Credential Management (FedCM)

https://ciamweekly.substack.com/p/federated-credential-management-fedcm
1•mooreds•10m ago•0 comments

Token-to-Credit Conversion: Avoiding Floating-Point Errors in AI Billing Systems

https://app.writtte.com/read/kZ8Kj6R
1•lasgawe•10m ago•1 comments

The Story of Heroku (2022)

https://leerob.com/heroku
1•tosh•10m ago•0 comments

Obey the Testing Goat

https://www.obeythetestinggoat.com/
1•mkl95•11m ago•0 comments

Claude Opus 4.6 extends LLM pareto frontier

https://michaelshi.me/pareto/
1•mikeshi42•11m ago•0 comments

Brute Force Colors (2022)

https://arnaud-carre.github.io/2022-12-30-amiga-ham/
1•erickhill•14m ago•0 comments

Google Translate apparently vulnerable to prompt injection

https://www.lesswrong.com/posts/tAh2keDNEEHMXvLvz/prompt-injection-in-google-translate-reveals-ba...
1•julkali•14m ago•0 comments

(Bsky thread) "This turns the maintainer into an unwitting vibe coder"

https://bsky.app/profile/fullmoon.id/post/3meadfaulhk2s
1•todsacerdoti•15m ago•0 comments

Software development is undergoing a Renaissance in front of our eyes

https://twitter.com/gdb/status/2019566641491963946
1•tosh•16m ago•0 comments

Can you beat ensloppification? I made a quiz for Wikipedia's Signs of AI Writing

https://tryward.app/aiquiz
1•bennydog224•17m ago•1 comments

Spec-Driven Design with Kiro: Lessons from Seddle

https://medium.com/@dustin_44710/spec-driven-design-with-kiro-lessons-from-seddle-9320ef18a61f
1•nslog•17m ago•0 comments

Agents need good developer experience too

https://modal.com/blog/agents-devex
1•birdculture•18m ago•0 comments

The Dark Factory

https://twitter.com/i/status/2020161285376082326
1•Ozzie_osman•18m ago•0 comments

Free data transfer out to internet when moving out of AWS (2024)

https://aws.amazon.com/blogs/aws/free-data-transfer-out-to-internet-when-moving-out-of-aws/
1•tosh•19m ago•0 comments

Interop 2025: A Year of Convergence

https://webkit.org/blog/17808/interop-2025-review/
1•alwillis•21m ago•0 comments
Open in hackernews

First Proof

https://arxiv.org/abs/2602.05192
33•samasblack•1h ago

Comments

samasblack•1h ago
https://1stproof.org/#about
happa•1h ago
February 13th is a pretty close deadline. They should at least have given a month.
blenderob•1h ago
February 13 seems right to me. I mean it's not like LLMs need to manually write out a 10 page proof. But a longer deadline can give human mathematicians time to solve the problem and write out a proof. A close deadline advantages the LLM and disadvantages humans which should be the goal if we want to see if LLMs are able to solve these.
baal80spam•1h ago
I'll patiently wait for the "goalpost moving olympics" after this is published.
blenderob•1h ago
The goalposts have been on wheels basically since the field was born. Look up "AI effect". I've stopped caring what HN comments have to say about whether something is or isn't AI. If its useful to me, I'm gonna use it.
blenderob•1h ago
Can someone explain how this would work?

> the answers are known to the authors of the questions but will remain encrypted for a short time.

Ok. But humans may be able to solve the problems too. What prevents Anthropic or OpenAI from hiring mathematicians, have them write the proof and pass it off as LLM written? I'm not saying that's what they'll do. But shouldn't the paper say something about how they're going to validate that this doesn't happen?

Honest question here. Not trying to start a flame here. Honestly confused how this is going to test what it wants to test. Or maybe I'm just plain confused. Someone help me understand this?

yorwba•1h ago
This is not a benchmark. They just want to give people the opportunity to try their hand at solving novel questions with AI and see what happens. If an AI company pulls a solution out of their hat that cannot be replicated with the products they make available to ordinary people, that's hardly worth bragging about and in any case it's not the point of the exercise.
cocoto•25m ago
They could solve the problems and train the next models with the answers, as such the future models could “solve” theses.
fph•16m ago
The authors mention that before publications they tested these questions on Gemini and GPT, so they have been available to the two biggest players already; they have a head start.
conformist•1h ago
It's possible but unlikely given the short timeline, diverse questions that require multiple matheamticians, and low stakes. Also they've already run preliminary tests.
blenderob•50m ago
> It's possible but unlikely given the short timeline

Yep. "possible but unlikely" was my take too. As another person commented, this isn't really a benchmark, and as long as that's clear, it seems fair. My only fear is that some submissions may be AI-assisted rather than fully AI-generated, with crucial insights coming from experienced mathematicians. That's still a real achievement even if it's human + AI collaboration. But I fear that the nuance would be lost on news media and they'll publish news about the dawn of fully autonomous math reasoning.

falloutx•1h ago
Anything special about these questions? Are they unsolved by humans. I am not working in mathematics research so its hard to tell the importance.
jsnell•1h ago
The abstract of the article is very short, and seems pretty clear to both of your questions.

This is what is special about them:

> a set of ten math questions which have arisen naturally in the research process of the authors. The questions had not been shared publicly until now;

I.e. these are problems of some practical interest, not just performative/competitive maths.

And this is what is know about the solutions:

> the answers are known to the authors of the questions but will remain encrypted for a short time.

I.e. a solution is known, but is guaranteed to not be in the training set for any AI.

blenderob•1h ago
> I.e. a solution is known, but is guaranteed to not be in the training set for any AI.

Not a mathematician and obviously you guys understand this better than I do. One thing I can't understand is how they're going to judge if a solution was AI written or human written. I mean, a human could also potentially solve the problem and pass it off as AI? You might say why would a human want to do that? Normal mathematicians might not want to do that. But mathematicians hired by Anthropic or OpenAI might want to do that to pass it off as AI achievements?

teraflop•32m ago
Well, I think the paper answers that too. These problems are intended as a tool for honest researchers to use for exploring the capabilities of current AI models, in a reasonably fair way. They're specifically not intended as a rigorous benchmark to be treated adversarially.

Of course a math expert could solve the problems themselves and lie by saying that an AI model did it. In the same way, somebody with enough money could secretly film a movie and then claim that it was made by AI. That's outside the scope of what this paper is trying to address.

The point is not to score models based on how many of the problems they can solve. The point is to look at the models' responses and see how good they are at tackling the problem. And that's why the authors say that ideally, people solving these problems with AI would post complete chat transcripts (or the equivalent) so that readers can assess how much of the intellectual contribution actually came from AI.

_alternator_•50m ago
These are very serious research level math questions. They are not “Erdős style” questions; they look more like problems or lemmas that I encountered while doing my PhD. Things that don’t make it into the papers but were part of an interesting diversion along the way.

It seems likely that PhD students in the subfields of the authors are capable of solving these problems. What makes them interesting is that they seem to require fairly high research level context to really make progress.

It’s a test of whether the LLMs can really synthesize results from knowledge that require a human several years of postgraduate preparation in a specific research area.

clickety_clack•27m ago
So these are like those problems that are “left for the reader”?
Jaxan•6m ago
Not necessarily. Even the statements may not appear in the final paper. The questions arose during research, and understanding them was needed for the authors to progress, but maybe not needed for the goal in mind.
richard_chase•47m ago
Interesting questions. I think I'll attempt #7.