frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Erdős Problem #1026

https://terrytao.wordpress.com/2025/12/08/the-story-of-erdos-problem-126/
86•tzury•6h ago

Comments

tzury•4h ago
This case study reveals the future of AI-assisted[1] work, far beyond mathematics.

It relies on a combination of Humans, LLMs ('General Tools'), Domain-Specific Tools, and Deep Research.

It is apparent that the static data encoded within an LLM is not enough; one must re-fetch sources and digest them fresh for the context of the conversation.

In this workflow, AlphaEvolve, Aristotle, and LEAN are the 'PhDs' on the team, while the LLM is the Full Stack Developer that glues them all together.

[1] If one likes pompous terms, this is what 'AGI' will actually look like.

9u127v89yik2j3•1h ago
The author is the PhD on the team.

Literally not AGI.

baq•4h ago
I have no comments about the result itself, but the process and the AI policy which facilitated it is inspiring and easily transferable to any moderately complicated software engineering problem. Much to learn regardless of the maths.
nsoonhui•3h ago
But software engineering problems are more fuzzy and less amendable to mathematical analysis, so exactly how can those AI policies developed for math be applied to software engineering problems?
boerseth•3h ago
Not sure which way the difference puts the pressure. Does the fuzziness require more prudent policies, or allow us to get away with less?
robrenaud•2h ago
I think you underestimate how powerful lean is, and close it is to the tedious part of formal math. A theorem prover needs consult no outside resource. A formal math LLM-like generator need only consult the theorem prover to get rid of hallucinations. This is why it's actually much easier than SWE to optimize/hill climb on.

Low level, automated theorem providing is going to fall way quicker than most expected, like AlphaGo, precisely because an MCTS++ search over lean proofs is scalable/amendable to self play/relevant to a significant chunk of professional math.

Legit, I almost wish the US and China would sign a Formal Mathematics Profileration Treaty, as a sign of good will between very powerful parties who have much to gain from each other. When your theorem prover is sufficiently better than most Fields medalists alive, you share your arch/algorithms/process with the world. So Mathematics stays in the shared realm of human culture, and it doesn't just happen to belong to DeepMind, OpenAI, or Deepseek.

baq•1h ago
On the contrary I think we're low key on the verge of model checkers being widely deployed in the industry. I've been experimenting with Opus 4.5 + Alloy and the preliminary results I'm getting are crossing usability thresholds in a step-function pattern (not surprising IMHO), I just haven't seen anyone pick up on it publicly yet.

The workflow I'm envisioning here is the plan document we're all making nowadays isn't being translated directly into code, but into a TLA+/Alloy/... model as executable docs and only then lowered into the code space while conformance is continuously monitored (which is where the toil makes it not worth it most of the time without LLMs). The AI literature search for similar problems and solutions is also obviously helpful during all phases of the sweng process.

gaigalas•40m ago
Is it trivial for any mathematician to understand lean code?

I'm curious if there is a scenario in which a large automated proof is achieved but there would be no practical means of getting any understanding of what it means.

I'm an engineer. Think like this: a large complex program that compiles but you don't understand what it does or how to use it. Is such a thing possible?

SHARP, an approach to photorealistic view synthesis from a single image

https://apple.github.io/ml-sharp/
312•dvrp•6h ago•63 comments

A2UI: A Protocol for Agent-Driven Interfaces

https://a2ui.org/
16•makeramen•1h ago•6 comments

Children with cancer scammed out of millions fundraised for their treatment

https://www.bbc.com/news/articles/ckgz318y8elo
231•1659447091•4h ago•172 comments

A linear-time alternative for Dimensionality Reduction and fast visualisation

https://medium.com/@roman.f/a-linear-time-alternative-to-t-sne-for-dimensionality-reduction-and-f...
58•romanfll•4h ago•16 comments

Quill OS: An open-source OS for Kobo's eReaders

https://quill-os.org/
281•Curiositry•10h ago•89 comments

Bonsai: A Voxel Engine, from scratch

https://github.com/scallyw4g/bonsai
60•jesse__•4h ago•7 comments

Erdős Problem #1026

https://terrytao.wordpress.com/2025/12/08/the-story-of-erdos-problem-126/
86•tzury•6h ago•8 comments

JetBlue flight averts mid-air collision with US Air Force jet

https://www.reuters.com/world/americas/jetblue-flight-averts-mid-air-collision-with-us-air-force-...
282•divbzero•12h ago•170 comments

Creating C closures from Lua closures

https://lowkpro.com/blog/creating-c-closures-from-lua-closures.html
31•publicdebates•4d ago•2 comments

Internal RFCs saved us months of wasted work

https://highimpactengineering.substack.com/p/the-illusion-of-shared-understanding
22•romannikolaev•5d ago•8 comments

“Are you the one?” is free money

https://blog.owenlacey.dev/posts/are-you-the-one-is-free-money/
340•samwho•4d ago•74 comments

8M users' AI conversations sold for profit by "privacy" extensions

https://www.koi.ai/blog/urban-vpn-browser-extension-ai-conversations-data-collection
494•takira•7h ago•161 comments

7 Years, 2 Rebuilds, 40K+ Stars: Milvus Recap and Roadmap

https://milvus.io/blog/milvus-exceeds-40k-github-stars.md
21•Fendy•5d ago•7 comments

Native vs. emulation: World of Warcraft game performance on Snapdragon X Elite

https://rkblog.dev/posts/pc-hardware/pc-on-arm/x86_versus_arm_native_game/
78•geekman7473•11h ago•31 comments

I'm a Tech Lead, and nobody listens to me. What should I do?

https://world.hey.com/joaoqalves/i-m-a-tech-lead-and-nobody-listens-to-me-what-should-i-do-e16e454d
23•joaoqalves•1h ago•6 comments

Essential Semiconductor Physics [pdf]

https://nanohub.org/resources/43623/download/Essential_Semiconductor_Physics.pdf
191•akshatjiwan•2d ago•7 comments

Show HN: I designed my own 3D printer motherboard

https://github.com/KaiPereira/Cheetah-MX4-Mini
70•kaipereira•1w ago•15 comments

Mark V Shaney

https://en.wikipedia.org/wiki/Mark_V._Shaney
15•djoldman•4d ago•1 comments

Economics of Orbital vs. Terrestrial Data Centers

https://andrewmccalip.com/space-datacenters
117•flinner•12h ago•173 comments

High Performance SSH/SCP

https://www.psc.edu/hpn-ssh-home/
3•gslin•5d ago•0 comments

Rollstack (YC W23) is hiring multiple software engineers (TypeScript) US/Canada

https://www.ycombinator.com/companies/rollstack-2/jobs/QPqpb1n-software-engineer-typescript-us-ca...
1•yjallouli•8h ago

Umbrel – Personal Cloud

https://umbrel.com
190•oldfuture•15h ago•101 comments

Chafa: Terminal Graphics for the 21st Century

https://hpjansson.org/chafa/
164•birdculture•16h ago•26 comments

In Defense of Matlab Code

https://runmat.org/blog/in-defense-of-matlab-whiteboard-style-code
127•finbarr1987•3d ago•128 comments

Light intensity steers molecular assemblies into 1D, 2D or 3D structures

https://phys.org/news/2025-11-intensity-molecular-1d-2d-3d.html
27•PaulHoule•5d ago•3 comments

The appropriate amount of effort is zero

https://expandingawareness.org/blog/the-appropriate-amount-of-effort-is-zero/
128•gmays•14h ago•74 comments

Secret Documents Show Pepsi and Walmart Colluded to Raise Food Prices

https://www.thebignewsletter.com/p/secret-documents-show-pepsi-and-walmart
430•connor11528•13h ago•108 comments

A kernel bug froze my machine: Debugging an async-profiler deadlock

https://questdb.com/blog/async-profiler-kernel-bug/
99•bluestreak•13h ago•17 comments

Understanding carriage

https://seths.blog/2025/12/understanding-carriage/
51•herbertl•5d ago•13 comments

Ford kills the All-Electric F-150

https://www.wired.com/story/ford-kills-electric-f-150-lightning-for-hybrid/
361•sacred-rat•13h ago•572 comments