frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Erdős Problem #1026

https://terrytao.wordpress.com/2025/12/08/the-story-of-erdos-problem-126/
85•tzury•5h ago

Comments

tzury•3h ago
This case study reveals the future of AI-assisted[1] work, far beyond mathematics.

It relies on a combination of Humans, LLMs ('General Tools'), Domain-Specific Tools, and Deep Research.

It is apparent that the static data encoded within an LLM is not enough; one must re-fetch sources and digest them fresh for the context of the conversation.

In this workflow, AlphaEvolve, Aristotle, and LEAN are the 'PhDs' on the team, while the LLM is the Full Stack Developer that glues them all together.

[1] If one likes pompous terms, this is what 'AGI' will actually look like.

9u127v89yik2j3•1h ago
The author is the PhD on the team.

Literally not AGI.

baq•3h ago
I have no comments about the result itself, but the process and the AI policy which facilitated it is inspiring and easily transferable to any moderately complicated software engineering problem. Much to learn regardless of the maths.
nsoonhui•3h ago
But software engineering problems are more fuzzy and less amendable to mathematical analysis, so exactly how can those AI policies developed for math be applied to software engineering problems?
boerseth•3h ago
Not sure which way the difference puts the pressure. Does the fuzziness require more prudent policies, or allow us to get away with less?
robrenaud•2h ago
I think you underestimate how powerful lean is, and close it is to the tedious part of formal math. A theorem prover needs consult no outside resource. A formal math LLM-like generator need only consult the theorem prover to get rid of hallucinations. This is why it's actually much easier than SWE to optimize/hill climb on.

Low level, automated theorem providing is going to fall way quicker than most expected, like AlphaGo, precisely because an MCTS++ search over lean proofs is scalable/amendable to self play/relevant to a significant chunk of professional math.

Legit, I almost wish the US and China would sign a Formal Mathematics Profileration Treaty, as a sign of good will between very powerful parties who have much to gain from each other. When your theorem prover is sufficiently better than most Fields medalists alive, you share your arch/algorithms/process with the world. So Mathematics stays in the shared realm of human culture, and it doesn't just happen to belong to DeepMind, OpenAI, or Deepseek.

baq•1h ago
On the contrary I think we're low key on the verge of model checkers being widely deployed in the industry. I've been experimenting with Opus 4.5 + Alloy and the preliminary results I'm getting are crossing usability thresholds in a step-function pattern (not surprising IMHO), I just haven't seen anyone pick up on it publicly yet.

The workflow I'm envisioning here is the plan document we're all making nowadays isn't being translated directly into code, but into a TLA+/Alloy/... model as executable docs and only then lowered into the code space while conformance is continuously monitored (which is where the toil makes it not worth it most of the time without LLMs). The AI literature search for similar problems and solutions is also obviously helpful during all phases of the sweng process.

gaigalas•15m ago
Is it trivial for any mathematician to understand lean code?

I'm curious if there is a scenario in which a large automated proof is achieved but there would be no practical means of getting any understanding of what it means.

I'm an engineer. Think like this: a large complex program that compiles but you don't understand what it does or how to use it. Is such a thing possible?

DHRUV64: India's First 1.0 GHz, 64-bit dual-core Microprocessor

https://www.pib.gov.in/PressNoteDetails.aspx?NoteId=156505&ModuleId=3&reg=3&lang=1
1•rilawa•19s ago•0 comments

Fighting Big Tech: Slack vs. Microsoft Teams [video]

https://www.youtube.com/watch?v=tO3SJiB8agI
1•binjo•2m ago•1 comments

US suspends technology deal with the UK

https://www.ft.com/content/afd45e58-5351-4379-8f7e-5788da3d2e20
1•robtherobber•2m ago•0 comments

Show HN: Visualize Meeting Transcripts into Flows and Mind Maps (Offline)

https://selfoss.app/
1•shobankr•3m ago•0 comments

Don't fear Python subprocess or Go codegen

https://benhoyt.com/writings/jubilant/
1•benhoyt•3m ago•0 comments

Show HN: Visualizing when you forget what you learn

https://flashmind-app.vercel.app/
1•rogimatt•7m ago•0 comments

Deep Dive in Java vs. C++ Performance

https://johnnysswlab.com/deep-dive-in-java-vs-c-performance/
2•ibobev•9m ago•0 comments

Aliasing

https://xania.org/202512/15-aliasing-in-general
1•ibobev•10m ago•0 comments

Experiment to train rats to play Doom reaches a new level: shooting enemies

https://www.tomshardware.com/virtual-reality/rats-are-still-being-taught-to-play-doom-now-with-a-...
1•rbanffy•10m ago•0 comments

Lightweight Cardinality Estimation with Density

https://buttondown.com/jaffray/archive/lightweight-cardinality-estimation-with-density/
1•ibobev•11m ago•0 comments

Cutting chatbot costs and latency by offloading queries to local guardrails

https://tanaos.com/blog/cut-guardrail-costs/
1•rlucato•11m ago•0 comments

Warper: Ultra-Fast React Virtualization

https://warper.tech/
1•handfuloflight•12m ago•0 comments

TPAC 2025 Breakouts Recap

https://www.w3.org/blog/2025/tpac-2025-breakouts-recap/
1•pentagrama•13m ago•0 comments

Image Translator – AI-Powered Photo Translation Tool

https://www.imagetranslatorai.app/
1•Irving-AI•15m ago•0 comments

Senators Investigate Role of A.I. Data Centers in Rising Electricity Costs

https://www.nytimes.com/2025/12/16/business/energy-environment/senate-democrats-electricity-price...
1•fleahunter•19m ago•0 comments

How Sustainable Is This Crazy Server Spending?

https://www.nextplatform.com/2025/12/15/how-sustainable-is-this-crazy-server-spending/
1•rbanffy•20m ago•0 comments

AI Ideas That Only Work Because It's 2026

1•suhaspatil101•25m ago•1 comments

Ask HN: Please, review wordoid2.com, a smart naming webapp inspired by original

https://wordoid2.com/
1•aleks5678•26m ago•1 comments

Show HN: Spec-AGENTS.md – A tiny Doc-Driven "spec" for AI coding tools

https://github.com/yibie/SPEC-AGENTS.md
1•oliverchan2024•27m ago•1 comments

Nvidia B200: Keeping the CUDA Juggernaut Rolling Ft. Verda (Formerly DataCrunch)

https://chipsandcheese.com/p/nvidias-b200-keeping-the-cuda-juggernaut
1•rbanffy•32m ago•0 comments

ArkhamMirror: Airgapped investigation platform with CIA-style hypothesis testing

https://github.com/mantisfury/ArkhamMirror
2•ArkhamMirror•33m ago•1 comments

Cloudflare Radar: The rise of AI, post-quantum, and DDoS attacks

https://blog.cloudflare.com/radar-2025-year-in-review/
1•furkansahin•33m ago•0 comments

Cloudflare Is Experiencing Increased Error Rates Accessing R2 from ENAM

https://www.cloudflarestatus.com/incidents/0z4xng0gllq7
1•ouked•33m ago•0 comments

I ported JustHTML from Python to JavaScript with LLMs in 4.5 hours

https://simonwillison.net/2025/Dec/15/porting-justhtml/
1•genericlemon24•34m ago•0 comments

King of Cannibal Island: Will the AI Bubble Burst?

https://www.lrb.co.uk/the-paper/v47/n23/john-lanchester/king-of-cannibal-island
1•ostacke•35m ago•1 comments

AI space datacenters are impossible

https://ulveon.net/p/2025-12-15-ai-space-datacenters-are-literally-impossible/
2•kevin061•36m ago•0 comments

The Specification Renaissance? Skills and Mindset for Spec Driven Development

https://blog.scottlogic.com/2025/12/15/the-specification-renaissance-skills-and-mindset-for-spec-...
2•furkansahin•36m ago•0 comments

Dispatches

https://rodgercuddington.substack.com/p/dispatches
1•freespirt•36m ago•0 comments

Show HN: ToneFit AI – strength workouts generated from goal, time and equipment

https://www.tonefitai.com/
1•SidDaigavane•36m ago•0 comments

Reflections on U.S. Government Outreach to Think Tanks

https://www.brookings.edu/articles/who-influences-whom-reflections-on-u-s-government-outreach-to-...
1•SiempreViernes•36m ago•0 comments