frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

minikeyvalue

https://github.com/commaai/minikeyvalue/tree/prod
2•tosh•3m ago•0 comments

Neomacs: GPU-accelerated Emacs with inline video, WebKit, and terminal via wgpu

https://github.com/eval-exec/neomacs
1•evalexec•7m ago•0 comments

Show HN: Moli P2P – An ephemeral, serverless image gallery (Rust and WebRTC)

https://moli-green.is/
2•ShinyaKoyano•11m ago•1 comments

How I grow my X presence?

https://www.reddit.com/r/GrowthHacking/s/UEc8pAl61b
2•m00dy•13m ago•0 comments

What's the cost of the most expensive Super Bowl ad slot?

https://ballparkguess.com/?id=5b98b1d3-5887-47b9-8a92-43be2ced674b
1•bkls•14m ago•0 comments

What if you just did a startup instead?

https://alexaraki.substack.com/p/what-if-you-just-did-a-startup
3•okaywriting•20m ago•0 comments

Hacking up your own shell completion (2020)

https://www.feltrac.co/environment/2020/01/18/build-your-own-shell-completion.html
2•todsacerdoti•23m ago•0 comments

Show HN: Gorse 0.5 – Open-source recommender system with visual workflow editor

https://github.com/gorse-io/gorse
1•zhenghaoz•24m ago•0 comments

GLM-OCR: Accurate × Fast × Comprehensive

https://github.com/zai-org/GLM-OCR
1•ms7892•25m ago•0 comments

Local Agent Bench: Test 11 small LLMs on tool-calling judgment, on CPU, no GPU

https://github.com/MikeVeerman/tool-calling-benchmark
1•MikeVeerman•26m ago•0 comments

Show HN: AboutMyProject – A public log for developer proof-of-work

https://aboutmyproject.com/
1•Raiplus•26m ago•0 comments

Expertise, AI and Work of Future [video]

https://www.youtube.com/watch?v=wsxWl9iT1XU
1•indiantinker•26m ago•0 comments

So Long to Cheap Books You Could Fit in Your Pocket

https://www.nytimes.com/2026/02/06/books/mass-market-paperback-books.html
3•pseudolus•27m ago•1 comments

PID Controller

https://en.wikipedia.org/wiki/Proportional%E2%80%93integral%E2%80%93derivative_controller
1•tosh•31m ago•0 comments

SpaceX Rocket Generates 100GW of Power, or 20% of US Electricity

https://twitter.com/AlecStapp/status/2019932764515234159
2•bkls•31m ago•0 comments

Kubernetes MCP Server

https://github.com/yindia/rootcause
1•yindia•32m ago•0 comments

I Built a Movie Recommendation Agent to Solve Movie Nights with My Wife

https://rokn.io/posts/building-movie-recommendation-agent
4•roknovosel•32m ago•0 comments

What were the first animals? The fierce sponge–jelly battle that just won't end

https://www.nature.com/articles/d41586-026-00238-z
2•beardyw•41m ago•0 comments

Sidestepping Evaluation Awareness and Anticipating Misalignment

https://alignment.openai.com/prod-evals/
1•taubek•41m ago•0 comments

OldMapsOnline

https://www.oldmapsonline.org/en
1•surprisetalk•43m ago•0 comments

What It's Like to Be a Worm

https://www.asimov.press/p/sentience
2•surprisetalk•43m ago•0 comments

Don't go to physics grad school and other cautionary tales

https://scottlocklin.wordpress.com/2025/12/19/dont-go-to-physics-grad-school-and-other-cautionary...
2•surprisetalk•43m ago•0 comments

Lawyer sets new standard for abuse of AI; judge tosses case

https://arstechnica.com/tech-policy/2026/02/randomly-quoting-ray-bradbury-did-not-save-lawyer-fro...
5•pseudolus•44m ago•0 comments

AI anxiety batters software execs, costing them combined $62B: report

https://nypost.com/2026/02/04/business/ai-anxiety-batters-software-execs-costing-them-62b-report/
1•1vuio0pswjnm7•44m ago•0 comments

Bogus Pipeline

https://en.wikipedia.org/wiki/Bogus_pipeline
1•doener•45m ago•0 comments

Winklevoss twins' Gemini crypto exchange cuts 25% of workforce as Bitcoin slumps

https://nypost.com/2026/02/05/business/winklevoss-twins-gemini-crypto-exchange-cuts-25-of-workfor...
2•1vuio0pswjnm7•46m ago•0 comments

How AI Is Reshaping Human Reasoning and the Rise of Cognitive Surrender

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6097646
3•obscurette•46m ago•0 comments

Cycling in France

https://www.sheldonbrown.com/org/france-sheldon.html
2•jackhalford•47m ago•0 comments

Ask HN: What breaks in cross-border healthcare coordination?

1•abhay1633•48m ago•0 comments

Show HN: Simple – a bytecode VM and language stack I built with AI

https://github.com/JJLDonley/Simple
2•tangjiehao•50m ago•0 comments
Open in hackernews

The Secret Meeting Where Mathematicians Struggled to Outsmart AI

https://www.scientificamerican.com/article/inside-the-secret-meeting-where-mathematicians-struggled-to-outsmart-ai/
47•fmihaila•8mo ago

Comments

xeonmc•8mo ago

    > While sparring with o4-mini was thrilling, its progress was also alarming. Ono and He express concern that the o4-mini’s results might be trusted too much. “There’s proof by induction, proof by contradiction, and then proof by intimidation,” He says. “If you say something with enough authority, people just get scared. I think o4-mini has mastered proof by intimidation; it says everything with so much confidence.”

I think there is a pitfall of designating a uniform categorization of “reasoning” like in this article; it is not surprising to hear that models are good at casting a wide net in fitting many different ideas together by association, however the subtle pitfalls in assuming that pieces fit together without unexpected interactions is something which require formal reasoning through instead of just correlating literature.
Reubend•8mo ago
> “There’s proof by induction, proof by contradiction, and then proof by intimidation,” He says. “If you say something with enough authority, people just get scared. I think o4-mini has mastered proof by intimidation; it says everything with so much confidence.”

Proof validation is the perfect solution to this, and indeed I would love to see future improvements to LLMs which allow them to formalize their proofs with a feedback loop from something like Lean or Coq so that they can ensure that hallucinations haven't occurred.

alimw•8mo ago
You can already try this in Cursor. It doesn't work too well right now but perhaps that's just because noone has tuned the loop.
AlexErrant•8mo ago
> “I came up with a problem which experts in my field would recognize as an open question in number theory—a good Ph.D.-level problem,” he says. He asked o4-mini to solve the question... o4-mini presented a correct but sassy solution

I wonder who gets first author credits on that paper.

ColinWright•8mo ago
It would be nice to know what the problem actually was.