frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: Cymatica – an experimental, meditative audiovisual app

https://apps.apple.com/us/app/cymatica-sounds-visualizer/id6748863721
1•_august•1m ago•0 comments

GitBlack: Tracing America's Foundation

https://gitblack.vercel.app/
1•martialg•1m ago•0 comments

Horizon-LM: A RAM-Centric Architecture for LLM Training

https://arxiv.org/abs/2602.04816
1•chrsw•1m ago•0 comments

We just ordered shawarma and fries from Cursor [video]

https://www.youtube.com/shorts/WALQOiugbWc
1•jeffreyjin•2m ago•1 comments

Correctio

https://rhetoric.byu.edu/Figures/C/correctio.htm
1•grantpitt•2m ago•0 comments

Trying to make an Automated Ecologist: A first pass through the Biotime dataset

https://chillphysicsenjoyer.substack.com/p/trying-to-make-an-automated-ecologist
1•crescit_eundo•6m ago•0 comments

Watch Ukraine's Minigun-Firing, Drone-Hunting Turboprop in Action

https://www.twz.com/air/watch-ukraines-minigun-firing-drone-hunting-turboprop-in-action
1•breve•7m ago•0 comments

Free Trial: AI Interviewer

https://ai-interviewer.nuvoice.ai/
1•sijain2•7m ago•0 comments

FDA Intends to Take Action Against Non-FDA-Approved GLP-1 Drugs

https://www.fda.gov/news-events/press-announcements/fda-intends-take-action-against-non-fda-appro...
4•randycupertino•9m ago•1 comments

Supernote e-ink devices for writing like paper

https://supernote.eu/choose-your-product/
2•janandonly•11m ago•0 comments

We are QA Engineers now

https://serce.me/posts/2026-02-05-we-are-qa-engineers-now
1•SerCe•11m ago•0 comments

Show HN: Measuring how AI agent teams improve issue resolution on SWE-Verified

https://arxiv.org/abs/2602.01465
2•NBenkovich•11m ago•0 comments

Adversarial Reasoning: Multiagent World Models for Closing the Simulation Gap

https://www.latent.space/p/adversarial-reasoning
1•swyx•12m ago•0 comments

Show HN: Poddley.com – Follow people, not podcasts

https://poddley.com/guests/ana-kasparian/episodes
1•onesandofgrain•20m ago•0 comments

Layoffs Surge 118% in January – The Highest Since 2009

https://www.cnbc.com/2026/02/05/layoff-and-hiring-announcements-hit-their-worst-january-levels-si...
7•karakoram•20m ago•0 comments

Papyrus 114: Homer's Iliad

https://p114.homemade.systems/
1•mwenge•20m ago•1 comments

DicePit – Real-time multiplayer Knucklebones in the browser

https://dicepit.pages.dev/
1•r1z4•20m ago•1 comments

Turn-Based Structural Triggers: Prompt-Free Backdoors in Multi-Turn LLMs

https://arxiv.org/abs/2601.14340
2•PaulHoule•22m ago•0 comments

Show HN: AI Agent Tool That Keeps You in the Loop

https://github.com/dshearer/misatay
2•dshearer•23m ago•0 comments

Why Every R Package Wrapping External Tools Needs a Sitrep() Function

https://drmowinckels.io/blog/2026/sitrep-functions/
1•todsacerdoti•23m ago•0 comments

Achieving Ultra-Fast AI Chat Widgets

https://www.cjroth.com/blog/2026-02-06-chat-widgets
1•thoughtfulchris•25m ago•0 comments

Show HN: Runtime Fence – Kill switch for AI agents

https://github.com/RunTimeAdmin/ai-agent-killswitch
1•ccie14019•28m ago•1 comments

Researchers surprised by the brain benefits of cannabis usage in adults over 40

https://nypost.com/2026/02/07/health/cannabis-may-benefit-aging-brains-study-finds/
1•SirLJ•29m ago•0 comments

Peter Thiel warns the Antichrist, apocalypse linked to the 'end of modernity'

https://fortune.com/2026/02/04/peter-thiel-antichrist-greta-thunberg-end-of-modernity-billionaires/
3•randycupertino•30m ago•2 comments

USS Preble Used Helios Laser to Zap Four Drones in Expanding Testing

https://www.twz.com/sea/uss-preble-used-helios-laser-to-zap-four-drones-in-expanding-testing
3•breve•36m ago•0 comments

Show HN: Animated beach scene, made with CSS

https://ahmed-machine.github.io/beach-scene/
1•ahmedoo•36m ago•0 comments

An update on unredacting select Epstein files – DBC12.pdf liberated

https://neosmart.net/blog/efta00400459-has-been-cracked-dbc12-pdf-liberated/
3•ks2048•36m ago•0 comments

Was going to share my work

1•hiddenarchitect•40m ago•0 comments

Pitchfork: A devilishly good process manager for developers

https://pitchfork.jdx.dev/
1•ahamez•40m ago•0 comments

You Are Here

https://brooker.co.za/blog/2026/02/07/you-are-here.html
5•mltvc•44m ago•1 comments
Open in hackernews

Think of a Number

https://xenaproject.wordpress.com/2025/01/20/think-of-a-number/
40•IdealeZahlen•7mo ago

Comments

AnotherGoodName•7mo ago
A great example of this is to ask AI to ingest and restate with detailed annotations advanced maths papers. This should be simple but the AI fails at this.

A lot of maths is terse. It can take years to grok a very advanced topic. Eg. The ABC conjecture is supposed to be solved by https://en.wikipedia.org/wiki/Inter-universal_Teichm%C3%BCll... but that theory is tough even for the smartest minds so it's still considered up in the air if it's solved or not, not enough mathematicians grok it yet to have a consensus. It's not disproven as nonsense, the paper appears to make sense. It's just that it's a very advanced topic that takes years to understand.

So as someone wanting to understand such topics you may be tempted to have AI read the paper and give annotations and summaries. You might be tempted to have AI give some numeric examples of formulas.

Guess what happens? COMPLETE AND TOTAL FAILURE. The AI can't do it. Because the paper has no online examples where people have written numeric examples and given annotations there's nothing for the AI to go off. It gives numeric examples with mistakes that don't even match the statement it's meant to be giving an example of. Often it gives up with statements like, "At this point the numeric example fails to solve the solution but you can imagine if it did". You can ask it to try and try again but it just keeps failing. Even simple and well known papers generally don't work unless there's already a simple explanation someone's already posted online that it can regurgitate.

Which is pretty damning right? Reading a paper, giving numeric examples of what the paper states and giving some plain english summaries to the most dense portions should be what a language processing system does best. We're not even asking it to come up with original ideas here. We're asking it to summarise well known mathematical papers. The only time i've seen it have success is if someone's already done such an explanation on mathsoverflow.

jordigh•7mo ago
> It's not disproven as nonsense, the paper appears to make sense

Not obviously utter nonsense, but a couple of mathematicians who have studied it have claimed to have found gaps and were unsatisfied with the resolution to those gaps that Mochizuki offered.

It's kind of like, well, LLM output. Has the right shape but upon scrutiny it seems to fall apart. Plausible-looking but probably nonsense.

BlackFingolfin•7mo ago
A follow up post is at https://xenaproject.wordpress.com/2025/03/16/think-of-a-numb...
jenny91•7mo ago
Mathematics is such a wide field and the questions asked here are ill defined.

If the comment is "the AI founder bros are hyping it up and it's not as good as they claim", I think we all agree that's true. LLMs are good, but exactly how good depends on many subjective points.

If the question is: "can we come up with questions that are easy for some tiny niche set of experts, but basically impossible for an LLM", I think the answer will always be "yes", especially if you can make "niche set of experts" more and more niche every time.

If the question is "will mathematicians be unemployed in a few years", obviously the answer is also "no".

If the question is "can LLMs be used to speed up mathematics research", the answer is "yes and no, depending on what you're doing".

prats226•7mo ago
An issue would be as soon as you make questions public, even by letting hosted LLMs predict on them, they are tainted. You can't use them anymore. So would it be a one time test dataset?
npodbielski•7mo ago
It would it was explained in the article. Though he did not do it at all. It is in follow up article in top coment.
jbs789•7mo ago
Interesting idea. Once you have the questions and get some buy-in… have you considered how you’d deal with an employee solving the problem and modifying the model before you get your results back? It would be a sleazy thing to do but I can imagine sneakiness around how folks interpret versions or modifications etc. Wonder if you or some third party just runs the question over the model.
npodbielski•7mo ago
It was half a year ago. He did not got enough and gave up. Top comment is follow up.