frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Study confirms experience beats youthful enthusiasm

https://www.theregister.com/2026/02/07/boomers_vs_zoomers_workplace/
1•Willingham•3m ago•0 comments

The Big Hunger by Walter J Miller, Jr. (1952)

https://lauriepenny.substack.com/p/the-big-hunger
1•shervinafshar•4m ago•0 comments

The Genus Amanita

https://www.mushroomexpert.com/amanita.html
1•rolph•9m ago•0 comments

We have broken SHA-1 in practice

https://shattered.io/
1•mooreds•10m ago•1 comments

Ask HN: Was my first management job bad, or is this what management is like?

1•Buttons840•11m ago•0 comments

Ask HN: How to Reduce Time Spent Crimping?

1•pinkmuffinere•12m ago•0 comments

KV Cache Transform Coding for Compact Storage in LLM Inference

https://arxiv.org/abs/2511.01815
1•walterbell•17m ago•0 comments

A quantitative, multimodal wearable bioelectronic device for stress assessment

https://www.nature.com/articles/s41467-025-67747-9
1•PaulHoule•19m ago•0 comments

Why Big Tech Is Throwing Cash into India in Quest for AI Supremacy

https://www.wsj.com/world/india/why-big-tech-is-throwing-cash-into-india-in-quest-for-ai-supremac...
1•saikatsg•19m ago•0 comments

How to shoot yourself in the foot – 2026 edition

https://github.com/aweussom/HowToShootYourselfInTheFoot
1•aweussom•19m ago•0 comments

Eight More Months of Agents

https://crawshaw.io/blog/eight-more-months-of-agents
3•archb•21m ago•0 comments

From Human Thought to Machine Coordination

https://www.psychologytoday.com/us/blog/the-digital-self/202602/from-human-thought-to-machine-coo...
1•walterbell•22m ago•0 comments

The new X API pricing must be a joke

https://developer.x.com/
1•danver0•23m ago•0 comments

Show HN: RMA Dashboard fast SAST results for monorepos (SARIF and triage)

https://rma-dashboard.bukhari-kibuka7.workers.dev/
1•bumahkib7•23m ago•0 comments

Show HN: Source code graphRAG for Java/Kotlin development based on jQAssistant

https://github.com/2015xli/jqassistant-graph-rag
1•artigent•28m ago•0 comments

Python Only Has One Real Competitor

https://mccue.dev/pages/2-6-26-python-competitor
3•dragandj•29m ago•0 comments

Tmux to Zellij (and Back)

https://www.mauriciopoppe.com/notes/tmux-to-zellij/
1•maurizzzio•30m ago•1 comments

Ask HN: How are you using specialized agents to accelerate your work?

1•otterley•31m ago•0 comments

Passing user_id through 6 services? OTel Baggage fixes this

https://signoz.io/blog/otel-baggage/
1•pranay01•32m ago•0 comments

DavMail Pop/IMAP/SMTP/Caldav/Carddav/LDAP Exchange Gateway

https://davmail.sourceforge.net/
1•todsacerdoti•33m ago•0 comments

Visual data modelling in the browser (open source)

https://github.com/sqlmodel/sqlmodel
1•Sean766•35m ago•0 comments

Show HN: Tharos – CLI to find and autofix security bugs using local LLMs

https://github.com/chinonsochikelue/tharos
1•fluantix•35m ago•0 comments

Oddly Simple GUI Programs

https://simonsafar.com/2024/win32_lights/
1•MaximilianEmel•36m ago•0 comments

The New Playbook for Leaders [pdf]

https://www.ibli.com/IBLI%20OnePagers%20The%20Plays%20Summarized.pdf
1•mooreds•36m ago•1 comments

Interactive Unboxing of J Dilla's Donuts

https://donuts20.vercel.app
1•sngahane•38m ago•0 comments

OneCourt helps blind and low-vision fans to track Super Bowl live

https://www.dezeen.com/2026/02/06/onecourt-tactile-device-super-bowl-blind-low-vision-fans/
1•gaws•39m ago•0 comments

Rudolf Vrba

https://en.wikipedia.org/wiki/Rudolf_Vrba
1•mooreds•40m ago•0 comments

Autism Incidence in Girls and Boys May Be Nearly Equal, Study Suggests

https://www.medpagetoday.com/neurology/autism/119747
1•paulpauper•41m ago•0 comments

Wellness Hotels Discovery Application

https://aurio.place/
1•cherrylinedev•41m ago•1 comments

NASA delays moon rocket launch by a month after fuel leaks during test

https://www.theguardian.com/science/2026/feb/03/nasa-delays-moon-rocket-launch-month-fuel-leaks-a...
2•mooreds•42m ago•0 comments
Open in hackernews

Designing Predictable LLM-Verifier Systems for Formal Method Guarantee

https://arxiv.org/abs/2512.02080
59•PaulHoule•1mo ago

Comments

brantmv•1mo ago
Maybe I'm wrong, but it looks like the authors did not actually have any LLMs write or verify any code for their experiments. Instead, their experiments consist of simulating the simplified Markov chain model itself. They simulated their simple Markov chain and checked if the theorem's predictions matched empirical statistics. This amounts to a test not of their model, but of basic Markov chain theory.

Did I misread or miss something?

brantmv•1mo ago
Also, the mathematical content here is pretty thin. Their main theorem has nothing to do with LLMs directly. It's a theorem about a five-state Markov chain, and the proof follows from standard Markov chain theory.

For those reasons, the grandiose name "LLM-Verifier Convergence Theorem" does not sit well with me.

mapontosevenths•1mo ago
This line made me pause:

"We prove that for any non-zero stage success probability, the system reaches the verified state almost surely"

What's the point if its still stochastic?

IanCal•1mo ago
Hash collisions are possible but can be provably so rare that they’re not a relevant concern.
jaggederest•1mo ago
"almost surely" means "happens with a probability 1", which in infinite set contexts doesn't mean that there aren't other outcomes, but that they have probability 0.

So like, imagine that you had some finite list of integers, and you were picking a random number from 0 to infinity - because the domain is infinite, any finite set has 0 probability, but that doesn't mean it doesn't exist.

https://en.wikipedia.org/wiki/Almost_surely

mapontosevenths•1mo ago
Thank you. That makes this a pretty big deal doesn't it?

The ability to deterministcly identify that code eventually reaches a halting state, implies that we can use these stochastic tools to generate deterministic outcomes reliably in the future doesn't it?

jaggederest•1mo ago
Well, reliably but still with a chance of failure - in the same way that you can have a program which is provably correct but can still run into real world issues like being killed, but yes I would say that "almost surely" is a pretty large jump from "more than likely" (50%+1) where I'd say LLM output generally lives these days.
MiniMax42•1mo ago
> a chance of failure

Well, technically, no chance of failure. The chance of failure is absolute zero. Not close to zero, absolute zero. There will be no failure if the assumptions of the model are correct.

The real catch here is in the assumptions.

How long do you have before you need to have a solution? An hour, a year, a century? Too bad, almost sure convergence only provides a guarantee if you wait an infinite amount of time.

And then there's the question of the probability space you assume. (The sigma algebra.) Which things do you assume to have probability zero from the start and is that realistic?

mapontosevenths•1mo ago
> How long do you have before you need to have a solution? An hour, a year, a century? Too bad, almost sure convergence only provides a guarantee if you wait an infinite amount of time.

Thanks for this. I was actually just thinking "this can't actually work, it would mean P vs NP is solved." Of course, this explains why it doesn't mean that.

werf456•1mo ago
Can check out this recent paper doing scalable formal verification of LLMs "BEAVER: An Efficient Deterministic LLM Verifier": https://arxiv.org/abs/2512.05439
lebron72•1mo ago
This paper looks pretty groundbreaking. The ability to verify LLMs at scale (e.g., 70B) on real-world tasks like math reasoning and code security is extremely impressive and impactful.