frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

'Probably' doesn't mean the same thing to your AI as it does to you

https://theconversation.com/probably-doesnt-mean-the-same-thing-to-your-ai-as-it-does-to-you-275626
7•colinprince•2h ago

Comments

OkayPhysicist•1h ago
I wonder if the 70% vs 80% "Probably" problem comes from cultural differences between anglophone countries. The human datasets that were available were mostly American, with some Western Europe/NATO. Notably missing would be India, which simply by population I'd expect to represent a significant chunk of English-language writing available on the open internet ( and thus fed into LLM training sets).

The other phenomena I would love to test is if the act of surveying people effected their declared odds. Not sure how to get good numbers out of that, but I could see the LLM vs surveyed human discrepancy arising from people using "probably" differently in their everyday writing, as opposed to when asked point-blank what "probably" means.

5o1ecist•1h ago
> The research focused on words of estimative probability, which include terms like “maybe,” “probably” and “almost certain.”

Interesting. Perplexity did that as well, but I've made sure it stops doing that.

Might be relevant for others: https://www.perplexity.ai/search/hey-hey-do-you-remember-whe...

selridge•1h ago
Alignment is impossible here. “Nearly certain” odds for success for a sports team might be 20:1, but that’s a little worse (not much!) than for a launch vehicle and not at all good for a web server. No one would say “it is nearly certain that I’ll serve a web request” based on two 9’s, but they would say “it is nearly certain the team will win today” given the same odds. That’s just between humans.
rcarr•1h ago
Something I noticed recently is that Claude Code interprets "or" as inclusive or (or at least it does when writing function names). I suspect that this must be due to it's code specific nature considering I would expect the majority of or use in written language to be exclusive or.
jadenPete•30m ago
It seems like this problem (differences in how humans and LLMs use probabilistic language) and hallucination are one in the same. LLMs don’t have access to information about how confident they are, so they always choose the most likely response, even if the most likely response isn’t actually that likely. Whereas if a human is unconfident, they’ll express that instead of choosing the most likely response.

Of course, LLMs can still speak about probabilities and mimic uncertainty, but that’s likely (heh) coming from their training data on the subject matter, not their actual confidence.

Humans are interesting because they employ a two-phased approach: when we’re learning, we fake confidence (you’d never write “I don’t know” on a test unless you truly had nothing of value to say), but during inference, we communicate our confidence. Some humans suffer from underconfidence or overconfidence, but most just seem to know innately how to do this.

Can anyone who works on LLMs clarify whether my understanding is correct?

Managing Complexity with Mycelium

https://yogthos.net/posts/2026-02-25-ai-at-scale.html
1•todsacerdoti•4m ago•0 comments

How Did Japan's Space Program Evolve?

https://thediplomat.com/2026/02/how-did-japans-space-program-evolve/
2•jyunwai•5m ago•0 comments

The Agent-Ready Codebase

https://bagerbach.com/blog/agent-ready-codebase/
2•bagerbach•8m ago•0 comments

Apple Rolls Out Age Verification to UK iPhone Users Under Online Safety Act

https://reclaimthenet.org/apple-rolls-out-age-verification-to-uk-iphone-users-under-online-safety...
3•uyzstvqs•8m ago•0 comments

The 2026 Global Intelligence Crisis

https://www.citadelsecurities.com/news-and-insights/2026-global-intelligence-crisis/
1•walterbell•12m ago•0 comments

Show HN: Deff – Review AI-generated code changes

https://github.com/flamestro/deff
1•flamestro•12m ago•0 comments

Sparky – useful 'living' OpenClaw bot

https://alexisgallagher.com/posts/2026/hello-sparky/
1•capncleaver•14m ago•1 comments

What Happened to Molecular Manufacturing?

https://latecomermag.com/article/what-happened-to-molecular-manufacturing/
1•ravenical•18m ago•0 comments

Specification; communication; computation – no, programming isn't dead

https://twey.io/llm-programming/
2•Twey•20m ago•0 comments

Larry Page has moved to Florida

https://twitter.com/paulg/status/2026737030257062253
1•jmeister•20m ago•0 comments

Apple brings age verification to UK users in iOS 26.4 beta

https://www.theverge.com/tech/884306/apple-age-verification-uk-users-ios-26-4-beta
1•turrini•22m ago•0 comments

Possible AI use leads to end of senryu competition after 20 years

https://www.japantimes.co.jp/news/2026/02/24/japan/japan-ai-senryu-poetry-writing/
3•haunter•24m ago•0 comments

Show HN: Clerk – Simple invoicing for freelancers built with AI agents in 7 days

https://clerkfinance.com/
1•radolang•25m ago•1 comments

Why Your Next Electric Car Will Cost 50% Less [video]

https://www.youtube.com/watch?v=6ecV9Yu7YvA
1•zeristor•27m ago•2 comments

Show HN: Provision Stateless GPU Compute with Claude Code's Remote Control

https://github.com/theoddden/terradev-mcp
2•Facingsouth•27m ago•0 comments

Show HN: Edictum – Runtime governance for LLM agent tool calls

2•acartag7•28m ago•0 comments

Outage of Coveralls

https://status.coveralls.io
2•sega_sai•30m ago•0 comments

Getting Global Age Assurance Right: What We Got Wrong and What's Changing

https://discord.com/blog/getting-global-age-assurance-right-what-we-got-wrong-and-whats-changing
2•Alupis•32m ago•0 comments

Tldraw moves tests to closed source to avoid AI copies

https://simonwillison.net/2026/Feb/25/closed-tests/
2•jbernardo95•33m ago•0 comments

Tech firms aren't just encouraging their workers to use AI. They're enforcing it

https://www.msn.com/en-us/money/other/tech-firms-aren-t-just-encouraging-their-workers-to-use-ai-...
3•smurda•33m ago•0 comments

The first transatlantic fiber-optic cable is being ripped up

https://www.tomshardware.com/tech-industry/the-worlds-first-transatlantic-fiber-optic-cable-is-be...
2•gnfargbl•35m ago•0 comments

Live – AI that runs your company

https://polsia.com/live
2•seyz•35m ago•0 comments

Fix cron routes: POST → GET (Vercel cron sends GET)

2•nishiohiroshi•38m ago•0 comments

Show HN: OrangeWalrus, an aggregator for trivia nights (and other events) in SF

https://www.orangewalrus.com/
3•gjtrowbridge•38m ago•0 comments

Banned in California

https://www.bannedincalifornia.org/
100•pie_flavor•39m ago•73 comments

What AI can and cannot do

https://greyenlightenment.com/2026/02/23/what-ai-can-and-cannot-do/
3•paulpauper•40m ago•0 comments

Tetraethylenepentamine-Grafted Magnetic Polymer Composite for CO2 Capture

https://www.mdpi.com/2297-8739/13/2/56
2•PaulHoule•41m ago•0 comments

Anthropic and the Department of War

https://thezvi.substack.com/p/anthropic-and-the-department-of-war
8•paulpauper•42m ago•1 comments

Show HN: Unworldly – A flight recorder for AI agents (tamper-proof, HIPAA)

https://github.com/DilawarShafiq/unworldly
1•dilawargopang•42m ago•0 comments

Buying News by Metric

https://www.overcomingbias.com/p/buying-news-by-metric
1•paulpauper•42m ago•0 comments