frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Will Future Generations Think We're Gross?

https://chillphysicsenjoyer.substack.com/p/will-future-generations-think-were
1•crescit_eundo•3m ago•0 comments

Kernel Key Retention Service

https://www.kernel.org/doc/html/latest/security/keys/core.html
1•networked•3m ago•0 comments

State Department will delete Xitter posts from before Trump returned to office

https://www.npr.org/2026/02/07/nx-s1-5704785/state-department-trump-posts-x
1•righthand•6m ago•0 comments

Show HN: Verifiable server roundtrip demo for a decision interruption system

https://github.com/veeduzyl-hue/decision-assistant-roundtrip-demo
1•veeduzyl•7m ago•0 comments

Impl Rust – Avro IDL Tool in Rust via Antlr

https://www.youtube.com/watch?v=vmKvw73V394
1•todsacerdoti•7m ago•0 comments

Stories from 25 Years of Software Development

https://susam.net/twenty-five-years-of-computing.html
2•vinhnx•8m ago•0 comments

minikeyvalue

https://github.com/commaai/minikeyvalue/tree/prod
3•tosh•13m ago•0 comments

Neomacs: GPU-accelerated Emacs with inline video, WebKit, and terminal via wgpu

https://github.com/eval-exec/neomacs
1•evalexec•17m ago•0 comments

Show HN: Moli P2P – An ephemeral, serverless image gallery (Rust and WebRTC)

https://moli-green.is/
2•ShinyaKoyano•21m ago•1 comments

How I grow my X presence?

https://www.reddit.com/r/GrowthHacking/s/UEc8pAl61b
2•m00dy•23m ago•0 comments

What's the cost of the most expensive Super Bowl ad slot?

https://ballparkguess.com/?id=5b98b1d3-5887-47b9-8a92-43be2ced674b
1•bkls•24m ago•0 comments

What if you just did a startup instead?

https://alexaraki.substack.com/p/what-if-you-just-did-a-startup
3•okaywriting•30m ago•0 comments

Hacking up your own shell completion (2020)

https://www.feltrac.co/environment/2020/01/18/build-your-own-shell-completion.html
2•todsacerdoti•33m ago•0 comments

Show HN: Gorse 0.5 – Open-source recommender system with visual workflow editor

https://github.com/gorse-io/gorse
1•zhenghaoz•34m ago•0 comments

GLM-OCR: Accurate × Fast × Comprehensive

https://github.com/zai-org/GLM-OCR
1•ms7892•35m ago•0 comments

Local Agent Bench: Test 11 small LLMs on tool-calling judgment, on CPU, no GPU

https://github.com/MikeVeerman/tool-calling-benchmark
1•MikeVeerman•36m ago•0 comments

Show HN: AboutMyProject – A public log for developer proof-of-work

https://aboutmyproject.com/
1•Raiplus•36m ago•0 comments

Expertise, AI and Work of Future [video]

https://www.youtube.com/watch?v=wsxWl9iT1XU
1•indiantinker•36m ago•0 comments

So Long to Cheap Books You Could Fit in Your Pocket

https://www.nytimes.com/2026/02/06/books/mass-market-paperback-books.html
3•pseudolus•37m ago•1 comments

PID Controller

https://en.wikipedia.org/wiki/Proportional%E2%80%93integral%E2%80%93derivative_controller
1•tosh•41m ago•0 comments

SpaceX Rocket Generates 100GW of Power, or 20% of US Electricity

https://twitter.com/AlecStapp/status/2019932764515234159
2•bkls•41m ago•0 comments

Kubernetes MCP Server

https://github.com/yindia/rootcause
1•yindia•42m ago•0 comments

I Built a Movie Recommendation Agent to Solve Movie Nights with My Wife

https://rokn.io/posts/building-movie-recommendation-agent
4•roknovosel•42m ago•0 comments

What were the first animals? The fierce sponge–jelly battle that just won't end

https://www.nature.com/articles/d41586-026-00238-z
2•beardyw•51m ago•0 comments

Sidestepping Evaluation Awareness and Anticipating Misalignment

https://alignment.openai.com/prod-evals/
1•taubek•51m ago•0 comments

OldMapsOnline

https://www.oldmapsonline.org/en
2•surprisetalk•53m ago•0 comments

What It's Like to Be a Worm

https://www.asimov.press/p/sentience
2•surprisetalk•53m ago•0 comments

Don't go to physics grad school and other cautionary tales

https://scottlocklin.wordpress.com/2025/12/19/dont-go-to-physics-grad-school-and-other-cautionary...
2•surprisetalk•53m ago•0 comments

Lawyer sets new standard for abuse of AI; judge tosses case

https://arstechnica.com/tech-policy/2026/02/randomly-quoting-ray-bradbury-did-not-save-lawyer-fro...
5•pseudolus•54m ago•0 comments

AI anxiety batters software execs, costing them combined $62B: report

https://nypost.com/2026/02/04/business/ai-anxiety-batters-software-execs-costing-them-62b-report/
1•1vuio0pswjnm7•54m ago•0 comments
Open in hackernews

Ask HN: Why LLMs confidently hallucinate instead of admitting knowledge cutoff?

2•cryptography•4mo ago
I asked Claude about a library released in March 2025 (after its January cutoff). Instead of saying smth like "I don't know, that's after my cutoff," it fabricated a detailed technical explanation - architecture, API design, use cases. Completely made up, but internally consistent and plausible.

What's confusing: the model clearly "knows" its cutoff date when asked directly, and can express uncertainty in other contexts. Yet it chooses to hallucinate instead of admitting ignorance.

Is this a fundamental architecture limitation, or just a training objective problem? Generating a coherent fake explanation seems more expensive than "I don't have that information."

Why haven't labs prioritized fixing this? Adding web search mostly solves it, which suggests it's not architecturally impossible to know when to defer.

Has anyone seen research or experiments that improve this behavior? Curious if this is a known hard problem or more about deployment priorities.

Comments

bigyabai•4mo ago
> Yet it chooses to hallucinate instead of admitting ignorance.

LLMs don't "choose" to do anything. They inference weights. Text is an extremely limiting medium, and doesn't afford LLMs the distinction between fiction and reality.

barrister•4mo ago
If I ask Grok about anything that occurred this morning, he immediately starts reading and researching in real time. "Summarize what Leavitt said this morning." "Tell me what's new in python 3.14." Etc.. What do you mean by "cutoff", it seems unlikely that Claude is that limited.
perrygeo•4mo ago
Because LLMs do not have a model of the world. They can only make more words. They can't compare their own output to any objective reality.
NoFunPedant•4mo ago
Claude doesn't know when it's lying or when it's telling the truth. It doesn't know anything. It's a computer program. It manipulates symbols according to mathematical rules. It doesn't know what the symbols mean. It doesn't know how the symbols relate to real-world facts.
gus_massa•4mo ago
Bad training weights. They gave 1 point for each correct answer and 0 for each incorrect one, so the model learned to bullshit and complete with random nonsense.

Next time, they will use 1 point for each correct answer and -.1 for each incorrect one, and 0 for "I don't know" and the model will behave. (And perhaps add some intermediate value for "I guess that [something]".)

We do that in the university. If the exam has 0 points for bad answers, I encourage my students to answer all of them.