Ask HN: Why LLMs confidently hallucinate instead of admitting knowledge cutoff?

2•cryptography•1h ago

I asked Claude about a library released in March 2025 (after its January cutoff). Instead of saying smth like "I don't know, that's after my cutoff," it fabricated a detailed technical explanation - architecture, API design, use cases. Completely made up, but internally consistent and plausible.

What's confusing: the model clearly "knows" its cutoff date when asked directly, and can express uncertainty in other contexts. Yet it chooses to hallucinate instead of admitting ignorance.

Is this a fundamental architecture limitation, or just a training objective problem? Generating a coherent fake explanation seems more expensive than "I don't have that information."

Why haven't labs prioritized fixing this? Adding web search mostly solves it, which suggests it's not architecturally impossible to know when to defer.

Has anyone seen research or experiments that improve this behavior? Curious if this is a known hard problem or more about deployment priorities.

Comments

bigyabai•1h ago

> Yet it chooses to hallucinate instead of admitting ignorance.

LLMs don't "choose" to do anything. They inference weights. Text is an extremely limiting medium, and doesn't afford LLMs the distinction between fiction and reality.

barrister•49m ago

If I ask Grok about anything that occurred this morning, he immediately starts reading and researching in real time. "Summarize what Leavitt said this morning." "Tell me what's new in python 3.14." Etc.. What do you mean by "cutoff", it seems unlikely that Claude is that limited.

Are AI Builders the Final Abstraction Layer?

Big AI firms pump money into world models as LLM advances slow

AI background removal can now also clean the subject of reflected colours

Instant Checkout in ChatGPT

Constant-Depth NTT for FHE-Based Private Proof Delegation

Sonnet 4.5 ranks #25 (below other Claude models) in generating SQL

Zipoc: Git, but super lightweight and simpler and with more features (WIP)

Claude Sonnet 4.5 autonomously generates Slack clone in one shot in 30 hours

Show HN: Resrap – A Parser but in Reverse

Diagnosing a Linux Performance Regression

Is Mainstream Tech News Dead?

What if there's an overinvestment in AI?

A blue jay and a green jay mated. Their offspring is a scientific marvel

Gemini API Down

NZ universities give up using software to detect AI in students' work

Vibe Working: Introducing Agent Mode and Office Agent in Microsoft 365 Copilot

Woman admits UK Bitcoin fraud charges after ' largest' crypto seizure

Claude Plays Catan [video]

Diffusion Cam: img2text2img social media

Researchers find tree growth boosts insect herbivory

100X Faster: How We Supercharged Netflix Maestro's Workflow Engine

How to Use an AWS S3 Bucket as a Pulumi State Back End

Tile Language: DSL for High-Performance GPU/CPU/Accelerators Kernels

Small Near-Earth Objects in the Taurid Resonant Swarm

Show HN: Open-Source Configurable AI Agents for Company Research

Agentic-Commerce-Protocol

Help Me Find Missing Issues of Australian Personal Computer

LoongArch Reference Manual

The new light of Jony Ive's life

US to See $350B Nuclear Boom to Power AI