Why does ChatGPT think mammoths were alive in December?

https://ramblingafter.substack.com/p/why-does-chatgpt-think-mammoths-were

4•wasabi991011•2mo ago

Comments

jqpabc123•2mo ago

Why does anyone still find this surprising?

It's disappointing that we (as a society) are investing such huge sums of money in this "amazing" new technology.

allears•2mo ago

I totally agree. Even though the author states over and over that an LLM is just a pattern-matching machine, he still anthropomorphizes the responses. To say that an LLM "makes mistakes" is granting it a consciousness it doesn't possess.

breathemein•2mo ago

I actually think the surprise is warranted. The model didn't just "get something wrong." It produced specific, confidently stated, errors that look like factual knowledge but are actually side-effects of pattern recognition. The whole point of the article is that this type of error is subtle enough that a non-expert might take it as truth (and to point out just how silly some of those confidently stated answers are for a laugh). Understanding the patterns and why the model behaves like this is useful so it can be improved. I don't think you can discredit the entire technology because it's not 100% perfect.

jqpabc123•2mo ago

Understanding the patterns and why the model behaves like this is useful so it can be improved.

The answer to why is simple and obvious --- the model is probabilistic. Improving it substantially would necessitate a total redesign.

I don't think you can discredit the entire technology because it's not 100% perfect.

So dice rolling as a decision making tool shouldn't be discredited in your opinion?

Would you fly on a plane designed using similar "technology"? Anything better than double 3s says it's safe enough.

I expect the legal system to ultimately have a lot of input on the use of this. I don't expect "But AI said it was OK" is going to fly in court once people have been harmed. And once this starts happening, the enthusiasm for this "new technology" will start to wane ---after $trillions have been wasted.

Show HN: Minecraft Creeper meets 90s Tamagotchi

Show HN: Termiteam – Control center for multiple AI agent terminals

The only U.S. particle collider shuts down

Ask HN: Why do purchased B2B email lists still have such poor deliverability?

Show HN: Remotion directory (videos and prompts)

Portable C Compiler

Show HN: Kokki – A "Dual-Core" System Prompt to Reduce LLM Hallucinations

Software Engineering Transformation 2026

Microsoft purges Win11 printer drivers, devices on borrowed time

Lunch with the FT: Tarek Mansour

Old Mexico and her lost provinces (1883)

'AI' is a dick move, redux

The source code was the moat. But not anymore

Does anyone else feel like their inbox has become their job?

An AI model that can read and diagnose a brain MRI in seconds

Dev with 5 of experience switched to Rails, what should I be careful about?

AlphaFace: High Fidelity and Real-Time Face Swapper Robust to Facial Pose

Scientists discover “levitating” time crystals that you can hold in your hand

Rammstein – Deutschland (C64 Cover, Real SID, 8-bit – 2019) [video]

Tell HN: Yet Another Round of Zendesk Spam

Postgres Message Queue (PGMQ)

Show HN: Django-rclone: Database and media backups for Django, powered by rclone

NY lawmakers proposed statewide data center moratorium

OpenClaw AI chatbots are running amok – these scientists are listening in

Show HN: AI agent forgets user preferences every session. This fixes it

Introduce the Vouch/Denouncement Contribution Model

Show HN: SSHcode – Always-On Claude Code/OpenCode over Tailscale and Hetzner

Microsoft appointed a quality czar. He has no direct reports and no budget

Multi-agent coordination on Claude Code: 8 production pain points and patterns

Washington Post CEO Will Lewis Steps Down After Stormy Tenure