frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

LLMs Don't Hallucinate – They Drift

https://figshare.com/articles/conference_contribution/Measuring_Fidelity_Decay_A_Framework_for_Semantic_Drift_and_Collapse/30422107?file=58969378
16•knowledgeinfra•1h ago

Comments

knowledgeinfra•1h ago
This paper argues that the dominant metaphor for LLM failure, hallucinations, misdiagnoses the real problem. Language models do not primarily fail by inventing false facts, but by undergoing fidelity decay, the gradual erosion of meaning across recursive transformations. Even when outputs remain accurate and coherent, nuance, metaphor, intent, and contextual ground steadily degrade. The paper proposes a unified framework for measuring this collapse through four interrelated dynamics, lexical decay, semantic drift, ground erosion, and semantic noise, and sketches how each can be operationalized into concrete benchmarks. The central claim is that accuracy alone is an insufficient evaluation target. Without explicit fidelity metrics, AI systems risk becoming fluent yet hollow, technically correct while culturally and semantically impoverished.
petesergeant•1h ago
Please don’t post AI summaries here
chrisjj•1h ago
> Language models do not primarily fail by inventing false facts, but by undergoing fidelity decay

This premise is unsound. We don't expect LLMs to deliver with fidelity, just as we don't expect parrots to speak with their owners' accents. So infidelity is by no means a failure.

zahrevsky•1h ago
> The contribution of this work lies in its move from critique to measurement. It proposes concrete methods: recursive summarization chains, metaphor stress-tests, resonance surveys, and noise-infused retrieval experiments. These allow researchers to track how meaning erodes over time. By integrating these methods, it outlines a pathway toward fidelity-centered benchmarks that complement existing accuracy metrics.

To me, starting to solve the problem by meticulously measuring it, is a sign of a good solution.

Retr0id•1h ago
What the heck is a resonance survey
chrisjj•1h ago
An LLM fabrication.
chrisjj•1h ago
True title: Measuring Fidelity Decay: A Framework for Semantic Drift and Collapse
botacode•1h ago
Getting a 403 when I try to read. Anyone have a backup link?
Retr0id•1h ago
This is slop
sylware•1h ago
ofc not, they "bungee jump"

:p

m0llusk•47m ago
Hallucinations that have certain characteristics and boundaries are still hallucinations. This is happening because learning models are doing pattern matching, so to put it briefly anything that fits may work and end up in the output.

Being able to admit the flaws and limitations of a technology is often critical to advancing adoption. Unfortunately, producers of currently popular learning model based technologies are more interested in speculation and growth and speculative growth than genuinely robust operation. This paper is a symptom of a larger problem that is contributing to the bubble pop, downturn, or "AI winter" that we are collectively heading toward.

polotics•24m ago
This is so short and empty sorry, the author would be well placed to try to ground their work in a modicum of empiricism, the puffed-up style here makes things a bit hard to read. I do not know if this is slop it's getting harder to guess, and some actual humans have been writing like this long before LLMs. Still, what is the actual finding being presented here?

This paper has been cited more than 6k times. It's fatally flawed.

https://statmodeling.stat.columbia.edu/2026/01/22/aking/
240•timr•5h ago•100 comments

Deutsche Telekom is violating Net Neutrality

https://netzbremse.de/en/
366•tietjens•5h ago•189 comments

Show HN: Bonsplit – Tabs and splits for native macOS apps

https://bonsplit.alasdairmonk.com
48•sgottit•2h ago•6 comments

ANN v3: 200ms p99 query latency over 100B vectors

https://turbopuffer.com/blog/ann-v3
17•_peregrine_•3d ago•4 comments

Introduction to PostgreSQL Indexes

https://dlt.github.io/blog/posts/introduction-to-postgresql-indexes/
113•dlt•6h ago•4 comments

Show HN: TUI for managing XDG default applications

https://github.com/mitjafelicijan/xdgctl
30•mitjafelicijan•2h ago•11 comments

Bridging the Gap Between PLECS and SPICE

https://erickschulz.dev/posts/plecs-spice/
4•eschu•3h ago•0 comments

Jurassic Park - Tablet device on Nedry's desk? (2012)

https://www.therpf.com/forums/threads/jurassic-park-tablet-device-on-nedrys-desk.169883/
66•exvi•4h ago•23 comments

BirdyChat becomes first European chat app that is interoperable with WhatsApp

https://www.birdy.chat/blog/first-to-interoperate-with-whatsapp
662•joooscha•19h ago•408 comments

Adoption of EVs tied to real-world reductions in air pollution: study

https://keck.usc.edu/news/adoption-of-electric-vehicles-tied-to-real-world-reductions-in-air-poll...
454•hhs•14h ago•391 comments

Nango (YC W23, Dev Infrastructure) Is Hiring Remotely

https://jobs.ashbyhq.com/Nango
1•bastienbeurier•2h ago

Alarm overload is undermining safety at sea as crews face thousands of alerts

https://www.lr.org/en/knowledge/press-room/press-listing/press-release/2026/alarm-overload-is-und...
32•geox•1h ago•11 comments

A Lament for Aperture

https://ikennd.ac/blog/2026/01/old-man-yells-at-modern-software-design/
137•firloop•4d ago•29 comments

BU-808: How to Prolong Lithium-based Batteries (2023)

https://www.batteryuniversity.com/article/bu-808-how-to-prolong-lithium-based-batteries/
28•eswat•2d ago•5 comments

Show HN: AutoShorts – Local, GPU-accelerated AI video pipeline for creators

https://github.com/divyaprakash0426/autoshorts
48•divyaprakash•6h ago•17 comments

The Rebirth of Pennsylvania's Infamous Burning Town

https://www.atlasobscura.com/articles/centralia-pennsylvania-rebirth
19•pbshgthm•5d ago•1 comments

Hands-On with Two Apple Network Server Prototype ROMs

http://oldvcr.blogspot.com/2026/01/hands-on-with-two-apple-network-server.html
18•todsacerdoti•6h ago•0 comments

I built a 2x faster lexer, then discovered I/O was the real bottleneck

https://modulovalue.com/blog/syscall-overhead-tar-gz-io-performance/
59•modulovalue•4d ago•32 comments

David Patterson: Challenges and Research Directions for LLM Inference Hardware

https://arxiv.org/abs/2601.05047
86•transpute•11h ago•9 comments

Doom has been ported to an earbud

https://doombuds.com
5•arin-s•1h ago•2 comments

Intrinsically stretchable 2D MoS2 transistors

https://www.nature.com/articles/s41467-026-68504-2
16•bookofjoe•4d ago•0 comments

Two Weeks Until Tapeout

https://essenceia.github.io/projects/two_weeks_until_tapeout/
152•client4•12h ago•10 comments

Claude Code's new hidden feature: Swarms

https://twitter.com/NicerInPerson/status/2014989679796347375
453•AffableSpatula•23h ago•301 comments

Accept_language 2.2 – RFC 7231/4647 compliant Accept-Language parsing for Ruby

https://github.com/cyril/accept_language.rb
12•cyrilllllll•4h ago•0 comments

Article on the History of Spot Instances: Analyzing Spot Instance Pricing Change

https://spot.rackspace.com/blogs/history-of-spot-instances
5•aleroawani•4d ago•0 comments

Postmortem: Our first VLEO satellite mission (with imagery and flight data)

https://albedo.com/post/clarity-1-what-worked-and-where-we-go-next
191•topherhaddad•18h ago•60 comments

Google confirms 'high-friction' sideloading flow is coming to Android

https://www.androidauthority.com/google-sideloading-android-high-friction-process-3633468/
310•_____k•5d ago•285 comments

Putting Rocks on the Moon

https://ahwoo.com/posts/019bd882-d104-7347-be7b-8e0a5ce13cb5
14•epaga•4d ago•0 comments

Raspberry Pi Drag Race: Pi 1 to Pi 5 – Performance Comparison

https://the-diy-life.com/raspberry-pi-drag-race-pi-1-to-pi-5-performance-comparison/
197•verginer•20h ago•84 comments

Typography on Pencils (2023)

https://www.presentandcorrect.com/blogs/blog/typography-on-pencils-1-5
89•NaOH•4d ago•9 comments