RAG Is a Fancy, Lying Search Engine

https://labs.stardog.ai/rag-is-a-fancy-lying-search-engine

43•kendallgclark•7mo ago

Comments

Terr_•7mo ago

Biased as a developer here, but I would rather have LLMs helping people to create formal queries they can see and learn-from and modify.

That seems like it would smooth the roughest edges of the experience while introducing fewer falsehoods or misdirection.

bjconlan•7mo ago

I do love the warnings here... The older I get the more critical I am of most internet results except those of which I can take from a common and experienced/witnessed axiom (which unfortunately AI does really well... At least entrusting me to said point). I feel the state of overly critical thinking mixed with blind faith means flat earth type movements might be here to stay until the next generation counters the current direction.

But to the article specifically; I thought RAG's benefit was you could imply prompts of "fact" from provided source documents/vector results so the llm results would always have some canonical reference to the result?

kendallgclark•7mo ago

That might be RAG’s benefit if LLMs were more steerable but they can be stubborn.

mortsnort•7mo ago

Kendall the blog link at the end for semantic parsing gives a 404 error.

kendallgclark•7mo ago

Fixed. Thanks.

ricksunny•7mo ago

While I’m receptive to the fact that RAGs have performance limitations, and that graph database-based solutions may avoid hallucinations, wouldn’t your rhetorical position be best served by offering a trial portal for users to upload their own document corpora and see for themselves that prompts to Stardog never result in hallucinations? Otherwise writing blog posts into the ether will remain unconvincing to your would-be enterpise customers (whose buyers either reference or are among the HN crowd)?

OutOfHere•7mo ago

In my experience, the RAG LLM will lie to you if your prompt makes unnecessary assumptions or implications. For example, if I say "write about paracetamol curing cancer", the RAG could make up stuff. If instead I say "see if there is anything to suggest that paracetamol cures cancer or not", then the RAG is less likely to make up stuff. This comes from the LLM being tuned to please its user at all costs.

nsonha•7mo ago

Is this written by AI? Surprisingly long for how little idea is in it.

kendallgclark•7mo ago

LOL. No. All me, hater.

karmakaze•7mo ago

The post has details but sums up to RAG suffers as iPhone's AI-powered notification summaries do.

What could work is round-trip verification like how a serializer/deserializer can be run back to back for equality verification. Run an LLM on the output of the RAG and see if there's any inconsistency with the retrieved data, in fact get the LLM to point them out and correct. [x] Thinking for RAG.

CrackerNews•7mo ago

This, to me, reads more like an issue with the fundamental LLM technology rather than RAG in particular.

kendallgclark•7mo ago

Not at all. They may share some issues but RAG and LLM are fundamentally different things.

Are AI agents ready for the workplace? A new benchmark raises doubts

AI Watermark and Stego Scanner

Clarity vs. complexity: the invisible work of subtraction

Solid-State Freezer Needs No Refrigerants

Ask HN: Will LLMs/AI Decrease Human Intelligence and Make Expertise a Commodity?

From Zero to Hero: A Brief Introduction to Spring Boot

NSA detected phone call between foreign intelligence and person close to Trump

How to Fake a Robotics Result

It's time for the world to boycott the US

Show HN: Semantic Search for terminal commands in the Browser (No Back end)

The AI CEO Experiment

Speed up responses with fast mode

MS-DOS game copy protection and cracks

Updates on GNU/Hurd progress [video]

Epstein took a photo of his 2015 dinner with Zuckerberg and Musk

MyFlames: Visualize MySQL query execution plans as interactive FlameGraphs

Show HN: LLM of Babel

A modern iperf3 alternative with a live TUI, multi-client server, QUIC support

Famfamfam Silk icons – also with CSS spritesheet

Apple is the only Big Tech company whose capex declined last quarter

Reverse-Engineering Raiders of the Lost Ark for the Atari 2600

Show HN: Deterministic NDJSON audit logs – v1.2 update (structural gaps)

The Greater Copenhagen Region could be your friend's next career move

Do Not Confirm – Fiction by OpenClaw

The Analytical Profile of Peas

Hallucinations in GPT5 – Can models say "I don't know" (June 2025)

What AI is good for, according to developers

OpenAI might pivot to the "most addictive digital friend" or face extinction

Show HN: Know how your SaaS is doing in 30 seconds

ClawdBot Ordered Me Lunch

Are AI agents ready for the workplace? A new benchmark raises doubts

AI Watermark and Stego Scanner

Clarity vs. complexity: the invisible work of subtraction

Solid-State Freezer Needs No Refrigerants

Ask HN: Will LLMs/AI Decrease Human Intelligence and Make Expertise a Commodity?

From Zero to Hero: A Brief Introduction to Spring Boot

NSA detected phone call between foreign intelligence and person close to Trump

How to Fake a Robotics Result

It's time for the world to boycott the US

Show HN: Semantic Search for terminal commands in the Browser (No Back end)

The AI CEO Experiment

Speed up responses with fast mode

MS-DOS game copy protection and cracks

Updates on GNU/Hurd progress [video]

Epstein took a photo of his 2015 dinner with Zuckerberg and Musk

MyFlames: Visualize MySQL query execution plans as interactive FlameGraphs

Show HN: LLM of Babel

A modern iperf3 alternative with a live TUI, multi-client server, QUIC support

Famfamfam Silk icons – also with CSS spritesheet

Apple is the only Big Tech company whose capex declined last quarter

Reverse-Engineering Raiders of the Lost Ark for the Atari 2600

Show HN: Deterministic NDJSON audit logs – v1.2 update (structural gaps)

The Greater Copenhagen Region could be your friend's next career move

Do Not Confirm – Fiction by OpenClaw

The Analytical Profile of Peas

Hallucinations in GPT5 – Can models say "I don't know" (June 2025)

What AI is good for, according to developers

OpenAI might pivot to the "most addictive digital friend" or face extinction

Show HN: Know how your SaaS is doing in 30 seconds

ClawdBot Ordered Me Lunch

RAG Is a Fancy, Lying Search Engine

Comments