From zero to a RAG system: successes and failures

https://en.andros.dev/blog/aa31d744/from-zero-to-a-rag-system-successes-and-failures/

52•andros•2d ago

Comments

Horatius77•2d ago

Great writeup but ... pretty sure ChromaDB is open source and not "Google's database"?

threatofrain•1h ago

I'm afraid this hits the credibility of the article for me, that's a pretty weird mistake to make. It's like paying for a Model 3 while thinking it comes from Ford.

andros•43m ago

Thank you for your feedback!

nalinidash•1h ago

ChromaDB is open source with Apache-2.0 license.

https://github.com/chroma-core/chroma

z02d•1h ago

Maybe a bit off-topic: For my PhD, I wanted to leverage LLMs and AI to speed up the literature review process*. Due to time constraints, this never really lifted off for me. At the time I checked (about 6 months ago), several tools were already available (NotebookLM, Anara, Connected Papers, ZotAI, Litmaps, Consensus, Research Rabbit) supporting Literature Review. They have all pros and cons (and different scopes), but my biggest requirement would be to do this on my Zotero bibliographic collection (available offline as PDF/ePub).

ZotAI can use LMStudio (for embeddings and LLM models), but at that time, ZotAI was super slow and buggy.

Instead of going through the valley of sorrows (as threatofrain shared in the blog post - thanks for that), is there a more or less out-of-the-box solution (paid or free) for the demand (RAG for local literature review support)?

*If I am honest, it was rather a procrastination exercise, but this is for sure relatable for readers of HN :-D

bee_rider•45m ago

I tried to do RAG on my laptop just by setting it all up myself, but the actual LLM gave poor results (I have a small thin-and-light fwiw, so I could only run weak models). The vector search itself, actually, ended up being a little more useful.

mettamage•1h ago

51 visitors in real-time.

I love those site features!

In a submission of a few days ago there was something similar.

I love it when a website gives a hint to the old web :)

aledevv•52m ago

I made something similar in my project. My more difficult task has been choice the right approach to chunking long documents. I used both structural and semantic chunking approach. The semantic one helped to better store vectors in vectorial DB. I used QDrant and openAi embedding model.

JKCalhoun•32m ago

And some have been saying that RAGs are obsolete—that the context window of a modern LLM is adequate (preferable?). The example I recently read was that the contexts are large enough for the entire "The Lord of the Rings" books.

That may be, but then there's an entire law library, the entirety of Wikipedia (and the example in this article of 451 GB). Surely those are at least an order of magnitude larger than Tolkien's prose and might still benefit from a RAG.

mentos•27m ago

I assume it’s not possible to get the same results by fine tuning a model with the documents instead?

notglossy•3m ago

You will still get hallucinations. With RAG you use the vectors to aid in finding things that are relevant, and then you typically also have the raw text data stored as well. This allows you to theoretically have LLM outputs grounded in the truth of the documents. Depending on implementation, you can also make the LLM cite the sources (filename, chunk, etc).

Personal Encyclopedias

Swift 6.3

From zero to a RAG system: successes and failures

Running Tesla Model 3's computer on my desk using parts from crashed cars

Obsolete Sounds

LibreOffice and the Art of Overreacting

ARC-AGI-3

Shell Tricks That Make Life Easier (and Save Your Sanity)

Niche Museums

The truth that haunts the Ramones: 'They sold more T-shirts than records'

Earthquake scientists reveal how overplowing weakens soil at experimental farm

My DIY FPGA board can run Quake II

More precise elevation data for GraphHopper routing engine

The EU still wants to scan your private messages and photos

90% of Claude-linked output going to GitHub repos w <2 stars

The Cassandra of 'The Machine'

Ashby (YC W19) Is Hiring Engineers Who Make Product Decisions

Supreme Court Sides with Cox in Copyright Fight over Pirated Music

Show HN: Robust LLM Extractor for Websites in TypeScript

What came after the 486?

Show HN: Optio – Orchestrate AI coding agents in K8s to go from ticket to PR

Quantization from the Ground Up

Thoughts on slowing the fuck down

The Last Contract: William T. Vollmann's Battle to Publish an Epic (2025)

Two studies in compiler optimisations

Optimization lessons from a Minecraft structure locator

Maxell MXCP-P100 – wireless cassette player

Government agencies buy commercial data about Americans in bulk

False claims in a widely-cited paper

Show HN: A plain-text cognitive architecture for Claude Code

From zero to a RAG system: successes and failures

Comments

Personal Encyclopedias

Swift 6.3

From zero to a RAG system: successes and failures

Running Tesla Model 3's computer on my desk using parts from crashed cars

Obsolete Sounds

LibreOffice and the Art of Overreacting

ARC-AGI-3

Shell Tricks That Make Life Easier (and Save Your Sanity)

Niche Museums

The truth that haunts the Ramones: 'They sold more T-shirts than records'

Earthquake scientists reveal how overplowing weakens soil at experimental farm

My DIY FPGA board can run Quake II

More precise elevation data for GraphHopper routing engine

The EU still wants to scan your private messages and photos

90% of Claude-linked output going to GitHub repos w <2 stars

The Cassandra of 'The Machine'

Ashby (YC W19) Is Hiring Engineers Who Make Product Decisions

Supreme Court Sides with Cox in Copyright Fight over Pirated Music

Show HN: Robust LLM Extractor for Websites in TypeScript

What came after the 486?

Show HN: Optio – Orchestrate AI coding agents in K8s to go from ticket to PR

Quantization from the Ground Up

Thoughts on slowing the fuck down

The Last Contract: William T. Vollmann's Battle to Publish an Epic (2025)

Two studies in compiler optimisations

Optimization lessons from a Minecraft structure locator

Maxell MXCP-P100 – wireless cassette player

Government agencies buy commercial data about Americans in bulk

False claims in a widely-cited paper

Show HN: A plain-text cognitive architecture for Claude Code