frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Can You Draw Every Flag in PowerPoint? (Part 2) [video]

https://www.youtube.com/watch?v=BztF7MODsKI
1•fgclue•4m ago•0 comments

Show HN: MCP-baepsae – MCP server for iOS Simulator automation

https://github.com/oozoofrog/mcp-baepsae
1•oozoofrog•8m ago•0 comments

Make Trust Irrelevant: A Gamer's Take on Agentic AI Safety

https://github.com/Deso-PK/make-trust-irrelevant
2•DesoPK•12m ago•0 comments

Show HN: Sem – Semantic diffs and patches for Git

https://ataraxy-labs.github.io/sem/
1•rs545837•13m ago•1 comments

Hello world does not compile

https://github.com/anthropics/claudes-c-compiler/issues/1
1•mfiguiere•19m ago•0 comments

Show HN: ZigZag – A Bubble Tea-Inspired TUI Framework for Zig

https://github.com/meszmate/zigzag
2•meszmate•21m ago•0 comments

Metaphor+Metonymy: "To love that well which thou must leave ere long"(Sonnet73)

https://www.huckgutman.com/blog-1/shakespeare-sonnet-73
1•gsf_emergency_6•23m ago•0 comments

Show HN: Django N+1 Queries Checker

https://github.com/richardhapb/django-check
1•richardhapb•38m ago•1 comments

Emacs-tramp-RPC: High-performance TRAMP back end using JSON-RPC instead of shell

https://github.com/ArthurHeymans/emacs-tramp-rpc
1•todsacerdoti•43m ago•0 comments

Protocol Validation with Affine MPST in Rust

https://hibanaworks.dev
1•o8vm•47m ago•1 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
2•gmays•49m ago•0 comments

Show HN: Zest – A hands-on simulator for Staff+ system design scenarios

https://staff-engineering-simulator-880284904082.us-west1.run.app/
1•chanip0114•50m ago•1 comments

Show HN: DeSync – Decentralized Economic Realm with Blockchain-Based Governance

https://github.com/MelzLabs/DeSync
1•0xUnavailable•55m ago•0 comments

Automatic Programming Returns

https://cyber-omelette.com/posts/the-abstraction-rises.html
1•benrules2•58m ago•1 comments

Why Are There Still So Many Jobs? The History and Future of Workplace Automation [pdf]

https://economics.mit.edu/sites/default/files/inline-files/Why%20Are%20there%20Still%20So%20Many%...
2•oidar•1h ago•0 comments

The Search Engine Map

https://www.searchenginemap.com
1•cratermoon•1h ago•0 comments

Show HN: Souls.directory – SOUL.md templates for AI agent personalities

https://souls.directory
1•thedaviddias•1h ago•0 comments

Real-Time ETL for Enterprise-Grade Data Integration

https://tabsdata.com
1•teleforce•1h ago•0 comments

Economics Puzzle Leads to a New Understanding of a Fundamental Law of Physics

https://www.caltech.edu/about/news/economics-puzzle-leads-to-a-new-understanding-of-a-fundamental...
3•geox•1h ago•1 comments

Switzerland's Extraordinary Medieval Library

https://www.bbc.com/travel/article/20260202-inside-switzerlands-extraordinary-medieval-library
2•bookmtn•1h ago•0 comments

A new comet was just discovered. Will it be visible in broad daylight?

https://phys.org/news/2026-02-comet-visible-broad-daylight.html
4•bookmtn•1h ago•0 comments

ESR: Comes the news that Anthropic has vibecoded a C compiler

https://twitter.com/esrtweet/status/2019562859978539342
2•tjr•1h ago•0 comments

Frisco residents divided over H-1B visas, 'Indian takeover' at council meeting

https://www.dallasnews.com/news/politics/2026/02/04/frisco-residents-divided-over-h-1b-visas-indi...
4•alephnerd•1h ago•5 comments

If CNN Covered Star Wars

https://www.youtube.com/watch?v=vArJg_SU4Lc
1•keepamovin•1h ago•1 comments

Show HN: I built the first tool to configure VPSs without commands

https://the-ultimate-tool-for-configuring-vps.wiar8.com/
2•Wiar8•1h ago•3 comments

AI agents from 4 labs predicting the Super Bowl via prediction market

https://agoramarket.ai/
1•kevinswint•1h ago•1 comments

EU bans infinite scroll and autoplay in TikTok case

https://twitter.com/HennaVirkkunen/status/2019730270279356658
7•miohtama•1h ago•5 comments

Benchmarking how well LLMs can play FizzBuzz

https://huggingface.co/spaces/venkatasg/fizzbuzz-bench
1•_venkatasg•1h ago•1 comments

Why I Joined OpenAI

https://www.brendangregg.com/blog/2026-02-07/why-i-joined-openai.html
35•SerCe•1h ago•31 comments

Octave GTM MCP Server

https://docs.octavehq.com/mcp/overview
1•connor11528•1h ago•0 comments
Open in hackernews

Ask HN: How would you architect a RAG system for 10M+ documents today?

23•Ftrea•2mo ago
I'm tasked with building a private AI assistant for a corpus of 10 million text documents (living in PostgreSQL). The goal is semantic search and chat, with a requirement for regular incremental updates.

I'm trying to decide between:

Bleeding edge: Implementing something like LightRAG or GraphRAG.

Proven stack: Standard Hybrid Search (Weaviate/Elastic + Reranking) orchestrated by tools like Dify.

For those who have built RAG at this scale:

What is your preferred stack for 2025?

Is the complexity of Graph/LightRAG worth it over standard chunking/retrieval for this volume?

How do you handle maintenance and updates efficiently?

Looking for architectural advice and war stories.

Comments

parentheses•2mo ago
If it's < 100M, with vectors of 1024 size, you could fit all of that in ~100G of memory. So, maybe storing it in memory is an easy way to go about it. This ignores a lot of "database problems". If the docs are changing constantly, or uou have other scalability concerns, you may be better off using a "proper" vector db. There have been HN postings which indicate vector db choice matters. Do your research there.
Ftrea•2mo ago
Agreed. Pure in-memory is too risky for us given the persistence requirements and monthly updates. We are definitely going with a 'proper' DB (likely Postgres+pgvector or Weaviate) to handle the state and updates reliably.
walpurginacht•2mo ago
do you have an evaluation in place that necessitates complex stuffs? If not I'd start simple with proven stuffs and collect usage data to determine what's next
Ftrea•2mo ago
This is the sanity check we needed. We don't have a benchmark yet necessitating complex graph architectures. We will stick to 'Proven Stuffs' first: A solid Hybrid Search (Vector + Keyword) baseline. We'll collect usage data and only complicate the stack if the baseline fails on specific queries.
journal•2mo ago
ranked hierarchical pagination and intermediate context control. also, text documents in database or text data in worth of 10 million documents? If you OCR, why not cache result? Also, Lucene white space tokenization is pretty good for dumb exact or close enough to get a filtered result that might fit the context windows better. imagine having to ocr and llm, instantly. i would do everything to avoid architecting a system like that. not sure if you're pointing the right end of the stick at the right problem. are you intending to max out your allowed context? what's going on here? you can usually extract rough set before you llm so ideally you'd never exceed 50% of context. How big of responses do you expect? you have a lot of options, just throw everything at the problem that's easy to implement first and see what sticks. make sure you got terminal access whereever you do this for max flexibility. i obviously prefer aspnet with psql. what kind of data do you need indexed? lets say you have something stupid like origin and destination based on locations and you need geo index and maybe zipcode database, and some intermediate step to calculate assets within radius, calc some distances and make a decision, adding geo to any problem is a nightmare, but fun, but only the first time. cause you know how to do it now but it takes so long you don't want to. if you have terminal and source you have enough space to maneuver updates, it'll end up being probably a one line to execute an update that takes some time to rebuild your solution and then it seems to automatically slide it under the working app i never experienced any problems. as for database schema changes, push out your production release to where the time between schema changes go down to less than 5% or something extreme but be aware there could be schema changes that are hard to implement even later, but after you're in production it's much harder.
Ftrea•2mo ago
Thanks for the tips. We are strictly doing offline processing (docs are already converted to Markdown stored in DB) to avoid any live OCR latency. Also 100% agreed on filtering—we plan to use metadata/keyword filters (Lucene style) to narrow down the search space before hitting the LLM context window. No intention to verify zipcodes though! :)
osigurdson•2mo ago
Are the documents individually large or fairly small - like a page or two each? If they are small docs since you already have Postgres, you can just add the pgvector extension determine what embeddings that you want to use and try it out without committing to much. Maybe add a hash column first so that you can avoid paying to compute the embeddings again if you decide to use a different approach. They are all basically doing the same math to find things so you aren't going to get magically better results with other things. If the docs are larger then you have to do chunking anyway.

Would the 10M documents be searched with a single vector search or would it be pre-filtered by other columns in your table first. If some prefiltering is happening it naturally make things faster. You will likely want to use regular text / tsvector based search as well and potentially feed the LLM with this as well since vector search isn't perfect.

You would then decide if you want to do re-ranking or not before handing it to the final LLM context window. These days, models are pretty good so they will do their own re-ranking to some extent but depends a bit on cost, latency and the quality of result that you are looking for.

Ftrea•2mo ago
This is extremely helpful. Our docs are indeed small (1-2 pages mostly), so distinct chunking might not even be needed—maybe one vector per doc or page. Since we are already on Postgres, pgvector + tsvector (for hybrid search) seems like the most logical MVP. Question: In your experience, does pgvector with HNSW indexes handle the 10M row scale with low latency (<200ms) for real-time chat? Or does a dedicated DB like Weaviate still offer a significant edge there?
mikert89•2mo ago
chunk the documents, use contextual embeddings, put into the vectordb in postgres