frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: RAG-powered search tool for 20k+ Epstein files

https://epfiles.ai
1•benbaessler•10h ago
I built epfiles.ai to make the U.S. House Oversight Epstein document release actually searchable.

The files exist publicly, but they're scattered across nested Google Drive folders in mixed formats: PDFs, images, scanned documents. Manually searching through 20,000+ files is impractical for most people.

This tool lets you query the corpus in natural language. Every answer includes clickable citations to the exact source page, so you can verify against the original. The goal is document discovery, not replacing human verification.

Technical approach: - OCR'd the entire corpus - Chunked and embedded for semantic search - RAG pipeline returns relevant passages with source links - Citations point directly to the House Oversight Committee's Google Drive

I built this because I think public document releases should be usable, not just technically available. Happy to answer questions about the approach.

Demo: https://youtube.com/watch?v=7sQgRvwK3LE

Comments

N_Lens•9h ago
I'd love to pose some quick questions (If I had more time) that collate the relevant data from the files, such as the contents linking DJT, or any other high profile individual.
benbaessler•9h ago
Yes that should definitely be doable, let me know if you need any help with anything.
benbaessler•9h ago
Here's what it does: - Natural-language search over the full corpus - Results are document discovery, not “trust me” summaries - Every result includes clickable citations that jump to the exact source page in the committee’s Google Drive so you can verify context quickly

Some useful test queries: - “Find documents mentioning [person/org] in connection with flights / schedules / contacts” - “Show mentions of ‘massage’, ‘modeling’, ‘recruiting’, ‘Palm Beach’, ‘New York’, ‘Little St. James’” - “What documents reference [date range] and [location]”

Known limitations: - OCR noise on low-quality scans - Names/aliases can be inconsistent; citations are the ground truth

TikTok has signed the deal to sell its US entity to American investor group

https://www.cnn.com/2025/12/18/tech/tiktok-signs-us-sale-deal
1•alienreborn•1m ago•0 comments

The Biggest Breakthroughs in Mathematics: 2025 [video]

https://www.youtube.com/watch?v=hRpcWpAeWng
1•pykello•1m ago•0 comments

Fulton surface-to-air recovery system

https://en.wikipedia.org/wiki/Fulton_surface-to-air_recovery_system
1•ColinWright•1m ago•0 comments

Compute Trends Across Three Eras of Machine Learning (2022)

https://arxiv.org/abs/2202.05924
1•measurablefunc•3m ago•0 comments

Tell HN: The Gospel According to forever list

1•bikamonki•6m ago•0 comments

Closure of Greenlandic Wikipedia

https://meta.wikimedia.org/wiki/Proposals_for_closing_projects/Closure_of_Greenlandic_Wikipedia
1•geox•7m ago•0 comments

Habits Beat Motivation

https://dontbreakprod.com/posts/habits-beat-motivation
1•dorkrawk•7m ago•0 comments

Purdue makes 'AI working competency' a graduation requirement

https://www.theregister.com/2025/12/17/purdue_require_ai_working_competency/
1•Bender•10m ago•0 comments

Jassy taps 27-year Amazon veteran to run AGI org which is now definitely a thing

https://www.theregister.com/2025/12/17/jassy_taps_peter_desantis_to_run_agi/
2•Bender•10m ago•0 comments

Thunderbird Expanding Microsoft Exchange and Protocol Support for 2026

https://www.phoronix.com/news/Thunderbird-2026-Plans
2•Bender•11m ago•0 comments

Ask HN: How do you market a small project?

1•thebigship•11m ago•0 comments

Great Ideas in Theoretical Computer Science

https://www.cs251.com/
1•sebg•13m ago•0 comments

TikTok signs agreement to create new U.S. joint venture

https://www.cnbc.com/2025/12/18/tik-tok-us-sale-china.html
3•mfiguiere•13m ago•0 comments

GrapheneOS blocks WhatsApp 0-day 0-click RCE exploit

https://twitter.com/MetroplexGOS/status/1982163802188575178
4•akyuu•15m ago•1 comments

Evidence shows deadly Brown, MIT shootings may be linked, sources say

https://www.foxnews.com/us/evidence-shows-deadly-brown-mit-shootings-may-linked-sources-say-report
2•perihelions•17m ago•0 comments

Trump signs executive order reclassifying cannabis

https://www.cnbc.com/2025/12/18/trump-pot-reclassification-cannabis-stocks-medicare-cbd.html
5•evo_9•18m ago•0 comments

LLM-Interview-Questions-and-Answers: 100 LLM interview questions with answers

https://github.com/KalyanKS-NLP/LLM-Interview-Questions-and-Answers-Hub
1•simonpure•19m ago•0 comments

Results-only audit: interface transport shows 15x redundancy, 90% fewer hotspots

https://github.com/johnoliveiradev/SentinelHotSpot
1•johnoliveiradev•20m ago•1 comments

BirdRadio: Listen to bird sounds around the world

https://bird-radio.pages.dev
1•chill_ai_guy•21m ago•0 comments

What's New in Ruby 4.0

https://nithinbekal.com/posts/ruby-4-0/
3•bkudria•23m ago•0 comments

Trained LLMs exclusively on pre-1913 texts

https://github.com/DGoettlich/history-llms
4•iamwil•25m ago•0 comments

Show HN: I Hacked Together a GUI for Building Terminal Commands

https://zilberlex.github.io/thezilber-app-template/dynamic-form
1•theZilber•25m ago•0 comments

Going Beyond AlphaEvolve in Agent Scientific Discovery

https://arxiv.org/abs/2512.13857
1•kyuksel•28m ago•1 comments

Security concerns over system at heart of digital ID

https://www.bbc.co.uk/news/articles/c5y930x81wpo
6•lifeisstillgood•32m ago•0 comments

Show HN: Screenshot2Charts – Turn screenshots or CSV into beautiful charts

https://screenshot2charts.com
2•reallynattu•34m ago•1 comments

How can I buy Office 365 without Copilot

https://learn.microsoft.com/en-us/answers/questions/5417908/how-can-i-buy-office-365-without-copilot
6•itronitron•34m ago•0 comments

Show HN: Mdgen – A browser tool to unify documentation into static HTML

https://mdgen.hund.studio/
1•ernestobellei•36m ago•0 comments

School security AI flagged clarinet as a gun. Exec says it wasn't an error

https://arstechnica.com/tech-policy/2025/12/florida-schools-plan-to-vastly-expand-use-of-ai-that-...
4•milkglass•36m ago•1 comments

Waterfox browser goes AI-free, targets the Firefox faithful

https://www.theregister.com/2025/12/18/firefox_no_ai_alternative_waterfox/
5•jjgreen•39m ago•1 comments

Show HN: Improved real-time news propagation tracking via source citation graphs

https://yandori.io/news-flow/lineage/
1•antiochIst•40m ago•1 comments