frontpage.

Show HN: A CLI to query the unsealed court files with local LLMs

https://github.com/simulationship/epstein-search

2•simulationship•1h ago

To succeed on Hacker News (HN), you have to completely drop the "marketing" and "YouTube hook" tone. The HN community heavily downvotes clickbait, sensationalism, and marketing fluff. They love "Show HN" posts, open-source projects, CLI tools, local LLMs, and clever technical solutions to messy data problems (like parsing poorly scanned government PDFs). Here are the best titles and the exact description (to use either as a text post or your first comment) tailored specifically for the Hacker News audience. The Hacker News Titles Choose one of these. On HN, titles should be strictly factual, descriptive, and avoid emojis. Option 1 (The Classic HN Format - Recommended): Show HN: epstein-search – A CLI to query the unsealed court files with local LLMs Option 2 (Focus on the tech pipeline): Show HN: I built a local RAG CLI to make the Epstein PDFs searchable Option 3 (Straight to the point): Show HN: epstein-search – Query the Epstein document dumps offline via CLI The Hacker News Description (First Comment or Text Body) If you submit the GitHub URL directly, immediately post this as the first comment. If you submit a text post, put this in the body. Keep the tone humble, technical, and open to feedback. Hi HN, When the Epstein court documents and flight logs were unsealed, they were released the way most legal drops are: thousands of pages of messy, poorly scanned, unsearchable PDFs. Standard Ctrl+F doesn't work well due to OCR errors, and the sheer volume makes manual parsing a nightmare. To solve this, I built epstein-search, an open-source Python CLI tool that lets you search and synthesize the documents using a Retrieval-Augmented Generation (RAG) pipeline directly in your terminal. How it works: It parses and chunks the original unsealed PDF files. You can run queries against the dataset using API-based models (OpenAI/Anthropic) if you want speed. Privacy-first: If you don't want your queries logged by a third-party API, you can point it directly to a local model (via Ollama or Llama.cpp) to run the entire search and retrieval process 100% offline. The goal was to make this data accessible to researchers and OSINT investigators without requiring them to manually read thousands of pages of court dockets or hand over their search queries to OpenAI. Repo is here: https://github.com/simulationship/epstein-search

The perks of being a mole rat

The Tax Nerd Who Bet His Life Savings Against DOGE

Remarkable reusable liquid stores solar energy like bottled sunlight

The x402 Service Discovery – runtime endpoint finder for the agent economy

The gold plating of American water

Hardworking teams still miss the goal

Jane Street faces claims of insider trading that sped up Terraform's collapse

Polsia: AI That Runs Your Company

The Peace Corps is recruiting volunteers to sell AI to developing nations

Body Futurism

Show HN: Can we simplify front end again? Meet DynamoJS

Best unrestricted AI video tools?

Show HN: Naperville Library Spy

Yabai: A tiling window manager for macOS based on binary space partitioning

Diamond owl swoops in with new method to keep electronics cool

The cartography of reason

Show HN: Cosmos-Reason2-2B on Nano Super

Connect your AI agent to every chat platform

Software companies buying software: a story of ecosystems and vendors

Engineering heat-tolerant, high-yield rice for a warming planet

Kalshi suspends users for insider trading

Hoot v0.8 released: new REPL enabling Scheme live coding in the browser

Trending Next.js Packages

A Chinese official's use of ChatGPT revealed a global intimidation operation

CSS is too powerful now [video]

It's Not Magic, It's Metapragmatic: Memetics Through the Lens of Semiotics

Show HN: Opty – A Zig-based HDC that reduces token use by up to 90%

Was the initial Jewish resettlement of Palestine colonialism?

Claude Cowork: Scheduled Tasks

Scam susceptibility is associated with accelerated onset of Alzheimer's dementia