The files exist publicly, but they're scattered across nested Google Drive folders in mixed formats: PDFs, images, scanned documents. Manually searching through 20,000+ files is impractical for most people.
This tool lets you query the corpus in natural language. Every answer includes clickable citations to the exact source page, so you can verify against the original. The goal is document discovery, not replacing human verification.
Technical approach: - OCR'd the entire corpus - Chunked and embedded for semantic search - RAG pipeline returns relevant passages with source links - Citations point directly to the House Oversight Committee's Google Drive
I built this because I think public document releases should be usable, not just technically available. Happy to answer questions about the approach.
N_Lens•9h ago
benbaessler•9h ago