This repo includes ground-truth datasets, queries, and relevance judgments for 3 hard domains:
• Financial documents (SEC filings with tables, charts, footnotes) • Medical device IFUs (diagrams, nested sections, regulatory language) • Educational videos (temporal alignment, code + lecture context)
Runs in ~1 second with demo data. Leaderboards + evaluator included. Contributions welcome.