frontpage.

Hey HN, I built [ragbandit](https://ragbandit.com), a tool to help you evaluate different document processing pipelines for the retrieval stage of your RAG systems.

I was a bit overwhelmed with the different ways that you can process documents to create embeddings for RAG, so I wanted to create a tool to experiment with different OCR models, refining the OCR results, different chunking methods, and different embedding models.

You can: - search processed documents in the playground - evaluate the retrieval results using an llm-as-judge (not perfect, but can be a useful signal) - compare different datasets (using aggregate metrics or by side by side comparison in the playground)

You can also manually inspect the results of each query, and of each intermediate document processing result.

To get a better idea, check out one of the use cases: https://ragbandit.com/use-cases/optimizing-insurance-documen...

To be completely fair, I haven't added that many options for the different stages of the document processing pipeline! There are tons of features that I'd like to add, but I've already spent quite a bit of time on this, so I'd really appreciate it if you could let me know if this is something that could be useful for you/you find interesting. Would you use something like this?

Tech stack: Postgres (with pgvector), fastapi, [ragbandit-core](https://github.com/MartimChaves/ragbandit-core) (the document processing core is open source), typescript with react, celery for background tasks (and redis as the broker).

It's currently a credits-based subscription with optional top-ups. You can get 1000 credits to try it out (I ask for card info for these 1000 credits as a spam filter).

Thanks, Martim

Show HN: Open-Source Animal Crossing–Style UI for Claude Code Agents

Show HN: Kagento – LeetCode for AI Agents

Show HN: I put an AI agent on a $7/month VPS with IRC as its transport layer

Show HN: For You – AI art floats down a river for strangers to find

Show HN: Anvil – Desktop App for Spec Driven Development

Show HN: Sup AI, a confidence-weighted ensemble (52.15% on Humanity's Last Exam)

Show HN: Foundry: a Markdown-first CMS written in Go

Show HN: MLForge: A Visual Machine Learning Platform

Show HN: Fio: 3D World editor/game engine – inspired by Radiant and Hammer

Show HN: Control Codex via WhatsApp using a Codex plugin

Show HN: Build AI Trading Agents in Cursor/Claude with an MCP Server

Show HN: Turbolite – a SQLite VFS serving sub-250ms cold JOIN queries from S3

Show HN: jid – JSON Incremental Digger v1.1.0 with JMESPath support

Show HN: Cranki – Crosswords meet Anki flashcards

Show HN: AgentGuard – A high-performance Go proxy for AI agent guardrails

Show HN: Minimalist library to generate SVG views of scientific data

Show HN: OpenHelm – OpenClaw but free (using your Claude Code subscription)

Show HN: FileWash – File tools that never see your files

Show HN: Veil – Dark mode PDFs without destroying images, runs in the browser

Show HN: Forkrun – NUMA-aware shell parallelizer (50×–400× faster than parallel)

Show HN: Alumnium – SOTA Browsing for Claude Code

Show HN: AppDesk – Native macOS Client for App Store Connect

Show HN: Grafana TUI – Browse Grafana dashboards in the terminal

Show HN: Optio – Orchestrate AI coding agents in K8s to go from ticket to PR

Show HN: Bottrace – headless CLI debugger for Python, built for LLM agents

Show HN: LLM-Gateway – Zero-Trust LLM Gateway

Show HN: Git Web Manager

Show HN: ClawRun – Deploy AI agents to secure sandboxes with one command

Show HN: A plain-text cognitive architecture for Claude Code

Show HN: Aegis – Security framework for AI agents

Show HN: A tool to create and evaluate document processing pipelines for RAG