Show HN: Vector Inspector – A forensic tool for vector databases

https://vector-inspector.divinedevops.com

1•spitefowl•1h ago

I’ve been working with vector databases a lot recently and I kept running into the same problem: it’s really hard to see what’s going on.

Every provider has its own UI (or none), debugging embeddings is guesswork, and migrating data between systems is painful. I wanted a single tool where I could browse, search, visualize, and compare vector data across providers.

So I built Vector Inspector.

It currently supports:

- Chroma

- Qdrant (local + server)

- Postgres/pgvector

- Pinecone (partial support)

You can browse collections, inspect metadata, run searches, compare distances, visualize embeddings, and debug cases where a vector “should” match but doesn’t. The goal is to make it feel like a forensic tool for vector data — something that helps you understand what your embeddings are actually doing.

There’s an OSS tier (Vector Inspector) and a more advanced version (Vector Studio) with upcoming features like clustering overlays, model-to-model comparison, and provenance coloring.

One of the biggest problems I kept hitting was missing provenance. You load a collection and you have no idea:

- what model produced these vectors

- whether they were normalized

- whether some vectors came from a different model entirely

- whether the source text was cleaned or chunked differently

Without that context, debugging is almost impossible. Vector Inspector tries to make provenance a first-class concept: if the metadata exists, it shows it; if it’s missing, it makes that visible too, so you can actually debug your embeddings instead of guessing.

I’d love feedback from the HN crowd — especially around:

- workflows you’d want for multi-provider setups

- what’s missing for real debugging

- how you’d expect migrations to work

- any pain points you’ve hit with embeddings or vector DBs

- how you would like to work with creation workflows

Repo: https://github.com/anthonypdawson/vector-inspector

Landing page: https://vector-inspector.divinedevops.com

Comments

spitefowl•1h ago

Happy to answer questions or go deeper on anything. A few notes that might help set expectations:

- Provider support is solid for Chroma, Qdrant, and Postgres/pgvector. Pinecone works for most read workflows but isn’t full parity yet.

- The tool is designed to be “forensic first”: surfacing metadata, provenance, and mismatches rather than hiding them behind abstractions.

- Visualization is intentionally minimal right now; clustering overlays and model-to-model comparison are in progress.

- I’m especially interested in how people think about creation workflows (re-embedding, mixed-model collections, reproducibility, etc.) since teams handle this very differently.

Just to set expectations: it’s basically been me running it so far. PyPI has been getting a lot of traffic, but real-world usage is still very small. I’m really curious how it behaves with other people’s data and workflows — that feedback is incredibly helpful at this stage.

If you hit anything confusing, missing, or surprising, I’d love to hear it. Real-world debugging stories are gold for shaping the next set of features.

Building multiple small AI tools instead of one big SaaS

Quantitative developmental biology in vitro using micropatterning(2021)

Show HN: TypeSync – Generate TypeScript type guards from your database schema

Coding Agent VMs on NixOS with Microvm.nix

Systems of pattern formation within developmental biology(2021)

Agency – Open-source multi-agent platform for autonomous software development

The AI Grand Prix

SkyNet Project

Data Structure Alignment

Ask HN: What serious task have you accomplished with Moltbot / OpenClaw?

I put AoE II sounds in my Claude Code Worktree/Sandbox Manager and it's glorious

Scaling markets with non-human operators

Show HN: Wikipedia as a doomscrollable social media feed

Artemis II: A Step Towards Permanent Human Activity Beyond Low Earth Orbit

Oracle to Raise Up to $50B This Year for Cloud Investment

The Physics of Glitches: Analyzing 'The Backrooms' as a Systems Failure

We built an AI sysadmin that works (and won't delete /usr)

Time Machine-style Backups with rsync (2018)

VoidLink: The Cloud-Native Malware Framework Weaponizing Linux Infrastructure

Testing your fit for policy careers (2024)

It's All About the Pixel Economy

Before ChatGPT-HW debate there were other "If students use X to do HW" debates

Selfhosted Bible PWA

Otava: Change Detection for Continuous Performance Engineering

History and Timeline of the Proco Rat Pedal (2021)

Show HN: I made a voice cloning Discord bot

Two kinds of AI users are emerging. The gap between them is astonishing

How One Line of Python Triggers 12,000 Lines of Code [video]

Show HN: Cut Your Pinecone Bill by 50% (Open Source Cost Auditor)

Aliasing and the Heisenberg Uncertainty Principle