Show HN: Antfly: Distributed, Multimodal Search and Memory and Graphs in Go

51•kingcauchy•3h ago

Hey HN, I’m excited to share Antfly: a distributed document database and search engine written in Go that combines full-text, vector, and graph search. Use it for distributed multimodal search and memory, or for local dev and small deployments.

I built this to give developers a single-binary deployment with native ML inference (via a built-in service called Termite), meaning you don't need external API calls for vector search unless you want to use them.

Some things that might interest this crowd:

Capabilities: Multimodal indexing (images, audio, video), MongoDB-style in-place updates, and streaming RAG.

Distributed Systems: Multi-Raft setup built on etcd's library, backed by Pebble (CockroachDB's storage engine). Metadata and data shards get their own Raft groups.

Single Binary: antfly swarm gives you a single-process deployment with everything running. Good for local dev and small deployments. Scale out by adding nodes when you need to.

Ecosystem: Ships with a Kubernetes operator and an MCP server for LLM tool use.

Native ML inference: Antfly ships with Termite. Think of it like a built-in Ollama for non-generative models too (embeddings, reranking, chunking, text generation). No external API calls needed, but also supports them (OpenAI, Ollama, Bedrock, Gemini, etc.)

License: I went with Elastic License v2, not an OSI-approved license. I know that's a topic with strong feelings here. The practical upshot: you can use it, modify it, self-host it, build products on top of it, you just can't offer Antfly itself as a managed service. Felt like the right tradeoff for sustainability while still making the source available.

Happy to answer questions about the architecture, the Raft implementation, or anything else. Feedback welcome!

Comments

thefogman•2h ago

Interesting project.

I’ve got a project right now, separate vector DB, Elasticsearch, graph store, all for an agent system.

When you say Antfly combines all three, what does that actually look like at query time? Can I write one query that does semantic similarity + full-text + graph traversal together, or is it more like three separate indexes that happen to live in the same binary?

Does it ship with a CLI that's actually good? I’m pivoting away from MCP. Like can I pipe stuff in, run queries, manage indexes from the terminal without needing to write a client? That matters more to me than the MCP server honestly.

And re: Termite + single binary, is the idea that I can just run `antfly swarm`, throw docs and images at it, and have a working local RAG setup with no API keys? If so, that might save me a lot of docker-compose work.

Who's actually running this distributed vs. single-node? Curious what the typical user experience looks like.

kingcauchy•2h ago

Thanks for the awesome questions!!

Exactly the use case I built it for! I wanted a world where you could build your indexes and the query planner could just be smart enough to use them in a single query. I've not quite nailed down the agentic query planner side 100% (it's getting there), but the JSON query DSL allows you to pipeline, join, fuse all the full-text, semantic, graph, reranking, pruning (score/token pruning) all in one query.

The CLI is my primary development tool with antfly, I am definitely looking for feedback on what people would like to see there, it's a little chonky with the flags --pruner e.g. requires writing the JSON for the config because I didn't want users to have to memorize 1000 subflags. It's definitely a first class citizen.

With respect to "Termite + single binary" that's exactly right, Termite handles chunking, multimodal chunking, embeddings (sparse + dense), reranking, fused chunking/embedding models, and we're excitedly getting more support for a variety of onnx based llms/ner models to help with data extraction use cases (functiongemma/gliner2/etc) so you don't have to setup 10 different services for testing vs deployment.

We run Antfly ourselves for our https://platform.searchaf.com (cheeky search AntFly) Algolia style search product in a distributed setup, and some users run Antfly in single node with large instances (more at the Postgres size datasets with millions of documents vs. large multitenant depoys). But we really wanted to build something with a more seamless experience of going back and forth between a distributed vs single node instance than elasticsearch or postgres can offer.

Hope that helps! Let me know if I can help you with anything!

wiresurfer•1h ago

A quick note, on platform.searchaf.com The account creation process hits a snag with verify-email links received on email giving a 404. hope it helps.

On a parallel note, It would be nice to put an architecture diagram in the github repo. Are there particular aspects of the current implementation which you want to actively improve/rearchitect/change?

I agree with the goals set out for the project and can testify that elasticsearch's DX is pretty annoying. Having said that, distributed indexing with pluggable ingestion/query custom indexes may be a good goal to aim for. - Finite State Transducers (FST) or Finite state automata based memory efficient indexes for specific data mimetypes - adding hashing based search semantic search indexes.

And even changing the indexer/reranker implementation would help make things super hackable.

kingcauchy•1h ago

Oh thanks for the 404 on the verify link (I abstracted out the auth OIDC for cross domain login and must have missed a path).

Yes good call, I tried to start that on the website with a react-flows based architectural flow chart a little bit but it's a bit high level, and not consumable directly in github markdown files but I'll work on that!

That's exactly the direction I've been working on, the reranking, embedders and chunkers are all plugable and the schema design (using jsonschema for our "schema-ish" approach allows for fine-grained index backend hints for individual data types etc.) I'll work on getting a good architecture doc up today and tomorrow!

jnstrdm05•2h ago

This looks sick!

Did you build this for yourself?

kingcauchy•2h ago

I built this for myself because I hated running a large ElasticSearch instance at work and wanted something that would autoscale and something that allowed for reindexing data. I also had a lot of experience running a large BigTable/Elasticsearch custom graph database I thought could be unified into a single database to cut costs. Started adding an embedding index for fun based on some Google papers and now here we are!

perfmode•2h ago

what google papers?

kingcauchy•2h ago

Not strictly google but microsoft/bing too, here's the top ones from my notes:

https://arxiv.org/abs/2410.14452 spfresh, https://arxiv.org/abs/2111.08566 spann, https://arxiv.org/abs/2405.12497 rabitq, https://arxiv.org/abs/2509.06046 diskann,

I have a variety of blogs that I used too and reference implementations!

It's a Rabit[Q]uantized Hierchical Balanced Clustering algorithm we use for the vector index and we use a chunked segment index for the sparse index if you're curious! Happy to discuss more!

perfmode•2h ago

Curious if you’re using any SIMD optimizations for numerical calculations.

kingcauchy•1h ago

Yes we do use SIMD heavily! https://github.com/ajroetker/go-highway I also added SME support for Darwin for most algorithms. We use it in the full-text index, all over the vector indexes and heavily for the ml inference we do in go especially.

rigorclaw•2h ago

what's the typical migration path look like for teams coming off elasticsearch? full reindex or can you do it incrementally?

kingcauchy•1h ago

Definitely open to working with you on supporting even better tooling for this as I imagine many different "styles" of migration will be necessary.

The number 1 supported migration path for users though is one of my personal favorite features of antfly which is the linear merge api, which allows you to incrementally reconcile an external pageable datasource with antfly at the pace you want while also getting the benefit of batching! We support index templates just like ES and the ability to change you schema and antfly manages the full-text reindex for you. If you're looking at migrating your embeddings in Elastic or another vectordb we can also support that! Let us know :)

epsniff•54m ago

Yeah, that is a pretty sweet feature. So you can keep two databases in-sync while you're doing your migration until you finish the cut over.

didip•1h ago

in the query_test.go, I don’t see how the hybrid search is being exercised.

For fun I am making hybrid search too and would love to see how you merge the two list (semantic and keyword) and rerank the importance score.

kingcauchy•1h ago

There's some examples in the quickstart on the website but I'll add an explicit e2e example case for that too. Otherwise the tests for that are a little lower level in the code! I'll add the RSF (merging of the two lists) example for that too!! Thanks for the feedback.

kingcauchy•55m ago

I've added a specific example for that using the go-sdk https://github.com/antflydb/antfly/pull/5 here!

Linell•1h ago

This is very interesting! I noticed that your TypeScript SDK link results in a 404: https://antfly.io/docs/sdks -> https://github.com/antflydb/antfly-ts

kingcauchy•1h ago

Thanks! Fixed that up!

SkyPuncher•1h ago

Can you help me understand what type of practical features Graph Traversal unlocks?

I've seen it on a few products and it doesn't click with me how people are using it.

kingcauchy•1h ago

I can't speak for everyone, knowledge graphs are the "new hotness" of the ai space (RAG and MCP are seeing a lull in their hype cycles I guess). But I've used graphs professionally for a long time to connect relationships that SQL normal forms have trouble expressing non-recursively. E.g. I used graphs to define identity relationships between data sources hierarchically, and then had a another graph relationship on top of that to define connections between those identities, user at one level and organizations at the next. Graphs as indexes allow you to express arbitrary relationships between data to allow for more efficient lookups by a database. Some folks use it to express conceptual relationship between data for AI now, so if I have a bunch of images stored in google drive, I might want to abstract the concept of pets and pets have relationship with a human etc. then my database queries for looking up all pictures related to the dog-pets owned by some human becomes a tractable search instead of a scan of the corpus!

epsniff•48m ago

The one area I keep seeing knowledge graphs come up are for: Product Knowledge Graphs (PKGs), which are a centralized, semantic, and highly interconnected data structure that brings together information about products, customers, and their interactions into a single, comprehensive "360-degree" view. Basically, it's the idea of combing through all the data (CRMs, codebases, Ticketing System, Churn Management System, sales calls, ...) that the company has digitally about their customers, and building one giant knowledge graph that they can use to determine a bunch of business intelligence use cases, or using it to power how to create new features. Then you slap an answer bar or semantic search on top of it, and you have a powerful way of getting insights or doing gap analysis on your product versus your customer needs.

Anyway, that's just one example of why you might want to use a knowledge graph. I'm sure there are literally hundreds, of more examples.

mrprincerawat•53m ago

Was thinking to create something similar, well done!

Show HN: Antfly: Distributed, Multimodal Search and Memory and Graphs in Go

Show HN: March Madness Bracket Challenge for AI Agents Only

Show HN: Crust – A CLI framework for TypeScript and Bun

Show HN: Horizon – GPU-accelerated infinite-canvas terminal in Rust

Show HN: Flowershow Publish Markdown in seconds. Hosted, free, zero config

Show HN: Mech keyboard sounds driven by a hidden accelerometer in MacBooks

Show HN: Oxyde – Pydantic-native async ORM with a Rust core

Show HN: Claude Code skills that build complete Godot games

Show HN: FireClaw – Open-source proxy defending AI agents from prompt injection

Show HN: Updated version of my interactive Middle-Earth map

Show HN: Thermal Receipt Printers – Markdown and Web UI

Show HN: F0lkl0r3.dev – a searchable, interlinked map of computing history

Show HN: Unsloth Studio - Local Fine-tuning, Chat UI

Show HN: Droeftoeter, a Terminal Coding Toy

Show HN: Zeroboot – sub-millisecond VM sandboxes using CoW memory forking

Show HN: M68k assembly emulator that runs in the browser

Show HN: Drakkar.one – Google Maps embed replacement, no API keys, GDPR-ready

Show HN: I built a React SDK to control apps with voice, gaze and gestures

Show HN: Hecate – Call an AI from Signal

Show HN: Hackerbrief – Top posts on Hacker News summarized daily

Show HN: Signet – Autonomous wildfire tracking from satellite and weather data

Show HN: Basalt – IDE-like documentation for infrastructure and API

Show HN: GDSL – 800 line kernel: Lisp subset in 500, C subset in 1300

Show HN: What if your synthesizer was powered by APL (or a dumb K clone)?

Show HN: Sprinklz.io – An RSS reader with powerful algorithmic controls

Show HN: Android Native Reverse Tools

Show HN: GitAgent – An open standard that turns any Git repo into an AI agent

Show HN: Ichinichi – One note per day, E2E encrypted, local-first

Show HN: Han – A Korean programming language written in Rust

Show HN: Goal.md, a goal-specification file for autonomous coding agents

Show HN: Antfly: Distributed, Multimodal Search and Memory and Graphs in Go

Show HN: March Madness Bracket Challenge for AI Agents Only

Show HN: Crust – A CLI framework for TypeScript and Bun

Show HN: Horizon – GPU-accelerated infinite-canvas terminal in Rust

Show HN: Flowershow Publish Markdown in seconds. Hosted, free, zero config

Show HN: Mech keyboard sounds driven by a hidden accelerometer in MacBooks

Show HN: Oxyde – Pydantic-native async ORM with a Rust core

Show HN: Claude Code skills that build complete Godot games

Show HN: FireClaw – Open-source proxy defending AI agents from prompt injection

Show HN: Updated version of my interactive Middle-Earth map

Show HN: Thermal Receipt Printers – Markdown and Web UI

Show HN: F0lkl0r3.dev – a searchable, interlinked map of computing history

Show HN: Unsloth Studio - Local Fine-tuning, Chat UI

Show HN: Droeftoeter, a Terminal Coding Toy

Show HN: Zeroboot – sub-millisecond VM sandboxes using CoW memory forking

Show HN: M68k assembly emulator that runs in the browser

Show HN: Drakkar.one – Google Maps embed replacement, no API keys, GDPR-ready

Show HN: I built a React SDK to control apps with voice, gaze and gestures

Show HN: Hecate – Call an AI from Signal

Show HN: Hackerbrief – Top posts on Hacker News summarized daily

Show HN: Signet – Autonomous wildfire tracking from satellite and weather data

Show HN: Basalt – IDE-like documentation for infrastructure and API

Show HN: GDSL – 800 line kernel: Lisp subset in 500, C subset in 1300

Show HN: What if your synthesizer was powered by APL (or a dumb K clone)?

Show HN: Sprinklz.io – An RSS reader with powerful algorithmic controls

Show HN: Android Native Reverse Tools

Show HN: GitAgent – An open standard that turns any Git repo into an AI agent

Show HN: Ichinichi – One note per day, E2E encrypted, local-first

Show HN: Han – A Korean programming language written in Rust

Show HN: Goal.md, a goal-specification file for autonomous coding agents

Show HN: Antfly: Distributed, Multimodal Search and Memory and Graphs in Go

Comments