It sounds like they're really targeting the logging search store part of ELK, which can be a perfectly fine objective, but no need to mislead audiences since they will find out and then you've made an enemy
Very misleading title
A few points that came up in the thread and are worth clarifying:
- We do get compared to Elasticsearch a lot. While we support some of its APIs, Manticore isn't a drop-in replacement. We've focused on performance, simplicity, and keeping things open-source without vendor lock-in. Our own SQL-based query language and REST endpoints are part of that philosophy. - @mdaniel was right to question the "drop-in" wording — that's not our goal. - As @sandstrom pointed out, tools like Typesense and Meilisearch are part of this evolving search space. We see Manticore fitting in where users want powerful full-text and vector search capabilities with lower resource overhead and SQL capabilities (we support JSON too though)
We'd love to hear from you: - What are your main use cases for search or log indexing? - Which Elasticsearch features (if any) are absolutely essential for you? - Are there performance comparisons or scaling challenges you'd like to see addressed?
Happy to answer any questions or dive deeper.
could you ELI5 the query language and TD-IDF?
(Being lazy, I am happy to look into this myself lol.)
Manticore Search's query language is more expressive than Lucene's. While Lucene supports basic boolean logic, field search, phrase queries, wildcards, and proximity, Manticore adds many powerful operators. These include quorum matching (/N), strict word order (<<), NEAR/NOTNEAR proximity, MAYBE (soft OR), exact form (=word), position anchors (^word, word$), full regular expressions, and more. Manticore uses SQL-style syntax with a MATCH() clause for full-text search, making it easier to combine text search with filters and ranking.
Camera and camera lens names. I tried mellisearch (1-2 years ago), and while I loved the simplicity (I barely understand what I threw together with many, many lines of C# code for elastic search; this is partially on ES, but clearly on me as well), it was not good at getting results.
Names like "Lumix DC-S1IIE", "DSC-RX100 VIIA", or "FE 50-150 mm F2 GM (SEL50150GM)" do not quite work with default tokenizers and analyzers. Of course that is for product names, for full text queries still need to use normal language rules… except for product names showing up in the text, so now I need multiple indexes for every field, and searching different sub-fields sometimes with different query analyzers.
It was a lot of trial and error getting ES to both find what was searched for, but also be typo tolerant. It’s very easy getting far too many results, and bad scoring for fuzzy queries.
So a bit of a special case, but something the customization capabilities of ES support pretty well.
Luckily, our dataset is rather small, maybe 100k documents, so scalability is not a problem.
- choosing which characters should be treated as token characters, and using the rest as token separators
- defining "blend chars" — for example, the hyphen (-) could make sense as both a separator and a non-separator in your case
- or optionally adding it to the ignore_chars list
- there's also regexp_filter to process tokens when indexing and searching
That said, setting things like this up perfectly is always tricky with any search engine, because the words and punctuation in real data often don't follow regular patterns. It's especially difficult when you want to find "abc def" by "ab cd ef" which may be a common situation in your case.
Does this mean you've at least implemented every API that Kibana requires?
To me, storing and searching logs is quite different from most other search use-cases, and it's not obvious that they should be handled by the same piece of software.
For example, tokenization, stemming, language support many other things and are basically useless in log search. Also, log search is often storing a lot of data, and rarely retrieving it (different usage pattern from many search use-cases which tend to be less write-heavy and more about reads).
I know ElasticSearch has had success in both, but if I were Manticore/Typesense/Meilisearch I'd probably just skip logs altogether.
Loki, QuickWit and other such tools are likely better suited for logs.
- https://github.com/quickwit-oss/quickwit - https://github.com/grafana/loki
Recently had a look at Tantivy as well, although compared to raw lucene, their perf is actually inferior. Wonder if there are specific benchmarks here which measure performace and if they compared tail latencies as opposed to averages.
Manticore has a modern multithreading architecture with efficient query parallelization that fully utilizes all CPU cores. It supports real-time indexing - documents are searchable immediately after insertion, with no need to wait for flushes or refreshes.
It uses row-wise storage optimized for small to large datasets, and for even larger datasets that don’t fit into memory, there's support for columnar storage through the Manticore Columnar Library.
Secondary indexes are built automatically using the PGM-index (Piecewise Geometric Model index), which enables efficient filtering and sorting by mapping keys to their memory locations. The cost-based query optimizer uses statistics about the data to choose the most efficient execution plan for each query.
Manticore is SQL-first: SQL is its native syntax, and it speaks the MySQL protocol, so it works out of the box with MySQL clients.
It's written in C++, starts quickly, uses minimal RAM, and avoids garbage collection — which helps keep latencies low and stable even under load.
As for benchmarks, there's a growing collection of them at https://db-benchmarks.com, where Manticore is compared to Elasticsearch, MySQL, PostgreSQL, Meilisearch, Typesense, and others. The results are open and reproducible.
We built a custom search engine on top of Elasticsearch. Our query builder regularly constructs optimised queries that would be impossible to implement in any of the touted alternatives or replacements, which almost always focus on simple full text search, because that’s everything the developers ever used ES for. There’s a mindboggingly huge number of additional features that you need for serious search engines though, and any contender will have to support at least a subset of these to deserve that title in the first place.
I’m keeping an eye on the space, but so far, I’m less than impressed with everything I’ve seen.
What was the reason for the fork, and in what ways does Manticore Search differ from Sphinx today?
The auto-bolding of query terms in responses is quite convenient and has allowed me to skip annoying little regexes many times. Maybe other engines have it too and I never noticed?
sandstrom•7h ago
(both are also trying to replace Algolia, because both have cloud offerings)
robertlagrant•7h ago
[0] https://opensearch.org
smarx007•6h ago
mdaniel•6h ago
Plus, given that AWS is currently hosting Open Search, they are not incentivized to sit on their laurels when it comes to modern features or stability
Keyframe•5h ago
sontek•5h ago
Edit: Nevermind, in another part of this thread the maintainer said:
Which conflicts with the README: "Drop-in replacement for E in the ELK stack"snikolaev•5h ago
Sorry for the confusion :)
merb•2h ago
(Logstash can basically ingest and output to everything…)
entropyie•5h ago
I have no affiliation.