Show HN: ShapedQL – A SQL engine for multi-stage ranking and RAG

https://playground.shaped.ai

80•tullie•1w ago

Hi HN,

I’m Tullie, founder of Shaped. Previously, I was a researcher at Meta AI, worked on ranking for Instagram Reels, and was a contributor to PyTorch Lightning.

We built ShapedQL because we noticed that while retrieval (finding 1,000 items) has been commoditized by vector DBs, ranking (finding the best 10 items) is still an infrastructure problem.

To build a decent for you feed or a RAG system with long-term memory, you usually have to put together a vector DB (Pinecone/Milvus), a feature store (Redis), an inference service, and thousands of lines of Python to handle business logic and reranking.

We built an engine that consolidates this into a single SQL dialect. It compiles declarative queries into high-performance, multi-stage ranking pipelines.

HOW IT WORKS:

Instead of just SELECT , ShapedQL operates in four stages native to recommendation systems:

RETRIEVE: Fetch candidates via Hybrid Search (Keywords + Vectors) or Collaborative Filtering. FILTER: Apply hard constraints (e.g., "inventory > 0"). SCORE: Rank results using real-time models (e.g., p(click) or p(relevance)). REORDER: Apply diversity logic so your Agent/User doesn’t see 10 nearly identical results.

THE SYNTAX: Here is what a RAG query looks like. This replaces about 500 lines of standard Python/LangChain code:

SELECT item_id, description, price

FROM

  -- Retrieval: Hybrid search across multiple indexes

  search_flights("$param.user_prompt", "$param.context"),

  search_hotels("$param.user_prompt", "$param.context")

WHERE
-- Filtering: Hard business constraints price <= "$param.budget" AND is_available("$param.dates")
ORDER BY
-- Scoring: Real-time reranking (Personalization + Relevance) 0.5 * preference_score(user, item) + 0.3 * relevance_score(item, "$param.user_prompt")
LIMIT 20
If you don’t like SQL, you can also use our Python and Typescript SDKs. I’d love to know what you think of the syntax and the abstraction layer!

Comments

thorax•1w ago

RE: syntax For casual use, I kinda always liked the whole MATCH/AGAINST syntax for old school Innodb, though obviously things have changed a lot since those days. But it felt less like calling embedded functions and more like extending SQL’s grammar.

Regarding the rest, it seems like a reasonable approach at first tinker.

tullie•1w ago

Makes sense. Implementation simplicity was part of the reason we didn't go for this. Currently the language is a transpiler, which maps SELECT and FROM into the retrieval stages, WHERE as post-filter stage, ORDER BY as score stage and REORDER BY to the final stage reranker. Because the text lookup is pushed through to the retriever we put it there. In the future we'll have more logic in the language compilation process which means that we can move things around a bit and under the hood they all get pushed to where they need to anyway.

refset•1w ago

Neat examples, and I agree that extending SQL like this has real potential. Another project along very similar lines is https://github.com/ryrobes/larsql

alexpadula•1w ago

Fairly easy to extend SQLite, Postgres and MariaDB/MySQL!

Curious what relational database do you @refset use? Is the code open source? Is the engine from scratch? What general dialect does it support?

Cheers!

refset•1w ago

I work on https://github.com/xtdb/xtdb which is broadly Postgres-compatible with a few key SQL extensions (SQL:2011 bitemporal tables + immutability, first-class nested data, pipeline syntax, etc). Built on Arrow and the JVM but is otherwise mostly from scratch.

XTDB is perhaps not directly relevant to the topic at hand, but I am a firm believer that ML workflows can benefit from robust temporal modelling.

tullie•1w ago

I've been loving all these projects that are integrating LLMs/encoding directly into the language. There's so much power there.

Someone shared with me these the other day and we're inspired to add more remote LLM calls directly into ShapedQL now: https://github.com/asg017/sqlite-rembed https://github.com/asg017/sqlite-lembed

jiwidi•1w ago

Great potential! Love the idea

hrimfaxi•1w ago

If I upload my own data, who exactly is it shared with? I can't find a list of subprocessors and this line in the privacy policy is alarming:

> We’ll whenever feasible ask for your consent before using your Personal information for a purpose that isn’t covered in this Privacy Policy.

tullie•1w ago

Subprocessors are here: https://docs.shaped.ai/docs/v2/support/security

Thanks for the feedback on the privacy policy, let me see if we can get that changed. For what it's worth we don't share personal information with anyone, this is likely just overly defensive legal writing on our part.

mritchie712•1w ago

this is cool, but:

> This replaces about 500 lines of standard Python

isn't really a selling point when an LLM can do it in a few seconds. I think you'd be better off pitching simpler infra and better performance (if that's true).

i.e. why should I use this instead of turbopuffer? The answer of "write a little less code" is not compelling.

airstrike•1w ago

> > This replaces about 500 lines of standard Python

> isn't really a selling point when an LLM can do it in a few seconds.

this is not my area of expertise, but doesn't that still assume the LLM will get it done right?

verdverm•1w ago

Shorter code is easier to understand and maintain, for both man and machine

This idea that it no longer matters because Ai can spam out code is a concerning trend.

tullie•1w ago

This line comes from a specific customer we migrated from Elastic Search, they had 3k lines of query logic, and it was completely unmaintainable. When they moved to Shaped we were able to describe all of their queries into a 30 line ShapedQL file. For them the reducing lines of code basically meant reducing tech-debt and ability to continue to improve their search because they could actually understand what was happening in a declarative way.

To put it in the perspective of LLMs, LLMs perform much better when you can paste the full context in a short context window. I've personally found it just doesn't miss things as much so the number of tokens does matter even if it's less important than for a human.

For the turbopuffer comment, just btw, we're not a vector store necessarily we're more like a vector store + feature store + machine learning inference service. So we do the encoding on our side, and bundle the model fine-tuning etc...

pickleballcourt•1w ago

Is there a major difference between pgvector and shapedql?

tullie•1w ago

You can think of Shaped more like a vector store + feature store + ML inference combined into one service. This bundling is what makes it so easy to get state-of-the art real-time recommendations and search performance.

E.g imagine trying to build a feed with pgvector, you need to build all of the vector encoding logic for your catalog, then you need to build user embeddings, the models to represent that and then have a service that at query time encodes user embeddings from interactions does a lookup on pgvector and returns nearest neighbor items. Then you also need to think about fine-tuning reranking models, diversity algorithms and the cold-start problem of serving new items to users. Shaped and ShapedQL bundles all of that logic into a service that does it all as one in a low-latency and fault-tolerant way.

pickleballcourt•1w ago

Thanks!

JacobiX•1w ago

>> Apply diversity logic so your Agent/User doesn’t see 10 nearly identical results

On Instagram this is a good thing, but here the example is hotel and flight search, where a more deterministic result is preferable.

In the retrieve → filter stage, using predicate pushdown may be more performant: first filter using hard constraints, then apply hybrid search ?

tullie•1w ago

Makes sense! Agreed on the diversity for agents being a bit contrived here.

All of the retrievers do support pre-filtering, you just add the where clause within the retriever function. We're working on more query optimization to make this automatic also.

data_ders•1w ago

I'm a big SQL stan here and I love the concept and if you ever wanna chat about how it might integrate with dbt let me know :)

conceptual questions:

1) why did you pick SQL? to increase the Total Addressable Userbase with the thinking that a SQL API means more people can use it than those who know Python or Typescript?

2) What isn't or will never be supported by this relational model? what are the constraints? Clickhouse comes to mind w/ it's intentionally imposed limitations on JOINs

3) databases are historically the stickiest products, but even today SQL dialects are sticky because of how closely tied they are to the query engine. why do you think users will adopt not only a new dialect but a new engine? Especially given that the major DWH vendors have been relentlessly competing to add AI search vector functionality into their products?

4) mindsdb comes to mind as something similar that's been in the market for a while but I don't hear it come up often. what makes you different?

playground feedback: 1) why are there no examples that: a) use `JOIN` (that `,` is unhinged syntax imho for an implicit join) b) don't use `*` (it's cool that there's actual numbers!)

2) i kinda get why the search results defaults to a UI, but as a SQL person I first wanted to know what columns exist. I was happy to see "raw table" was available but it took me a while to find it. might be have raw table and UI output visible at the same time with clear instructions on what columns the query requires to populate the UI

tullie•1w ago

Would love to chat about it, and talk about dbt integration. There's a few use cases that have come up where this would be really helpful. I'll PM you.

1) So we do actually have a python and typescript API, it's just the console web experience is SQL only as it feels the best for that kind of experience. The most important thing though is that it's declarative. This helps keep things relatively simple despite all the configuration complexity, and is also the best for LLMs/agents as they can iterate on the syntax without doc context.

2) Yeah exactly, joins is something we can't do at the moment, and i'm not sure the exact solution their honestly. Under the hood most of Shaped's offline data is built around Clickhouse, and we do want to build a more standard SQL interface just so you can do ad-hoc, analytical queries. We're currently trying to work if we should integrate it more directly with ShapedQL or just keep it as a separate interface (e.g. a ShapedQL tab vs a Clickhouse SQL tab).

3) We didn't really want to create a new SQL dialect, or really a new database. The problem is none of the current databases are well suited for search and recommendations, where you need extremely low latency, scalable, fault-tolerance, but also the ability to query based on a user or session context. One of the big things here is that because Shaped stores the user interactions alongside the item catalog, we can encode real-time vectors based on those interactions all in an embedding query service. I don't think that's possible with any other database.

4) I haven't looked into mindsdb too much, but this is a good reminder for me to deep dive into it later today. From taking a quick pass on it, my guess is the biggest difference is that we're built specifically for real-time search, recommendations and RAG, and that means latency, and ability to integrate click-through-rate models and things becomes vital.

Thanks so much for the playground syntax, have some follow up questions but i'm going to pm you if that's okay. Agreed on the being able to see which columns exist.

froh42•1w ago

I had a look, so how would I bring my data into it.

By exposing my database to services somewhere else in the network. Oh and somewhere else is the US.

Fat chance in hell I can anyone in my company look at that or even think about legally applying it with some serious data. (I'm in EU. Yes, a lot of people and companies use US services. Currently it looks like NONE of these can legally do.)

It looks interesting, but it needs a on premise solution.

cyanydeez•1w ago

I always assume cloud based services are a moat against the simplicity of the code bases used for these types of demos.

tullie•1w ago

Fair enough. We're releasing our first bring-your-own-cloud (BYOC) offering in April. We're working with a big e-commerce platform in Germany that has data sovereignty requirements so totally get the constraint and excited to offer something like this. We're starting with AWS then will do GCP at end of year. Full on-premise will still be awhile though to be honest.

For the cloud platform though (console.shaped.ai), i'd recommend just testing with some synthetic or deanonymized data or our demos and then if you're interested in BYOC reach out after April!

Show HN: A luma dependent chroma compression algorithm (image compression)

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

Show HN: Django-rclone: Database and media backups for Django, powered by rclone

Show HN: If you lose your memory, how to regain access to your computer?

Show HN: I spent 4 years building a UI design tool with only the features I use

Show HN: Witnessd – Prove human authorship via hardware-bound jitter seals

Show HN: Smooth CLI – Token-efficient browser for AI agents

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

Show HN: PalettePoint – AI color palette generator from text or images

Show HN: Artifact Keeper – Open-Source Artifactory/Nexus Alternative in Rust

Show HN: BioTradingArena – Benchmark for LLMs to predict biotech stock movements

Show HN: I built a <400ms latency voice agent that runs on a 4gb vram GTX 1650"

Show HN: Slack CLI for Agents

Show HN: Stacky – certain block game clone

Show HN: A toy compiler I built in high school (runs in browser)

Show HN: Gigacode – Use OpenCode's UI with Claude Code/Codex/Amp

Show HN: ARM64 Android Dev Kit

Show HN: Env-shelf – Open-source desktop app to manage .env files

Show HN: Nginx-defender – realtime abuse blocking for Nginx

Show HN: Micropolis/SimCity Clone in Emacs Lisp

Show HN: MCP App to play backgammon with your LLM

Show HN: Horizons – OSS agent execution engine

Show HN: Daily-updated database of malicious browser extensions

Show HN: I'm 75, building an OSS Virtual Protest Protocol for digital activism

Show HN: I built Divvy to split restaurant bills from a photo

Show HN: Falcon's Eye (isometric NetHack) running in the browser via WebAssembly

Show HN: Local task classifier and dispatcher on RTX 3080

Show HN: Slop News – HN front page now, but it's all slop

Show HN: A luma dependent chroma compression algorithm (image compression)

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

Show HN: Django-rclone: Database and media backups for Django, powered by rclone

Show HN: If you lose your memory, how to regain access to your computer?

Show HN: I spent 4 years building a UI design tool with only the features I use

Show HN: Witnessd – Prove human authorship via hardware-bound jitter seals

Show HN: Smooth CLI – Token-efficient browser for AI agents

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

Show HN: PalettePoint – AI color palette generator from text or images

Show HN: Artifact Keeper – Open-Source Artifactory/Nexus Alternative in Rust

Show HN: BioTradingArena – Benchmark for LLMs to predict biotech stock movements

Show HN: I built a <400ms latency voice agent that runs on a 4gb vram GTX 1650"

Show HN: Slack CLI for Agents

Show HN: Stacky – certain block game clone

Show HN: A toy compiler I built in high school (runs in browser)

Show HN: Gigacode – Use OpenCode's UI with Claude Code/Codex/Amp

Show HN: ARM64 Android Dev Kit

Show HN: Env-shelf – Open-source desktop app to manage .env files

Show HN: Nginx-defender – realtime abuse blocking for Nginx

Show HN: Micropolis/SimCity Clone in Emacs Lisp

Show HN: MCP App to play backgammon with your LLM

Show HN: Horizons – OSS agent execution engine

Show HN: Daily-updated database of malicious browser extensions

Show HN: I'm 75, building an OSS Virtual Protest Protocol for digital activism

Show HN: I built Divvy to split restaurant bills from a photo

Show HN: Falcon's Eye (isometric NetHack) running in the browser via WebAssembly

Show HN: Local task classifier and dispatcher on RTX 3080

Show HN: Slop News – HN front page now, but it's all slop

Show HN: ShapedQL – A SQL engine for multi-stage ranking and RAG

Comments