I imagine we're not alone in this type of abstraction layer, and some type-safety would be very welcome there. I tried to build our system on top of Kysely (https://kysely.dev/) but the Clickhouse extension was not far along enough to make sense for our use-case. As such, we basically had to build our own parser that compiles down to sql, but there are many type-error edge cases, especially when we're joining in against data from S3 that could be CSV, Parquet, etc.
Side note: One of the things I love most about Clickhouse is how easy it is to combine data from multiple sources other than just the source database at query time. I imagine this makes the problem of building an ORM much harder as well, since you could need to build type-checking / ORM against sql queries to external databases, rather than to the source table itself
No one needs an ORM: https://dev.to/cies/the-case-against-orms-5bh4
The article opens with "ORMs have proven to be useful for many developers" -- I believe the opposite is true.
The devil is of course in the details, but it's a nice dream.
ActiveRecord, RoR's ORM is the complete opposite.
pl/pgSQL (or pl/sql in oracle) and variants.
ORMs, like all abstractions, are a leaky abstraction. But I would argue because of the ubiquity and utility of SQL itself they are a very leaky one where eventually you are going to need to work around them.
After switching to just using SQL in all situations I found my life got a lot simpler. Performance also improved as most ORMs (Rails in particular) are not very well implemented from a performance standpoint, even for very simple use cases.
I can not recommend enough that people skip the ORM entirely.
What is your concern re: random types popping up? SQLite springs to mind as a prime offender due to not enforcing column types OOTB, but most dialects have rather strong typing.
If we’re talking about mapping UUIDs and datetimes from their DB representations to types defined by the language stdlib, that’s usually the responsibility of the DB driver, no?
Pydantic knows nothing of your database. It’s schema-on-read (a great pattern that pydantic is well suited for), or serialization, or validation, but not an ORM.
Basically we don’t always know what the future unknown data source we may be reading from, and also the schema of the source might change, but we can define what we expect on the receiving end (in pydantic), and have it fail loudly when our assumptions change.
Perhaps saying "ORM" is a bit of a misnomer, but they're discussing the DX ergonomics of an ORM and acknowledging the exact challenges you describe
And for most of the code, the performance and overheads were negligible. C# with LINQ is even better, it provides strong typecheck for the queries and often has almost zero overhead.
I'm using Go now, and I don't even want to touch any of the available ORMs because they all suck, compared to the state-of-the-art in Java circa 20 years ago.
This can only be achieved by utilizing some sort of type system. Whether it's reflecting on the tables, codegen on the fly, or having to write custom adapters for each structure. All of which can be greatly simplified with an ORM.
It's not going to help much with bespoke report asks from the business though.
Schemas as application code means you get version control, PR review and type‑safe changes. A query builder that feels like SQL and lets you write “real” ClickHouse queries with IDE autocompletion and compile‑time checking. Local development and CI should mirror production so you can preview schema changes before they apply to prod.
>>>
I believe this is what dbt set out to accomplish. They came at the problem from the point of view of a data transformation language that is essentially a pseudo type checked SQL for analytical engines with some additional batteries included (ie macros) but the motivation was similar. I’ve always felt that what has held dbt back from more mainstream adoption by the dev community is because they’ve prioritized data transformation over data access to the application layer - ie business intelligence tools over a web app.
Moosestack looks interesting- will definitely check it out.
And I wanted it to emit the raw SQL because that's generally what I want for olap.
So I had to go at building it. If anyone's interested a very rough demo/prototype is here: https://www.robinlinacre.com/vite_live_pg_orm/
Load in the demo Northwind schema and click some tables/columns to see the generated joins
The sane middle ground is libraries that give you nicer ergonomics around SQL without hiding it (like Golangs sqlx https://github.com/jmoiron/sqlx). Engineers should be writing SQL, period.
The blog suggests that an ORM for OLAP would do exactly that
I’ve written a lot about this particular topic: https://www.jacobelder.com/2025/01/31/where-shift-left-fails...
There's no rule saying you can't integrate your own manually written SQL with an ORM, and in fact, any production-ready, feature-complete ORM will allow you do it, because it's effectively a requirement for any non-trivial use case.
Is it a variation of: "I suffered when I was young, so everyone must suffer as I did?"
SQL is terrible, however you slice it. It's a verbose bass-ackwards language. ORMs that remove the need to deal with SQL for 99% of trivial cases are great.
The rest 1% can remain painful.
SQL remains the only way to efficiently perform MANY computations in the database precisely because it's lingua franca for the database planner. If you're not writing SQL, it doesn't mean that you're unable to cover 1% of the queries, it only means that you're leaving 99% of performance on the table. You can tell a bad programmer by their refusing to use views and materialized views. Not to mention normalisation! I'm yet to see a coder using ORM produce a sane schema. And don't get me started on aggregates. Relational databases represent relations, not objects, period.
> If you're not writing SQL, it doesn't mean that you're unable to cover 1% of the queries, it only means that you're leaving 99% of performance on the table.
Honestly? You're spewing bullshit. In most apps most of SQL is trivial. It's typically limited to simple joins (with indexes) with simple filters. Anything more complicated, and it's usually not suitable for OLTP. Heck, all our SQL is linted to not have full-table scans.
This kind of SQL is perfectly auto-generated by ORMs.
Those multi-page queries that required mystic DB knowledge for placing hints, burning incense, and paying $1000 per hour to Oracle consultants? They're entirely useless in modern software stacks. Because you can either keep the entire working set in RAM so these queries can be trivially rewritten, or you just won't use regular SQL for it.
> You can tell a bad programmer by their refusing to use views and materialized views.
You can tell a bad programmer by them using DB in a way that requires materialized views. It typically ends with moving app logic into SQL, and may even lead to "SELECT '<td>' + row.cust_name + '</td>'".
As another commenter wrote:
”If you're doing OLAP…SQL isn't wholly adequate for this, it's hard work to get the SQL right and if there's joins involved it's not hard to accidentally fan out and start double counting.”
I feel this in my bones. Anytime someone in the business changes something in a source system, the joins create extra rows. Trying to address this with SQL is like plowing a field with a spoon.
And I don’t think ORMs are the answer. Just imperative code. The ability to throw in a few lines of sanity checking outside of SQL is such a massive boost to reliability and data quality, when the source systems are moving underneath your feet.
Doubleagree on sanity checks.
SQL isn't wholly adequate for this, it's hard work to get the SQL right and if there's joins involved it's not hard to accidentally fan out and start double counting.
If you ask me, you want an analytic model of the data that is designed around measures, dimensions, with an anointed time dimension, and a way of expressing higher level queries such that it automatically aggregates depending on which dimensions you leave out, and gives you options to sort, pivot, filter etc. dynamically.
This doesn't look like entities, really, but it is a model between you and the SQL.
From my scan - not detailed - reading of the article, Moose looks too low level and not a useful abstraction to sit in the same logical place that ORMs do in OLTP databases.
While the domain modeling exerts (some/lots of) friction, an OLAP ORM adaptation may result from forking their concept.
bob1029•5mo ago
I guess I have a wildly different interpretation of typical OLAP scenarios. To me this acronym mostly means "reporting". And in 99% of cases where the business desires a new report, the ideal views or type systems have not been anticipated. In these cases (most of them), I can't imagine a faster way to give the business an answer than just writing some sql.
timgdelisle•5mo ago
ElatedOwl•5mo ago
In my experience these one off reports are very brittle. The app ends up making schema changes that are breaking to these one off reports, and you usually don’t find out until it goes to production.
I’ve dealt with the maintenance nightmare before. At current gig we’re exploring solutions, curious what a robust pipeline looks like in 2025.
The ORM piece is interesting — we use ActiveRecord and Ruby, and accidentally breaking schema changes within app will get caught by the unit test suite. I would love for a way to bring OLAP reports in similarly to test at CI time.
wredcoll•5mo ago
lpapez•5mo ago
Surely there is a way to run a raw query in Rails/ActiveRecord and use it in a smoke test?
datadrivenangel•5mo ago
sdairs•5mo ago