1. Better query language: QUEL, LINQ, etc.
2. Better data model or performance: CouchDB, Mongo, Redis, etc.
3. Better abstraction: Zope, Active Record, et. al.
SQL vendors keep absorbing the differentiating features from other approaches with enough success that they capture most business use cases. That's not to say there aren't successes outside of SQL! But I've seen it claimed SQL will be dead several times over thanks to some new tech and most of the time SQL instead kills that tech or makes it niche.
I would classify Text-to-SQL as a Schedule I rabbit hole. It has some serious effects, but it otherwise has no acceptable level of value-add in most reasonable enterprises.
This is a major theme with LLMs. When they first came out you'd see it randomly returning garbage in the middle of an otherwise good output maybe 30% of the time. You knew you had to go through it with a fine tooth comb.
Now it's more like 3%. And you just gloss over it.
I think you are right about Text-to-SQL being a trap. In this case the deficiencies are unaccaptable.
But elsewhere? I think we are going to see the "customer service effect" applied all over the place. I am referring to the downward trend of customer service. Where quality was eschewed for scale. We went from highly competent agents providing individual feedback to hard to reach agents with no agency. It scales, but the bar has been lowered significantly. You get less customer service.
I think AI begs for the same tradeoff. Settle for less because it makes some things easier. Which of course makes other things much more complex or even undermines the entire premise.
I'd also love to understand better why you think that there is no "acceptable level of value-add in most reasonable enterprises".
The writing SQL experience is a product of both SQL's syntax, the structure of the database you're querying over and the complexity of your query.
When things get hairy, and you have a good number of representative queries already written that you can use as context, LLMs can be really nice tool.
Syntactically PRQL is much simpler, cleaner and more flexible. You simply chain pipeline stages instead of having a single monolithic query statement.
Data model wise EdgeQL is close to what I want (links instead of joins, optionality instead of null, nesting support), but it's syntax is almost as bad as SQL.
I just wish they had mutation in there too. I don't like the idea of swapping between PRQL and SQL, let alone some complex update statements where i'd rather write the query in PRQL. .. Yea you could argue they shouldn't be that complex for updates though heh.
Sure, you have to phrase your question in a way that's a bit like trying to ask a very specific question of an annoying "Self-Diagnosed Internet Autistic" co-worker who can't tell the difference between being "precise" and being "a pedantic pain in the arse", but it is just text.
Oh you're upset because SQL isn't in German? Well there's no reason why you can't stick German into the lexer, set your columns up with German names, and get a query like
WAHLEN_SIE zeielen_id, benutzer_namen, eingetragen
AUS benutzern WO aktiviert = WAHR
SORTIEREN NACH registrierungs_datum;
people who can really speak German, ja ich weiss meine Deutsch is so schlect, geh schon ;-)But really why would you bother?
SQL is a formal language, not a natural one. It's precise, rigid, and requires a specialized understanding of schema, joins, and logic. text-to-sql systems don't exist because people are too lazy to type; they exist because most people can't fluently express analytical intent in sql syntax. They can describe what they want in natural language ("show me all active users who registerd this year"), but translating that into correct, optimized sql requires at least familiarity, and sometimes expertise
So the governance challenges discussed in the article aren't about "oh SQL is too hard to type"...they're about trust, validation, and control when you introduce an AI intermediary that converts natural lang into a query that might affect sensitive data
And you want everyone to learn this? People who don’t even have time or ability to master excel?
Really?
The problem is that, IMX, AI is just as shit at that as most humans are. AI can find and connect primary keys and columns with the same name. It doesn't understand cardinality of entities or how to look for data that breaks the data model of the application (which is a critical element of data validation).
None of the actual hard parts of writing SQL go away with AI.
LLMs are getting pretty good at writing SQL. There is so much training material out there, and it is not that hard to validate the results. The real interesting question will be if they will be better at leveraging all the database specific dialects than tool like PowerBI. High-performance databases like Exasol often has a lot of specific features in their SQL dialects that generic tools and ORMs are not able to use, it will be interesting to see if LLMs can make that more accessible.
Anyone selling you 99% accuracy can prove it there first.
The promise of the technology is not that it can deal with any arbitrarily complex Enterprise setup, but rather that you expose it with enough guidance on a controlled and sufficiently good data model.
Depending on your use case this can be super valuable as it enables a lot more people to use data and get relevant recommendations.
But yeah it's work to make nice roads and put up signs everywhere.
I have hundreds of tables designed by several different teams. I do have decent documentation on the tables but if I had a nice, organized data model I wouldn't need an AI assistant. If I had a perfect data model my team could write simple SQL queries or give chatgpt a schema dump + a natural language query and it would get the answer most of the time.
IMHO, the big value in this space will be when these tools can wrangle realistic databases.
In Dot, it's divide and conquer. If you have several different teams each of them has to maintain their knowledge base.
A bunch of our customer have less than 10 tables hooked up to Dot, but this data is core to their business and so the analytics agent is really useful. Our most complex setup is on more than 5000 tables, but that was a lot more work to lay out the structure and guidelines.
Also, I don't think all organization are ready for AI. If the data model is a huge mess, data quality is poor and analytics use cases are not mature, it's better to focus on the fundamentals first.
But… someone what knows approximately what to do and sort of how to do it could work wonders — if we had LLMs trained on a corpus with specific rules.
I don’t know how to left join and what table I need to get the aggregate of sales in each region by date and price range, but I can describe it halfway and know how to check if each step is valid.
LLMs can do this. They’re trained on English, and they are able to weight definitive rules. But instead we throw a random text at a general purpose transformer.
Parsing a response of tokens into grammatical English is the most expensive computation (after the initial scraping and catalog). Instead of wasting all the cycles doing that against the sun total of GitHub, StackOverflow, Reddit, and Wikipedia, create a fuzzy match on a simplification a rigorous specification and train it on your data (just a few million tokens) to teach it that users have primary addresses and are associated to accounts that have regions and region X has roughly 10 times the sales volume of region y.
So someone intelligent in the matter with an understanding of logical rigor and a general idea of the data shape can actually become 10x more efficient, instead of trying to lift vibe coders to the level of Spakespearean monkeys, you could be turning mid-level devs into super analysts.
lateforwork•5h ago
disgruntledphd2•1h ago
For most businesses, it really really is. In general, people (businesses) are incredibly sensitive about any possibility of data leakage, even just the metadata. There are lots of companies who would pay for this, and they tend to have a lot of money.