“It passed all the unit tests, the shape of the code looks right," he said. It's 3.7x more lines of code that performs 2,000 times worse than the actual SQLite. Two thousand times worse for a database is a non-viable product. It's a dumpster fire. Throw it away. All that money you spent on it is worthless.”
“This magic that literally didn’t exist two years ago in more than a toy state is moving at such a rapid rate that it couldn’t even reproduce sqlite three months ago and only got better enough in those weeks to produce a bad version of sqlite! Clearly useless! It has no value, no one is using it to do any work and won’t get better over the next three months or three years!”
An amazing take.
As a mechanical engineer, that is exactly how you solve that problem.
> point is that it isnt exponential/fundamental progress
You just stuck the goalpost on a rocket and shot it into space. You'd be hard pressed to show evidence that progress in this field was ever exponential - in most fields it never was. Logarithmic progress is typical; you make a lot of progress early on picking the low hanging fruit figuring out the basics, and as the problems get harder and the theory better understood it takes more effort to make improvements, but fundamentally improvements continue.
Incremental progress from increasing scale is, again, perfectly cromulent. It's how we've made advanced computers that can fit in your pocket, it's how clothing became so cheap it's practically disposable, it's how you can fly across the country for less than the price of a nice dinner. Imagine looking at photolithography, textile manufacturing, or aircraft 5 years after they reached their modern forms and saying "this has plateaued".
In a sense, looking at photolithography, textile manufacturing, or aircraft as you suggest, does show they plateaued, at least to me.
Are we sure we want to be making things so cheap they become discardable in the ever-growing landfills of the world?
Literally the introduction of transformers was absolutely exponential, in fact exponential progress is pretty much the defining characteristic of first chunk of a new technology's development. I mean in CS specifically, there are dozens and dozens of instances of exponential improvements. Like... obviously lol. Also the plateau that folks are mentioning is about a lack of fundamental improvements. Perhaps MEs dont experience exponential improvements but we do all the time in CS and SWE lol.
Yes, context is the plateau. But I don't think it the bottleneck is RAM. The mechanism described in "Attention is all you need" is O(N^2) where N is the size of the context window. I can "feel" this in everyday usage. As the context window size grows, the model responses slow down, a lot. That's due to compute being serialised because there aren't enough resources to do it in parallel. The resources are more likely compute and memory bandwidth than RAM.
If there is a breakthrough, I suspect it will be models turning the O(N^2) into O(N * ln(N)), which is generally how we speed things up in computer science. That in turn implies abstracting the knowledge in the context window into a hierarchical tree, so the attention mechanism only has to look across a single level in the tree. That in turn requires it to learn and memorise all these abstract concepts.
When models are trained the learn abstract concepts which they near effortlessly retrieve, but don't do that same type of learning when in use. I presume that's because it requires a huge amount of compute, repetition, and time. If only they could do what I do - go to sleep for 8 hours a day, and dream about the same events using local compute, and learn them. :D Maybe, one day, that will happen, but not any time soon.
> "The other way to look at this is like there's no free lunch here," said Smiley. "We know what the limitations of the model are. It's hard to teach them new facts. It's hard to reliably retrieve facts. The forward pass through the neural nets is non-deterministic, especially when you have reasoning models that engage an internal monologue to increase the efficiency of next token prediction, meaning you're going to get a different answer every time, right? That monologue is going to be different.
> "And they have no inductive reasoning capabilities. A model cannot check its own work. It doesn't know if the answer it gave you is right. Those are foundational problems no one has solved in LLM technology. And you want to tell me that's not going to manifest in code quality problems? Of course it's going to manifest."
You can argue with specifics in there, but they made their case.
„Insurance underwriters are seriously trying now to remove coverage in policies where AI is applied and there's no clear chain of responsibility“
I see a future coming, where everyone uses AI but nobody admits it.
al_borland•2d ago
bigstrat2003•2d ago