The maximum size database would be 4294967294 pages at 65536 bytes per page or 281,474,976,579,584 bytes (about 281 terabytes).
Usually SQLite will hit the maximum file size limit of the underlying filesystem or disk hardware long before it hits its own internal size limit.
Did a full-day deep dive into SQLite a while back; funny how one tiny database runs the whole world—phones, AI, your fridge, your face... and like, five people keep it alive.
Blows my mind.
Though to be fair to those organizations. It's amazing the performance someone can get out of a quarter million dollars of off the shelf server gear. Just imagine how much RAM and enterprise grade flash that can get someone off of AMD or Intel's highest bin CPU even at that budget!
The largest filesystems I could find are ~1EB and 700PB at Oak Ridge.
FWIW, I took the ‘usually’ to mean usually the theoretical file size limit on a machine is smaller than theoretical SQLite limit. It doesn’t necessarily imply that anyone’s hit the limit.
The limit is more about how much data you want to keep in sqlite before switching to a "proper" DBMS.
Also the limit above is for someone with the foresight that their database will be huge. In practice most sqlite files use the default page size of 4096, or 1024 if you created the file before the 2016 version. That limits your file to 17.6TB or 4.4TB respectively.
I don't use it for sqlite, but having multi-petabyte filesystems, in 2025, is not rare.
Discussions: https://news.ycombinator.com/item?id=43682006 | 5 months ago | 41 comments
And in these read only cases I'd use Parquet files queried with Duckdb Wasm.
You'll probably benefit from using the largest possible page size; also, keep alive; etc.
But even then, you'll pull at most 64 KiB per request. If you managed to have response times of 10 ms, you'd be pulling at most 52 Mbps.
So yeah, if your queries end up reading just a couple of pages, it's great. If they require a full table scan, you need some smart prefetching+caching to hide the latency.
If craftsmanship is measured by the long tail of good choices that give something a polished and pristine feel, then SQLite was built with none of it. And yet, it's by far the best initial choice for every project that needs a database. Most projects will never need to switch to anything more.
It apparently has an extensive and thorough test suite. That's an excellent design choice that tons of other projects could learn from, and is probably a key element of its success.
Sometimes a poorly-designed thing that is excellently-documented and thoroughly-tested is better than a brilliantly-designed thing that is lacking in those. In fact, unless the number of users of the thing is 1 (the creator), the former is likely a better option across all possible use-cases.
Perhaps we could generalize this by stating that determinism > pareto-optimality.
It was the first database to introduce a cost-based optimizer, and ran under both VMS and Digital UNIX.
Oracle bought it, and VMS versions are still supported.
https://www.oracle.com/database/technologies/related/rdb.htm...
https://en.m.wikipedia.org/wiki/Oracle_Rdb
(My employer is still using the VMS version.)
Is this opinion shared by others?
At the same time, strict tables address some of the concern of those coming from conventional databases.
Dates and times are a core problem to SQLite not seen elsewhere as far as I know, but this does evade UTC and constantly shifting regional time. My OS gets timezone updates every few months, and avoiding that had foresight.
Default conformance with Postel's Law is SQLite's stance, and it does seem to work with the ANSI standard.
Typically, the Lowest-Common-Denominator wins mass appeal/uasge.
By not having safety checks and even typing enforcement, SQLite caters to actually more use cases than less.
I don't remember ever saying that. Rather, see https://sqlite.org/flextypegood.html for detailed explanation of why I think flexible typing ("weak typing" is a purgative and inaccurate label) is a useful and innovative feature, not a limitation or a bug. I am surprised at how successful SQLite has become, but if anything, the flexible typing system is a partial explanation for that success, not a cause of puzzlement.
"I had this crazy idea that I’m going to build a database engine that does not have a server, that talks directly to disk, and ignores the data types, and if you asked any of the experts of the day, they would say, “That’s impossible. That will never work. That’s a stupid idea.” Fortunately, I didn’t know any experts and so I did it anyway, so this sort of thing happens. I think, maybe, just don’t listen to the experts too much and do what makes sense. Solve your problem."
Misstated, I'd say. You said "nobody" but the actual quote is about the assumed conventional wisdom of the time, which is quite different. And while this was probably inadvertent, you phrased it in a way that almost made it sound like that was Dr. Hipp's original opinion, which, of course, is the opposite of true.
Postel's Law, also known as the Robustness Principle, is a guideline in software design that states: "be conservative in what you send, be liberal in what you accept."
Nothing to do with the posters badly formatted complained about Sqlite. By that metric DuckDB has a ton of issues that even out scale Sqlite.
> It actually does have typed values
Now. As the article points out, they were not part of the initial design, because of the Tcl heritage.
Tcl has value types. Tcl 7.x and earlier only had one data type, the string-so adding two integers required two string-to-int conversions followed by an int-to-string conversion. In 1997, Tcl 8.x was released, which internally has distinct values types (int, string, etc), although it retains the outward appearance of “everything-is-a-string” for backward compatibility. So SQLite’s Tcl heritage included distinguishing different types of values, as is done in post-1997 Tcl.
Long term compatibility (i.e. prioritizing the needs of users vs chasing inevitably changing ideas about what feels polished or pristine), near fanatical dedication to testing and quality, and sustained improvement over decades - these are the actual signs of true craftsmanship in an engineering project.
(plus, I don't agree with you that the storage layer, column format, or SQL implementation are bad).
I agree. I am not suggesting that the SQLite team doesn't know how to make the technology better. Just that they aren't/haven't. Backwards compatibility is a good reason not to.
My original comment was contrasting craftsmanship and utility, since both are somewhat prized on HN, but they aren't the same thing at all. Look at a system like Wireguard. A huge amount of small decisions went into making that as simple and secure as it is. When most developers are confronted with similar decisions, they perform almost randomly and accumulate complexity over the long tail of decisions (it doesn't matter just pick a way). With Wireguard, every design decision reliably drove toward simplicity (it does matter, choose carefully).
I'd say they're prized everywhere, though "craftsmanship" is really subjective. and the HN I usually [edit/add: see] seems to have more a meta of "criticize anything someone tries to build, and rave about IQ" tbh ;)
SQLite works and I don't have to think about it why it works (too much). That is IMO a true hallmark of solid engineering.
You're going to have to expand on that, because I have no idea what you're talking about, nor does anyone else here seem to.
It's a relational database meant primarily for a single user. It's SQL. It works as expected. It's performant. It's astonishingly reliable.
The only obviously questionable design decision I'm aware of is for columns to be able to mix types, but that's more "differently designed" rather than "poorly designed", and it's actually fantastic for automatically saving space on small integers. And maybe the fact ALTER TABLE is limited, but there are workarounds and it's not like you'll be doing that much in production anyways.
What are your specific problems with it?
What an odd design choice. Why not just have the value be the base 2 logarithm of the page size, i.e. a value between 9 and 16?
Yes, that would have been a better choice. Originally, the file format only supported page sizes between 512 and 32768, though, and so it just seemed natural to stuff the actual number into a 2-byte integer. The 65536 page size capability was added years later (at the request of a client) and so I had to implement the 65536 page size in a backwards compatible way. The design is not ideal for human readability, but there are no performance issues nor unreasonable code complications.
The page size value is not the only oddity. There other details in the file format that could have been done better. But with trillions of databases in circulation, it seems best to leave these minor quirks as they are rather than to try to create a new, more perfect, but also incompatible format.
sqlite docs recommend avoiding using it on network storage, though from what I can gather, it's less of an issue if you're truly only doing reads (meaning I could create it locally and then copy it to network storage). Apache Parquet seems promising, and it seems to support indexing now which is an important requirement.
https://www.kernel.org/doc/html/latest/filesystems/squashfs....
adzm•1d ago
chasil•22h ago
https://www.sqlite.org/locrsf.html
justin66•21h ago
chasil•21h ago
However, a significant fraction of the current installed base would not upgrade, requiring new feature development for both versions.
The test harness would also need implementations for both versions.
Then the DO-178B status would would need maintenance for both.
That introduces significant complexity.
johannes1234321•19h ago