LumoSQL

https://lumosql.org/src/lumosql/doc/trunk/README.md

253•smartmic•1mo ago

Comments

Lio•1mo ago

I'd say the most interesting thing on that site is the Not-Forking idea[1].

1. https://lumosql.org/src/not-forking/doc/trunk/README.md

actionfromafar•1mo ago

I have never seen PikChr diagrams before, very interesting.

sksrbWgbfK•1mo ago

https://pikchr.org/home/doc/trunk/homepage.md

Made by the same people who brought us SQLite.

stonemetal12•1mo ago

The US Navy?

aredox•1mo ago

I was going to say that.

We really need a way to customise software at the source code level without forking.

Arwill•1mo ago

You mean like #define + #ifdef?

LoganDark•1mo ago

Very niche but: https://github.com/SpongePowered/Mixin

It's not really possible to implement in the same way for many other languages, but something like this but for source code transformations (rather than bytecode, or machine code for compiled languages) is probably the kind of thing they're thinking of.

Mixin allows you to insert code into methods, modify calls to functions, read/write to local variables, modify constants, and a lot more in that type of vein. It is the way mods are made in the Fabric mod loader for Minecraft. I believe Forge also reluctantly added support for Mixin back in 1.16 or so.

SOLAR_FIELDS•1mo ago

Back in the day I used AspectJ to do something like this. The interface was decently friendly and IMO decorator style patterns are going to be the most user friendly approach to something like this. You won’t ever find a universal solution to this though without doing transpilation to some common format that gets morphed (ASM in your example, but I could see also a theoretical world where WASM would offer this)

eddd-ddde•1mo ago

I got to the end of that without really understanding what this is solving, what it does, or how.

How do you handle changing upstream files locally without forking? Do you just, keep changes in a separate configuration format that is applied lazily at built time?

I've never had issues with maintaining a fork anyways.

theamk•1mo ago

it's a replacement for checkout + patch process.

The main advantage over plain "patch" is that it is more powerful in the face of upstream changes. For example if you rename the upstream file, you have no good way to represent this in .patch, but that project allows is. There also a way to specify patch using function name, which should make it more robust in face of upstream changes.

As for my opinion, this seems like an incremental improvement over existing tools. I'd prefer a simple shell script that does "git checkout ... && mv ... && cp .. && patch ..." over something fancy like this.

90s_dev•1mo ago

I'm confused, is this just patch and apply patch?

alexjurkiewicz•1mo ago

Yes. It's a small DSL to fork a repository and apply a series of textfile transformations (replace file, replace partial file).

But if you give this a cool name, it's a New Idea.

worldsayshi•1mo ago

So it's a fork but with somewhat semantic diff specifications?

SOLAR_FIELDS•1mo ago

Yes, specifications which could almost certainly be applied with a custom merge drivers or strategy rather than trying to come up with a clever name and hand baking a custom toolchain to do what you can already do with Git

SoftTalker•1mo ago

Not all projects use git.

90s_dev•1mo ago

Weird.

SOLAR_FIELDS•1mo ago

How many of those projects are attempts at maintaining patches across multiple upstreams, where none of those upstream projects use Git? The author freely admits in their post that all upstream projects in their case are using git, like most of the FOSS world. They just chose fossil for their downstream project because Reasons.

knowitnone•1mo ago

did you say "fork"?

dunham•1mo ago

The TeX source works like this, too. There is the original tex.web and some change files which get applied when converting it to either a TeX document or pascal source. (These days the pascal is further translated to C.)

aidenn0•1mo ago

Article doesn't tell me why it's better than git's subtree merge.

theamk•1mo ago

let's say you need to copy config.h.in to config.h and patch a few lines. In case of "subtree merge", you will have to do it manually, every time you upgrade upstream.

e63f67dd-065b•1mo ago

Don't we just call these out-of-tree patches? Lots of those for Linux lying about, it's not a new idea. I guess the difference is that they have multiple upstreams, so really they're more of a new project that consists of:

- A set of sqlite patches,

- Other upstreams and patches?

- A custom toolchain to build all the above together into one artefact

SOLAR_FIELDS•1mo ago

If a trunk is a series of patches then isn’t maintaining a series of patches against multiple upstreams just maintaining multiple forks? Feels like mostly semantics to try to differentiate at this point. I mean what is the thing that you would get from doing it this other way? Isn’t it fundamentally still going to require pulling upstream changes regularly and resolving conflicts when they occur? Reading the treatise they say “this allows us to address conflicts that normally you would have to resolve manually”. So Git has tools to pick and choose and auto resolve conflicts, you just have to customize the behavior of your tooling a bit.

Seems like they’re just ditching the inbuilt tools git/github offers to achieve this and doing the exact same thing with custom bespoke tooling. Instead of doing that, I’d be more interested in leveraging what Git has baked in with some sort of wrapper tool to perform my automation with some bespoke algorithm. There are merge drivers and strategies end users can implement themselves for this kind of power user behavior that don’t require some weird reinvention of concepts already built into Git

riffraff•1mo ago

AFAIU, it's a build strategy.

Instead of having a fork, you have a configuration that says "checkout x then apply changes a,b,c".

herodoturtle•1mo ago

This looks really good. The per-row checksums is particularly neat. Good luck to these folks.

noduerme•1mo ago

Possibly the wrong place to ask this, but:

I've played with SQLite when it was still available in-browser, and I felt that was on the brink of being a game-changer. If it was still supported in-browser and we had replication from the browser, peer-to-peer, I think we'd be living in a much more useful world. It's a lovely tech, but I never built anything serious around it. At this point, as a front-end web technology that seems to be gone. I know I could conceivably use it to back a NodeJS server, keep all the data in memory and local files, but I don't see a great use case for that. I do lots of small projects that could use SQLite, but I usually scaffold them around a single shot Mysql DB for testing, which is easy to spin up and easy to detach from any given back-end instance. So I'm not sure what I'd gain by trying to make a tiny databse on the back-end live in Sqlite files. I'm totally enchanted by stuff like Litestream, and I'm actually dying to find a reason to try it. But every good use case for Sqlite that I could think of sort of died when it stopped being a viable client-side store.

TL;DR, what are people using SQLite for? What's the advantage over spinning up a tiny MySQL instance in a cloud somewhere, where you don't have to deal with managing replication and failover by yourself?

nicoburns•1mo ago

Most uses of SQLite are client-side apps. Basically everything that's not web: from mobile apps and desktop apps to embedded software in things like cars, tvs, kiosks, etc. There are probably more apps using SQLite than not using SQLite for these kind of apps.

noduerme•1mo ago

Are these things just running sql.js? Doesn't that use a kind of unstable webstorage instead of writing client-side files? I don't have a good handle on the state of SQLite these days as a way to store semi-permanent data on the client. In a locked-in environment or a backend I feel like it might make sense, but... isn't there like a 50Mb limit on localStorage, and how does that play nicely with a potentially larger DB...?

nicoburns•1mo ago

"client" doesn't mean web. Mobile apps, desktop apps, etc are all client-side apps that can run regular SQLite.

Think of apps Spotify, WhatsApp, AirBnB, Uber, etc. Not to mention mail clients, web browsers, etc. Probably 90% of non-web clients are using SQLite.

noduerme•1mo ago

I'm not sure about this, I may be exaggerating, but aren't all four apps you mentioned (Spotify, WhatsApp, AirBnB, Uber) built on Electron? So they'd be using SQLite in the Node portion as their storage. That's their "server side", not client side.

For that portion (the locally-run mobile backend - the middleware) I guess it would make more sense... so I see what you're saying.

[Edit: Of all 4 things - Maybe only Spotify is actually an Electron app...? Although I'm confused as to how the rest could leverage NodeJS locally]

lights0123•1mo ago

I would consider the entirety of one of those Electron apps to be a client since their main purpose is to interface with an external server—even if a small part of them internally is itself a server.

beejiu•1mo ago

Servers are things on other sides of networks. An electron app running locally is all client, whether it contains a database or not.

jitl•1mo ago

Spotify used Chrome Embedded Framework (CEF) not Electron, but it’s similar in that it bundles Chrome and uses webviews to draw UI

_joel•1mo ago

There are bindings to sqlite for pretty much most languages out there. Not just webapps

https://docs.python.org/3/library/sqlite3.html

https://www.sqlite.org/cintro.html

https://docs.rs/sqlite/latest/sqlite/

etc :)

noduerme•1mo ago

huh. Sorry, but do some languages have SQLite bindings to some other executable? I thought that sql.js and sqlite3 in JS actually were SQLite in its entirety, running in script. You don't need to run anything else to make them work.

_joel•1mo ago

They have an interface to work with the sqlite (proper), which will be shipped alongside the application in question. sql.js is using sqlite in WASM/Emscripten, so it's pretty much analagous.

striking•1mo ago

Definitely can still be used on a client, you just have to be creative with running it. https://github.com/orbitinghail/sqlsync uses rusqlite compiled to WASM within a Web Worker, for instance.

_joel•1mo ago

Chances are you have an installation or several running in your pocket right now. It's one of the most widely deployed pieces of software in existence. It's not supposed to be a 'traditional' DB in that running a webapp for many users sense (although it can do that), but to back client based software that need a data store/query tool and don't want to implement their own.

softfalcon•1mo ago

Agree with all of this. Also want to add that SQLite is a perfect jumping off point for full stack web-devs doing react-native (or similar) and want a familiar data query pattern they were already used to from MySQL, Postgres, etc.

Having a consistency of SQL everywhere is really appealing for data management.

incomingpain•1mo ago

Most of my python projects use SQLite; 1 exception where i need multiprocessing access to the database and no locking problems and speed so i need to run the entire db in memory.

https://docs.python.org/3/library/sqlite3.html

The built in library makes it really quick and easy to use it. Whereas mysql or in my case id use postgresql if i needed a full db. You're looking for a third party library? I have used Psycopg before but its just not needed.

Yes, ive come up against the sqlite locked database performance troubles; and failed to actually get the multi user thing working properly. But I moreso just needed to reapproach the issue.

My new startup http://mapleintel.ca is db.sqlite3 based. thousands of lines in it so far and growing every day.

noduerme•1mo ago

I've used sqlite3 in node, and it's nice and performant for small cases, yes. Mostly I've used it for things small enough where a user could download an entire sqlite file of their data, and then re-upload it in case their data got lost. But ultimately this data gets stored in a true MySQL DB. I don't think I'd trust it to run a whole system with thousands of users and millions of entries... honestly, maybe my issue is that I don't trust NodeJS enough...

hruk•1mo ago

FWIW, I've been running a system with roughly 100K users, about 25 qps on average, with a single SQLite file for several years. No issues with data.

noduerme•1mo ago

That's... pretty amazing. It sounds crazy to me, I'm obsessive about hourly backups, but do you use something like Litestream to keep copies?

arkh•1mo ago

From some months ago: https://news.ycombinator.com/item?id=43076785

> searchcode.com’s SQLite database is probably one of the largest in the world, at least for a public facing website. It’s actual size is 6.4 TB.

hruk•1mo ago

Yep, we use Litestream. It's been very reliable.

hiAndrewQuinn•1mo ago

SQLite is almost certainly more battle tested by this point than even MySQL for things like this. Alongside having somewhere around the ballpark of 10¹² current deployments (yes, 1 trillion) it has about 600 lines of testing code per line of its actual source code.

https://www.sqlite.org/testing.html

To give you an idea of just how hardcore this is, they stress test something as fundamental as malloc() independently:

>SQLite, like all SQL database engines, makes extensive use of malloc() [...] On servers and workstations, malloc() never fails in practice and so correct handling of out-of-memory (OOM) errors is not particularly important. But on embedded devices, OOM errors are frighteningly common and since SQLite is frequently used on embedded devices, it is important that SQLite be able to gracefully handle OOM errors.

>OOM testing is accomplished by simulating OOM errors. SQLite allows an application to substitute an alternative malloc() implementation using the sqlite3_config(SQLITE_CONFIG_MALLOC,...) interface. The TCL and TH3 test harnesses are both capable of inserting a modified version of malloc() that can be rigged to fail after a certain number of allocations. These instrumented mallocs can be set to fail only once and then start working again, or to continue failing after the first failure. OOM tests are done in a loop. On the first iteration of the loop, the instrumented malloc is rigged to fail on the first allocation. Then some SQLite operation is carried out and checks are done to make sure SQLite handled the OOM error correctly. Then the time-to-failure counter on the instrumented malloc is increased by one and the test is repeated. The loop continues until the entire operation runs to completion without ever encountering a simulated OOM failure. Tests like this are run twice, once with the instrumented malloc set to fail only once, and again with the instrumented malloc set to fail continuously after the first failure.

I don't say this as a hater of MySQL! SQLite is built with very different constraints in mind. But data consistency is something it really shines at.

chris_st•1mo ago

Two things:

1. There's almost certainly a port of Sqlite3 to WASM that would be more than glad to run in your browser.

2. I'd really love to know what applications fit in the "we had replication from the browser, peer-to-peer, I think we'd be living in a much more useful world" situation. We've had GunDB, IPFS, etc. that live in the browser for decades (and projects like Urbit), and the killer app just... doesn't seem to exist? Let alone anything useful as just a basic demo? Anyone have anything to point to? I just don't see it, personally.

noduerme•1mo ago

Heh. Well, #2, brilliant question. But no, I'm not thinking of anything as sexy as totally distributed filesystems. 15 years ago when I was into crypto and ran a bitcoin casino I would have had much bigger ideas for fully distributed DBs (which surely would have tanked and caused me ruin). Currently, I deal with a lot of site-specific software installations that run their own MySQL servers, some of which have unexpected downtime or go offline. I have a lot of custom code to align them with a master source of truth when they come back up. At least a few times a year, one of them gets so corrupted that I have to just login remotely and rebuild its database by hand. If I could have designed them to share a single database peer to peer, it would have saved me a lot of personal time.

There are probably a lot of hub-and-spoke systems like this flying way under the radar that would be a lot better if there were a reliable technology to keep them synchronized. I keep looking at Litestream and thinking about it.

RUnconcerned•1mo ago

Your best answer for "a much more useful world" is... easier development of crypto gambling? That sounds like an actively worse world to live in to me, to be honest.

chris_st•1mo ago

Wow, that sounds like a really difficult situation! I think your idea of using Litestream is definitely worth a try. Good luck (but not with crypto gambling :-).

ncruces•1mo ago

https://sqlite.org/wasm/doc/trunk/index.md

jurip•1mo ago

One thing that's always brought up in these discussions, because it's worth bringing up, is the file format of the macOS image editor Acorn: https://flyingmeat.com/acorn/docs/technotes/ACTN002.html

Personally I use it a bunch in mobile and desktop apps.

noduerme•1mo ago

That's quite fun to know! Although, it's also funny that I downloaded the demo .acorn files on a Mac and the OS has no idea how to open them without searching the App Store.

I feel like a JSON file would be more compact and easier to read, but wtf do I know. Harder to query, I guess?

gbalduzzi•1mo ago

JSON is just terrible size-wise, it can't efficiently store binary data

criddell•1mo ago

SQLite is bundled with Windows as well.

> All supported versions of Windows support SQLite, so your app does not have to package SQLite libraries. Instead, your app can use the version of SQLite that comes installed with Windows.

https://learn.microsoft.com/en-us/windows/apps/develop/data-...

arkh•1mo ago

> what are people using SQLite for?

Managing profiles and inventory in a solo game where crafting results are random and I don't like limited inventories.

dist-epoch•1mo ago

> single shot Mysql DB for testing, which is easy to spin up and easy to detach from any given back-end instance

you're doing something wrong if that is easier than using sqlite

> What's the advantage over spinning up a tiny MySQL instance in a cloud somewhere

one advantage is your thing will work without needing network access

pmbanugo•1mo ago

I'd try it just for the LMDB mixing

tehlike•1mo ago

I'd try it to see if it can be made to work against rocksdb or foundationdb - mvsqlite kind of does that but foundationdb is missing things like compression that rocksdb provides.

maxloh•1mo ago

SQLite itself is open source, but the last time I checked, its test suite remains proprietary.

How does a "fork" like this be tested if everything stays working and compatible to upstream after the change?

rogerbinns•1mo ago

The general test suite is not proprietary, and is a standard part of the code. You can run make test. It uses TCL to run the testing, and covers virtually everything.

There is a separate TH3 test suite which is proprietary. It generates C code of the tests so you can run the testing in embedded and similar environments, as well as coverage of more obscure test cases.

https://sqlite.org/th3.html

OJFord•1mo ago

Why is that? Surely that leads to conversations with open source contributors like 'this fails the test suite, but I can't show you, please fix it'?

jitl•1mo ago

SQLite doesn’t accept contributions

LiamPowell•1mo ago

This isn't an issue as SQLite doesn't accept contributions because they don't want to risk someone submitting proprietary code and lying about its origin.

I've never understood why other large open-source projects are just willing to accept contributions from anyone. What's the plan when someone copy-pastes code from some proprietary codebase and the rights holders finds it?

Vvector•1mo ago

The "plan" is to take out the contaminated code and rewrite it.

LiamPowell•1mo ago

If the rights holder is particularly litigious then I could see them suing even if you agreed to take out their code under the argument that you've distributed it and profited from it. I don't know if there's been any cases of this historically but I'd be surprised if there hasn't been.

Vvector•1mo ago

Every open source project has the possibility of litigation. Can't always live in fear of the bogeyman

timewizard•1mo ago

The same issue is present with the use of LLMs. Are you absolutely sure it didn't just repeat some copyrighted code at you?

OJFord•1mo ago

Partly why they have CLAs I suppose?

If someone sells me something they stole, I'm not on the hook for the theft.

mmooss•1mo ago

> LumoSQL exists to demonstrate changes to SQLite that might be useful, but which SQLite probably cannot consider for many years because of SQLite's unique position of being used by a majority of the world's population.

> SQLite is used by thousands of software projects, just three being Google's Android, Mozilla's Firefox and Apple's iOS which between them have billions of users. That is a main reason why SQLite is so careful and conservative with all changes.

That's a great perspective. How well does the SQLite team work with them? How well does it work in production, especially if you need SQLite compatibility? And

Lord_Zero•1mo ago

The home page doesn't say what it actually is or does.

philjohn•1mo ago

Main thing seems to be pluggable backends.

advisedwang•1mo ago

right, but what do the alternative backends do that the OG doesn't

kragen•1mo ago

The home page says, "LumoSQL can swap back end key-value store engines in and out of SQLite. LMDB is the most famous (but not the only) example of an alternative key-value store, and LumoSQL can combine dozes of versions of LMDB and SQLite source code(...)" and also, "Encryption and corruption detection, optionally per-row".

systems•1mo ago

"LumoSQL can swap back end key-value store engines in and out of SQLite."

"LumoSQL can swap SQLite backend, with Key-value store engines"

===

"LMDB is the most famous (but not the only) example of an alternative key-value store"

"We currently only support LMDB as an alternative KV store"

===

"and LumoSQL can combine dozes of versions of LMDB and SQLite source code like this:"

"LumoSQL will allow you to use different versions of SQLite and LMDB in parallel as different backends"

kragen•1mo ago

Why are you reposting these sentences with additional errrors inserted into them?

systems•1mo ago

i was rephrasing , saying what i understood (thinking i was making it clearer)

you are suggesting, i misunderstood the original text , if that is true i blame the original of being obfuscated

kragen•1mo ago

It definitely is no paragon of clarity or careful editing.

diggan•1mo ago

I feel like this best explains what added benefits LumoSQL tries to add:

> LumoSQL is a derivative of SQLite, the most-used software in the world. Our focus is on privacy, at-rest encryption, reproducibility and the things needed to support these goals. [...]

https://lumosql.org/src/lumosql/file?name=doc/project-announ...

I'm unsure what Phase 1 was about, or if there is a planned Phase 3, but seems to outline what they're currently aiming for at least.

blacklion•1mo ago

AFAIR, LMDB is very buggy. There was one person who showed that and maintained fork of LMDB with many bugs fixed, but he is very opinionated and think that world outside Russia is "evil" and forbid to use his fork for evil...

Oh, license was changed to Apache 2.0! But still github account has note which equals Hitler to Soros...

Why should SQLite backend be replaced with LMDB?

UPD: Ooops, LMDB was forked a long time ago, so, maybe, LMDB can be fixed already!

jasonwatkinspdx•1mo ago

AFAIK the Symas version is the official now. It's part of the OpenLDAP repo but they have a read only mirror on github for people who want to use it stand alone.

OpenLDAP is very heavily used by a lot of companies, in particular mobile operators that put it under very heavy load, so I'd be reasonably confident of its reliability.

canadiantim•1mo ago

Can LumoSQL be used with Litestream for replication?

canadiantim•1mo ago

How does LumoSQL compare to LibSQL/Turso?

cluckindan•1mo ago

KV backend suggestion: BadgerDB

https://github.com/hypermodeinc/badger

alexpadula•1mo ago

How does LumoSQL deal with locking? If I proposed Wildcat as a storage engine which is lockless. Can LumoSQL be optimized for atomicy and multi writer? C shared library too if you're curious https://wildcatdb.com

Asynchrony is not concurrency

How to write Rust in the Linux kernel: part 3

Ccusage: A CLI tool for analyzing Claude Code usage from local JSONL files

Silence Is a Commons by Ivan Illich (1983)

Shutting Down Clear Linux OS

Broadcom to discontinue free Bitnami Helm charts

Wii U SDBoot1 Exploit “paid the beak”

Multiplatform Matrix Multiplication Kernels

EPA says it will eliminate its scientific reseach arm

lsr: ls with io_uring

Valve confirms credit card companies pressured it to delist certain adult games

Meta says it wont sign Europe AI agreement, calling it growth stunting overreach

Trying Guix: A Nixer's impressions

Replication of Quantum Factorisation Records with a VIC-20, an Abacus, and a Dog

AI capex is so big that it's affecting economic statistics

Show HN: Molab, a cloud-hosted Marimo notebook workspace

Mango Health (YC W24) Is Hiring

The year of peak might and magic

CP/M creator Gary Kildall's memoirs released as free download

Sage: An atomic bomb kicked off the biggest computing project in history

Show HN: I built library management app for those who outgrew spreadsheets

A New Geometry for Einstein's Theory of Relativity

Cancer DNA is detectable in blood years before diagnosis

Show HN: Simulating autonomous drone formations

How I keep up with AI progress

Benben: An audio player for the terminal, written in Common Lisp

Making a StringBuffer in C, and questioning my sanity

Hundred Rabbits – Low-tech living while sailing the world

How to Get Foreign Keys Horribly Wrong

When root meets immutable: OpenBSD chflags vs. log tampering

Asynchrony is not concurrency

How to write Rust in the Linux kernel: part 3

Ccusage: A CLI tool for analyzing Claude Code usage from local JSONL files

Silence Is a Commons by Ivan Illich (1983)

Shutting Down Clear Linux OS

Broadcom to discontinue free Bitnami Helm charts

Wii U SDBoot1 Exploit “paid the beak”

Multiplatform Matrix Multiplication Kernels

EPA says it will eliminate its scientific reseach arm

lsr: ls with io_uring

Valve confirms credit card companies pressured it to delist certain adult games

Meta says it wont sign Europe AI agreement, calling it growth stunting overreach

Trying Guix: A Nixer's impressions

Replication of Quantum Factorisation Records with a VIC-20, an Abacus, and a Dog

AI capex is so big that it's affecting economic statistics

Show HN: Molab, a cloud-hosted Marimo notebook workspace

Mango Health (YC W24) Is Hiring

The year of peak might and magic

CP/M creator Gary Kildall's memoirs released as free download

Sage: An atomic bomb kicked off the biggest computing project in history

Show HN: I built library management app for those who outgrew spreadsheets

A New Geometry for Einstein's Theory of Relativity

Cancer DNA is detectable in blood years before diagnosis

Show HN: Simulating autonomous drone formations

How I keep up with AI progress

Benben: An audio player for the terminal, written in Common Lisp

Making a StringBuffer in C, and questioning my sanity

Hundred Rabbits – Low-tech living while sailing the world

How to Get Foreign Keys Horribly Wrong

When root meets immutable: OpenBSD chflags vs. log tampering

LumoSQL

Comments