We were adding MotherDuck as a destination and the first version just used DuckDB’s Go driver directly. It worked great on my machine… until we wired it into our Transfer service (https://github.com/artie-labs/transfer).
Because the driver requires CGO, our cross-compiles to amd64 and arm64 started failing, we lost our easy static binaries, and our Docker images had to pull in C toolchains and system libraries just to support one dependency. We tried isolating the CGO bits in a separate module, but it still caused CI failures and forced us to rewrite chunks of our build pipeline. At that point it was clear we didn’t want CGO anywhere near our main service.
So I built ducktape: a tiny standalone microservice that wraps DuckDB’s Appender API behind HTTP/2 streams. Clients stream NDJSON over HTTP/2, and ducktape appends directly into DuckDB on the other side. No CGO in the main codebase, and we keep our cross-platform, pure-Go build story.
The overhead was surprisingly low in benchmarks: ~757 MiB/sec over HTTP/2 vs ~848 MiB/sec in-process — about 90% of native performance but over the network.
ducktape is open source and MIT licensed: https://github.com/artie-labs/ducktape
I’d love feedback, especially if you’ve tackled CGO isolation differently or have ideas to squeeze out more performance!