frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
236•isitcontent•15h ago•26 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
336•vecti•17h ago•147 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
300•eljojo•18h ago•186 comments

Show HN: I'm 75, building an OSS Virtual Protest Protocol for digital activism

https://github.com/voice-of-japan/Virtual-Protest-Protocol/blob/main/README.md
5•sakanakana00•50m ago•1 comments

Show HN: I built Divvy to split restaurant bills from a photo

https://divvyai.app/
3•pieterdy•52m ago•0 comments

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

https://github.com/phreda4/r3
76•phreda4•15h ago•14 comments

Show HN: Smooth CLI – Token-efficient browser for AI agents

https://docs.smooth.sh/cli/overview
92•antves•1d ago•66 comments

Show HN: I Hacked My Family's Meal Planning with an App

https://mealjar.app
2•melvinzammit•3h ago•0 comments

Show HN: ARM64 Android Dev Kit

https://github.com/denuoweb/ARM64-ADK
17•denuoweb•1d ago•2 comments

Show HN: BioTradingArena – Benchmark for LLMs to predict biotech stock movements

https://www.biotradingarena.com/hn
25•dchu17•20h ago•12 comments

Show HN: I built a free UCP checker – see if AI agents can find your store

https://ucphub.ai/ucp-store-check/
2•vladeta•3h ago•2 comments

Show HN: Slack CLI for Agents

https://github.com/stablyai/agent-slack
47•nwparker•1d ago•11 comments

Show HN: Artifact Keeper – Open-Source Artifactory/Nexus Alternative in Rust

https://github.com/artifact-keeper
152•bsgeraci•1d ago•64 comments

Show HN: Gigacode – Use OpenCode's UI with Claude Code/Codex/Amp

https://github.com/rivet-dev/sandbox-agent/tree/main/gigacode
18•NathanFlurry•23h ago•9 comments

Show HN: Compile-Time Vibe Coding

https://github.com/Michael-JB/vibecode
10•michaelchicory•5h ago•1 comments

Show HN: Slop News – HN front page now, but it's all slop

https://dosaygo-studio.github.io/hn-front-page-2035/slop-news
14•keepamovin•6h ago•5 comments

Show HN: Daily-updated database of malicious browser extensions

https://github.com/toborrm9/malicious_extension_sentry
14•toborrm9•20h ago•7 comments

Show HN: Horizons – OSS agent execution engine

https://github.com/synth-laboratories/Horizons
23•JoshPurtell•1d ago•5 comments

Show HN: Micropolis/SimCity Clone in Emacs Lisp

https://github.com/vkazanov/elcity
172•vkazanov•2d ago•49 comments

Show HN: Falcon's Eye (isometric NetHack) running in the browser via WebAssembly

https://rahuljaguste.github.io/Nethack_Falcons_Eye/
5•rahuljaguste•15h ago•1 comments

Show HN: Fitspire – a simple 5-minute workout app for busy people (iOS)

https://apps.apple.com/us/app/fitspire-5-minute-workout/id6758784938
2•devavinoth12•8h ago•0 comments

Show HN: I built a RAG engine to search Singaporean laws

https://github.com/adityaprasad-sudo/Explore-Singapore
4•ambitious_potat•9h ago•4 comments

Show HN: Sem – Semantic diffs and patches for Git

https://ataraxy-labs.github.io/sem/
2•rs545837•10h ago•1 comments

Show HN: Local task classifier and dispatcher on RTX 3080

https://github.com/resilientworkflowsentinel/resilient-workflow-sentinel
25•Shubham_Amb•1d ago•2 comments

Show HN: FastLog: 1.4 GB/s text file analyzer with AVX2 SIMD

https://github.com/AGDNoob/FastLog
5•AGDNoob•11h ago•1 comments

Show HN: A password system with no database, no sync, and nothing to breach

https://bastion-enclave.vercel.app
12•KevinChasse•20h ago•16 comments

Show HN: GitClaw – An AI assistant that runs in GitHub Actions

https://github.com/SawyerHood/gitclaw
9•sawyerjhood•21h ago•0 comments

Show HN: Gohpts tproxy with arp spoofing and sniffing got a new update

https://github.com/shadowy-pycoder/go-http-proxy-to-socks
2•shadowy-pycoder•12h ago•0 comments

Show HN: I built a directory of $1M+ in free credits for startups

https://startupperks.directory
4•osmansiddique•12h ago•0 comments

Show HN: A Kubernetes Operator to Validate Jupyter Notebooks in MLOps

https://github.com/tosin2013/jupyter-notebook-validator-operator
2•takinosh•13h ago•0 comments
Open in hackernews

Show HN: DuckDB for Kafka Stream Processing

https://sql-flow.com/docs/tutorials/intro/
77•dm03514•2mo ago
Hello Everyone! We built SQLFlow as a lightweight stream processing engine.

We leverage DuckDB as the stream processing engine, which gives SQLFlow the ability to process 10's of thousands of messages a second using ~250MiB of memory!

DuckDB also supports a rich ecosystem of sinks and connectors!

https://sql-flow.com/docs/category/tutorials/

https://github.com/turbolytics/sql-flow

We were tired of running JVM's for simple stream processing, and also of bespoke one off stream processors

I would love your feedback, criticisms and/or experiences!

Thank you

Comments

srameshc•2mo ago
This looks brilliant, thank you. I love DuckDB and use it for lot of local data processing jobs. We have a data stream, not to the size where we need to push to BigQuery or elsewhere. I was thinking of trying something like sql-flow but I am glad now it makes the job very easy.
mbay•2mo ago
I see an example with what looks like a lookup-type join against a Postgres DB. Are stream/stream joins supported, though?

The DLQ and Prometheus integration out of the box are nice.

dm03514•2mo ago
Stream to stream joins are NOT currently supported. This is a regularly requested feature, and I'll look at prioritizing it.

SQLFlow uses duckdb internally for windowing and stream state storage :), and I'll look at extending it to support stream / stream joins.

Could you describe a bit more about your use case? I'd really appreciate it if you could create an issue in the repo describing your use case and desired functionality a bit!

https://github.com/turbolytics/sql-flow/issues

We were looking at solving some of the simplier use cases first before branching out into these more complicated ones :)

mbay•2mo ago
I worked on stream processing at my previous gig but don't have a need for it currently. Just curious.
mihevc•2mo ago
How does this compare to https://github.com/Query-farm/tributary ?
dm03514•2mo ago
Oh yes!! I've seen this a couple times. I am far from an expert in tributary so please take with a grain of salt.

Based on the tributary documentation, I understand that tributary embeds kafka consumers into duckdb. This makes duckdb the main process that you run to perform consumption. I think that this makes creating stream processing POCs very accessible. It looks like it is quite easy to start streaming data into duckdb. What I don't see is a full story around Devops, operations, testing, configuration as code etc.

SQLFlow is a service that embeds DuckDB as the storage and processing brains. Because of this, we're able to offer metrics, testing utilities, pipelines as code, and all the other DevOps utilities that are necessary to run a huge number of streaming instances 24x7. SQLFlow was created as a tool that I wish I had to for simple stream processing in production in high availability contexts :)

mihevc•2mo ago
Nice! Thanks for the context, it's great to know!
rustyconover•2mo ago
The next major release of Tributary will support Avro, Protobuf and JSON along with the Schema Registry it will also bring the ability to write to Kafka with transactions.

But really you should get excited for DuckDB Labs to build out materialized views. Materialized views where you can ingest more streaming data to update aggregates. This way you could just keep pushing rows through aggregates from Kafka.

It is going to be a POWER HOUSE for streaming analytics.

Contact DuckDB Labs if you want to sponsor the work on materialized views: https://duckdb.org/roadmap

buremba•2mo ago
Exactly. I have also been playing with DuckDB for streaming use cases, but it feels hacky to issue micro-batching queries on streaming data in short intervals.

DuckDB has everything that streaming engines such as Flink have; it just needs to support managing intermediate aggregate states and scheduling the materialized views itself.

trueno•2mo ago
Is this to be used in an analytics application backend sort of scenario?

I am familiar with materialized views / dynamic tables from enterprise-grade cloud lake type offerings, but I've never quite understood where duckdb, though impressive, fits into everyones use case. I've toyed with it for personal things, it's very cool having a local instance of something akin to snowflake when it comes to processing and aggregating on Big Data™ but generally I don't see it used in operational settings. For application development people are generally tied to sqlite and postgres.

It all does seem really cool though, I guess I'm just not feeling creative enough to conjure up a stream-to-duckdb use case. Feel free to bombard me with cool ideas.

itsfseven•2mo ago
It would be great if this supported Pulsar too!
pulkitsh1234•2mo ago
(not an expert in stream processing).. from the docs here https://sql-flow.com/docs/introduction/basics#output-sink it seems like this works on "batches" of data, how is this different from batch processing ? Where is the "stream" here ?
dm03514•2mo ago
Ha Yes! A pipeline assumes a "batch" of data, which is backed by an ephemeral duckdb in memory table. The goal is to provide SQL table semantics and implement pipelines in a way where the batch size can be toggled without a change to the pipeline logic.

The stream is achieved by the continuous flow of data from Kafka.

SQLFlow exposes a variable for batch size. Setting the batch size to 1 will make it so SQLFlow reads a kafka message, applies the processor SQL logic and then ensures it successfully commits the SQL results to the sink, one after another.

SQLFlow provides at least once delivery guarantees. It will only commit the source message once it successfully writes to the pipeline output (sink).

https://sql-flow.com/docs/operations/handling-errors

The batch table is just a convention which allows for seamless batch size configuration. If your throughput is low, or if you require message by message processing, SQLFlow can be toggled to a batch of 1. If you need higher throughput and can tolerate the latency, then the batch can be toggled higher.