frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Rocky – Rust SQL engine with branches, replay, column lineage

https://github.com/rocky-data/rocky
54•hugocorreia90•18h ago
Hi HN, I'm Hugo. I've been building Rocky over the past month, shipping fast in the open. The binary is on GitHub Releases, `dagster-rocky` on PyPI, and the VS Code extension on the Marketplace. I held off on a broader announcement until the trust-system surface was coherent enough to talk about as one thing. The governance waveplan — column classification, per-env masking, 8-field audit trail on every run, `rocky compliance` rollup, role-graph reconciliation, retention policies — landed end-to-end last week in engine-v1.16.0 and rounded out in v1.17.4 (tagged 2026-04-26). That's the milestone I'd been waiting for.

The pitch: keep Databricks or Snowflake. Bring Rocky for the DAG. Rocky is a Rust-based control plane for warehouse pipelines. Storage and compute stay with your warehouse. Rocky owns the graph — dependencies, compile-time types, drift, incremental logic, cost, lineage, governance. The things your current stack can't give you because it doesn't own the DAG.

A few things I think are interesting:

- Branches + replay. `rocky branch create stg` gives you a logical copy of a pipeline's tables (schema-prefix today; native Delta SHALLOW CLONE and Snowflake zero-copy are next). `rocky replay <run_id>` reconstructs which SQL ran against which inputs. Git-grade workflow on a warehouse.

- Column-level lineage from the compiler, not a post-hoc graph crawl. The type checker traces columns through joins, CTEs, and windows. VS Code surfaces it inline via LSP.

- Governance as a first-class surface. Column classification tags plus per-env masking policies, applied to the warehouse via Unity Catalog (Databricks) or masking policies (Snowflake). 8-field audit trail on every run. `rocky compliance` rollup that CI can gate on. Role-graph reconciliation via SCIM + per-catalog GRANT. Retention policies with a warehouse-side drift probe.

- Cost attribution. Every run produces per-model cost (bytes, duration). `[budget]` blocks in `rocky.toml`; breaches fire a `budget_breach` hook event.

- Compile-time portability + blast radius. Dialect-divergence lint across Databricks / Snowflake / BigQuery / DuckDB (12 constructs). `SELECT *` downstream-impact lint.

- Schema-grounded AI. Generated SQL goes through the compiler — AI suggestions type-check before they can land.

What Rocky isn't:

- Not a warehouse — it's the control plane on top.

- Not a Fivetran replacement. `rocky load` handles files (CSV/Parquet/JSONL); for SaaS sources use Fivetran, Airbyte, or warehouse-native CDC.

- Not dbt Cloud — no hosted UI, no managed scheduler. First-class Dagster integration if you need orchestration.

Adapters: Databricks (GA), Snowflake (Beta), BigQuery (Beta), DuckDB (local dev / playground). Apache 2.0.

I'd love feedback on the trust-system framing, the governance surface (particularly classification-to-masking resolution in `rocky compile` and the `rocky compliance` CI gate), the branches/replay design, the cost-attribution primitives, or anything else that catches your eye. Happy to go deep in the thread.

Comments

mergisi•1h ago
* * *
hasyimibhar•1h ago
Looks cool, I've been waiting for someone to build this since dbt and SQLMesh acquisition. It would be great to have model versioning and support for ClickHouse SQL.
mollerhoj•4m ago
Its a bit confusing to claim that "The things your current stack can't give you because it doesn't own the DAG" and use DataBricks as your example: DataBricks includes jobs and pipelines, so it very much owns the DAG, no?

Show HN: Rocky – Rust SQL engine with branches, replay, column lineage

https://github.com/rocky-data/rocky
54•hugocorreia90•18h ago•3 comments

Show HN: Auto-Architecture: Karpathy's Loop, pointed at a CPU

https://github.com/FeSens/auto-arch-tournament/blob/main/docs/auto-arch-tournament-blog-post.md
147•fesens•16h ago•31 comments

Show HN: Rip.so – a graveyard for dead internet things

https://rip.so
3•bozdemir•10m ago•0 comments

Show HN: Drive any macOS app in the background without stealing the cursor

https://github.com/trycua/cua
130•frabonacci•17h ago•30 comments

Show HN: Live Sun and Moon Dashboard with NASA Footage

https://www.lumara-space.app/
198•beeswaxpat•20h ago•62 comments

Show HN: TiGrIS, a tiling compiler that fits ML models onto embedded devices

https://github.com/raws-labs/tigris
18•asteinh•2h ago•0 comments

Show HN: OSS Agent I built topped the TerminalBench on Gemini-3-flash-preview

https://github.com/dirac-run/dirac
384•GodelNumbering•1d ago•141 comments

Show HN: Pi-hosts – Give the Pi coding agent access to your servers

https://github.com/hunvreus/pi-hosts
16•hunvreus•7h ago•0 comments

Show HN: Utilyze – an open source GPU monitoring tool more accurate than nvtop

https://www.systalyze.com/utilyze
122•ManyaGhobadi•1d ago•28 comments

Show HN: GeoTraceroute – Traceroutes on a 3D globe and submarine cables

https://geotraceroute.com
21•Himred•7h ago•1 comments

Show HN: A terminal spreadsheet editor with Vim keybindings

https://github.com/garritfra/cell
121•garritfra•1d ago•50 comments

Show HN: 49Agents – 2D Canvas IDE for Orchestrating Agents, Repos, Issues

https://github.com/49Agents/49Agents
14•alpadurza•9h ago•1 comments

Show HN: I built another to do list. But it does a lot

https://apps.apple.com/us/app/rotation-list-shared-to-do/id6758746324
11•toddh•10h ago•2 comments

Show HN: ClusterdOS – Kubernetes without the platform team

https://gitlab.com/aranya-tech/public/clusterdos
13•druid•11h ago•1 comments

Show HN: I wrote a DOOM clone in my own programming language

https://spectrelang.org/log/devlog#cubedoom
20•pizza_man•22h ago•5 comments

Show HN: Waiting for LLMs Suck – Give your user a game

https://github.com/ftaip/waiting-game
33•dalemhurley•1d ago•14 comments

Show HN: Effected Keyboard 2 – Effects as You Type

13•vitalipom•13h ago•0 comments

Show HN: A TUI for Markdown view an editing

https://mdee.bkh.dev
19•cloked•13h ago•1 comments

Show HN: Devicons, +1300 logos and icons in React, SVG, and icon format

https://devicons.io/
18•vorillaz•1d ago•2 comments

Show HN: I mapped the latest UK fuel prices by county

https://fuelfox.uk/regional
19•sircipher•14h ago•2 comments

Show HN: Open Bias – proxy that enforces agent behavior at runtime

https://github.com/open-bias/open-bias/
19•algomaniac•15h ago•3 comments

Show HN: The Unix Magic poster, annotated (updated)

https://github.com/drio/unixmagic
74•drio•2d ago•7 comments

Show HN: Tiao, A two-player turn-based board game

https://playtiao.com
73•trebeljahr•2d ago•29 comments

Show HN: Free textbook on engineering thermodynamics

https://thermodynamicsbook.com/
187•2DcAf•2d ago•47 comments

Show HN: Unusual Wikipedia

https://unusualwiki.nk412.com/
37•grilledchickenw•1d ago•3 comments

Show HN: AgentSwift – Open-source iOS builder agent

https://github.com/hpennington/agentswift
63•hpen•1d ago•9 comments

Show HN: Ragnerock, an AI data analysis tool

https://www.ragnerock.com
13•mmahowald27•16h ago•4 comments

Show HN: Decaf – rewrites webpage comments using on-device Gemini Nano

https://github.com/milind-soni/Decaf
15•milindsoni201•17h ago•1 comments

Show HN: PrePrompt – rewrites vague prompts before they reach the LLM

https://preprompt.org/
21•yashdeeptehlan•1d ago•5 comments

Show HN: SyncVibe – Code with friends in the terminal, each with your own AI

https://syncvibe.online/
18•curious1008•18h ago•4 comments