frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Any pipeline tool for ClickHouse, similar to Snowflake's Dynamic Tables

https://www.snowflake.com/en/blog/reimagine-batch-streaming-data-pipelines/
2•tingfirst•4mo ago

Comments

tingfirst•4mo ago
Is there a native SQL pipeline tool for ClickHouse that processes real-time data incrementally, with low latency, large throughput and high efficiency, similar to Snowflake’s Dynamic Tables?

[1] Dynamic Tables: One of Snowflake’s Fastest-Adopted Features: https://www.snowflake.com/en/blog/reimagine-batch-streaming-...

Sep142324•4mo ago
Dynamic Tables are interesting for declarative streaming. In the ClickHouse ecosystem, you might want to look at materialized views combined with streaming engines.

For real-time transformations, there are a few approaches: - Native ClickHouse MaterializedViews with AggregatingMergeTree - Stream processors that write to ClickHouse (Flink, Spark Streaming) - Streaming SQL engines that can read/write ClickHouse

We've been working on streaming SQL at Proton (github.com/timeplus-io/proton) which handles similar use cases - continuous queries that maintain state and can write results back to ClickHouse. The key difference from Dynamic Tables is handling unbounded streams vs micro-batches.

What's your specific use case? Happy to discuss the tradeoffs.

tingfirst•4mo ago
Data sources are usually in Kafka, or other operational databases like Postgres or MySQL

1. Table A : fact events, high-throughput (10k~1M eps), high-cardinality

2. Table B, C, D : couple of dimension tables (fast or slow changing).

The use case is straightforward : join/enrich/lookup everything into one big flattened, analytics-friendly table into ClickHouse.

What’s the best pipeline approach to achieve this in real-time and efficiently?

tingfirst•4mo ago
Consistently we heard about ClickHouse has very limited materialized views that can't handle real-time pipeline fast efficiently enough. would love to see more comments here.
gangtao•4mo ago
there are some limitations as I know:

1. Insert Performance Degradation

Users frequently complain that materialized views significantly slow down insert performance, especially when having multiple MVs on a single table.

2. Streaming Data Patterns

This is critical for ClickHouse materialized views. Streaming data often arrives in frequent, small batches, but ClickHouse performs best when ingesting data in larger batches. The materialized views' transformation query runs synchronously within the INSERT transaction for every single batch, making the fixed overhead disproportionately large for small batches

3. Block-Level Processing Limitations

MVs in ClickHouse operate only on the data blocks being inserted at that moment. When performing aggregation, a single group from the original dataset may have multiple entries in the target table since grouping is applied only to the current insert block.

4. JOIN Limitations and Memory Issues

Materialized views with JOINs are problematic because MVs only trigger on the left-most table. It's also inefficient to update the view upon the right join table since it needs to recreate a hash table each time, or else keeping a large hash table and consuming a lot of memory.

5. Reprocessing historical data requires manual ALTER TABLE operations.

6. Each materialized view will create a new part from the block over which it runs - potentially causing the "Too Many Parts" issue

gangtao•4mo ago
you can check https://github.com/timeplus-io/proton which provides streaming processing pipeline.

AppSecMaster – Learn Application Security with hands on challenges

https://www.appsecmaster.net/en
1•aqeisi•8s ago•1 comments

Fibonacci Number Certificates

https://www.johndcook.com/blog/2026/02/05/fibonacci-certificate/
1•y1n0•1m ago•0 comments

AI Overviews are killing the web search, and there's nothing we can do about it

https://www.neowin.net/editorials/ai-overviews-are-killing-the-web-search-and-theres-nothing-we-c...
2•bundie•6m ago•0 comments

City skylines need an upgrade in the face of climate stress

https://theconversation.com/city-skylines-need-an-upgrade-in-the-face-of-climate-stress-267763
3•gnabgib•7m ago•0 comments

1979: The Model World of Robert Symes [video]

https://www.youtube.com/watch?v=HmDxmxhrGDc
1•xqcgrek2•11m ago•0 comments

Satellites Have a Lot of Room

https://www.johndcook.com/blog/2026/02/02/satellites-have-a-lot-of-room/
2•y1n0•12m ago•0 comments

1980s Farm Crisis

https://en.wikipedia.org/wiki/1980s_farm_crisis
3•calebhwin•13m ago•1 comments

Show HN: FSID - Identifier for files and directories (like ISBN for Books)

https://github.com/skorotkiewicz/fsid
1•modinfo•18m ago•0 comments

Show HN: Holy Grail: Open-Source Autonomous Development Agent

https://github.com/dakotalock/holygrailopensource
1•Moriarty2026•25m ago•1 comments

Show HN: Minecraft Creeper meets 90s Tamagotchi

https://github.com/danielbrendel/krepagotchi-game
1•foxiel•32m ago•1 comments

Show HN: Termiteam – Control center for multiple AI agent terminals

https://github.com/NetanelBaruch/termiteam
1•Netanelbaruch•32m ago•0 comments

The only U.S. particle collider shuts down

https://www.sciencenews.org/article/particle-collider-shuts-down-brookhaven
2•rolph•35m ago•1 comments

Ask HN: Why do purchased B2B email lists still have such poor deliverability?

1•solarisos•35m ago•2 comments

Show HN: Remotion directory (videos and prompts)

https://www.remotion.directory/
1•rokbenko•37m ago•0 comments

Portable C Compiler

https://en.wikipedia.org/wiki/Portable_C_Compiler
2•guerrilla•39m ago•0 comments

Show HN: Kokki – A "Dual-Core" System Prompt to Reduce LLM Hallucinations

1•Ginsabo•40m ago•0 comments

Software Engineering Transformation 2026

https://mfranc.com/blog/ai-2026/
1•michal-franc•41m ago•0 comments

Microsoft purges Win11 printer drivers, devices on borrowed time

https://www.tomshardware.com/peripherals/printers/microsoft-stops-distrubitng-legacy-v3-and-v4-pr...
3•rolph•42m ago•1 comments

Lunch with the FT: Tarek Mansour

https://www.ft.com/content/a4cebf4c-c26c-48bb-82c8-5701d8256282
2•hhs•45m ago•0 comments

Old Mexico and her lost provinces (1883)

https://www.gutenberg.org/cache/epub/77881/pg77881-images.html
1•petethomas•48m ago•0 comments

'AI' is a dick move, redux

https://www.baldurbjarnason.com/notes/2026/note-on-debating-llm-fans/
5•cratermoon•49m ago•0 comments

The source code was the moat. But not anymore

https://philipotoole.com/the-source-code-was-the-moat-no-longer/
1•otoolep•50m ago•0 comments

Does anyone else feel like their inbox has become their job?

1•cfata•50m ago•1 comments

An AI model that can read and diagnose a brain MRI in seconds

https://www.michiganmedicine.org/health-lab/ai-model-can-read-and-diagnose-brain-mri-seconds
2•hhs•53m ago•0 comments

Dev with 5 of experience switched to Rails, what should I be careful about?

2•vampiregrey•55m ago•0 comments

AlphaFace: High Fidelity and Real-Time Face Swapper Robust to Facial Pose

https://arxiv.org/abs/2601.16429
1•PaulHoule•56m ago•0 comments

Scientists discover “levitating” time crystals that you can hold in your hand

https://www.nyu.edu/about/news-publications/news/2026/february/scientists-discover--levitating--t...
3•hhs•58m ago•0 comments

Rammstein – Deutschland (C64 Cover, Real SID, 8-bit – 2019) [video]

https://www.youtube.com/watch?v=3VReIuv1GFo
1•erickhill•59m ago•0 comments

Tell HN: Yet Another Round of Zendesk Spam

5•Philpax•59m ago•1 comments

Postgres Message Queue (PGMQ)

https://github.com/pgmq/pgmq
1•Lwrless•1h ago•0 comments