OLake just added Kafka as a source, allowing data from Kafka topics to be written directly into Apache Iceberg tables (open source, no proprietary format).
Why this was added:
Many teams today land Kafka data into warehouses or custom storage layers, then later rewrite it into Iceberg for analytics or AI workloads.
That adds latency, cost, and operational complexity.
With OLake:
- Kafka -> Iceberg is a single step
- Tables are standard Iceberg (queryable by Spark, Trino, Presto, Athena, etc.)
- Supports schema evolution and high-throughput ingestion
This is early and we’re actively looking for feedback from teams running Kafka at scale or experimenting with Iceberg-based lakehouses.
rohankhameshra•1h ago
OLake just added Kafka as a source, allowing data from Kafka topics to be written directly into Apache Iceberg tables (open source, no proprietary format).
Why this was added:
Many teams today land Kafka data into warehouses or custom storage layers, then later rewrite it into Iceberg for analytics or AI workloads.
That adds latency, cost, and operational complexity.
With OLake:
- Kafka -> Iceberg is a single step
- Tables are standard Iceberg (queryable by Spark, Trino, Presto, Athena, etc.)
- Supports schema evolution and high-throughput ingestion
This is early and we’re actively looking for feedback from teams running Kafka at scale or experimenting with Iceberg-based lakehouses.
Happy to answer questions or discuss trade-offs.