Show HN: Public Apache Iceberg datasets via a REST catalog

https://opensource.googleblog.com/2026/01/explore-public-datasets-with-apache-iceberg-and-biglake.html

13•talatuyarer•3w ago

Hi HN,

I’m one of the creators of this project.

We noticed that while many developers want to experiment with Apache Iceberg the "entry cost" is often high. You usually have to set up your own storage buckets, configure a catalog (like Hive), and ingest data before you can even run a single SELECT statement.

We wanted to lower that barrier. We’ve hosted a production-grade Iceberg REST Catalog on BigLake with public datasets (starting with the NYC Taxi data) that anyone can query.

You can point Spark, Trino, or Flink directly at the REST endpoint and start querying immediately.

You do need a Google Cloud Project ID for authentication/quota, but the data access itself is free and public.

I’d love to hear your thoughts. Are there specific datasets or Iceberg features you’d like to see added to the dataset?

Comments

kamaci•3w ago

Nice way to lower the barrier to trying Iceberg!

Do you have any plans or timeline for supporting the Iceberg v3 spec?

talatuyarer•3w ago

Thank you.

Yes We have plan to publish Dataset for Apache V3 spec features such as Variant, Deletion Vector. I can update this comment when we have release date.

mustafaulu•3w ago

This looks really useful. Is it possible to access the REST catalog and query the datasets directly from Python?

If so, do you have a minimal Python example?

talatuyarer•3w ago

I hope this helps you

https://gist.github.com/talatuyarer/02568a38a7630434556e7dc1...

hakantaymaz•3w ago

Nice work, this really does lower the barrier to actually trying Iceberg instead of just reading about it. Looking forward to poking at it with Trino/Spark..

sakalsiz•3w ago

You should combine this with a Trino Docker config.

Show HN: Poddley.com – Follow people, not podcasts

Layoffs Surge 118% in January – The Highest Since 2009

Papyrus 114: Homer's Iliad

DicePit – Real-time multiplayer Knucklebones in the browser

Turn-Based Structural Triggers: Prompt-Free Backdoors in Multi-Turn LLMs

Show HN: AI Agent Tool That Keeps You in the Loop

Why Every R Package Wrapping External Tools Needs a Sitrep() Function

Achieving Ultra-Fast AI Chat Widgets

Show HN: Runtime Fence – Kill switch for AI agents

Researchers surprised by the brain benefits of cannabis usage in adults over 40

Peter Thiel warns the Antichrist, apocalypse linked to the 'end of modernity'

USS Preble Used Helios Laser to Zap Four Drones in Expanding Testing

Show HN: Animated beach scene, made with CSS

An update on unredacting select Epstein files – DBC12.pdf liberated

Was going to share my work

Pitchfork: A devilishly good process manager for developers

You Are Here

Why social apps need to become proactive, not reactive

How patient are AI scrapers, anyway? – Random Thoughts

Vouch: A contributor trust management system

I built a terminal monitoring app and custom firmware for a clock with Claude

Tiny C Compiler

Y Combinator Founder Organizes 'March for Billionaires'

Ask HN: Need feedback on the idea I'm working on

OpenClaw Addresses Security Risks

Apple finalizes Gemini / Siri deal

Italy Railways Sabotaged

Emacs-tramp-RPC: high-performance TRAMP back end using MsgPack-RPC

Nintendo Wii Themed Portfolio

"There must be something like the opposite of suicide "