frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

A Python-first data lakehouse

https://www.bauplanlabs.com/blog/everything-as-python
65•akshayka•2d ago

Comments

flakiness•3h ago
There have been so many "better notebook" implementations over the years that I cannot catch up. What are the promising one? Is this "marimo" one of them or rather a newcomer?
simonw•3h ago
Marimo is very impressive. It's effectively a cross between Jupyter and https://observablehq.com/ - it adds "reactivity", which solves the issue where Jupyter cells can be run in any order which can make the behavior of a notebook unpredictable, whereas in Marimo (and Observable) updating a cell automatically triggers other dependent cells to re-execute, similar to a spreadsheet.

Marimo is pretty new (first release January 2025) but has a high rate of improvement. It's particularly good for WebAssembly stuff - that's been one of their key features almost from the start.

My notes on it so far are here: https://simonwillison.net/tags/marimo/

lvl155•2h ago
I think it’s safe to say Observable’s inability to properly price their services made people look elsewhere. Their new offering is interesting but also ridiculously priced.
ayhanfuat•22m ago
I was also wondering their pricing because Canvas seemed so cool at first. Now that I've seen your comment I checked and $900/month (includes 10 users) is indeed very high. I guess they are primarily targeting big enterprises.
akshayka•2h ago
Thanks Simon for the kind words!

For those new to marimo, we have affordances for working with expensive (ML/AI/pyspark) notebooks too, including lazy execution that gives you guarantees on state without running automatically.

One small note: marimo was actually first launched publicly (on HN) in January 2024 [1]. Our first open-source release was in 2023 (a quiet soft launch). And we've been in development since 2022, in close consultation with Stanford scientists. We're used pretty broadly today :)

[1] https://news.ycombinator.com/item?id=38971966

theLiminator•3h ago
I personally really like marimo. It's very easy to use and for data analysis type tasks it seems to work a lot better than jupyter in most cases.
cantdutchthis•2h ago
marimo is open source and uses a reactive model which makes it fun to mix/match widgets with Python code. It even supports gamepads if you wanted to go nuts!

https://youtu.be/4fXLB5_F2rg?si=jeUj77Cte3TkQ1j-

disclaimer: I work for marimo and I made that video, but the gamepad support is awesome and really shows the flexibility

Snakes3727•2h ago
One of the most critical aspects a Lakehouse is protecting data for security and compliance reasons and this article completely just glosses over it which makes me really uncomfortable.
jtagliabuetooso•2h ago
Thanks for the feedback. Bauplan actually features a few innovative points in this area, and full Pythonic at that: Git for Data (https://docs.bauplanlabs.com/en/latest/concepts/git_for_data...) to sandbox any data change, tag it for compliance and make it querable; full code and data auditability in one command (AFAIK, the only platform offering this), as every change is automatically versioned and tagged with the exact run and code that produced it (https://docs.bauplanlabs.com/en/latest/concepts/git_for_data...).

Our sandbox with public data is free for you to try, or just reach out and ask any question!

jtagliabuetooso•2h ago
Hey, founder of Bauplan here. Happy to field any questions or thoughts. Yes, marimo is great, and it's the only way to work within a real Python ecosystem for production use cases shipping proper code.
waffletower•1h ago
Rolling a notebook out to a service rapidly is an attractive idea -- but, as mentioned, has security implications -- I can add that there are also a host of monitoring implications as well -- service quality & continuity, model quality etc.
jtagliabuetooso•1h ago
You mean on the data side? Data access in the example (and in real-world) is mediated by production-grade Iceberg compatible catalog, sandboxed changes, and full auditability trail (https://docs.bauplanlabs.com/en/latest/concepts/git_for_data...). Or do you mean something else?
waffletower•1h ago
I don't think python is always the best suited language for managing models and agents, but it certainly is the most popular and has the largest choice of related libraries. "Python first" or "pythonic" invites skepticism from me.
davistreybig•1h ago
Huge fan of Marimo - fixes so many of the annoying problems w/ notebooks
blooalien•34m ago
I find Marimo best for when you're trying to build something "app-like"; an interactive tool to perform a specific task. I find Jupyter lab more appropriate for random experimentation and exploration, and documenting your learnings. Each absolutely has it's place in the toolbox, and does it's thing well, but for me at least, there's not much overlap between the two other than the cell-based notebook-like similarity. That similarity works well for me when migrating from exploration mode to app design mode. The familiar interface makes it easy for me to take ideas from Jupyter into Marimo to build out a proper application.

Phoenix.new – Remote AI Runtime for Phoenix

https://fly.io/blog/phoenix-new-the-remote-ai-runtime/
272•wut42•5h ago•125 comments

Visualizing environmental costs of war in Hayao Miyazaki's Nausicaä

https://jgeekstudies.org/2025/06/20/wilted-lands-and-wounded-worlds-visualizing-environmental-costs-of-war-in-hayao-miyazakis-nausicaa-of-the-valley-of-the-wind/
123•zdw•4h ago•38 comments

Show HN: Nxtscape – an open-source agentic browser

https://github.com/nxtscape/nxtscape
124•felarof•3h ago•88 comments

EU Eyes Ditching Microsoft Azure for France's OVHcloud

https://www.euractiv.com/section/tech/news/scoop-commission-eyes-ditching-microsoft-azure-for-frances-ovhcloud-over-digital-sovereignty-fears/
86•doener•1h ago•47 comments

Show HN: Inspect and extract files from MSI installers directly in your browser

https://pymsi.readthedocs.io/en/latest/msi_viewer.html
6•rmast•16m ago•0 comments

Cracovians: The Twisted Twins of Matrices

https://marcinciura.wordpress.com/2025/06/20/cracovians-the-twisted-twins-of-matrices/
35•mci•3h ago•18 comments

Dancing Naked on the Head of a Pin: The Early History of Microphotography

https://publicdomainreview.org/essay/dancing-naked-on-the-head-of-a-pin
12•crescit_eundo•2d ago•0 comments

Oklo, the Earth's Two-billion-year-old only Known Natural Nuclear Reactor (2018)

https://www.iaea.org/newscenter/news/meet-oklo-the-earths-two-billion-year-old-only-known-natural-nuclear-reactor
136•keepamovin•10h ago•53 comments

Tuxracer.js play Tux Racer in the browser

https://github.com/ebbejan/tux-racer-js
35•retro_guy•3h ago•12 comments

A Python-first data lakehouse

https://www.bauplanlabs.com/blog/everything-as-python
65•akshayka•2d ago•15 comments

Hurl: Run and test HTTP requests with plain text

https://github.com/Orange-OpenSource/hurl
391•flykespice•16h ago•96 comments

Klong: A Simple Array Language

https://t3x.org/klong/
91•tosh•7h ago•37 comments

Show HN: SnapQL – Desktop app to query Postgres with AI

https://github.com/NickTikhonov/snap-ql
65•nicktikhonov•9h ago•44 comments

An analysis of recent multithreading improvements for a smoother game

https://dev.arma3.com/post/oprep-performance-optimizations-in-220
23•diggan•3d ago•0 comments

New dating for White Sands footprints confirms controversial theory

https://arstechnica.com/science/2025/06/study-confirms-white-sands-footprints-are-23000-years-old/
27•_tk_•1h ago•1 comments

How to Design Programs 2nd Ed (2024)

https://htdp.org
64•AbuAssar•4h ago•13 comments

Verified Dynamic Programming with Σ-types in Lean

https://tannerduve.github.io/blog/memoization-sigma/
5•rck•3d ago•0 comments

A Brief, Incomplete, and Mostly Wrong History of Robotics

https://generalrobots.substack.com/p/a-brief-incomplete-and-mostly-wrong
78•Bogdanp•4d ago•31 comments

Minimal auto-differentiation engine in Rust

https://github.com/e3ntity/nanograd
40•lschneider•6h ago•4 comments

Asterinas: A new Linux-compatible kernel project

https://lwn.net/SubscriberLink/1022920/ad60263cd13c8a13/
180•howtofly•18h ago•62 comments

Career advice, or something like it

https://brooker.co.za/blog/2025/06/20/career.html
31•SchwKatze•1h ago•1 comments

Meta announces Oakley smart glasses

https://www.theverge.com/news/690133/meta-oakley-hstn-ai-glasses-price-date
131•jmsflknr•7h ago•244 comments

College baseball, venture capital, and the long maybe

https://bcantrill.dtrace.org/2025/06/15/college-baseball-venture-capital-and-the-long-maybe/
103•bcantrill•4d ago•63 comments

Qfex (YC X25) – Back End Engineer for a 24/7 Stock Exchange

https://www.ycombinator.com/companies/qfex/jobs/S7XSybx-founding-backend-engineer
1•NPDW•13h ago

ELIZA Reanimated: Restoring the Mother of All Chatbots

https://www.computer.org/csdl/magazine/an/2025/02/11030922/27sQDLuL7Uc
84•abrax3141•3d ago•19 comments

Congestion pricing in Manhattan is a predictable success

https://www.economist.com/united-states/2025/06/19/congestion-pricing-in-manhattan-is-a-predictable-success
222•edward•5h ago•346 comments

Show HN: SecureBuild – Zero-CVE Images That Pay OSS Projects

https://securebuild.com
25•grantlmiller•5h ago•12 comments

Reworking Memory Management in CRuby [pdf]

https://blog.peterzhu.ca/assets/ismm_2025.pdf
32•hahahacorn•3d ago•3 comments

Giant, all-seeing telescope is set to revolutionize astronomy

https://www.science.org/content/article/giant-all-seeing-telescope-set-revolutionize-astronomy
170•gammarator•21h ago•59 comments

Andrej Karpathy: Software in the era of AI [video]

https://www.youtube.com/watch?v=LCEmiRjPEtQ
1349•sandslash•1d ago•739 comments