> No additional lock or requirements files are needed
Additional to what?
> Guaranteed reproducibility
Of what?
I probably need your project, but I don't understand what it is for.
Having manually sifted through hundreds of randomly sampled notebooks, I feel I can confidently speak on the distribution of characteristics in them, at least up through a couple years ago.
1. Notebooks on GitHub are not necessarily an entirely representative sample of notebooks at large. If the author is putting it on GitHub, there's already a significant selection bias towards certain topics, despite notebooks being used in practically any discipline you can think of at least to some extent.
2. Notebooks in repositories that contain requirements.txt files are a minority and tend to be intended/cleaned up for sharing (itself not the norm).
3. What's more common is just a handful of !pip install at the top.
4. Even more common is just some details on dependencies in an adjoining README.
A very non-trivial chunk of notebooks on GitHub are just copies of the "Hands On ML 3" textbook/exercise set. If my memory serves there are tens of thousands of copies of that one repository. The fork count shown by GitHub doesn't account for the plethora of copies that weren't made using it.
I'm not sure which model fits best, I'll have to see how your juvio handles kernels in jupyter. Does the kernel name change, is it all the default kernel, and what changes when an install happens?
I'm not quite sure what you mean by cleaner git diffs, but hopefully that will become clear with experimentation.
For my particular method of working, I've mostly switched to having each small project (roughly a JIRA ticket) be a separate uv-managed project in a git repo, and I create a kernel for each of the uv projects. This allows me to examine multiple different tickets and kernels without having to launch multiple jupyter labs.
The whole kernel<->venv mapping is another layer of massive complexity on top of the current huge amount of complexity in Python packaging. uv makes it fast , but it does not provide the "correct" or even single route to managing venvs.
This should be your primary selling point!
Could be fantastic for my use-case. We have a large repo of notebooks that are generally write-once, sometimes-re-run. Having a separate repo/venv/kernel per notebook would be a big overhead, so I currently just try to do a kind of semantic versioning where I make a new kernel on something like a 6-month cadence and try to keep breaking dependency changes to a minimum in that window. I can then keep around the old kernels for running old notebooks that depend on them. This is not at all an ideal state.
Thanks for sharing!
simlevesque•4h ago
okost1•4h ago