frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: I built a visual, MLOps tool (Skyulf)

https://www.skyulf.com/
2•flyingriverhrse•1d ago
Hi HN,

I built Skyulf because I kept encountering two specific problems that existing tools (like MLflow or standard Scikit-learn pipes) didn't quite solve for me: silent data leakage and monolithic pickles.

## The Problems

1. Data Leakage is Silent: You compute mean imputation on the full dataset, then split. Your model looks great in dev but fails in production. It happens to the best of us.

2. Deployment Hell (The Pickle Problem): Standard pipelines pickle everything data schema, logic, and 3rd party library versions into one opaque blob. To run a simple inference, you need the same heavy environment used for training.

## The Solution: Distinct Calculator & Applier

Skyulf enforces a strict separation of concerns using a Calculator / Applier pattern (inspired by modern engine design).

1. Calculator (Fit): Consumes data (`X`, `y`), learns the state (means, vocabularies, coefficients), and outputs a lightweight, JSON-serializable Artifact.

2. Applier (Predict): A pure function. Consumes the Artifac + New Data -> Output.

Why this matters: You can train on a massive GPU cluster, save just the lightweight JSON artifacts (state), and run the Applier on a tiny CPU instance. The Applier is stateless.

3. Structural Leakage Prevention: We use a `SplitDataset` abstraction. Transformers receive train/test/val as a single object but are mathematically forced to compute statistics on `.train` only.

```python from skyulf import SkyulfPipeline

config = { "preprocessing": [ # Split happens FIRST. Leakage is structurally impossible. {"name": "split", "transformer": "TrainTestSplitter", "params": {"test_size": 0.2}}, {"name": "impute_age", "transformer": "SimpleImputer", "params": {"columns": ["age"], "strategy": "mean"}}, {"name": "scale_income", "transformer": "StandardScaler", "params": {"columns": ["income"]}}, ], "modeling": {"type": "random_forest_classifier", "params": {"n_estimators": 100}} }

pipeline = SkyulfPipeline(config) pipeline.fit(df, target_column="target") pipeline.save("model.pkl") ```

## Features

1. Polars-First (~3.5x Faster): We migrated the core engine from Pandas to Polars. Lazy evaluation means we can scan generic CSV/Parquet files instantly for EDA.

2. One-Liner EDA: Generates a comprehensive profile (quality, outliers, VIF, causal graphs) in seconds.

```python from skyulf.profiling.analyzer import EDAAnalyzer from skyulf.profiling.visualizer import EDAVisualizer import polars as pl

df = pl.read_csv("data.csv") profile = EDAAnalyzer(df).analyze(target_col="churn")

viz = EDAVisualizer(profile, df) viz.summary() # Terminal dashboard viz.plot() # Matplotlib distributions & correlations ```

3. Visual ML Canvas (Local-First): A React-based drag-and-drop UI (running locally via FastAPI) that lets you visually debug pipelines. You can click any node to see data stats at that exact point in the pipeline.

## Why Another Tool?

- vs MLflow: We focus on the construction and execution of the pipeline, not just tracking the metrics.

- vs Scikit-learn Pipelines: We separate state (Artifacts) from logic (Appliers) and enforce leakage checks.

- vs Cloud Platforms: Skyulf is self-hosted. Your data never leaves your machine.

## Current Status

The library skyulf-core is stable on PyPI. The visual platform is functional but still being polished. I'm a solo dev building this in public.

I'm building this in public and would love your feedback. If you find this interesting, a star on GitHub would mean a lot! I'm also looking for contributors if you're into Python, React, or MLOps, check out the issues.

---

*Links*: - Repo: https://github.com/flyingriverhorse/Skyulf - PyPI: https://pypi.org/project/skyulf-core - Docs: https://www.skyulf.com

Show HN: Workflow automation builder for saving website data

https://www.addtosheets.com/automations/
1•siegers•2m ago•0 comments

Ask HN: What are your favorite sources for the latest AI News?

1•lilsquid•2m ago•0 comments

AI hampered productivity of software developers,m

https://fortune.com/article/does-ai-increase-workplace-productivity-experiment-software-developer...
1•msolujic•3m ago•0 comments

Claude-Code v2.1.0

https://github.com/anthropics/claude-code/commit/870624fc1581a70590e382f263e2972b3f1e56f5
1•handfuloflight•3m ago•0 comments

I built a Million Dollar Homepage for text, and it's chaotic

https://www.themillionlines.com
1•lolzenom•5m ago•1 comments

Show HN: Kerns – A Continuous Research Workspace

https://www.kerns.ai/
3•kanodiaayush•6m ago•0 comments

FOSS book: programming essentials with Guile Scheme

https://www.draketo.de/software/programming-scheme
1•ArneBab•8m ago•1 comments

Show HN: LLM-powered What If text gen for fun

1•techbuilder4242•9m ago•0 comments

Fidji Simo: ChatGPT Health and what AI can do for a broken system

https://fidjisimo.substack.com/p/chatgpt-health
1•nadis•9m ago•1 comments

Minneapolis Mayor Blasts Kristi Noem's BS ICE Shoots Kills Woman in Minnesota

https://www.thedailypoliticususa.com/p/minneapolis-mayor-blasts-kristi-noems
11•Jacquie11•10m ago•7 comments

There Is a Sickness Eating Away at American Democracy

https://www.nytimes.com/2026/01/06/opinion/trump-jan-6-jefferson-davis.html
3•whack•12m ago•0 comments

How being on LinkedIn feels like now

https://plonkedin.vercel.app/
2•falloutx•12m ago•0 comments

Cursor: Dynamic Context Discovery

https://cursor.com/blog/dynamic-context-discovery
2•leerob•13m ago•0 comments

The Films of 2025:Q4

https://scottsumner.substack.com/p/the-films-of-2025q4
1•paulpauper•14m ago•0 comments

Teenygrad

https://github.com/tinygrad/teenygrad
1•jxmorris12•14m ago•0 comments

Persuasion of Humans Is the Bottleneck

https://erikschiskin.substack.com/p/persuasion-of-humans-is-the-bottleneck
1•paulpauper•15m ago•0 comments

CKSyncEngine Questions and Answers

https://christianselig.com/2026/01/cksyncengine/
1•chmaynard•16m ago•0 comments

The Philosophy of Solvej Balle

https://endsdontjustifythemeans.com/p/the-philosophy-of-solvej-balle
1•paulpauper•16m ago•0 comments

CheckMyLLM – A real-time "status board" for LLM reliability

https://checkmyllm.com/
2•valpine8•16m ago•2 comments

Longbeard: Catholic Social Teaching and AI

https://www.longbeard.com/blog/catholic-social-teaching-and-ai
1•admaiorem•17m ago•1 comments

Recipe for a great startup dev team

https://renaissance.kelsus.com/p/recipe-for-a-great-startup-dev-team
1•nadis•19m ago•0 comments

Show HN: AI Swarm v3 – Self-host your own headless AI agents

https://ai-swarm.dev
1•plurb-unus•21m ago•0 comments

HP Reveals Keyboard Computer with Ryzen AI Chip

https://www.hp.com/us-en/desktops/business/eliteboard.html
2•tonymet•21m ago•0 comments

Antiwar AI

https://nikonole.com/antiwarai
2•throwoutway•23m ago•0 comments

Democratizing 3D for Everyone

https://www.youtube.com/watch?v=oGOkx7cuwvo
2•AmanPorwal•24m ago•1 comments

Show HN: SpaceXYZ

https://angular-audio.com/space-xyz
1•jsmithoner•28m ago•0 comments

How AI Is Learning to Think in Secret

https://nickandresen.substack.com/p/how-ai-is-learning-to-think-in-secret
1•NickAndresen•31m ago•2 comments

Applied Mathematics Notebooks Repository:Collection of Google Colab Notebooks

https://github.com/GirolamoOddo/AppliedMath_Notebooks
1•joebig•31m ago•0 comments

What 'I'm an American living in Australia' trend reveals about both countries

https://www.abc.net.au/news/2026-01-08/what-im-an-american-in-australia-trend-reveals-about-usa-a...
1•defrost•33m ago•0 comments

Trump Crypto Venture World Liberty Applies for Bank Charter

https://www.bloomberg.com/news/articles/2026-01-07/trump-crypto-venture-world-liberty-applies-for...
2•geox•33m ago•0 comments