frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
408•klaussilveira•5h ago•91 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
759•xnx•10h ago•462 comments

Why I Joined OpenAI

https://www.brendangregg.com/blog/2026-02-07/why-i-joined-openai.html
25•SerCe•1h ago•18 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
133•isitcontent•5h ago•14 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
123•dmpetrov•5h ago•53 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
34•quibono•4d ago•2 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
237•vecti•7h ago•114 comments

A century of hair samples proves leaded gas ban worked

https://arstechnica.com/science/2026/02/a-century-of-hair-samples-proves-leaded-gas-ban-worked/
60•jnord•3d ago•3 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
304•aktau•12h ago•152 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
306•ostacke•11h ago•82 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
164•eljojo•8h ago•123 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
382•todsacerdoti•13h ago•217 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
310•lstoll•11h ago•230 comments

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

https://github.com/phreda4/r3
45•phreda4•5h ago•7 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
103•vmatsiiako•10h ago•34 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
174•i5heu•8h ago•128 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
11•gfortaine•3h ago•0 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
227•surprisetalk•3d ago•30 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
966•cdrnsf•14h ago•414 comments

Learning from context is harder than we thought

https://hy.tencent.com/research/100025?langVersion=en
139•limoce•3d ago•78 comments

FORTH? Really!?

https://rescrv.net/w/2026/02/06/associative
38•rescrv•13h ago•17 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
7•kmm•4d ago•0 comments

Evaluating and mitigating the growing risk of LLM-discovered 0-days

https://red.anthropic.com/2026/zero-days/
33•lebovic•1d ago•11 comments

Show HN: Smooth CLI – Token-efficient browser for AI agents

https://docs.smooth.sh/cli/overview
76•antves•1d ago•56 comments

The Oklahoma Architect Who Turned Kitsch into Art

https://www.bloomberg.com/news/features/2026-01-31/oklahoma-architect-bruce-goff-s-wild-home-desi...
17•MarlonPro•3d ago•2 comments

I'm going to cure my girlfriend's brain tumor

https://andrewjrod.substack.com/p/im-going-to-cure-my-girlfriends-brain
31•ray__•2h ago•7 comments

Show HN: Slack CLI for Agents

https://github.com/stablyai/agent-slack
38•nwparker•1d ago•8 comments

Claude Composer

https://www.josh.ing/blog/claude-composer
100•coloneltcb•2d ago•68 comments

The Beauty of Slag

https://mag.uchicago.edu/science-medicine/beauty-slag
31•sohkamyung•3d ago•3 comments

Evolution of car door handles over the decades

https://newatlas.com/automotive/evolution-car-door-handle/
39•andsoitis•3d ago•61 comments
Open in hackernews

Show HN: Neurox – GPU Observability for AI Infra

https://github.com/neuroxhq/helm-chart-neurox-control
25•leeab•9mo ago

Comments

leeab•9mo ago
GPU observability is broken, so we built Neurox.

When I co-founded Mezmo (a Series D observability platform), we obsessed over logs, metrics, and traces. I learned firsthand how critical app-level observability is for DevOps, cutting through logging noise and finding the needle in the haystack is everything.

But after diving into AI infra, I noticed a huge gap: GPU monitoring in multi-cloud environments is woefully insufficient.

Despite companies throwing billions at GPUs, there's no easy way to answer basic questions:

- What's happening with my GPUs?

- Who's using them?

- How much is this project costing me?

What's happening: Metrics (like DCGM_FI_DEV_GPU_UTIL) told us what was happening, but not why. Underutilized GPUs? Maybe the pod is crashlooping, stuck pulling an image, or misconfigured, or the application is simply not using the GPU.

Who's using the compute: Kubernetes metadata such as namespace or podname gave us the missing link. We even traced issues like failed pod states, incorrect scheduling, and even PyTorch jobs silently falling back to CPU.

How much is this gonna cost: Calculating cost isn't easy either. If you're renting, you need GPU-time per pod and cloud billing data. If you're on-prem, you'll want power usage + rate cards. Neither comes from a metrics dashboard.

---

Most teams are duct-taping scripts to Prometheus, Grafana, and kubectl.

So we built Neurox - A purpose-built GPU observability platform for Kubernetes-native, multi-cloud AI infrastructure. Think:

1. Real-time GPU utilization and alerts for idle GPUs

2. Cost breakdowns per app/team/project and finops integration

3. Unified view across AWS, GCP, Azure, and on-prem

4. Kubernetes-aware: connect node metrics to running pods, jobs, and owners

5. GPU health checks

Everyone we talked to runs their compute in multi-cloud and uses Kubes as the unifier across all environments. Metrics alone aren't good enough. You gotta combine metrics with Kube state and financial data to see the whole picture.

Check us out, let us know what we're missing. Curious to hear from folks who've rolled their own, what did you do?

Lee @ Neurox

nickysielicki•9mo ago
> Everyone we talked to runs their compute in multi-cloud and uses Kubes as the unifier across all environments.

I categorically support any company willing to take a strong stance on the total irrelevance of slurm.

dharmab•9mo ago
Is your comment pro-SLURM or anti-SLURM?

I took a serious look at SLURM for my problem space and among my conclusions were:

- Hiring people who know Kubernetes is going to be far cheaper

- Kubernetes is gonna be way more compatible with popular o11y tooling

- SLURM's accounting is great if your billing model includes multiple government departments and universities each with their own grants and strict budgets, but is far more complex than needed by the typical tech company

- Writing a custom scheduler that outperforms kube-scheduler is far easier than dealing with SLURM in general

leeab•9mo ago
We're not for nor against Slurm. I do believe it has use cases in HPC, scientific and academic settings. We think our web UI is a bit easier to use and we do offer a competing scheduler.

Our focus is definitely more on container-first, cloud-native Kubernetes environments like EKS, GKE, AKS. Also we're way more health monitoring of the actual GPU hardware rather than just scheduling jobs.

firgrove•9mo ago
this feels like grafana with extra steps
leeab•9mo ago
Haha...there is some truth to that. We do use Prometheus under the hood to collect metrics. However, our thesis is that metrics alone isn’t enough. We marry metrics + kube state + cost data to get the whole picture.

Also we're purpose built to monitor GPUs, so we have things like drilling down from a Kube cluster, down to GPU nodes, down to a GPU card.

freeatnet•9mo ago
Interesting! A friend recently asked me if I knew of any tools to improve GPU observability across their deployments (primarily for cost tracking purposes, I think), but he was looking for an OSS solution. Do you plan to open source this?
leeab•9mo ago
We have considered this and may go down this route in the future. One thing we asked ourselves was what open sourcing provides. Usually it's a desire for privacy or cost in the form of self-hosting, among other reasons.

Currently, our free version is self-hosted and monitors clusters with up to 64 GPUs. We feel this will work for many use cases, especially just to try it out. Monitoring GPUs typically requires you to deploy something where your GPUs live. Since you’re already installing software on your cluster, you might as well keep your data there too.

fustercluck•9mo ago
Your Github repo says you need 120 GB of persistent storage, but our bare metal GPU clusters only have local storage. Would like to try your thing, but hosting the data with the GPUs is a pretty big blocker for us.
leeab•9mo ago
Ahh yes...here’s how you solve that. Just install the Neurox Control plane onto any regular Kubes cluster (doesn’t need GPUs, just needs persistent storage. ie: EKS, AKS, GKE, etc) without that last flag in the instructions: `--set workload.enabled=true` (<-- leave this out). More info: https://docs.neurox.com/installation/alternative-install-met...

Then on your GPU cluster w/o disk, you just need to install the Neurox Workload agent. In the Web Portal UI, click on Clusters > New Cluster and copy/paste the snippet there.

fustercluck•9mo ago
Oh sweet, I'll take a look. Thanks!
nicoslepicos•9mo ago
I've heard a few folks at events mention curiosity about stuff like this.

Given you decided to start self-hosted, are you planning on a cloud version in the next while too?

I'm curious also who you think is the right fit for this right now in terms of initial users

leeab•9mo ago
One of the reasons we went down the self-hosted route is to ensure that your data remains on your servers. Since our architecture allows for separation between where our control plane lives vs where GPU workloads run, we can definitely host the control plane portion for you. Then you just need to run our agent only on your GPU cluster. Shoot me an email: lee at neurox.com and we can discuss!
zekrioca•9mo ago
* Not open-source.
leeab•9mo ago
Someone asked about this earlier. We have considered this and may go down this route in the future. Was there something specific that you were looking for with open source? (ie: privacy, cost, etc)

Our solution is self-hosted and your data remains on your servers. And I think we do provide a fairly generous free limit of 64 GPUs.

28374654•9mo ago
Arjikh
badmonster•9mo ago
What metrics and Kubernetes runtime data does Neurox collect to provide its AI workload monitoring dashboards, and how customizable are these dashboards for different user roles like developers or finance auditors?
leeab•9mo ago
We collect a handful of metrics, but coming from our previous lives in DevOps, we only collect just what's needed to avoid unnecessary metrics bloat.

The main 3 are:

- GPU runtime stats from NVIDIA smi

- Running pods from Kube state

- Node data & events from Kube state

We have several screens with similar information intended for different roles. For example, the Workloads screen is mainly for researchers to monitor their workloads from creation to completion. The Reports screen shows mainly cost data grouped by team/project, etc.

mountainriver•9mo ago
Kubernetes is now kinda a bad abstraction for accelerated compute with the GPU shortages
leeab•9mo ago
Well, it depends on how many GPU clouds you're managing. We've talked to a bunch of companies, some startups, some enterprises and the main trend we found was the sheer number of companies with GPU clusters from multiple clouds...likely due to GPU shortage.

And yes, it's nowhere near 100% but an overwhelming majority was running Kubes for GPU workloads...mainly cuz so they'd have a unifying layer that wasn't managing each cloud separately and being proficient with their respective tooling.

Are you using something else? Slurm, docker, etc?

nickysielicki•9mo ago
Are you concerned at all that your name is one letter away from neuronx?

https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neur...

leeab•9mo ago
I've heard of AWS Neuron...didn't realize they had a package called NeuronX. Tbh, I feel like many AI companies have similar names. Maybe the guys who own neuronx.io might be more concerned...