frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: Pyrig – One command to set up a production-ready Python project

https://github.com/Winipedia/pyrig
1•Winipedia•44s ago•0 comments

Fast Response or Silence: Conversation Persistence in an AI-Agent Social Network [pdf]

https://github.com/AysajanE/moltbook-persistence/blob/main/paper/main.pdf
1•EagleEdge•50s ago•0 comments

C and C++ dependencies: don't dream it, be it

https://nibblestew.blogspot.com/2026/02/c-and-c-dependencies-dont-dream-it-be-it.html
1•ingve•1m ago•0 comments

Show HN: Vbuckets – Infinite virtual S3 buckets

https://github.com/danthegoodman1/vbuckets
1•dangoodmanUT•1m ago•0 comments

Open Molten Claw: Post-Eval as a Service

https://idiallo.com/blog/open-molten-claw
1•watchful_moose•2m ago•0 comments

New York Budget Bill Mandates File Scans for 3D Printers

https://reclaimthenet.org/new-york-3d-printer-law-mandates-firearm-file-blocking
1•bilsbie•2m ago•0 comments

The End of Software as a Business?

https://www.thatwastheweek.com/p/ai-is-growing-up-its-ceos-arent
1•kteare•4m ago•0 comments

Exploring 1,400 reusable skills for AI coding tools

https://ai-devkit.com/skills/
1•hoangnnguyen•4m ago•0 comments

Show HN: A unique twist on Tetris and block puzzle

https://playdropstack.com/
1•lastodyssey•7m ago•0 comments

The logs I never read

https://pydantic.dev/articles/the-logs-i-never-read
1•nojito•9m ago•0 comments

How to use AI with expressive writing without generating AI slop

https://idratherbewriting.com/blog/bakhtin-collapse-ai-expressive-writing
1•cnunciato•10m ago•0 comments

Show HN: LinkScope – Real-Time UART Analyzer Using ESP32-S3 and PC GUI

https://github.com/choihimchan/linkscope-bpu-uart-analyzer
1•octablock•10m ago•0 comments

Cppsp v1.4.5–custom pattern-driven, nested, namespace-scoped templates

https://github.com/user19870/cppsp
1•user19870•11m ago•1 comments

The next frontier in weight-loss drugs: one-time gene therapy

https://www.washingtonpost.com/health/2026/01/24/fractyl-glp1-gene-therapy/
1•bookofjoe•14m ago•1 comments

At Age 25, Wikipedia Refuses to Evolve

https://spectrum.ieee.org/wikipedia-at-25
1•asdefghyk•17m ago•3 comments

Show HN: ReviewReact – AI review responses inside Google Maps ($19/mo)

https://reviewreact.com
2•sara_builds•17m ago•1 comments

Why AlphaTensor Failed at 3x3 Matrix Multiplication: The Anchor Barrier

https://zenodo.org/records/18514533
1•DarenWatson•19m ago•0 comments

Ask HN: How much of your token use is fixing the bugs Claude Code causes?

1•laurex•22m ago•0 comments

Show HN: Agents – Sync MCP Configs Across Claude, Cursor, Codex Automatically

https://github.com/amtiYo/agents
1•amtiyo•23m ago•0 comments

Hello

2•otrebladih•24m ago•1 comments

FSD helped save my father's life during a heart attack

https://twitter.com/JJackBrandt/status/2019852423980875794
3•blacktulip•27m ago•0 comments

Show HN: Writtte – Draft and publish articles without reformatting, anywhere

https://writtte.xyz
1•lasgawe•29m ago•0 comments

Portuguese icon (FROM A CAN) makes a simple meal (Canned Fish Files) [video]

https://www.youtube.com/watch?v=e9FUdOfp8ME
1•zeristor•31m ago•0 comments

Brookhaven Lab's RHIC Concludes 25-Year Run with Final Collisions

https://www.hpcwire.com/off-the-wire/brookhaven-labs-rhic-concludes-25-year-run-with-final-collis...
3•gnufx•33m ago•0 comments

Transcribe your aunts post cards with Gemini 3 Pro

https://leserli.ch/ocr/
1•nielstron•37m ago•0 comments

.72% Variance Lance

1•mav5431•38m ago•0 comments

ReKindle – web-based operating system designed specifically for E-ink devices

https://rekindle.ink
1•JSLegendDev•39m ago•0 comments

Encrypt It

https://encryptitalready.org/
1•u1hcw9nx•39m ago•1 comments

NextMatch – 5-minute video speed dating to reduce ghosting

https://nextmatchdating.netlify.app/
1•Halinani8•40m ago•1 comments

Personalizing esketamine treatment in TRD and TRBD

https://www.frontiersin.org/articles/10.3389/fpsyt.2025.1736114
1•PaulHoule•42m ago•0 comments
Open in hackernews

Anomaly detection: business metrics vs. system metrics?

3•chipfixer•9mo ago
Recently, I ran into someone at a conference who had a major incident: a config change caused a revenue drop. Their RED/system metrics didn’t catch it because they were all static-threshold alerts and siloed from the business signals, so engineers didn't discover the actual revenue impact until much later.

What may have been helpful is anomaly detection directly on their business metrics — with system metrics helping explain root cause but only when real customer/business impact is detected.

Curious to hear: How much does your org prioritize monitoring business metrics (not just System metrics)? If you do, what tools do you use?

Comments

poobear22•9mo ago
I managed the system administrators for a high performance computing center. We took a lot of blame for the applications when in reality, often times it was poor programming on the developer's part. So, I got really tired of taking the blame and implemented statistical process control to track the mean time between failures of the jobs. I was really just shining the flashlight on production jobs and was hoping it could change the culture. It was not my job to fix their code, and the applications were developed by a different group of people with a very different culture. I thought the process control worked really well, and it did allow me to take the heat off me for random blaming of my team, when I could respond with "your job is failing XX times per year" and from there, push for a root cause analysis. But pushing against that culture was really hard, and there was a lot of "set the job to complete and I will look at it on Monday". If they do not want to conduct a root cause analysis on the failure modes for their code, I can't do much. So, even implementing some type of monitoring can have little effect if the ones who need to fix something do not support the culture. And, as I read your post, I'd think people would be looking at these business metrics a little closer or develop more sensitive metrics to catch these issues.
chipfixer•9mo ago
Yup, no amount or type of anomaly detection can fix the culture. That said, in this case, maybe one reason it may be hard is the devs weren't the ones owning what the job did in production?
nchinmay•9mo ago
We had this very issue where a bad configuration change (human error) caused a large & sudden revenue drop and a drop in our streaming ad event metrics. This is a realtime adtech system where a delay in detecting sudden changes in business metrics can have monetary impact and visible customer experience drops. In this case, the major drop in revenue was immediately found and addressed but not all of our expected alerts went off. Our streaming ad events metric threshold was statically set too low. This threshold was appropriate at earlier stages of our business but as our business has grown, this threshold happens to be too low to set off the alert I would have expected to go off as the very first one. We do have sophisticated metrics instrumentation and alerting but an effective anomaly detection around sudden upticks/downticks in business metrics while being conscious of underlying metric trends evolving organically with the business would be a game changer.

Larger, incident-worthy changes in metrics are also easier to set static thresholds around and ring more than one bell when they occur. I'd be more concerned about smaller to mid deviations from the trend, say, sudden -/+10% change in my business metrics over X minutes. Can I reliably set a static threshold that will universally be appropriate here? A good anomaly detector would ideally bring something like this to attention without hard coded alert configs here