frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

I'm Too Lazy to Check Datadog Every Morning, So I Made AI Do It

https://quickchat.ai/post/automate-bug-triage-with-claude-code-and-datadog
21•piotrgrudzien•6h ago

Comments

danpalmer•2h ago
Why would one need to check Datadog every morning? Wouldn't alerts fire if there was something to do?
bak3y•2h ago
Exactly what I came to say, alerts need tuning if you're having to check your monitoring tools by hand.
dathinab•1h ago
I read the article as a way for AI to check, classify and potentially partial fix the alerts you see when logging-in in the morning.

And for many alerts you need to look at other events around it to properly classify and partially solve them. Due to that you need to give the AI more then just the alerts.

Through I do see a risk similar to wrongly tuned alerts:

Not everything which resolves by itself and can be ignored _in this moment_ is a non issue. It's e.g. pretty common that a system with same rare ignoble warns/errs falls completely flat, when on-boarding a lot of users, introducing a new high load feature, etc. due the exactly the things which you could fully ignore before hand.

seneca•1h ago
I'm not sure if this is what the writer was getting at, but I tend to check telemetry for my production applications regularly not because I'm looking for things that would fire alerts, but to keep a sense of what production looks like. Things like request rate, average latency, top request paths etc. It's not about knowing something is broken, it's about knowing what healthy looks like.

Understanding what your code looks like in production gives you a lot better sense of how to update it, and how to fix it when it does inevitably break. I think having AI checking for you will make this basically impossible, and that probably makes it a pretty bad idea.

vrosas•1h ago
Almost no one actually knows how to set up their monitoring. Like, they know the words but not the full picture or how the pieces should actually fit together. Then they do shit like this to try and make up for that fact.
bdangubic•1h ago
the ones that know do not check anything every morning
import•1h ago
Well, the industry standard solution is correct monitoring and alerting. This doesn’t sound like “the right way”.
sgarman•1h ago
I don't understand the workflow of having multiple new bugs everyday that need fixed. Is there bad code being shipped? Are there 1000 devs and it's just this persons' job to fix everyone's bugs? Is this an extremely old and complicated codebase they are improving? Not trying to be snarky - I just don't understand how every day there is new bugs that are just error messages.

If there are new bugs every day that need fixed is the AI really good enough to know the fix from just an error?

sothatsit•1h ago
Generally I think this happens when people don’t monitor for errors on a regular basis. People only notice if things are actively broken for customers, and tons of small non-fatal bugs slip through and build up over time.
Xeoncross•1h ago
> Total alerts/errors found: 7

Apps written in an exceptions language (Java, JavaScript, PHP, etc..) are really annoying to monitor as everything that isn't the happy path triggers an 'error'/'fatal' log/metric.

Yes, you can technically work around it with (near) Go-level error verbosity (try/catches everywhere on every call) but I've never seen a team actually do that.

Modern languages that don't throw exceptions for every error like Rust, Go, and Zig make much more sane telemetry reports in my experience.

On this note, a login failure is not an error, it's a warning because there is no action to take. It's an expected outcome. Errors should be actionable. WARN should be for things that in aggregate (like login failures) point to an issue.

Spivak•1h ago
> On this note, a login failure is not an error

Login failure is like the most important error you'll track. A login failure isn't necessarily actionable but a spike of thousands of them for sure is. No single system has been more responsible for causing outages in my career than auth. And I get that it's annoying when they appear in your Rollbar but sometimes Login Failed is the only signal you get that something is wrong.

Some 3rd party IdP saying "nope" can be innocuous when it's a few people but a huge problem when it's because they let their cert/application token expire.

And I can already hear the "it should be a metric with an alert" and you're absolutely right. Except that it requires that devs take the positive action of updating the metric on login failures vs doing nothing and letting the exception propagate up. And you just said login failures aren't errors and "bad password" obviously isn't an error so no need to update the metric on that and cause chatty alerts. Except of course that one time a dev accidentally changed the hashing algorithm. Everyone was really bad at typing their password that day for some reason.

SkiFire13•29m ago
Rather than login failures I would monitor login successes. A sharp decrease of successes likely points to some issue, but an increase in login failures might easily be someone trying tons of random credentials on your website (still not ideal, but much harder to act on)

Canada's bill C-22 mandates mass metadata surveillance of Canadians

https://www.michaelgeist.ca/2026/03/a-tale-of-two-bills-lawful-access-returns-with-changes-to-war...
434•opengrass•6h ago•110 comments

Chrome DevTools MCP

https://developer.chrome.com/blog/chrome-devtools-mcp-debug-your-browser-session
385•xnx•8h ago•162 comments

The 49MB web page

https://thatshubham.com/blog/news-audit
350•kermatt•7h ago•184 comments

What Is Agentic Engineering?

https://simonwillison.net/guides/agentic-engineering-patterns/what-is-agentic-engineering/
57•lumpa•2h ago•36 comments

LLMs can be exhausting

https://tomjohnell.com/llms-can-be-absolutely-exhausting/
106•tjohnell•6h ago•82 comments

LLM Architecture Gallery

https://sebastianraschka.com/llm-architecture-gallery/
274•tzury•11h ago•20 comments

A new Bigfoot documentary helps explain our conspiracy-minded era

https://www.msn.com/en-us/news/us/a-new-bigfoot-documentary-helps-explain-our-conspiracy-minded-e...
56•zdw•5h ago•26 comments

//go:fix inline and the source-level inliner

https://go.dev/blog/inliner
119•commotionfever•4d ago•44 comments

The Linux Programming Interface as a university course text

https://man7.org/tlpi/academic/index.html
37•teleforce•3h ago•2 comments

Separating the Wayland compositor and window manager

https://isaacfreund.com/blog/river-window-management/
239•dpassens•12h ago•115 comments

Federal Right to Privacy Act – Draft legislation

https://righttoprivacyact.github.io
20•pilingual•1h ago•10 comments

Cannabinoids remove plaque-forming Alzheimer's proteins from brain cells (2016)

https://www.salk.edu/news-release/cannabinoids-remove-plaque-forming-alzheimers-proteins-from-bra...
77•anjel•3h ago•39 comments

How I write software with LLMs

https://www.stavros.io/posts/how-i-write-software-with-llms/
21•indigodaddy•2h ago•3 comments

What makes Intel Optane stand out (2023)

https://blog.zuthof.nl/2023/06/02/what-makes-intel-optane-stand-out/
187•walterbell•12h ago•118 comments

Bandit: A 32bit baremetal computer that runs Color Forth [video]

https://www.youtube.com/watch?v=HK0uAKkt0AE
31•surprisetalk•3d ago•2 comments

Glassworm is back: A new wave of invisible Unicode attacks hits repositories

https://www.aikido.dev/blog/glassworm-returns-unicode-attack-github-npm-vscode
232•robinhouston•14h ago•146 comments

Stop Sloppypasta

https://stopsloppypasta.ai/
146•namnnumbr•9h ago•86 comments

AI tools are making me lose interest in CS fundamentals

24•Tim25659•1h ago•21 comments

Nasdaq's Shame

https://keubiko.substack.com/p/nasdaqs-shame
205•imichael•5h ago•56 comments

The emergence of print-on-demand Amazon paperback books

https://www.alexerhardt.com/en/enshittification-amazon-paperback-books/
103•aerhardt•18h ago•73 comments

Learning athletic humanoid tennis skills from imperfect human motion data

https://zzk273.github.io/LATENT/
130•danielmorozoff•12h ago•27 comments

An experiment to use GitHub Actions as a control plane for a PaaS

https://towlion.github.io
8•baijum•2h ago•3 comments

Quillx is an open standard for disclosing AI involvement in software projects

https://github.com/QAInsights/AIx
5•qainsights•2h ago•6 comments

Bus travel from Lima to Rio de Janeiro

https://kenschutte.com/lima-to-rio-by-bus/
129•ks2048•4d ago•51 comments

A Visual Introduction to Machine Learning (2015)

https://r2d3.us/visual-intro-to-machine-learning-part-1/
324•vismit2000•16h ago•29 comments

I built an ephemeral P2P chat with WebRTC, without servers

https://ephemchat.vercel.app/
11•zRinexD•2h ago•11 comments

A Plain Anabaptist Story: The Hutterites

https://ulmer457718.substack.com/p/a-plain-anabaptist-story-the-hutterites
27•gaplong•3d ago•1 comments

Type systems are leaky abstractions: the case of Map.take!/2

https://dashbit.co/blog/type-systems-are-leaky-abstractions-map-take
30•tosh•4d ago•16 comments

Show HN: Free OpenAI API Access with ChatGPT Account

https://github.com/EvanZhouDev/openai-oauth
38•EvanZhouDev•5h ago•16 comments

In Memoriam: John W. Addison, my PhD advisor

https://billwadge.com/2026/03/15/in-memoriam-john-w-addison-jr-my-phd-advisor/
104•herodotus•11h ago•4 comments