It's the end of observability as we know it (and I feel fine)

https://www.honeycomb.io/blog/its-the-end-of-observability-as-we-know-it-and-i-feel-fine

280•gpi•1d ago

Comments

techpineapple•1d ago

I feel like the alternate title of this could be “how to 10x your observability costs with this one easy trick”. It didn’t really show a way to get rid of all the graphs, the prompt was “show me why my latency spikes every four hours”. That’s really cool, but in order to generate that prompt you need alerts and graphs. How do you know you’re latency is spiking to generate the prompt?

The devil seems to be in the details, but you’re running a whole bunch more compute for anomaly detection and “ Sub-second query performance, unified data storage”, which again sounds like throwing enormous amounts of more money at the problem. I can totally see why this is great for honeycomb though, they’re going to make bank.

techpineapple•1d ago

Additionally, I wonder if any of this fixes the fact that anomaly detection in alerting is traditionally a really hard problem, and one I’ve hardly seen done well. Of any set of packaged or recommended alerts, I probably only use 1% of them because anomalies are often the norm.

tptacek•1d ago

That reinforces his argument.

techpineapple•1d ago

If he stopped at “LLms can help rca’s, I’d agree with you.

zer00eyz•1d ago

> because anomalies are often the norm

You fix these issues or you tune your alert system to make it clear that they aren't actionable. Otherwise you end up turning them off so your system doesn't turn into the boy who cried wolf (or worse teams learn to ignore it and it becomes useless)

Bayesian filters, and basic dirivative functions (think math) can do a lot to tame output from these systems. These arent "product features" so in most orgs they dont get the attention they need or deserve.

zdragnar•1d ago

> or worse teams learn to ignore it and it becomes useless

This is basically every team I've worked with. Product wants new features, and doesn't want to spend on existing features. Hurry up and write new stuff! Ignore problems and they'll go away!

Also: I've already reported this bug! Why haven't the developers fixed it is yet?

tptacek•1d ago

I'm not sure I understand the question. He's writing from the vantage point of someone with a large oTel deployment; that's the data he has to work with. Honeycomb has an MCP server. Instead of him clicking around Honeycomb and making inferences from the data and deciding where to drill down, an LLM did that, and found the right answer quicker than a human would have.

Where's the extra expense here? The $0.60 he spent on LLM calls with his POC agent?

danpalmer•1d ago

I think the implication is that if you have that graph you’re already half way towards a solution because you know there’s a problem.

In terms of _identifying the problems_, shoving all your data into an LLM to spot irregularities would be exceptionally expensive vs traditional alerting, even though it may be much more capable at spotting potential issues without explicit alerting thresholds being set up.

tptacek•1d ago

Look, I like Honeycomb a lot and we're dependent on it for parts of our orchestrator. It's great; it accelerates investigations.

But even with Honeycomb, we are sitting on an absolute mountain of telemetry data, in logs, in metrics, and in our trace indices at Honeycomb.

We can solve problems by searching and drilling down into all those data sources; that's how everybody solves problems. But solving problems takes time. Just having the data in the graph does not mean we're near a solution!

An LLM agent can chase down hypotheses, several at a time, and present them with collected data and plausible narratives. It'll miss things and come up with wild goose chases, but so do people during incidents.

datadrivenangel•1d ago

I think the point is that all your telemetry data is going to be very expensive.

danpalmer•1d ago

My point, and what I think I'm clarifying from the parent comment, is that running a single starting point through an LLM to explore root causes is far cheaper than running all starting points through an LLM to find potential issues in the first place.

nemothekid•1d ago

I understand the value prop you are describing, but that is not the value prop described in the article. The article did not describe an LLM agent intuiting a hypothesis, checking with data, and coming up with a narrative all for 60 cents. The author did 80% of the work, and had the LLM do the final stretch. Maybe there's value there, but the article did not present a workflow that obviated the need for graphs.

This execution gap is common in these LLM blog posts (ex. the LLM can write a login function in Swift, so the author claims LLMs will be writing production ready iOS apps tomorrow). The common thread I see is that getting from the point of "explain why this endpoint is slow" to the point of "An LLM agent can chase down hypotheses, several at a time, and present them with collected data and plausible narratives." is a lot more expensive than 60 cents - it's months of paying for an enterprise license where this feature is promised to be "just around the corner"

tptacek•1d ago

Can you be clearer about what you're defining as "80% of the work"? If it's "setting up a telemetry pipeline capable of serving an LLM agent loop", I'm going to disagree, but I don't want to disagree preemptively.

hooverd•1d ago

Is the LLM making these inferences from the aether?

techpineapple•1d ago

Right, but the vision isn’t LLM troubleshooting, the title of the article isnt look at the way you can troubleshoot with an LLM, it’s “its the end of observability as we know it). he goes on to say it’s AI constantly analyzing data in a Unified sub-second database? Even without the AI that’s expensive.

zer00eyz•1d ago

Also we need to talk about what should be logged and where.

There seem to be two schools of thought, just enough to tell something is wrong but not what it is - OR - you get to drink from the firehose. And most orgs go from the first to the second.

As to where, well thats at the hardware/vm/container level, and mirror and extend what it does. Nothing worse than 20 different ideas of how to log and rotate and trying to figure out who did what, when where and why. If you can't match a log entry to a running environment... well.

I weep quietly inside when some or all of this goes through one, or several S3 buckets for no good reason.

stlava•1d ago

I feel that if you need an LLM to help pivot between existing data it just means the operability tool has gaps in user functionality. This is by far my biggest gripe with DataDog today. All the data is there but going from database query to front end traces should be easy but is not.

Sure we can use an LLM but I can for now click around faster (if those breadcrumbs exist) than it can reason.

Also the LLM would only point to a direction and I’m still going to have to use the UI to confirm.

tptacek•1d ago

One of the interesting things an agent can do that no individual telemetry tool does effectively is make deductions and integrate information across data sources. It's a big open challenge for us here; in any given incident, we're looking at Honeycomb traces, OpenSearch for system logs, and Prometheus metrics in a VictoriaMetrics cluster. Given tool calls for each of these data sources, an agent can generate useful integrated hypotheses without any direct integration between the data sources. That's pretty remarkable.

JyB•1d ago

The whole point of Datadog is that you can seamlessly go between products for the same event source. Doesn’t it just highlight a setup issue?

physix•1d ago

I'd like to see the long list of companies that are in the process of being le cooked.

ok_dad•1d ago

"Get AI to do stuff you can already do with a little work and some experts in the field."

What a good business strategy!

I could post this comment on 80% of the AI application companies today, sadly.

tptacek•1d ago

I think you think this is a dunk, but "some experts in [this] field" are extraordinarily expensive. If you can actually do this, it's no wonder you're seeing so many flimsy AI companies.

NewJazz•1d ago

I think the commenter might have been saying that you need experts I'm the field to leverage AI here, in which case your response is supporting their point.

tptacek•1d ago

"you can already do with" implies otherwise.

NewJazz•1d ago

It is ambiguous and arguably poorly worded... Like I said I'm not really sure how to interpret the comment.

(Get AI to do stuff (you can already do) with a little work and some experts in the field).

(Get AI to do stuff (you can already do with a little work and some experts in the field)).

AdieuToLogic•1d ago

This post is a thinly veiled marketing promo. Here's why.

Skip to the summary section titled "Fast feedback is the only feedback" and its first assertion:

  ... the only thing that really matters is fast, tight
  feedback loops at every stage of development and operations.

This is industry dogma generally considered "best practice" and sets up the subsequent straw man:

  AI thrives on speed—it'll outrun you every time.

False.

"AI thrives" on many things, but "speed" is not one of them. Note the false consequence ("it'll outrun you every time") used to set up the the epitome of vacuous sales pitch drivel:

  To succeed, you need tools that move at the speed of AI as well.

I hope there's a way I can possibly "move at the speed of AI"...

  Honeycomb's entire modus operandi is predicated on fast
  feedback loops, collaborative knowledge sharing, and
  treating everything as an experiment. We’re built for the
  future that’s here today, on a platform that allows us to
  be the best tool for tomorrow.

This is as subtle as a sledgehammer to the forehead.

What's even funnier is the lame attempt to appear objective after all of this:

  I’m also not really in the business of making predictions.

Really? Did the author read anything they wrote before this point?

pgwhalen•1d ago

Is it even attempting to be veiled at all? You know you’re reading a company’s blog post, written about a feature the company is building for their product, right? It is explicitly marketing.

AdieuToLogic•1d ago

> Is it even attempting to be veiled at all?

I do believe the veil is at best "thin." Perhaps I was being too generous given the post starts with:

  New abstractions and techniques for software development 
  and deployment gain traction, those abstractions make 
  software more accessible by hiding complexity, and that 
  complexity requires new ways to monitor and measure what’s 
  happening. We build tools like dashboards, adaptive 
  alerting, and dynamic sampling. All of these help us 
  compress the sheer amount of stuff happening into something 
  that’s comprehensible to our human intelligence.
  
  In AI, I see the death of this paradigm. It’s already real, 
  it’s already here, and it’s going to fundamentally change 
  the way we approach systems design and operation in the 
  future.

Maybe I should have detected the utterly condescending phrase, "something that’s comprehensible to our human intelligence."

zug_zug•1d ago

As somebody who's good at RCA, I'm worried all my embarrassed coworkers are going to take at face value a tool that's confidently incorrect 10% of the time and screw stuff up more instead of having to admit they don't know something publicly.

It'd be less bad if the tool came to a conclusion, then looked for data to disprove that interpretation, and then made a more reliably argument or admitted its uncertainty.

jakogut•1d ago

You can achieve a good amount of this with system prompts. I've actually had good success using LLMs to craft effective system prompts and custom instructions to get more rigorous and well researched answers by default.

One I use with ChatGPT currently is:

> Prioritize substance, clarity, and depth. Challenge all my proposals, designs, and conclusions as hypotheses to be tested. Sharpen follow-up questions for precision, surfacing hidden assumptions, trade offs, and failure modes early. Default to terse, logically structured, information-dense responses unless detailed exploration is required. Skip unnecessary praise unless grounded in evidence. Explicitly acknowledge uncertainty when applicable. Always propose at least one alternative framing. Accept critical debate as normal and preferred. Treat all factual claims as provisional unless cited or clearly justified. Cite when appropriate. Acknowledge when claims rely on inference or incomplete information. Favor accuracy over sounding certain.

j_bum•1d ago

Do you just add this to your “instructions” section?

And what type of questions do you ask the model?

Thanks for sharing

jakogut•19h ago

Yes, with ChatGPT, I added that paragraph as custom instructions under personalization.

I ask a wide variety of things, from what a given plant is deficient in based on a photo, to wireless throughout optimization (went from 600 Mbps to 944 Mbps in one hour of tuning). I use models for discovering new commands, tools, and workflows, interpreting command output, and learning new keywords for deeper and more rigorous research using more conventional methods. I rubber duck with it, explaining technical problems, my hypotheses, and iterating over experiments until arriving at a solution, creating a log in the process. The model is often wrong, but it's also often right, and used the way I do, it's quickly apparent when it's wrong.

I've used ChatGPT's memory feature to extract questions from previous chats that have already been answered to test the quality and usability of local models like Gemma3, as well as craft new prompts in adjacent topics. Prompts that are high leverage, compact, and designed to trip up models that are underpowered or over quantized. For example:

>> "Why would toggling a GPIO in a tight loop not produce a square wave on the pin?"

> Tests: hardware debounce, GPIO write latency, MMIO vs cache, bus timing.

>> "Why is initrd integrity important for disk encryption with TPM sealing?"

> Tests: early boot, attack surface, initramfs tampering vectors.

>> "Why would a Vulkan compute shader run slower on an iGPU than a CPU?"

> Tests: memory bandwidth vs cache locality, driver maturity, PCIe vs UMA.

>> "Why is async compute ineffective on some GPUs?"

> Tests: queue scheduling, preemption granularity, workload balance.

>> "Why might a PID loop overshoot more when sensor update rate decreases?"

> Tests: delayed ACK, bufferbloat, congestion control tuning.

>> "How can TCP experience high latency even with low packet loss?"

> Tests: delayed ACK, bufferbloat, congestion control tuning.

>> "How can increasing an internal combustion engine's compression ratio improve both torque and efficiency?"

> Tests: thermodynamics, combustion behavior, fuel octane interaction.

>> "How can increasing memory in a server slow it down?"

> Tests: NUMA balancing, page table size, cache dilution.

>> "Why would turning off SMT increase single-threaded performance on some CPUs?"

> Tests: resource contention, L1/L2 pressure, scheduling policy.

dr_kiszonka•1d ago

This looks excellent. Thanks for sharing.

resonious•1d ago

The title is a bit overly dramatic. You still need all of your existing observability tools, so nothing is ending. You just might not need to spend quite as much time building and staring at graphs.

It's the same effect LLMs are having on everything, it seems. They can help you get faster at something you already know how to do (and help you learn how to do something!), but they don't seem to outright replace any particular skill.

sakesun•1d ago

1. help you get faster at something you already know how to do

2. and help you learn how to do something!

This is the second time I heard this conclusion today. Using inference to do 2. and then getting superpower in doing 1., this is probably the right way to go forward.

germandiago•23h ago

This is exactly how I have been using it.

Which takes me to the question: who would you hire:

1. an expert for salary X? 2. a not-so-expert for 0.4x salary + AI tool that can do well enough?

I suspect that, unless there are specific requirements, 2. tends to win.

autoexec•17h ago

> 2. and help you learn how to do something!

I think it's more likely that people will learn far less. AI will regurgitate an answer to their problem and people will use it without understanding or even verifying if it is correct so long as it looks good.

All the opportunities a person would have to discover and learn about things outside of the narrow scope they initially set their sights on will be lost. Someone will ask AI to do X and they get their copy/paste solution so there's nothing to learn or think about. They'll never discover things like why X isn't such a good idea or that Y does the job even better. They'll never learn about Z either, the random thing that they'd have stumbled upon while looking for more info about X.

sakesun•9h ago

I find that GenAI is really good at explaining programming languages features.

scubbo•1d ago

> The title is a bit overly dramatic.

I call it "The Charity Majors effect".

joeconway•1d ago

She is not the author

scubbo•17h ago

But she does shape the culture of her company.

nkotov•19h ago

Title is dramatic but the point is clear - the moats are definitely emptying.

stego-tech•1d ago

Again, sales pitch aside, this is one of the handful of valuable LLM applications out there. Monitoring and observability have long been the exclusive domains of SRE teams in large orgs while simultaneously out of reach to smaller orgs (speaking strictly from an IT perspective, NOT dev), because identifying valuable metrics and carving up heartbeats and baselines for them is something that takes a lot of time, specialized tooling, extensive dev environments to validate changes, and change controls to ensure you don’t torch production.

With LLMs trained on the most popular tools out there, this gives IT teams short on funds or expertise the ability to finally implement “big boy” observability and monitoring deployments built on more open frameworks or tools, rather than yet-another-expensive-subscription.

For usable dashboards and straightforward observability setups, LLMs are a kind of god-send for IT folks who can troubleshoot and read documentation, but lack the time for a “deep dive” on every product suite the CIO wants to shove down our throats. Add in an ability to at least give a suggested cause when sending a PagerDuty alert, and you’ve got a revolution in observability for SMBs and SMEs.

JimBlackwood•1d ago

Agreed! I see huge gains for small SRE teams aswell.

I’m in a team of two with hundreds of bare metal machines under management - if issues pop up it can be stressful to quickly narrow your search window to a culprit. I’ve been contemplating writing an MCP to help out with this, the future seems bright in this regard.

Plenty of times when issues have been present for a while before creating errors, aswell. LLM’s again can help with this.

chupasaurus•21h ago

> identifying valuable metrics and carving up heartbeats and baselines for them

The first problem is out of reach for LLMs, the other 2 are trivial with convolutional NNs for a long time.

> extensive dev environments to validate changes, and change controls to ensure you don’t torch production

Are out of scope for observability.

stego-tech•19h ago

Not for small environments they’re not. If you say “observability” to a big Fortune company, they peg it to a narrow, specifically defined team and goal; that same word to a SMB/E is “oh, you mean monitoring and change control”.

I used the word correctly for the environments I was describing. They’re just not accurate for larger, more siloed orgs.

mediumsmart•1d ago

I thought the article was about the end of observability of the real world as we knew it and was puzzled why they felt fine.

kacesensitive•1d ago

LLMs won't replace observability, but they absolutely change the game. Asking "why is latency spiking" and getting a coherent root cause in seconds is powerful. You still need good telemetry, but this shifts the value from visualizing data to explaining it.

Nathanba•1d ago

I was initially agreeing with the article but it's a clever marketing piece. Nothing changes, these graphs were already really easy to read and if they weren't then they should be. You should already be capable of zooming into your latency spike within seconds and clicking on it and seeing which method was slow. So asking the AI will be more comfortable but it doesn't change anything.

ActorNightly•1d ago

LLMS are basically just like higher level programming tools - knowing how to utilize them is the key. Best practice is not to depend on them for correctness, but instead utilize them as automatic maps from data->action that you would otherwise have to write manually.

For example, I wrote my own MCP server in Python that basically makes it easy to record web browser activities and replay them using playwright. When I have to look at logs or inspect metrics, I record the workflow, and import it as an MCP tool. Then I keep a prompt file where I record what the task was, tool name, description of the output, and what is the final answer given an output.

So now, instead of doing the steps manually, I basically just ask Claude to do things. At some point, I am going to integrate real time voice recording and trigger on "Hey Claude" so I don't even have to type.

The only thing I wish someone would do is basically make a much smaller model with limited training only on things related to computer science, so it can run at high resolution on a single Nvidia card with fast inference.

jacobsenscott•1d ago

The problem with LLMs is the answer always sounds right, no matter if it is or isn't. If you already know the answer to a question it is kind of fun to see an LLM get lucky and cobble together a correct answer. But they are otherwise useless - you need to do all the same work you would do anyway to check the LLM's "answer".

dcre•1d ago

There’s a world of difference between “always sounds right, but actually is right 80% of the time” and “always sounds right, but actually is right 99% of the time.” It has seemed clear to me for a while that we are on the way to the latter though a combination of model improvement and boring, straightforward engineering on scaffolding (e.g., spending additional compute verifying answers by trying to produce counterarguments). Model improvement is maybe less straightforward but the trajectory is undeniable and showing no sign of plateauing.

RainyDayTmrw•1d ago

I think we are, collectively, greatly underestimating the value of determinism and, conversely, the cost of nondeterminism.

I've been trialing a different product with the same sales pitch. It tries to RCE my incidents by correlating graphs. It ends up looking like this page[1], which is a bit hard to explain in words, but both obvious and hilarious when you see it for yourself.

[1]: https://tylervigen.com/spurious-correlations

graemep•1d ago

Its fun, but the point should be well known (i know its not). Time series are very prone to spurious correlations - r² is not useful.

Its even worse if you just eyeball a graph. If something changes over time, you need to use appropriate measures.

feoren•17h ago

> r² is not useful

People want so badly to have an "objective" measure of truth that they can just pump data into and get a numeric result that "doesn't lie". r², p < 0.05, χ2, etc. It's too far to say these numbers aren't useful at all -- they are -- but we're just never going to be able to avoid the difficult and subjective task of interpreting experimental results in their broader context and reconciling them with our pre-existing knowledge. I think this is why people are so opposed to anything Bayesian: we don't want to have to do that work, to have those arguments with each other about what we believed before the experiment and how strongly. But the more we try to be objective, the more vulnerable we are to false positives and spurious correlations.

worldsayshi•2h ago

Perhaps I'm missing you point a bit but you can absolutely have deterministic UX when it matters with LLM based applications if you design it right. Whenever you need determinism, make the LLM generate a deterministic specification for how to do something and/or record it's actions. And let the user save away re-playable specifications along with the dialogue. Then build ways for the AI to suggest fixes for failing specs when needed.

It's basically the same flow as when you use AI for programming. Except you need to constrain the domain of the specifications more and reason more about how to allow the AI to recover from failing specifications if you don't want to force the user to learn your specification language.

favflam•1d ago

I find people relying way too much on AI tools. If I pay someone a salary, they need to understand the actually answer the give me. And their butt needs to be on the line if the answer is wrong. That is the purpose of them getting a salary. It is not just to do the work, but it is to be responsible for the results. AI breaks this in a lot of the use cases I see crop up on ycombinator.

If some AI tools outstrips the ability for a human to be in the decision loop, then that AI tool's usefulness is not so great.

tptacek•1d ago

they need to understand the actually answer the give me

OK.

And their butt needs to be on the line if the answer is wrong

OK.

I'm not seeing the problem.

roenxi•1d ago

The problem seems to be a significant number of software programmers not being confident about what they will be doing to justify drawing a salary.

wiseowise•1d ago

I don’t understand the premise: you were hired to do the job, you do the job, tooling improves so you do more job with less resources. It’s a win-win for everyone.

layer8•1d ago

Lower-skilled people will be able to “do the job” with the new tooling (on a level that management believes to be good enough), and doing more job with less resources also means with less human resources. There is no win-win, similar to how there is no win-win for artists who in principle can now produce more “art” with less resources.

chii•1d ago

only if the assumption of a static world, with static demand is true.

the past has shown that this is not the case.

layer8•1d ago

You aren’t addressing the lower-skill part of the argument.

In addition, there are fewer farmers than there used to be, despite demand not having been static.

chii•22h ago

some people who used to have a high-skill job will have their skill be devalued as AI takes over - this is a consequence of technological improvement. It is not a right that society maintains a level of value in an acquired skill. They will have to adapt.

As for fewer farmers, that is exactly it - those who would have been farmers would be required to acquire new skills or pursue something other than farming. Bringing this back to AI - artists, writers and programmers who get displaced will need to adapt. In the long term, the massive decrease in costs of production of various "creative" endeavours will produce new industries, new demand and increase overall wealth - even if it is not shared evenly (in the same sense that past technological leaps are also not shared evenly).

pixl97•19h ago

>There isn’t a rule of economics that says better technology makes more, better jobs for horses. It sounds shockingly dumb to even say that out loud, but swap horses for humans and suddenly people think it sounds about right.

bootsmann•1d ago

I don’t think this is true at all, its very evident when you see how quickly ai assistants break down if they meet established, complex codebases that do a little more than your average todo list app.

Cthulhu_•22h ago

That's not new though, the bar for software development has been continuously dropping - at least on paper - since the profession started. I don't know assembly or manual memory management in C, but I do know languages and tools that allow me to do the job. Do I steal jobs from assembly / C developers?

Don't get me wrong, I don't like AI either and it's only a matter of time before my day job goes from "rewrite this (web) app from 5-10 years ago" to "rewrite this AI assisted or generated (web) app from 5-10 years ago". But I don't think it's going to cost that many jobs in the long run.

autoexec•17h ago

> it's only a matter of time before my day job goes from "rewrite this (web) app from 5-10 years ago" to "rewrite this AI assisted or generated (web) app from 5-10 years ago".

That seems optimistic. It all goes according to plan you won't get to write or rewrite anything anymore. You'll just be QA reviewing AI output. If you find something wrong you don't get to correct it, nobody will pay for coders, you'll just be telling AI about the problem so it can try to correct it until the code passes and the hallucinations are gone. That way they can pay you much less while you help train AI to get better and better at doing your old job.

adrianN•1d ago

If tooling makes everybody 20% more efficient you can fire 20% of your employees and increase profit.

rienbdj•1d ago

Companies prefer growing revenue to cutting costs typically.

Don’t will depend if there are many projects with a good outlook sitting around.

donalhunt•1d ago

My stance has always been to lean on the available tools to free up time to work on the more interesting problems that deliver value to the organisation / company. Has been a good strategy to date.

Sadly, the current environment does not reflect that in my experience. There is a vicious focus on keeping profit margins at a steady rate at all costs while slashing spend on tooling which requires re-work on solved problems. :/

At some point the music is going to stop and it's not going to be pretty I suspect. :(

chii•22h ago

organizations that don't trust their engineers to work towards delivering value (by using better tooling, efficiency increasing automation etc), means that they don't improve and is accepting the current status quo.

That's why you need to keep an eye out, and smell whether the management understands it or not. Plan to leave, as your value contribution will not give you back the reward that such contributions deserve in this type of organization.

sokoloff•1d ago

Tooling has made us vastly more efficient since the days of FORTRAN and punch cards, which has caused millions more of us to be able to employed in the field. Few companies could afford websites built with ancient tech, but everyone can afford websites built with recent tech and, as a result, everyone has one.

NB: Firing 20% of employees requires a 25% increase in efficiency by the simple math.

Cthulhu_•22h ago

That's the very superficial theory, but in practice it means your company can do 20% more (?? I can't math) work. I've worked in B2C companies for years and there's always more work, never a point where they have to downsize.

whateveracct•19h ago

If the Mythical Man Month were real..then yes that would be the case.

empath75•17h ago

What actually happens is software development gets cheaper and you do more stuff with the same people.

guappa•21h ago

I don't like that my job is now to fix very verbose slop made by noobs.

butlike•19h ago

I have a theory you never actually save on resources, they just change; but I digress.

gonzo41•1d ago

There's a lot of make work out there. And AI does help. Just today it got me out a self made jam with sqlalchemy. But it's not the panacea that the entertainment financial reports make it out to be.

maccard•1d ago

That’s not new in the age of AI. I’ve refused code reviews with blatant race conditions, initialisation order problems, even stuff that literally triggers compiler warnings, and been told “it’s not a big deal” in response.

happymellon•22h ago

We've had a similar long running joke about architects.

Step 1. Design system that gets push back because a bunch of things appear to be buzzwords

Step 2. Force it through politically

Step 3. Quit with shiney buzzword on CV

Step 4. Narrowly avoid being around when shit hits the fan.

I find developers are usually much more concerned about it working well because it ends up being their baby. Not always of course, but more often than architects that don't actually have to do the work.

qsort•1d ago

I don't understand why parent is downvoted. I'm not an AI booster, but "if you use X tool then you don't understand" is a bad argument and needs to stop.

You're supposed to read the output critically and you're still responsible for it. AI or not AI.

dheatov•1d ago

I believe its just marketing. The "AI" will ultimately be used as a Natural Language Calculator. IMO we will eventually give up trying to make calculator designs, plans, reviews and interprets. Things too good to be true will fade and remain academic after the bubble bursts.

MoreQARespect•1d ago

Often this is true but I find that for complex or semi-complex applications with confusing (and often shitty) user interfaces LLMs are pretty much a net positive. For all of their faults, one thing LLMs are good at is providing a more user friendly UX for complex apps that are used rarely.

For observability I find most apps fit in this category. They are complex, they usually have UX that is so bad it makes me rage and I don't use most of their deep level features very often.

I think Jira could also benefit. Its UX is so bad it borders on criminal.

The hallucination issue can be worked around by providing that demonstrates the agent's working (i.e. what tools they called with what parameters).

RobAley•22h ago

> The hallucination issue can be worked around by providing that demonstrates the agent's working (i.e. what tools they called with what parameters).

And this is (in my opinion) an intractable problem - You can get the AI to list the tools/parameters it used, but then you can't be sure that it hasn't just hallucinated parts of that list as well, unless you both understand that they were the right tools and right parameters to use, and run them yourself to verify the output. And at that point you might as well just have done it yourself in the first place.

I.e. if you can't trust the AI, you can't trust the AI to tell you why you should trust the AI.

octo888•1d ago

Depends what you're paying.

£50k full stack dev in London? You're going to get AI slop these days, particularly the moment you encourage usage within the org. LLMs will help underpaid people act their wage. You've already asked them to do 2 or 3 people's jobs with "full stack"

Will you pay £20k more for someone who won't use AI?

j7ake•1d ago

Does 20k really make a difference in quality of hire?

Valodim•1d ago

In this case it's like 40%, so I would assume yes

bravesoul2•1d ago

You get what you pay for is a correlation with some causation and a lot of randomness.

octo888•1d ago

True but we can agree on some level if we can agree generally a £30k junior is going to be worse than a £100k senior

bravesoul2•1d ago

Ymmv

octo888•1d ago

Please don't nitpick over the exact number

(But yes, salaries in the UK are low. £20k is a lot to a lot of people)

favflam•1d ago

I take it as a P = NP problem.

I either: a. spend money on a person and take money and time to specify the business problem to this person and take money and time to judge the results b. skip the person and just use the AI itself

The difference between the two is that I don't have some metaphorical hostage to execute in "b". If the task is trivial enough, I don't need a hostage.

HumanOstrich•1d ago

What does that have to do with P = NP?

antihero•1d ago

We’re encouraged to use AI at work (I’m senior full stack and have been coding for 20 years) and honestly I’ve Ben experimenting a lot and other than menial work, doing stuff myself is still way faster when you have a large existing codebase. I spent probably £200 of tokens last week trying I get it to expand a lib in an elixir monolith I’m unfamiliar with, and add a controller to a large old Phoenix API. Couldn’t get it working. This week got some good rest and just did the whole thing myself in like a couple days.

sokoloff•1d ago

I’m willing to pay £20K less for someone who promises to never use AI.

AI can’t solve everything, but never using it isn’t the right answer.

sudomateo•1d ago

If you hire someone, you need to vet them enough to mitigate a bad hire. Your butt needs to be on the line if they are a bad hire.

Saying things like this in isolation is silly because it just shifts blame rather than creating an environment of teamwork and rigor. What I got out of the article is that AI helps humans get to answers faster. You still need to verify that the answers are correct.

otabdeveloper4•1d ago

> AI helps humans get to answers faster

Well, that is the "vibe" at least. We don't really know for sure. (Maybe ask ChatGPT?)

atmosx•1d ago

In many cases, people are satisfied with responses that are almost or somewhat correct. That's a valid approach - that's exactly how prometheus monitoring works (pull-based metrics): we don't see the details of that spike, just the big picture and that's good enough and cheap in a sea of datapoints where pods come and go...

It feels like another example of regression toward the mean in our society. At the same time, this creates a valuable opportunity for diligent, detail-oriented professionals to truly stand out.

sudomateo•20h ago

Well said! If AI can start me at 80% and I take it the remaining 20% that helps me do more and learn with my time. It frees me up to do the human things that are required of me.

I'm actively aware of the regression towards the mean and discuss that with my peers frequently. It helps me prevent atrophy of my skills while reaping the benefits of AI. Put another way, there are people out there using AI to punch well above their weight while not actually being a good fighter in the first place. If you're a good fighter are you going to let an inexperienced fighter step into the rings that you're meant to step into?

wiseowise•1d ago

> I've largely replaced Google with ChatGPT for looking things up, but it hasn't changed anything about what I write.

https://x.com/paulg/status/1917320233955602540

antihero•1d ago

It can definitely point you in the right direction but if you are lazy and blindly trust it you’re a fool.

sudomateo•20h ago

That's why I said it helps, not that it's perfect or absolute. It's still up to a diligent human to verify the output from AI in the context they're working it. I personally use AI to do more, learn more, and help with context switching. It helps me make more efficient use of my personal and professional time while still challenging me to learn about topic I previously didn't have or make the time to.

Those that blindly post @grok please help or use AI for everything will likely experience some level of atrophy in their critical thinking skills. I would advise those people to think about this and adjust course.

nikolayasdf123•1d ago

> If I pay someone a salary

we are quickly approaching the world where you financially can't afford to pay anyone any salary...

msgodel•1d ago

I think you'll have a lot more independent software devs. I think that's good, corporations were surprisingly bad at developing software.

t_mahmood•1d ago

That's what I'm thinking, while corporate lays off people, these people groups up, offer human services, instead of corporate AI bull, corporate dies off, and we are back to software actually being people things again.

Is that too extreme to be real?

msgodel•1d ago

I don't think AI is going away. I'm not sure why you think that.

hyperadvanced•1d ago

If you assume that AI will further enshittify the corporate software program writ large, it’s no easy to see how this is analogous to the dot com boom where new good ideas get sidelined and easy ones get mainlined, and the outsiders ultimately storm the castle and win. Granted, I don’t really think that is the case, but it’s a possibility

alextingle•1d ago

Cheap-as-chips AI from gigantic data-centres is certain to go away, one way or another. Either the companies succeed in their game plan, and start to raise prices once they've established a set of dependent customers, or else they all go bust and the data-centres stand idle whilst the industry puzzles over a business model that works.

Of course the technology will remain. You'll still be able to run models locally (but they won't be as good). And eventually someone will work out how to make the data-centres turn a profit (but that won't be cheap for users).

Or maybe the local models will get good enough, and the data-centres will turn out to be a gigantic white elephant.

t_mahmood•23h ago

I was not clear, apologies, I meant to say, I feel the corporate control over things should go away.

AI is definitely not going away. But I feel it could be like how it was in 90s, when people can get their hands on a computer and do amazing things. Yes, I know corporate existed then, but I think it was not this direct, and sinister as it is now.

Many of the corporates that are trying to getting rid of employees in favor of AI, So they might lose a huge ground to someone who keeps the old school way going while leveraging AI in a more practical way, and general people will soon realize it's not what they're being fed, and revert, but it's not that easy for the big ones to rollback.

Which might end the corporate control over many of the things.

This is what I meant, but I am sleepy and tired, failed to articulate properly, hopefully the gist of it was clear.

mrkeen•1d ago

Not going to happen. The business owner and the customer are two different personas, even when it's the same person.

The customer hates cookies, bloated websites, trackers, popups, pointless UI rearrangements, autoplay videos and clickbait. Classical enshittification.

When the customer puts on his suit and tie and goes into the office, he'll pay software devs to get these features into the product.

I don't see why AI would be any different.

nikolayasdf123•1d ago

that's the problem. market is already oversaturdated. just check HN posts about App Store crisis we have now

gatekeeping on distribution is unbelievable. getting something to financially work requires marketing and "white passs from gatekeepers" expenditures which eat away any margins you may have

if you get laid off by big tech (no matter years of experience) chances are you are going to be doing Doordash and living in a tent

msgodel•1d ago

Smart phone apps are really a corporate thing. I think you should just ignore those.

motorest•1d ago

> gatekeeping on distribution is unbelievable. getting something to financially work requires marketing and "white passs from gatekeepers" expenditures which eat away any margins you may have

That is one way of putting it.

Another way to put it is that app stores are so saturated with N versions of the same God damned app doing exactly the same God damned thing that even when they start to charge a gatekeeping fee you still get marker saturation.

motorest•1d ago

> I think you'll have a lot more independent software devs. I think that's good, corporations were surprisingly bad at developing software.

Will you, though? I mean, working on software for a living means having someone paying you to do something. If a corporation with an established business and reliable revenue can't justify paying you for your work, who do you expect to come in and cover your rent?

msgodel•1d ago

I don't know if you've worked for an "established" corporation developing software before but most of what they pay you for is dealing with internal (arguably mostly social) stuff. Some minority is actual useful software development work.

kweingar•1d ago

And yet the systems built at these places far exceed what an indie dev can do

radiator•1d ago

Not sure whether it's the systems built there, or the customer bases that they have.

motorest•23h ago

> (...) but most of what they pay you for is dealing with internal (arguably mostly social) stuff.

Not really. They pay you to deliver something, but you also need to coordinate and interact with people. That involves coordinating and syncing.

There is absolutely no position or role whatsoever, either in a big corporation or small freelancer gig, that you do not need to coordinate and interact with people.

Pray tell, how do you expect to be paid while neither delivering results nor coordinating with anyone else?

xyzal•1d ago

Where is that 'growth' disappearing then?

sethammons•1d ago

The squeeze of later stage capitalism: growth disappears as investment gains. Capital is redistributed from the less/non-capital owners to the owners. Any company not willing to push to get competitive ROI will not receive competitive investment. As an investor, why chase x% when x+1% alternatives exist?

nikolayasdf123•1d ago

gatekeeping by platforms + marketing cost + oversaturated market

7bit•1d ago

> to be responsible for the results

I used to think the same way. But life taught me that responsibility is a fairy tale only for the bottom 90% — a story told to keep them obedient, hardworking, and self-blaming. Meanwhile, the top 10% rewrite the rules, avoid accountability, and externalize blame when things go wrong. And if you’re not in that elite circle, you either pay the price for their mistakes — or you’re told there’s just nothing you can do about it.

trashtester•1d ago

Ironically, finding ways to spin stories like that IS one way of taking responsibility, even if it's only a way to take responsibility for the narratives that are created after things happen.

Because those narratives play an important role in the next outcome.

The error is when you expect them to play for your team. Most people will (at best) be on the same team as those they interact with directly on a typical day. Loyalty 2-3 steps down a chain of command tends to be mostly theoretic. That's just human nature.

So what happens when the "#¤% hts the fan, is that those near the top take responsbility for themselves, their families and their direct reports and managers first. Meaning they externalize damage to elsewhere, which would include "you and me".

Now this is baseline human nature. Indeed, this is what natural empathy dictates. Because empathy as an emotions is primarily triggered by those we interact with directly.

Exceptions exist. Some leaders really are idealists, governed more by the theories/principles they believe in than the basic human impulses.

But those are the minority, and this may aven be a sign of autism or something similar where empathy for oneself and one's immediate surrounding is disabled or toned down.

HumanOstrich•1d ago

I was with you until you casually diagnosed a large group of people with autism using incorrect criteria.

motorest•1d ago

> I used to think the same way. But life taught me that responsibility is a fairy tale only for the bottom 90% — a story told to keep them obedient, hardworking, and self-blaming.

I think you're missing the big picture. You're focusing on your conspiracy theory that the hypothetical 10% are the only ones valuing responsibility as a manipulation tactic. However, what do you do when anyone, being it a team member or a supermarket or an online retailer or a cafe employee fails to meet your bar of responsibility? Do you double down on their services, or do you adapt so that their failure no longer affects your day?

You've put together this conspiracy that only higher ups value responsibility, and only as a crowd control strategy. If you take a step back and look at your scenario, you'll notice that it is based on how you are in a position where you are accountable to them. Is that supposed to mean that middle management is not accountable for anything at all? Is no mom and pop shop accountable for anything? Why is Elon Musk crying crocodile tears over how unfair the world is and for having been kicked out of DOGE and the world stopping buying Tesla cars?

7bit•22h ago

What conspiracy theory?

Richard Field (Lehman Brothers) walked away with 500-1000 Million USD after causing the worldwide financial crisis.

Trump bankrupted multiple corporations, impacting the lives of thousands of workers and their families. He is president now.

RFK Junior's budget cuts stopped important work on HIV and cancer research, delaying finding a cure or better treatment, which will cause pain and suffering for millions of people. He just sacked the entire vaccine committee. He still is secretary of health.

I would go on, but my stomach ulcer is already flaming up again.

The only people who face the consequences of their actions are you and me. The wealthy shit on your morals of taking responsibility. That's not a conspiracy theory.

Izkata•18h ago

> Trump bankrupted multiple corporations, impacting the lives of thousands of workers and their families. He is president now.

Like five or so out of a few hundred. IIRC that's better than average, which would mean he saved more jobs than he lost.

favflam•1d ago

I understand the sentiment. However, I as an individual do have some power in this world, even if it is fleeting power.

I am sure at some point you have some power too. Just use that moment to make up the difference.

For the good of society and human kind, don't give in up front.

atoav•1d ago

I see it the same way, but LLMs like any tool depend on how they are used. I am mostly using them:

1. As an additional layer of checks, to find potential errors in things I have been doing (texts, code, ideas). This works good for the most part.

2. As a mental sparings partner where I explore certain ideas with me being the one to guide the LLM, instead of the other way around. This is much more about exploring thoughts than it is about the LLM providing anything I actually use anywhere.

3. As an additional tool to explore new bodies of text and code

But of course it makes me wonder how people will fare that don't know their shit yet. I had a LLM tell a student an electrical lie that would have very likely have caused a fire before, where the LLM got the math wrong in exactly the opposite way of how it would work in reality.

ninetyninenine•1d ago

There will be a day when you can reliably put AI’s butt on the line.

fendy3002•1d ago

only if AI's output is somewhat consistent. Like a calculator, nobody will ask for responsibility of calculator. But that day for AI is still long

s1mplicissimus•21h ago

I'm not going to hold my breath for that to happen tbh

ninetyninenine•7h ago

I wasn’t holding my breath for Turing test passing AI either. We’ll see what the next decade brings.

est•1d ago

AI tools are just like interns. They cost less and do some jobs well, and sometimes they fail.

bravesoul2•1d ago

This AI usage ain't that. When diagnosing SRE type issues you want useful facts fast. You are playing detective and AI provides leads but you still need to make human decisions for most things. AI could do stuff like rollbacks of recent deployments tho.

vasco•1d ago

Auto rolling back on bad metrics post deployment is something that doesn't require AI. Not in a "you can do it with a lot of work and a huge team" sense, but in a "take a few days to review your metrics and then implement it in a day".

bravesoul2•1d ago

Yeah for context I mean doing slow rollouts and checking error rates on a new deployment is a given. That's not AI.

But for an alert for a deployment that has been out for a while there is a bit of a decision to be made as whether to roll back, scale up etc. sometimes rolling back achieves nothing other than increasing the delta again when you roll forward and delaying stuff.

eptcyka•1d ago

Some are comfortable giving answers and solutions they do not actually understand even without the help of AI.

friendzis•1d ago

Sadly, the ones who take the time to be sure of the answer they give and stick around to put their butt on the line provide too much value in their position to be replaced, so they tend to NOT get promoted. One of the best ways to climb the corporate ladder is to actually give quick, plausibly sounding answers and move before their butt ends up on the line.

These incentive structures now give tools for mediocre bullshitters to bullshit their way through and indirectly promote proliferation of these tools.

Malcolmlisk•1d ago

> These incentive structures now give tools for mediocre bullshitters to bullshit their way through and indirectly promote proliferation of these tools.

This scares me a lot. I'm a software consultant and I see my software and solutions being appropriated by bullshitters from inside the company. I don't know what to expect from the future anymore.

lipowitz•23h ago

I've seen this cycle play out quite a few times long before LLMs. What scares me this time is the wide ranging possible consequences of the automatic assistant in terms of how far they can lead us down the garden path and how hard people are pushed to become BSers.

butlike•20h ago

The advent of LLMs is kind of brilliant, because now instead of the chain-of-responsibility landing on some lowly engineer who might get fired, it can be brickwalled into some LLM. Good guy LLM doesn't care who's pointing their finger at it. Good guy LLM doesn't have their job on the line.

klank•18h ago

No. LLMs are not some new moral beacon that leadership will be happy to have a finger pointed at.

Historically, "leadership" at organizations haven't cared about objective truths when those truths conflict with their desires. So why would they care about what a hyped up gradient descent has to say?

cogman10•19h ago

Let me just provide a bit of hope.

I'm not a huge fan of AI stuff, but the output quality is (usually) above that of what BSers were putting out.

While I still need to double check my BS team members, the problems with the code they are pushing is lower than what it was pre-AI everywhere. To me, that's a win.

I guess what I'm saying is I'd rather have mediocre AI code written than the low quality code I say before LLMs became as popular as they are.

lipowitz•18h ago

On a code level I'm inclined to agree that it will do better line-by-line.

On more abstract things I think it has to have intentional filters to not follow you down a rathole like flat earth doctrine if you match the bulk of opinion in verbose authors in a subject. I don't see the priority for adding those filters being recognized on apolitical STEM oriented topics.

pnutjam•18h ago

The quality of any individual part of the code might be better, but the architecture is way worse and unsustainable long term.

cogman10•18h ago

Yes. However, frankly, BSers weren't maintaining a good architecture anyways (in my experience). Code simply landed where it could rather than addressing the overarching problem.

It's why code reviews remain important.

lipowitz•18h ago

A BSer is about pushing for things that don't make sense but sound like they solve more constraints than anything possible to implement could. It is very unlikely they are giving you code that could compile. (Though if so there's a little bug or todo in it that just happens to be Turing award's material.)

Cthulhu_•22h ago

This is kind of par for the course for consultants tbh. Make sure to keep, log and publish your solutions on your own CV for future contracts. Or if you like the company, apply to work there yourself.

butlike•20h ago

Don't worry about it, it's always been the way.

Now, watch this drive!

stavros•1d ago

This doesn't matter. Make the human responsible, if they want to use tools they don't understand, they make a mistake and get fired. The next human won't make the same mistake, and the problem fixes itself. This has always been the case, AI or no.

dahauns•22h ago

>Make the human responsible, if they want to use tools they don't understand, they make a mistake and get fired. >The next human won't make the same mistake, and the problem fixes itself.

This presumes that the human mandating tool use, the human making the mistake, and the human held responsible are one and the same.

stavros•21h ago

The tool is making the mistake, and if the human mandating the tool use and the human being responsible for the tool use aren't the same, you have bigger problems.

butlike•20h ago

Humans can become tools given enough layers of responsibility. How do you think the board/CEO get anything done? They move their C-levels like the engineer moves their tools.

stavros•19h ago

And who do you think is responsible if the C-levels continue causing the company trouble?

pixl97•19h ago

Generally some random people in IT that had nothing to do with it.

marcosdumay•18h ago

Welcome to the world. Are you new here?

Yes, we have bigger problems.

randomstate•1d ago

Lately I think more and more about the famous IBM quote - "A computer can never be held accountable, therefore a computer must never make a management decision."

I think this also applies to current AI solutions and I think that's precisely why the best workers will be humans who both use AI and will put their accountability at stake.

bbarnett•1d ago

Maybe a decade ago, I wanted to have some fun with a corporate structure.

One thing I wanted to do, was setup an AI as CEO and board member of a small company. Then, all pay for contracts I do would flow through that entity.

This was going to be simple algo AI, with pre-scripted decisions including "hire an expert for unknown issues" then "vote yes to expert's solution". The expert always being me, of course, under contract.

I'd take a much lower hourly rate for normal rech work, and instead keep capital in the business. The business would buy an old warehouse, have servers, and a guard.

I'd be hired as the guard too.

You can hire guards with the requirement they sleep on site. In such a case the lodging isn't income, as it is required for the job.

So I could take a very low hourly rate, and maybe options in shares.

And of course the car would be corporate leased, and all that.

This is perfectly legit setup in my tax jurisdiction, except for the AI part. Back then, I had two goals.

The first was a way to amusingly conceal where I live. It bugs me that I have to disclose this. The second was to keep as much of my earnings in a corporation, and the mortgage also gave the corporation an asset with increasing value.

But my name would not be on the registration for the company as a board member. Or CEO.

And in fact my relationship to the company would be under contract as a sole proprietor, so no need to file certain employee laden paperwork.

Upon retirement, I could cash out the company in a variety of ways, after exercising my options.

(The company would also be required to help with a bridge loan for this)

So maybe IBM is wrong?

(Just an amusing edge case)

yial•19h ago

Can’t you pull of something similar using irrevocable trusts ? Overly simplified, but have a trust registered through an agent you trust( ha ha), have them add you as a trustee, they then resign?

bbarnett•18h ago

Not against the idea, but I think they're more constrained where I am. I recall looking into it and instead going down the corporate path.

But I could be wrong.

watwut•22h ago

Management is routinely unaccountable and extremely good at avoiding accountability. If this is the difference, humans cant do management decisions either.

otterley•5h ago

Whether management can be held accountable is a question of will. The difference is that they can be held accountable if people so choose, whereas a computer never can.

samzer•1d ago

There is definitely hype around AI but there is also value in it.

The realistic scenario for a well functioning organization is not to replace their entire DevOps team with AI agents and then realize the hard reality that their entire tech infra is on fire but to actually use these as a tool that makes their best DevOps person more efficient and make their DevOps resources much more leaner. This increases the productivity for the same cost. Even though this sounds grim for the future of work, this is what I feel is going to happen.

jstummbillig•23h ago

Does this reflect HNs average view of how employment functions and how employers should treat their employees? As an employer, I find this post mostly confusing and having very little to do with the role of actual, real life employees in my experience.

mihaaly•23h ago

I agree. Unluckily the 'take the money and blame every and anyone(thing) else' is a very old game regardless of the context and the tools used. Just look at the characteristic trait of all societies: politics. Barf! Our parents spent a lot of effort trying (with more or less success) distinguishing the honests from shitheads, the reliables from cheats. We are in a constant struggle to learn it further on our own.

Anyhow, perhaps looking at what tools some are using could be a better and better indicator of who are the responsible and who are the not that much responsible types when giving away our hard earned money for something we need and want to receive in exchange.

afiodorov•22h ago

I like this reply because it shows the shifting paradigm of hireability. Years ago, software engineers used to be hired just for their coding talent and would be put up with even if they were arrogant and had poor hygiene. With time, we all rebranded as problem-solvers, now required to be good communicators and team players in addition to writing good code. The next evolution is that we will be the ones responsible for a certain area, and how we get it done will be abstracted away.

csomar•20h ago

> AI breaks this in a lot of the use cases I see crop up on ycombinator.

AI doesn't break anything. The responsibility is exercised by upper leadership. And current leadership is high on AI crystal meth like an NYC subway junkie.

coldtea•19h ago

>If I pay someone a salary, they need to understand the actually answer the give me. And their butt needs to be on the line if the answer is wrong.

Given the prevalence of outsourcing to random remote companies, this is a moot point for many corporations.

whateveracct•19h ago

The real money AI is after isn't net-new but rather a chunk of people's salaries.

Give us $X/year for our tool that makes your employee "more efficient" (more fungible tbh). Subtract that $X/year from the salary you pay.

That's their big bet.

ww-picard-do•17h ago

It seems more likely that people will get paid more. When companies want to reduce labor costs they usually lay off people, not decrease salaries. We see this happening in FAANG. The remaining workers will not only need to understand their industry domain, but they will also need to be skilled in working with AI tools. People who can do both well will be in demand.

mmsc•16h ago

>If I pay someone a salary, they need to understand the actually answer the give me. And their butt needs to be on the line if the answer is wrong.

What's that got to do with AI exactly? Sure, there's the chance that this AI thing will disappear tomorrow and they won't be able to do anything, but so too is the chance the stackoverflow disappears

satisfice•1d ago

So many engineers feel fine about a tool that they cannot rely upon.

Without reliability, nothing else matters, and this AI that can try hypotheses so much faster than me is not reliable. The point is moot.

geraneum•1d ago

> This isn’t a contrived example. I basically asked the agent the same question we’d ask you in a demo, and the agent figured it out with no additional prompts, training, or guidance. It effectively zero-shot a real-world scenario.

As I understand, this is a demo they already use and the solution is available. Maybe it should’ve been a contrived example so that we can tell if the solution was not in training data verbatim. Not that it’s not useful what the LLM did but if you announce the death of observability as we know it, you need to show that the tool can generalize.

akrauss•1d ago

I would be interested in reading what tools are made available to the LLM, and how everything is wired together to form an effective analysis loop. It seems like this is a key ingredient here.

trashtester•1d ago

For now, the people able to glue all the necessary ingredients together are the same ones who can understand the output if they drill into it.

Indeed, these may be the last ones to be fired, as they can become efficient enough to do the jobs of everyone else one day.

yellow_lead•1d ago

Did AI write this entire article?

> In AI, I see the death of this paradigm. It’s already real, it’s already here, and it’s going to fundamentally change the way we approach systems design and operation in the future.

How is AI analyzing some data the "end of observability as we know it"?

schwede•1d ago

Maybe I’m just a skeptic, but it seems like a software engineer or SRE familiar with the application should be able to come to the conclusion of load testing fairly easily. For sure not as fast like 80 seconds though which is impressive. As noted you still need an engineer to review the data and complete those proposed action items.

devmor•1d ago

As the AI growth cycle stagnates while valuations continue to fly wildly out of control and more and more of the industry switches from hopeful to a bearish sentiment, I’ve started to find this genre of article extremely funny, if not pitiable.

Who are you trying to convince with this? It’s not going to work on investors much longer, it’s mostly stopped working on the generically tech-inclined, and it’s never really worked on anyone who understands AI. So who’s left to be suckered by this flowery, desperate prose? Are you just trying to convince yourselves?

vanschelven•1d ago

> New abstractions and techniques... hide complexity, and that complexity requires new ways to monitor and measure.

If the abstractions hide complexity so well you need an LLM to untangle them later, maybe you were already on the wrong track.

Hiding isn't abstracting, and if your system becomes observable only with AI help, maybe it's not well designed, just well obfuscated. I've written about this before here: https://www.bugsink.com/blog/you-dont-need-application-perfo...

neuroelectron•1d ago

This would have been really nice to have when I was in Ops. Running MapReduce on logs and looking at dozens of graphs made up most of my working hours. We did eventually get the infrastructure for live filtering but that was just before the entire sector was outsourced.

catlifeonmars•1d ago

Was anyone else just curious about those odd spikes and was disappointed the article didn’t do a deeper dive to explain that unusual shape?

anentropic•18h ago

Did you read to the end?

It identified them as result of load testing - they were isolated to a single user, against a single endpoint (checkout), user-agent was python/requests, shopping cart qty was unusually large, and they had the 'ramp up then stop' shape of a load test

catlifeonmars•13h ago

I did, but I apparently missed the ramping part. Thanks for the summary.

heinrichhartman•1d ago

> New Relic did this for the Rails revolution, Datadog did it for the rise of AWS, and Honeycomb led the way for OpenTelemetry.

I find this reading of history of OTel highly biased. OpenTelemetry was born as the Merge of OpenCensus (initiated by Google) and OpenTracing (initiated by LightStep):

https://opensource.googleblog.com/2019/05/opentelemetry-merg...

> The seed governance committee is composed of representatives from Google, Lightstep, Microsoft, and Uber, and more organizations are getting involved every day.

Honeycomb has for sure had valuable code & community contributions and championed the technology adoption, but they are very far from "leading the way".

loevborg•21h ago

As someone who recently adopted Honeycomb, it really is an amazing tool. Especially with otel auto-instrumentation, you get insights within a few hours. The dashboard / query features are very powerful and obviously stem from a deep philosophical understanding of observability. My team was shocked at how good the tools is.

Datadog by contrast seems to be driven by marketing and companies having a "observability" checkbox to tick.

ayewo•18h ago

Which programming languages are you using with the OTel auto-instrumentation feature?

loevborg•14h ago

Node and Python. It's amazing how much works out of the box - express routes, http queries, dns queries, the list goes on

Kiyo-Lynn•1d ago

I used to think that monitoring and alerting systems were just there to help you quickly and directly see the problems.But as the systems grew more complex, I found that the dashboards and alerts became overwhelming, and I often couldn’t figure out the root cause of the issue. Recently, I started using AI to help with analysis, and I found that it can give me clues in a few seconds that I might have spent half a day searching for.

While it's much more efficient, sometimes I worry that, even though AI makes problem-solving easier, we might be relying too much on these tools and losing our own ability to judge and analyze.

vjerancrnjak•1d ago

Yes, at any particular task you will be better than AI.

We somehow forget that none of these systems are better than expert humans. If you rely on a tool, you might never develop the skills. Some skills are more worth than others. You won’t even have the experience to know which ones as well.

globular-toast•1d ago

I would argue that people have been relying on observability tools too much rather than designing systems that are understandable in the first place.

germandiago•23h ago

This is 100% true and also my experience.

However many companies will just care about how fast you deliver a solutiom, not about how much you are learning. They do not care anymore.

The speed of the productive process is critical to them in many jobs.

dgellow•23h ago

Sort of related: using Claude code with the gcloud CLI, only allowing read only commands (and of course no ssh), and with supervision, is such a superpower. I don’t think I can go back to debugging my infra manually. It’s like all use of Claude code, not a fire and forget, you have to guide and correct it, but that’s so much faster and easier than dealing directly with the mess GCP APIs is

benterix•21h ago

> only allowing read only commands

Out of curiosity, how do you do that? I have no experience with this tool, not I would ever thought to use it for infra, but you made me curious.

plucas•2h ago

My assumption is by creating a service account with limited privileges and activating it for gcloud when running this.

benterix•25m ago

Yes, in the meantime I figured out it's basically assigning it the Reader IAM role in GCP.

captainbland•22h ago

Gripe: enshittification is tangential to performance concerns. It doesn't just mean software getting bad, it means software which may be technically accomplished doing things which are bad for users in service of creating ROI for investors after a period of market share building.

gilbetron•21h ago

There's a bit of a flaw in the "don't need graphs and UIs to look at your data" premise behind this article: sure, LLMs will be great ... when the work great. When they fail, you need a human there to figure it out and they will still need the graphs.

Furthermore, while graphing and visualization are definitely tough, complex parts about observability, gathering the data and storing it in forms to meet the complex query demands are really difficult as well.

Observability will "go away" once AI is capable of nearly flawlessly determining everything out itself, and then AI will be capable of nearly anything, so the "end of observability" is the end of our culture as we know it (probably not extinction, but more like culture will shift profoundly, and probably painfully).

AI will definitely change observability, and that's cool. It already is, but has a long way to go.

nilkn•19h ago

It's not the end of observability as we know it. However, the article also isn't totally off-base.

We're almost certain to see a new agentic layer emerge and become increasingly capable for various aspects of SRE, including observability tasks like RCA. However, for this to function, most or even all of the existing observability stack will still be needed. And as long as the hallucination / reliability / trust issues with LLMs remain, human deep dives will remain part of the overall SRE work structure.

pmbauer•18h ago

It must burn a little blogging about an LLM-driven latency analysis _internal demo_ only to have Datadog launch a product in the same space a day later. https://www.datadoghq.com/blog/bits-ai-sre/

Rastonbury•8h ago

And it wasn't just that single AI product either, they announced a bunch including an MCP server and cursor extension

https://www.datadoghq.com/blog/dash-2025-new-feature-roundup...

nektro•12h ago

lol

Peeling the Covers Off Germany's Exascale "Jupiter" Supercomputer

Ask HN: Minecraft's UI element style (vs. modern flat glass interfac)

Air India flight with 242 on board crashes, flight ops suspended at airport

AI Healthspan Prediction: Leading the Future with Avio Health's Agentic AI

Ask HN: When is it too little and when too much when you do market research?

Maximizing Battery Storage Profits via High-Frequency Intraday Trading

Pentagon Has Been Pushing Americans to Believe in UFOs for Decades, New Report

T Cells Take Up Residence in the Healthy Brain via a Gut-Fat-Brain Axis

Show HN: Turn your YT videos into AI-tutor

Opera Neon: the first AI agentic browser

Humanity has captured our first look at the Sun's South Pole

A California dairy tried to capture its methane, and it worked

IBM now describing its first error-resistant quantum compute system

Air India B788 at Ahmedabad on Jun 12th 2025, lost height shortly after takeoff

Dependency chain error breaks NPM packages

Disney, Universal Launch AI Legal Battle, Sue Midjourney over Copyright Claims

Building Cursor

London-bound plane carrying over 200 people crashes after take-off in India

Android 16: desktop experience with phone connected to display

Air India 787 crashes on takeoff

Air India flight to UK crashes in Ahmedabad in India shortly after takeoff

Claude, Llama can now be used with highly sensitive data in AWS' gov cloud

Agentic Coding Recommendations

Basic Patterns in How Adaptive Systems Fail

Air India passenger plane with over 200 onboard crashes near Meghaninagar

claude-code-costs: Analyze your Claude Code conversation costs

Editorial Hygiene for AI Search

Flash Linear Attention no longer maintain support for the RWKV series

The Gratitude Calendar

Show HN: BrowserTotal: A framework for analyzing browser security posture

Peeling the Covers Off Germany's Exascale "Jupiter" Supercomputer

Ask HN: Minecraft's UI element style (vs. modern flat glass interfac)

Air India flight with 242 on board crashes, flight ops suspended at airport

AI Healthspan Prediction: Leading the Future with Avio Health's Agentic AI

Ask HN: When is it too little and when too much when you do market research?

Maximizing Battery Storage Profits via High-Frequency Intraday Trading

Pentagon Has Been Pushing Americans to Believe in UFOs for Decades, New Report

T Cells Take Up Residence in the Healthy Brain via a Gut-Fat-Brain Axis

Show HN: Turn your YT videos into AI-tutor

Opera Neon: the first AI agentic browser

Humanity has captured our first look at the Sun's South Pole

A California dairy tried to capture its methane, and it worked

IBM now describing its first error-resistant quantum compute system

Air India B788 at Ahmedabad on Jun 12th 2025, lost height shortly after takeoff

Dependency chain error breaks NPM packages

Disney, Universal Launch AI Legal Battle, Sue Midjourney over Copyright Claims

Building Cursor

London-bound plane carrying over 200 people crashes after take-off in India

Android 16: desktop experience with phone connected to display

Air India 787 crashes on takeoff

Air India flight to UK crashes in Ahmedabad in India shortly after takeoff

Claude, Llama can now be used with highly sensitive data in AWS' gov cloud

Agentic Coding Recommendations

Basic Patterns in How Adaptive Systems Fail

Air India passenger plane with over 200 onboard crashes near Meghaninagar

claude-code-costs: Analyze your Claude Code conversation costs

Editorial Hygiene for AI Search

Flash Linear Attention no longer maintain support for the RWKV series

The Gratitude Calendar

Show HN: BrowserTotal: A framework for analyzing browser security posture

It's the end of observability as we know it (and I feel fine)

Comments