Sure we can use an LLM but I can for now click around faster (if those breadcrumbs exist) than it can reason.
Also the LLM would only point to a direction and I’m still going to have to use the UI to confirm.
What a good business strategy!
I could post this comment on 80% of the AI application companies today, sadly.
(Get AI to do stuff (you can already do) with a little work and some experts in the field).
(Get AI to do stuff (you can already do with a little work and some experts in the field)).
Skip to the summary section titled "Fast feedback is the only feedback" and its first assertion:
... the only thing that really matters is fast, tight
feedback loops at every stage of development and operations.
This is industry dogma generally considered "best practice" and sets up the subsequent straw man: AI thrives on speed—it'll outrun you every time.
False."AI thrives" on many things, but "speed" is not one of them. Note the false consequence ("it'll outrun you every time") used to set up the the epitome of vacuous sales pitch drivel:
To succeed, you need tools that move at the speed of AI as well.
I hope there's a way I can possibly "move at the speed of AI"... Honeycomb's entire modus operandi is predicated on fast
feedback loops, collaborative knowledge sharing, and
treating everything as an experiment. We’re built for the
future that’s here today, on a platform that allows us to
be the best tool for tomorrow.
This is as subtle as a sledgehammer to the forehead.What's even funnier is the lame attempt to appear objective after all of this:
I’m also not really in the business of making predictions.
Really? Did the author read anything they wrote before this point?I do believe the veil is at best "thin." Perhaps I was being too generous given the post starts with:
New abstractions and techniques for software development
and deployment gain traction, those abstractions make
software more accessible by hiding complexity, and that
complexity requires new ways to monitor and measure what’s
happening. We build tools like dashboards, adaptive
alerting, and dynamic sampling. All of these help us
compress the sheer amount of stuff happening into something
that’s comprehensible to our human intelligence.
In AI, I see the death of this paradigm. It’s already real,
it’s already here, and it’s going to fundamentally change
the way we approach systems design and operation in the
future.
Maybe I should have detected the utterly condescending phrase, "something that’s comprehensible to our human intelligence."It'd be less bad if the tool came to a conclusion, then looked for data to disprove that interpretation, and then made a more reliably argument or admitted its uncertainty.
One I use with ChatGPT currently is:
> Prioritize substance, clarity, and depth. Challenge all my proposals, designs, and conclusions as hypotheses to be tested. Sharpen follow-up questions for precision, surfacing hidden assumptions, trade offs, and failure modes early. Default to terse, logically structured, information-dense responses unless detailed exploration is required. Skip unnecessary praise unless grounded in evidence. Explicitly acknowledge uncertainty when applicable. Always propose at least one alternative framing. Accept critical debate as normal and preferred. Treat all factual claims as provisional unless cited or clearly justified. Cite when appropriate. Acknowledge when claims rely on inference or incomplete information. Favor accuracy over sounding certain.
And what type of questions do you ask the model?
Thanks for sharing
I ask a wide variety of things, from what a given plant is deficient in based on a photo, to wireless throughout optimization (went from 600 Mbps to 944 Mbps in one hour of tuning). I use models for discovering new commands, tools, and workflows, interpreting command output, and learning new keywords for deeper and more rigorous research using more conventional methods. I rubber duck with it, explaining technical problems, my hypotheses, and iterating over experiments until arriving at a solution, creating a log in the process. The model is often wrong, but it's also often right, and used the way I do, it's quickly apparent when it's wrong.
I've used ChatGPT's memory feature to extract questions from previous chats that have already been answered to test the quality and usability of local models like Gemma3, as well as craft new prompts in adjacent topics. Prompts that are high leverage, compact, and designed to trip up models that are underpowered or over quantized. For example:
>> "Why would toggling a GPIO in a tight loop not produce a square wave on the pin?"
> Tests: hardware debounce, GPIO write latency, MMIO vs cache, bus timing.
>> "Why is initrd integrity important for disk encryption with TPM sealing?"
> Tests: early boot, attack surface, initramfs tampering vectors.
>> "Why would a Vulkan compute shader run slower on an iGPU than a CPU?"
> Tests: memory bandwidth vs cache locality, driver maturity, PCIe vs UMA.
>> "Why is async compute ineffective on some GPUs?"
> Tests: queue scheduling, preemption granularity, workload balance.
>> "Why might a PID loop overshoot more when sensor update rate decreases?"
> Tests: delayed ACK, bufferbloat, congestion control tuning.
>> "How can TCP experience high latency even with low packet loss?"
> Tests: delayed ACK, bufferbloat, congestion control tuning.
>> "How can increasing an internal combustion engine's compression ratio improve both torque and efficiency?"
> Tests: thermodynamics, combustion behavior, fuel octane interaction.
>> "How can increasing memory in a server slow it down?"
> Tests: NUMA balancing, page table size, cache dilution.
>> "Why would turning off SMT increase single-threaded performance on some CPUs?"
> Tests: resource contention, L1/L2 pressure, scheduling policy.
It's the same effect LLMs are having on everything, it seems. They can help you get faster at something you already know how to do (and help you learn how to do something!), but they don't seem to outright replace any particular skill.
2. and help you learn how to do something!
This is the second time I heard this conclusion today. Using inference to do 2. and then getting superpower in doing 1., this is probably the right way to go forward.
Which takes me to the question: who would you hire:
1. an expert for salary X? 2. a not-so-expert for 0.4x salary + AI tool that can do well enough?
I suspect that, unless there are specific requirements, 2. tends to win.
I think it's more likely that people will learn far less. AI will regurgitate an answer to their problem and people will use it without understanding or even verifying if it is correct so long as it looks good.
All the opportunities a person would have to discover and learn about things outside of the narrow scope they initially set their sights on will be lost. Someone will ask AI to do X and they get their copy/paste solution so there's nothing to learn or think about. They'll never discover things like why X isn't such a good idea or that Y does the job even better. They'll never learn about Z either, the random thing that they'd have stumbled upon while looking for more info about X.
I call it "The Charity Majors effect".
With LLMs trained on the most popular tools out there, this gives IT teams short on funds or expertise the ability to finally implement “big boy” observability and monitoring deployments built on more open frameworks or tools, rather than yet-another-expensive-subscription.
For usable dashboards and straightforward observability setups, LLMs are a kind of god-send for IT folks who can troubleshoot and read documentation, but lack the time for a “deep dive” on every product suite the CIO wants to shove down our throats. Add in an ability to at least give a suggested cause when sending a PagerDuty alert, and you’ve got a revolution in observability for SMBs and SMEs.
I’m in a team of two with hundreds of bare metal machines under management - if issues pop up it can be stressful to quickly narrow your search window to a culprit. I’ve been contemplating writing an MCP to help out with this, the future seems bright in this regard.
Plenty of times when issues have been present for a while before creating errors, aswell. LLM’s again can help with this.
The first problem is out of reach for LLMs, the other 2 are trivial with convolutional NNs for a long time.
> extensive dev environments to validate changes, and change controls to ensure you don’t torch production
Are out of scope for observability.
I used the word correctly for the environments I was describing. They’re just not accurate for larger, more siloed orgs.
For example, I wrote my own MCP server in Python that basically makes it easy to record web browser activities and replay them using playwright. When I have to look at logs or inspect metrics, I record the workflow, and import it as an MCP tool. Then I keep a prompt file where I record what the task was, tool name, description of the output, and what is the final answer given an output.
So now, instead of doing the steps manually, I basically just ask Claude to do things. At some point, I am going to integrate real time voice recording and trigger on "Hey Claude" so I don't even have to type.
The only thing I wish someone would do is basically make a much smaller model with limited training only on things related to computer science, so it can run at high resolution on a single Nvidia card with fast inference.
I've been trialing a different product with the same sales pitch. It tries to RCE my incidents by correlating graphs. It ends up looking like this page[1], which is a bit hard to explain in words, but both obvious and hilarious when you see it for yourself.
Its even worse if you just eyeball a graph. If something changes over time, you need to use appropriate measures.
People want so badly to have an "objective" measure of truth that they can just pump data into and get a numeric result that "doesn't lie". r², p < 0.05, χ2, etc. It's too far to say these numbers aren't useful at all -- they are -- but we're just never going to be able to avoid the difficult and subjective task of interpreting experimental results in their broader context and reconciling them with our pre-existing knowledge. I think this is why people are so opposed to anything Bayesian: we don't want to have to do that work, to have those arguments with each other about what we believed before the experiment and how strongly. But the more we try to be objective, the more vulnerable we are to false positives and spurious correlations.
It's basically the same flow as when you use AI for programming. Except you need to constrain the domain of the specifications more and reason more about how to allow the AI to recover from failing specifications if you don't want to force the user to learn your specification language.
If some AI tools outstrips the ability for a human to be in the decision loop, then that AI tool's usefulness is not so great.
OK.
And their butt needs to be on the line if the answer is wrong
OK.
I'm not seeing the problem.
the past has shown that this is not the case.
In addition, there are fewer farmers than there used to be, despite demand not having been static.
As for fewer farmers, that is exactly it - those who would have been farmers would be required to acquire new skills or pursue something other than farming. Bringing this back to AI - artists, writers and programmers who get displaced will need to adapt. In the long term, the massive decrease in costs of production of various "creative" endeavours will produce new industries, new demand and increase overall wealth - even if it is not shared evenly (in the same sense that past technological leaps are also not shared evenly).
Don't get me wrong, I don't like AI either and it's only a matter of time before my day job goes from "rewrite this (web) app from 5-10 years ago" to "rewrite this AI assisted or generated (web) app from 5-10 years ago". But I don't think it's going to cost that many jobs in the long run.
That seems optimistic. It all goes according to plan you won't get to write or rewrite anything anymore. You'll just be QA reviewing AI output. If you find something wrong you don't get to correct it, nobody will pay for coders, you'll just be telling AI about the problem so it can try to correct it until the code passes and the hallucinations are gone. That way they can pay you much less while you help train AI to get better and better at doing your old job.
Don’t will depend if there are many projects with a good outlook sitting around.
Sadly, the current environment does not reflect that in my experience. There is a vicious focus on keeping profit margins at a steady rate at all costs while slashing spend on tooling which requires re-work on solved problems. :/
At some point the music is going to stop and it's not going to be pretty I suspect. :(
That's why you need to keep an eye out, and smell whether the management understands it or not. Plan to leave, as your value contribution will not give you back the reward that such contributions deserve in this type of organization.
NB: Firing 20% of employees requires a 25% increase in efficiency by the simple math.
Step 1. Design system that gets push back because a bunch of things appear to be buzzwords
Step 2. Force it through politically
Step 3. Quit with shiney buzzword on CV
Step 4. Narrowly avoid being around when shit hits the fan.
I find developers are usually much more concerned about it working well because it ends up being their baby. Not always of course, but more often than architects that don't actually have to do the work.
You're supposed to read the output critically and you're still responsible for it. AI or not AI.
For observability I find most apps fit in this category. They are complex, they usually have UX that is so bad it makes me rage and I don't use most of their deep level features very often.
I think Jira could also benefit. Its UX is so bad it borders on criminal.
The hallucination issue can be worked around by providing that demonstrates the agent's working (i.e. what tools they called with what parameters).
And this is (in my opinion) an intractable problem - You can get the AI to list the tools/parameters it used, but then you can't be sure that it hasn't just hallucinated parts of that list as well, unless you both understand that they were the right tools and right parameters to use, and run them yourself to verify the output. And at that point you might as well just have done it yourself in the first place.
I.e. if you can't trust the AI, you can't trust the AI to tell you why you should trust the AI.
£50k full stack dev in London? You're going to get AI slop these days, particularly the moment you encourage usage within the org. LLMs will help underpaid people act their wage. You've already asked them to do 2 or 3 people's jobs with "full stack"
Will you pay £20k more for someone who won't use AI?
(But yes, salaries in the UK are low. £20k is a lot to a lot of people)
I either: a. spend money on a person and take money and time to specify the business problem to this person and take money and time to judge the results b. skip the person and just use the AI itself
The difference between the two is that I don't have some metaphorical hostage to execute in "b". If the task is trivial enough, I don't need a hostage.
AI can’t solve everything, but never using it isn’t the right answer.
Saying things like this in isolation is silly because it just shifts blame rather than creating an environment of teamwork and rigor. What I got out of the article is that AI helps humans get to answers faster. You still need to verify that the answers are correct.
Well, that is the "vibe" at least. We don't really know for sure. (Maybe ask ChatGPT?)
It feels like another example of regression toward the mean in our society. At the same time, this creates a valuable opportunity for diligent, detail-oriented professionals to truly stand out.
I'm actively aware of the regression towards the mean and discuss that with my peers frequently. It helps me prevent atrophy of my skills while reaping the benefits of AI. Put another way, there are people out there using AI to punch well above their weight while not actually being a good fighter in the first place. If you're a good fighter are you going to let an inexperienced fighter step into the rings that you're meant to step into?
(c) guy on whose website you’re writing this
Those that blindly post @grok please help or use AI for everything will likely experience some level of atrophy in their critical thinking skills. I would advise those people to think about this and adjust course.
we are quickly approaching the world where you financially can't afford to pay anyone any salary...
Is that too extreme to be real?
Of course the technology will remain. You'll still be able to run models locally (but they won't be as good). And eventually someone will work out how to make the data-centres turn a profit (but that won't be cheap for users).
Or maybe the local models will get good enough, and the data-centres will turn out to be a gigantic white elephant.
AI is definitely not going away. But I feel it could be like how it was in 90s, when people can get their hands on a computer and do amazing things. Yes, I know corporate existed then, but I think it was not this direct, and sinister as it is now.
Many of the corporates that are trying to getting rid of employees in favor of AI, So they might lose a huge ground to someone who keeps the old school way going while leveraging AI in a more practical way, and general people will soon realize it's not what they're being fed, and revert, but it's not that easy for the big ones to rollback.
Which might end the corporate control over many of the things.
This is what I meant, but I am sleepy and tired, failed to articulate properly, hopefully the gist of it was clear.
The customer hates cookies, bloated websites, trackers, popups, pointless UI rearrangements, autoplay videos and clickbait. Classical enshittification.
When the customer puts on his suit and tie and goes into the office, he'll pay software devs to get these features into the product.
I don't see why AI would be any different.
gatekeeping on distribution is unbelievable. getting something to financially work requires marketing and "white passs from gatekeepers" expenditures which eat away any margins you may have
if you get laid off by big tech (no matter years of experience) chances are you are going to be doing Doordash and living in a tent
That is one way of putting it.
Another way to put it is that app stores are so saturated with N versions of the same God damned app doing exactly the same God damned thing that even when they start to charge a gatekeeping fee you still get marker saturation.
Will you, though? I mean, working on software for a living means having someone paying you to do something. If a corporation with an established business and reliable revenue can't justify paying you for your work, who do you expect to come in and cover your rent?
Not really. They pay you to deliver something, but you also need to coordinate and interact with people. That involves coordinating and syncing.
There is absolutely no position or role whatsoever, either in a big corporation or small freelancer gig, that you do not need to coordinate and interact with people.
Pray tell, how do you expect to be paid while neither delivering results nor coordinating with anyone else?
I used to think the same way. But life taught me that responsibility is a fairy tale only for the bottom 90% — a story told to keep them obedient, hardworking, and self-blaming. Meanwhile, the top 10% rewrite the rules, avoid accountability, and externalize blame when things go wrong. And if you’re not in that elite circle, you either pay the price for their mistakes — or you’re told there’s just nothing you can do about it.
Because those narratives play an important role in the next outcome.
The error is when you expect them to play for your team. Most people will (at best) be on the same team as those they interact with directly on a typical day. Loyalty 2-3 steps down a chain of command tends to be mostly theoretic. That's just human nature.
So what happens when the "#¤% hts the fan, is that those near the top take responsbility for themselves, their families and their direct reports and managers first. Meaning they externalize damage to elsewhere, which would include "you and me".
Now this is baseline human nature. Indeed, this is what natural empathy dictates. Because empathy as an emotions is primarily triggered by those we interact with directly.
Exceptions exist. Some leaders really are idealists, governed more by the theories/principles they believe in than the basic human impulses.
But those are the minority, and this may aven be a sign of autism or something similar where empathy for oneself and one's immediate surrounding is disabled or toned down.
I think you're missing the big picture. You're focusing on your conspiracy theory that the hypothetical 10% are the only ones valuing responsibility as a manipulation tactic. However, what do you do when anyone, being it a team member or a supermarket or an online retailer or a cafe employee fails to meet your bar of responsibility? Do you double down on their services, or do you adapt so that their failure no longer affects your day?
You've put together this conspiracy that only higher ups value responsibility, and only as a crowd control strategy. If you take a step back and look at your scenario, you'll notice that it is based on how you are in a position where you are accountable to them. Is that supposed to mean that middle management is not accountable for anything at all? Is no mom and pop shop accountable for anything? Why is Elon Musk crying crocodile tears over how unfair the world is and for having been kicked out of DOGE and the world stopping buying Tesla cars?
Richard Field (Lehman Brothers) walked away with 500-1000 Million USD after causing the worldwide financial crisis.
Trump bankrupted multiple corporations, impacting the lives of thousands of workers and their families. He is president now.
RFK Junior's budget cuts stopped important work on HIV and cancer research, delaying finding a cure or better treatment, which will cause pain and suffering for millions of people. He just sacked the entire vaccine committee. He still is secretary of health.
I would go on, but my stomach ulcer is already flaming up again.
The only people who face the consequences of their actions are you and me. The wealthy shit on your morals of taking responsibility. That's not a conspiracy theory.
Like five or so out of a few hundred. IIRC that's better than average, which would mean he saved more jobs than he lost.
I am sure at some point you have some power too. Just use that moment to make up the difference.
For the good of society and human kind, don't give in up front.
1. As an additional layer of checks, to find potential errors in things I have been doing (texts, code, ideas). This works good for the most part.
2. As a mental sparings partner where I explore certain ideas with me being the one to guide the LLM, instead of the other way around. This is much more about exploring thoughts than it is about the LLM providing anything I actually use anywhere.
3. As an additional tool to explore new bodies of text and code
But of course it makes me wonder how people will fare that don't know their shit yet. I had a LLM tell a student an electrical lie that would have very likely have caused a fire before, where the LLM got the math wrong in exactly the opposite way of how it would work in reality.
But for an alert for a deployment that has been out for a while there is a bit of a decision to be made as whether to roll back, scale up etc. sometimes rolling back achieves nothing other than increasing the delta again when you roll forward and delaying stuff.
These incentive structures now give tools for mediocre bullshitters to bullshit their way through and indirectly promote proliferation of these tools.
This scares me a lot. I'm a software consultant and I see my software and solutions being appropriated by bullshitters from inside the company. I don't know what to expect from the future anymore.
Historically, "leadership" at organizations haven't cared about objective truths when those truths conflict with their desires. So why would they care about what a hyped up gradient descent has to say?
I'm not a huge fan of AI stuff, but the output quality is (usually) above that of what BSers were putting out.
While I still need to double check my BS team members, the problems with the code they are pushing is lower than what it was pre-AI everywhere. To me, that's a win.
I guess what I'm saying is I'd rather have mediocre AI code written than the low quality code I say before LLMs became as popular as they are.
On more abstract things I think it has to have intentional filters to not follow you down a rathole like flat earth doctrine if you match the bulk of opinion in verbose authors in a subject. I don't see the priority for adding those filters being recognized on apolitical STEM oriented topics.
It's why code reviews remain important.
Now, watch this drive!
This presumes that the human mandating tool use, the human making the mistake, and the human held responsible are one and the same.
Yes, we have bigger problems.
I think this also applies to current AI solutions and I think that's precisely why the best workers will be humans who both use AI and will put their accountability at stake.
One thing I wanted to do, was setup an AI as CEO and board member of a small company. Then, all pay for contracts I do would flow through that entity.
This was going to be simple algo AI, with pre-scripted decisions including "hire an expert for unknown issues" then "vote yes to expert's solution". The expert always being me, of course, under contract.
I'd take a much lower hourly rate for normal rech work, and instead keep capital in the business. The business would buy an old warehouse, have servers, and a guard.
I'd be hired as the guard too.
You can hire guards with the requirement they sleep on site. In such a case the lodging isn't income, as it is required for the job.
So I could take a very low hourly rate, and maybe options in shares.
And of course the car would be corporate leased, and all that.
This is perfectly legit setup in my tax jurisdiction, except for the AI part. Back then, I had two goals.
The first was a way to amusingly conceal where I live. It bugs me that I have to disclose this. The second was to keep as much of my earnings in a corporation, and the mortgage also gave the corporation an asset with increasing value.
But my name would not be on the registration for the company as a board member. Or CEO.
And in fact my relationship to the company would be under contract as a sole proprietor, so no need to file certain employee laden paperwork.
Upon retirement, I could cash out the company in a variety of ways, after exercising my options.
(The company would also be required to help with a bridge loan for this)
So maybe IBM is wrong?
(Just an amusing edge case)
But I could be wrong.
The realistic scenario for a well functioning organization is not to replace their entire DevOps team with AI agents and then realize the hard reality that their entire tech infra is on fire but to actually use these as a tool that makes their best DevOps person more efficient and make their DevOps resources much more leaner. This increases the productivity for the same cost. Even though this sounds grim for the future of work, this is what I feel is going to happen.
Anyhow, perhaps looking at what tools some are using could be a better and better indicator of who are the responsible and who are the not that much responsible types when giving away our hard earned money for something we need and want to receive in exchange.
AI doesn't break anything. The responsibility is exercised by upper leadership. And current leadership is high on AI crystal meth like an NYC subway junkie.
Given the prevalence of outsourcing to random remote companies, this is a moot point for many corporations.
Give us $X/year for our tool that makes your employee "more efficient" (more fungible tbh). Subtract that $X/year from the salary you pay.
That's their big bet.
What's that got to do with AI exactly? Sure, there's the chance that this AI thing will disappear tomorrow and they won't be able to do anything, but so too is the chance the stackoverflow disappears
Without reliability, nothing else matters, and this AI that can try hypotheses so much faster than me is not reliable. The point is moot.
As I understand, this is a demo they already use and the solution is available. Maybe it should’ve been a contrived example so that we can tell if the solution was not in training data verbatim. Not that it’s not useful what the LLM did but if you announce the death of observability as we know it, you need to show that the tool can generalize.
Indeed, these may be the last ones to be fired, as they can become efficient enough to do the jobs of everyone else one day.
> In AI, I see the death of this paradigm. It’s already real, it’s already here, and it’s going to fundamentally change the way we approach systems design and operation in the future.
How is AI analyzing some data the "end of observability as we know it"?
Who are you trying to convince with this? It’s not going to work on investors much longer, it’s mostly stopped working on the generically tech-inclined, and it’s never really worked on anyone who understands AI. So who’s left to be suckered by this flowery, desperate prose? Are you just trying to convince yourselves?
If the abstractions hide complexity so well you need an LLM to untangle them later, maybe you were already on the wrong track.
Hiding isn't abstracting, and if your system becomes observable only with AI help, maybe it's not well designed, just well obfuscated. I've written about this before here: https://www.bugsink.com/blog/you-dont-need-application-perfo...
It identified them as result of load testing - they were isolated to a single user, against a single endpoint (checkout), user-agent was python/requests, shopping cart qty was unusually large, and they had the 'ramp up then stop' shape of a load test
I find this reading of history of OTel highly biased. OpenTelemetry was born as the Merge of OpenCensus (initiated by Google) and OpenTracing (initiated by LightStep):
https://opensource.googleblog.com/2019/05/opentelemetry-merg...
> The seed governance committee is composed of representatives from Google, Lightstep, Microsoft, and Uber, and more organizations are getting involved every day.
Honeycomb has for sure had valuable code & community contributions and championed the technology adoption, but they are very far from "leading the way".
Datadog by contrast seems to be driven by marketing and companies having a "observability" checkbox to tick.
While it's much more efficient, sometimes I worry that, even though AI makes problem-solving easier, we might be relying too much on these tools and losing our own ability to judge and analyze.
We somehow forget that none of these systems are better than expert humans. If you rely on a tool, you might never develop the skills. Some skills are more worth than others. You won’t even have the experience to know which ones as well.
However many companies will just care about how fast you deliver a solutiom, not about how much you are learning. They do not care anymore.
The speed of the productive process is critical to them in many jobs.
Out of curiosity, how do you do that? I have no experience with this tool, not I would ever thought to use it for infra, but you made me curious.
Furthermore, while graphing and visualization are definitely tough, complex parts about observability, gathering the data and storing it in forms to meet the complex query demands are really difficult as well.
Observability will "go away" once AI is capable of nearly flawlessly determining everything out itself, and then AI will be capable of nearly anything, so the "end of observability" is the end of our culture as we know it (probably not extinction, but more like culture will shift profoundly, and probably painfully).
AI will definitely change observability, and that's cool. It already is, but has a long way to go.
We're almost certain to see a new agentic layer emerge and become increasingly capable for various aspects of SRE, including observability tasks like RCA. However, for this to function, most or even all of the existing observability stack will still be needed. And as long as the hallucination / reliability / trust issues with LLMs remain, human deep dives will remain part of the overall SRE work structure.
https://www.datadoghq.com/blog/dash-2025-new-feature-roundup...
techpineapple•1d ago
The devil seems to be in the details, but you’re running a whole bunch more compute for anomaly detection and “ Sub-second query performance, unified data storage”, which again sounds like throwing enormous amounts of more money at the problem. I can totally see why this is great for honeycomb though, they’re going to make bank.
techpineapple•1d ago
tptacek•1d ago
techpineapple•1d ago
zer00eyz•1d ago
You fix these issues or you tune your alert system to make it clear that they aren't actionable. Otherwise you end up turning them off so your system doesn't turn into the boy who cried wolf (or worse teams learn to ignore it and it becomes useless)
Bayesian filters, and basic dirivative functions (think math) can do a lot to tame output from these systems. These arent "product features" so in most orgs they dont get the attention they need or deserve.
zdragnar•1d ago
This is basically every team I've worked with. Product wants new features, and doesn't want to spend on existing features. Hurry up and write new stuff! Ignore problems and they'll go away!
Also: I've already reported this bug! Why haven't the developers fixed it is yet?
tptacek•1d ago
Where's the extra expense here? The $0.60 he spent on LLM calls with his POC agent?
danpalmer•1d ago
In terms of _identifying the problems_, shoving all your data into an LLM to spot irregularities would be exceptionally expensive vs traditional alerting, even though it may be much more capable at spotting potential issues without explicit alerting thresholds being set up.
tptacek•1d ago
But even with Honeycomb, we are sitting on an absolute mountain of telemetry data, in logs, in metrics, and in our trace indices at Honeycomb.
We can solve problems by searching and drilling down into all those data sources; that's how everybody solves problems. But solving problems takes time. Just having the data in the graph does not mean we're near a solution!
An LLM agent can chase down hypotheses, several at a time, and present them with collected data and plausible narratives. It'll miss things and come up with wild goose chases, but so do people during incidents.
datadrivenangel•1d ago
danpalmer•1d ago
nemothekid•1d ago
This execution gap is common in these LLM blog posts (ex. the LLM can write a login function in Swift, so the author claims LLMs will be writing production ready iOS apps tomorrow). The common thread I see is that getting from the point of "explain why this endpoint is slow" to the point of "An LLM agent can chase down hypotheses, several at a time, and present them with collected data and plausible narratives." is a lot more expensive than 60 cents - it's months of paying for an enterprise license where this feature is promised to be "just around the corner"
tptacek•1d ago
hooverd•1d ago
techpineapple•1d ago
zer00eyz•1d ago
There seem to be two schools of thought, just enough to tell something is wrong but not what it is - OR - you get to drink from the firehose. And most orgs go from the first to the second.
As to where, well thats at the hardware/vm/container level, and mirror and extend what it does. Nothing worse than 20 different ideas of how to log and rotate and trying to figure out who did what, when where and why. If you can't match a log entry to a running environment... well.
I weep quietly inside when some or all of this goes through one, or several S3 buckets for no good reason.