For example, I heard that SAP has an 80-90% deployment failure rate back in the day, but don't have a citable source for it.
Something to keep in mind is that ERP "failure" is frequently defined as went over budget or over time, even if it ultimately completed and provided the desired functionality.
It's a much smaller percentage of projects that are either cancelled or went live and significantly did not function as the business needed.
ERP rollouts can "fail" for lots of reasons that aren't to do with the software. They are usually business failures. Mostly, companies end up spending so much on trying to endlessly customize it to their idiosyncratic workflows that they exceed their project budgets and abandon the effort. In really bad cases like Birmingham they go live before actually finishing setup, and then lose control of their books and have to resort to hiring people to do the admin manually.
There's a saying about SAP: at some point gaining competitive advantage in manufacturing/retail became all about who could make SAP deployment a success.
This is no different to many other IT projects, most of them fail too. I think people who have never worked in an enterprise context don't realize that; it's not like working in the tech sector. In the tech industry if a project fails, it's probably because it was too ambitious and the tech itself just didn't work well. Or it was a startup whose tech worked, but they couldn't find PMF. But in normal, mature, profitable non-tech businesses a staggering number of business automation projects just fail for social or business reasons.
AI deployments inside companies are going to be like that. The tech works. The business side problems are where the failures are going to happen. Reasons will include:
• Not really knowing what they want the AI to do.
• No way to measure improved productivity, so no way to decide if the API spend is worth it.
• Concluding the only way to get a return is entirely replace people with AI and then having to re-hire them because the AI can't handle the last 5% of the work.
• Non-tech executives doing deals to use models or tech stacks that aren't the right kind or good enough.
etc
This summer, I built two very sophisticated pieces of software. A financial ledger to power accrual accounting operations and a code generation framework that scaffolds a database from a defined data model to the frontend components and everything in between.
I used ChatGPT substantially. I'm not sure how long it would have taken without generative AI, but in reality, I would have just given up out of frustration or exhaustion. From the outside, it would appear to any domain expert that at least three other people worked on these giving the pace at which they got completed.
The completion of those two were seminal moments for me. I can't imagine how anyone, in any field of information systems, is not multiples more effective than they were five years ago. That directly affects a P&L and I can't think of anything in my career that is even remotely close to having that magnitude.
I don't know what encapsulates an AI pilot in these orgs, and I'm sure they are massively more complex than anything I've done. But to hear 95% of these efforts don't have a demonstrable effect is just wild.
Maybe I misunderstood this, but I took this to mean that people inside enterprises are struggling using tools like ChatGPT. They do point out that perhaps the tools are being deployed in the wrong areas:
> The data also reveals a misalignment in resource allocation. More than half of generative AI budgets are devoted to sales and marketing tools, yet MIT found the biggest ROI in back-office automation—eliminating business process outsourcing, cutting external agency costs, and streamlining operations.
But I've seen some amazing automation does in sales and marketing that directly affected sales efficiency and reduced sales and marketing expenses.
Why tho? You used AI to make some software, but did you use AI to achieve rapid revenue acceleration?
That you used AI to build software seems tangential to whether it can increase revenues. Over the years, we've seen many technologies that didn't deliver on promises of rapidly increasing revenues despite being useful for creating software (cough OOP cough), so this new one failing to live up to expectations isn't surprising. Actually given the history of technologies that over promise and under deliver on massive hype, disappointment should be the null hypothesis.
Did several domain experts tell you this or are you making it up?
> I can't imagine how anyone, in any field of information systems, is not multiples more effective than they were five years ago.
Perhaps "they are massively more complex than anything I've done"
It's an assertion among eight other engineers on the project with ~15 years of experience in the domain. They are domain experts. This part isn't up for debate.
Regarding use of AI in software development (which is not what the article is about), the proof of the pudding isn’t in greenfield projects, it’s in longer-term software evolution and legacy code. Few disagree that AI saves time for prototyping or creating a first MVP.
(How should I invest if I have this thesis)
If you cannot tolerate false negatives I don't see how you get around the inaccuracy of LLMs. As long as you can spot false positives and their rate is sufficiently low they are merely an annoyance.
I think this is a good consideration before starting a project leveraging LLMs
Genuinely curious what others have experienced but specifically those that are using LLMs for business workflows. It is not to say any system is perfect but for purpose driven data pipelines LLMs can be pretty great.
I've had pretty good success using LLMs for coding and in some ways they are perfect for that. False positives are usually obvious and false negatives don't matter because as long as the LLM finds a solution, it's not a huge deal if there was a better way to do it. Even when the LLM cannot solve the problem at all, it usually produces some useful artifacts for the human to build on.
If I as a human deploy code, it is not certain that it necessarily works - just like with LLMs. The extent is different however.
Makes sense. The people in charge of setting AI initiatives and policies are office people and managers who could be easily replaced by AI, but the people in charge not going to let themselves be replaced. Salesmen and engineers are the hardest to replace, yet they aren't in charge so they get replaced the fastest.
I dont think Joe and Jane worker are purposely not using to protect their jobs, everyone wants ease at work, its just these LLM-based AI's dont offer much outside of some use cases. AI is vastly over-hyped and now we're in the part of the hype cycle where people are more comfortable saying to power, "This thing you love and think will raise your stock price is actually pretty terrible for almost all the things you said it would help with."
AI has its place, but its not some kind of universal mind that will change everything and be applicable in significant and fundamentally changing ways outside of some narrow use cases.
I'm on week 3 of making a video game (something I've never done before) with Claude/Chat and once I got past the 'tutorial level' design, these tools really struggle. I think even where an LLM would naturally be successful (structured logical languages), its still very underwhelming. I think we're just seeing people push back on hype and feeling empowered to say "This weird text autogenerator isn't helping me."
It's kinda like what I realized with the meta Ray-Bans: I can have these things on my face, they can tell me the answer to virtually any question in 10 seconds or less.
But I, as a human, rarely have questions to ask. When you walk in to your local grocery store - you generally know what you want and where to find it. A ton of companies are just gluing LLM text boxes into apps and then scratching their heads when people don't use them.
Why?
Because the customer wasn't the user - it was their boss and shareholders. It was all done to make someone else think 'woah, they are following the trend!'.
The core issue with generative AI is that it all works best when focused in a narrow sense. There is like one or two really clever uses I've seen - disappointingly, one of them was Jira. The internal jargon dictionary tool was legitimately impressive. Will it make any more money? Probably not.
Sounds like Microsoft 365 Copilot at my org. Sucks at nearly everything, but it actually makes a fantastic search engine for emails, teams convos, sharepoint docs, etc. Much better that Microsoft's own global search stuff. Outside of coding, that's the only other real world use case I've found for LLMs - "get me all the emails, chats, and documents related to this upcoming meeting" and it's pretty good at that.
Though I'm not sure we should be killing the earth for better search, there are probably other, better ways to do it.
Then the other 5% is the 'extra; it does for me, and gets me details I wouldn't have even known where to find.
But it is just fancy search for me so far - but fancy search I see as valuable.
Are we, though? What I have read so far suggests the carbon footprint of training models like gpt4 was "a couple weeks of flights from SFO to NYC" https://andymasley.substack.com/p/individual-ai-use-is-not-b...
They also seem to be coming down in power usage substantially, at least for inference. There's pretty good models that can run on laptops now, and I still very much think we're in the model T phase of this technology so I expect further efficiency refinements. It also seems like they have recently hit a "cap" on the increase in intelligence models are getting for more raw power.
The trendline right now makes me wonder if we'll be talking about "dark datacenters" in the future the same way we talked about dark fiber after the dot com bubble.
It's kinda funny that some online shops are now bragging how great their customer support is because they DON'T use LLM bots xD
Wow. This just does not match my personal experience. I do an hour or so walk around the reservoir near my house 4-5 times a week, letting my mind wander freely -- and I find that I stop on average at least five or ten times to take notes about questions to learn the answers to later, and occasionally decide that it's worth it to break pace to start learning the answer right then and there.
There’s a difference between asking out loud or another being vs asking yourself internally.
There's only so many questions I have the ability to answer myself. Of those, there's only so many that I have the lifespan to answer myself. We stand on the shoulders of giants, and even on the shoulders of average people -- really it's shoulders all the way down. Unless the questioning itself is the source of joy (which it certainly sometimes is), I prefer to find out what others have learned when they asked the same questions. It's vanishingly rare that I believe I'm the first to think through something.
I already have my phone, I could look up the answers immediately. The reason I don't isn't that I can't. It's that asking the question is the point, not answering it.
1. Calf (young cow, young of certain other mammals)
Old English: cealf (plural calfru or later calves)
Proto-Germanic: kalbaz or *kalbaz/kalbazō
Cognates: Old Norse kálfr, Old High German kalb, German Kalb, Dutch kalf.
Proto-Indo-European root: often linked to gel- (“to swell, be rounded”), possibly referring to the rounded shape of a young animal. Some etymologists, however, leave it as “origin uncertain” beyond Proto-Germanic.
2. Calf (back of the lower leg)
Old English: caf, cealf (“calf of the leg”) — likely related to the animal term, but the link is uncertain.
Possible origin: Could be from the same gel- “swell” root, referring to the bulging muscle at the back of the leg, or an independent development within Germanic.
Cognates: Old Norse kálfi (“calf of the leg”), Swedish kalv (leg calf), Icelandic kálfi.
This year I snapped a pic and sent to chat gpt. Normal end of year die off, cut the brown branches away, here is a fertilizer schedule for end of year to support new growth for the next year.
ChatGPT makes gardening so much easier, and that is just one of many areas. Recipes are another, don't trust the math, but chat gpt can remix and elevate recipes so much better than Google recipe blog spam posts can.
Are you the guy that walks the poodle?
I think this highlights an interesting point: Sensible use cases are unsexy. But the pushers want stuff, however unrealistic, that lends itself to breathless hype that can be blown out of proportion.
This is an eye-opening sentence. It's quite hard to imagine how to live one's daily life with "few questions to ask." Perhaps this is a neurodivergent thing?
I would also argue that ND people seem to be the heavier AI users, at least in my experience. Its a bit like the stereotypical 'wikipedia deep dive' but 10x.
I'll just be over here, floating (often treading water) in a raging river of "what ifs ...", "I wonder ifs..." And, "Hmmms?"
I recall the glasses also can write on the screen inside the lens, which makes me think they may be good for deaf people as well.
It's just that these use-cases seem uncool, and big companies seem to have to be cool in order to keep either their status or their profits. But I have a feeling the technology may be really useful for some really vulnerable people.
Nobody seems to have been successful yet, and I think the focus on applying LLMs instead of dumb UI and mixed dumb and ML image processing is a large reason why.
That said, the scenarios they are good at they are really good at. I was traveling in Europe and the glasses where translating engravings on castle walls, translating and summarizing historical plaques, and just generally letting me know what was going on around me.
I'm seeing this again and again. Customers as users seems like the last concern, if it is a concern at all. Adherence to the narrative du jour, fundraising from investors and hyping the useless product up to dump on retail are the primary concerns.
Vaporware or a useless, unlaunched product are advantageous here. Actual users might report how underwhelming or useless it is. Sky high development costs are touted as wins.
5% are succeeding. People are trying AI for just about everything right now. 5% is pretty damn good, when AI clearly has a lot of room to get better.
The good models are quite expensive and slow. The fast & cheap models aren't that great - unless very specifically fine-tuned.
Will it get better enough so that that growth rate in success pilots grows from 5% - 25% in 5 years or 20? Who knows, but it almost certainly will grow.
It's hard to tell how much better the top foundation models will get over the next 5-10 years, but one thing that's certain is that the cost will go down substantially for the same quality over that time frame.
Not to mention all the new use cases people will keep trying over that timeline.
If in 10-years time, AI is succeeding in 2x as many use cases - that might not justify current valuations, but it will be a much better future - and necessary if we're planning on having ~25% of the population being retired / not working by then.
Without AI replacing a lot of jobs, we're gonna have a tough time retiring all the people we promised retirements to.
That depends if the AI successes depended much on the leading edge of LLM developments, or if actually most of the value was just "low hanging fruit".
If the latter, that would imply the utility curve is levelling out, because new developments are not proving instrumental enough.
I'm thinking of an S curve: slow improvements through the 2010s, then a burst of activity as the tech became good enough to do something "real", followed by more gradual wins in efficiency and accuracy.
And regardless, I still see this as very positive for society - and don't care as much about whether or not this is an AI bubble or not.
About a mil? Maybe two? Seems realistic…
People have to invent whatever seems reasonable while squinting given how much accumulation of capital there is.
The guys with money are easy to fool. Just lie to them about your „product”, get the cash, get out of the rat race, smooth sailing.
Of course easier said than done. I can’t lie this convincingly, I don’t have the con man skillset or connections.
So I’m stuck in a 9 to 5. Zzz…
We use generative imagery/video at my job and it's adding value. I see value being added for coders.
There's real innovation happening, but I find it's mostly companies cutting corners making customer service even shittier than it already was.
There's a meme that I think fits: https://i.redd.it/20rpdamxef0f1.jpeg
I think for a long time, cutting corners so that the number can go up next quarter has worked surprisingly well. Genuinely, I don't think a lot of corporations view offering a better product as a viable means of competing in the 2025 marketplace.
For them, AI is not the next industrial revolution, it's the next overseas outsourcing; AI isn't a way to bring new value to customers, it's a way to bring roughly the same value (read worse) but at a much cheaper cost to them. If they get their way, everything will get worse, while they make more money. That's the value proposition at play here.
A lot of this must come down to execution. And there's a lot of snake oil out there at the execution layer.
https://www.youtube.com/watch?v=KX5jNnDMfxA
5% is not unexpected, as startup success rates are normally about 1:22 over 3 years. lol =3
> "Vaughan saw that his team was not fully on board. His ultimate response? He replaced nearly 80% of the staff within a year"
Being that this is Fortune magazine, it makes sense that they're portraying it this way, but reading between the lines there a little bit, it seems like the staff knew what would happen and wasn't keen on replacing themselves.
The article mentions 19-20 year old founders, focused on solving single user problems, were the successes.
The sample size is 300 public AI deployments and an undisclosed number of private in-house AI projects. And the survey seems to only consider business applications, as compared with end-user applications like media and software. That's significant but not definitive.
Isn't it more likely that existing problems with low hanging fruit, perhaps unpopular answers, that could be solved by leaning on "AI". And perhaps "AI" wasn't the key to success?
amirkabbara•1h ago
longtimelistnr•1h ago
As a technically minded person but not a comp sci guy, refining document search is like staring into a void and every option uses different (confusing) terminology. This makes it extra difficult for me to both do my regular job AND learn the multiple names/ways to do the exact same thing between platforms.
The only solution that has any reliability for me so far are Gemini instances where i upload only the files i wish to search and just keep it locked to a few questions per instance before it starts to hallucinate.
My attempt at RAG search implementation was a disaster that left me more confused than anything.
appease7727•1h ago
nathan_compton•1h ago
The fact that we live in an era where tech people have been so investor pilled that overstating the capabilities of technology is basically second nature does not help.
troupo•1h ago
There are very few use cases at companies where you need to generate something. You want to work with the company's often very private disparate data (with access controls etc.) You wouldn't even have enough data to train a custom LLM, much less use a generic one.
ARandumGuy•1h ago
morkalork•1h ago