For example, everyone now writes emails with perfect grammar in a fraction of a time. So now the expectation for emails is that they will have perfect grammar.
Or one can build an interactive dashboard to visualize their spreadsheet and make it pleasing. Again the expectation just changed. The bar is higher.
So far I have not seen productivity increase in dimensions with direct sight to revenue. (Of course there is the niche of customer service, translation services etc that already were in the process of being automated)
You do not need to build a spreadsheet visualiser tool there are plenty of options that exist and are free and open source.
I'm not against advances, I'm just really failing to see what problem was in need of solving here.
The only use I can get behind is the translation, which admittedly works relatively well with LLMs in general due to the nature of the work.
I had a conversation with my manager about the implications of everyone using AI to write/summarise everything. The end result will most likely be staff getting Copilot to generate a report, then their manager uses Copilot to summarise the report and generate a new report for their manager, ad inifinitum.
Eventually all context is lost, busywork is amplified, and nobody gains anything.
https://www.fool.com/investing/2024/11/29/this-magnificent-s...
Think like a forestry investor, not a cash crop next season.
majormajor•1h ago
Some nits I'd pick along those lines:
>For instance, according to the most recent AI Index Report, AI systems could solve just 4.4% of coding problems on SWE-Bench, a widely used benchmark for software engineering, in 2023, but performance increased to 71.7% in 2024 (Maslej et al., 2025).
Something like this should have the context of SWE-Bench not existing before November, 2023.
Pre-2023 systems were flying blind with regard to what they were going to be tested with. Post-2023 systems have been created in a world where this test exists. Hard to generalize from before/after performance.
> The patterns we observe in the data appear most acutely starting in late 2022, around the time of rapid proliferation of generative AI tools.
This is quite early for "replacement" of software development jobs as by their own prior statement/citation the tools even a year later, when SWE-Bench was introduced, were only hitting that 4.4% task success rate.
It's timing lines up more neatly with the post-COVID-bubble tech industry slowdown. Or with the start of hype about AI productivity vs actual replaced employee productivity.
eru•1h ago
But with progress continuing in the models, too, it's an even more complicated affair.
trhway•19m ago
ath3nd•57m ago
That's an opinion many disagree with. As a matter of fact, the only limited study up to date showed that LLMs usage decrease productivity for experienced developers by roughly 19%. Let's reserve opinions and link studies.
https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...
My anecdotal experience, for example, is that LLMs are such a negative drain on both time and quality that one has to be really early in their career to benefit from their usage.
yakshaving_jgt•47m ago
black_knight•17m ago
yakshaving_jgt•4m ago
Maybe one of the reasons I'm getting good results is because the LLM effectively has to argue with GHC, and GHC always wins here.
I've found that it's a superpower also for finding logic bugs that I've missed, and for writing SQL queries (which I was never that good at).
manmademagic•43m ago
1. Converting exported data into a suitable import format based on a known schema 2. Creating syntax highlighting rules for language not natively support in a Typst report
Both situations didn't have an existing solution, and while the outputs were not exactly correct, they only needed minor adjustments.
Any other situation, I'd generally prefer to learn how to do the thing, since understanding how to do something can sometimes be as important as the result.
wahnfrieden•36m ago
It's no surprise to me that devs who are accustomed to working on one thing at a time due to fast feedback loops have not learned to adapt to paralellizing their work (something that has been demonized at agile style organizations) and sit and wait on agents and start watching YouTube instead, as the study found (productivity hits were due to the participants looking at fun non-work stuff instead of attempting to parallelize any work).
The study reflects usage of emergent tools without training, and with regressive training on previous generation sequential processes, so I would expect these results. If there is any merit in coordinating multiple agents on slower feedback work, this study would not find it.
ardit33•30m ago
They are not great if your tasks are not well defined. Sometimes, they suprise you with great solutions, sometimes they produce mess that just wastes your time and deviates from your mission.
To, me LLMs have been great accelerants when you know what you want, and can define it well. Otherwise, they can waste your time by creating a lot of code slop, that you will have to re-write anyways.
One huge positive sideffect, is that sometimes, when you create a component, (i.e. UI, feature, etc), often you need a setup to test, view controllers, data, which is very boring and annoying / time wasting to deal. LLM can do that for you within seconds (even creating mock data), and since this is mostly test code, it doesn't matter if the code quality is not great, it just matters to get something in the screen to test the real functionality. AI/LLMs have been a huge time savers for this part.
hochstenbach•56m ago
moi2388•18m ago
trhway•17m ago
yurishimo•6m ago
NitpickLawyer•37m ago
While this is true, there are ways to test (open models) on tasks created after the model was released. We see good numbers there as well, so something is generalising there.