No surprises here.
It always struggles on non-web projects or on software where it really matters that correctness is first and foremost above everything, such as the dotnet runtime.
Either way, a complete disastrous start and what a mess that Copilot has caused.
I have so far only found LlMs useful as a way of researching, an alternative to web search, and doing very basic rote tasks like implementing unit tests or doing a first pass explanation of some code. Tried actually writing code and it’s not usable.
OTOH webdev is known for rapid framework/library churn, so before too long there will be a crossroads where the pre-AI training data is too old and the fresh training data is contaminated by the firehose of vibe coded slop.
And the quantity of js code available/discoverable when scrapping the web is larger by an order of magnitude than every other language.
> This seems like it's fixing the symptom rather than the underlying issue?
This is also my experience when you haven't setup a proper system prompt to address this for everything an LLM does. Funniest PRs are the ones that "resolves" test failures by removing/commenting out the test cases, or change the assertions. Googles and Microsofts models seems more likely to do this than OpenAIs and Anthropics models, I wonder if there is some difference in their internal processes that are leaking through here?
The same PR as the quote above continues with 3 more messages before the human seemingly gives up:
> please take a look
> Your new tests aren't being run because the new file wasn't added to the csproj
> Your added tests are failing.
I can't imagine how the people who have to deal with this are feeling. It's like you have a junior developer except they don't even read what you're telling them, and have 0 agency to understand what they're actually doing.
Another PR: https://github.com/dotnet/runtime/pull/115732/files
How are people reviewing that? 90% of the page height is taken up by "Check failure", can hardly see the code/diff at all. And as a cherry on top, the unit test has a comment that say "Test expressions mentioned in the issue". This whole thing would be fucking hilarious if I didn't feel so bad for the humans who are on the other side of this.
I agree that not auto-collapsing repeated annotations is an annoying bug in the github interface.
But just pointing out that annotations can be hidden in the ... menu to the right (which I just learned).
Typically, you wouldn't bother manually reviewing something until the automated checks have passed.
I'd rather hop in and get them on the right path rather than letting them struggle alone, particularly if they're struggling.
If it's another senior developer though I'd happily leave them to it to get the unit tests all passing before I take a proper look at their work.
But as a general principle, please at least get a PR through formatting checks before assigning it to a person.
Let them finish a pull request before spending time reviewing it. That said, a merge request needs to have an issue written before it's picked up, so that the author does not spend time on a solution before the problem is understood. That's idealism though.
The earliest feedback you can get comes from the compiler. If it won't build successfully don't submit the PR.
Maybe, but likely it is reality and their true company culture leaking through. Eventually some higher eq execs might come to the very late realization that they cant actually lead or build a worthwhile and productive company culture and all that remains is an insane reflection of that.
Why do they even need it? Success is code getting merged 1st shot, failure gets worse the more requests for changes the agent gets. Asking for manual feedback seems like a waste of time. Measure cycle time and rate of approvals and change failure rate like you would for any developer.
That comparison is awful. I work with quite a few Junior developers and they can be competent. Certainly don't make the silly mistakes that LLMs do, don't need nearly as much handholding, and tend to learn pretty quickly so I don't have to keep repeating myself.
LLMs are decent code assistants when used with care, and can do a lot of heavy lifting, they certainly speed me up when I have a clear picture of what I want to do, and they are good to bounce off ideas when I am planning for something. That said, I really don't see how it could meaningfully replace an intern however, much less an actual developer.
It's not like a regular junior developer, it's much worse.
Nice to see that Microsoft has automated that, failure will be cheaper now.
An outsourced contractor was tasked with a very simple job as their first task - update a single dependency, which required just a bump of the version and no code changes - after three days of them seemingly struggling to even understand what they were asked to do, inability to clone the repo, failure to install the necessary tooling on their machine, they ended up getting fired from the project. Complete waste of money, and the time of those of us having to delegate and review this work.
Give instructions, get good code back. That's the dream, though I think the pieces that need to fall into place for particular cases will prevent reaching that top quality bar in the general case.
Those have long been the folks I’ve seen at the biggest risk of being replaced by AI. Tasks that didn’t rely on human interaction or much training, just brute force which can be done from anywhere.
And for them, that $3/hr was really good money.
This level of smugness is why outsourcing still continues to exist. The kind of things you talk about were rare. And were mostly exaggerated to create anti-outsourcing narrative. None of that led to outsourcing actually going away simply because people are actually getting good work done.
Bad quality things are cheap != All cheap things are bad.
Same will work with AI too, while people continue to crap on AI, things will only improve, people will be more productive with AI, get more and bigger things done for cheaper and better. This is just inevitable given how things are going now.
>>There's a PM who takes your task and gives it to a "developer" who potentially has never actually written a line of code, but maybe they've built a WordPress site by pointing and clicking in Elementor or something.
In the peak of outsourcing wave. Both the call center people and IT services people had internal training and graduation standards that were quite brutal and mad attrition rates.
Exams often went along the lines of having to write whole ass projects without internet help in hours. Theory exams that had like -2 marks on getting things wrong. Dozens of exams, projects, coding exams, on-floor internships, project interviews.
>>After dozens of hours billed you will, in fact, get code where the new file wasn't added to the csproj or something like that, and when you point it out, they will bill another 20 hours, and send you a new copy of the project, where the test always fails. It's exactly like this.
Most IT services billing had pivoted away from hourly billing, to fixed time and material in the 2000s itself.
>>It's exactly like this.
Very much like outsourcing. AI is here to stay man. Deal with it. Its not going anywhere. For like $20 a month, companies will have same capability as a full time junior dev.
This is NOT going away. Its here to stay. And will only get better with time.
Most of this works because of price arbitrage. And continues to work that way, not just with outsourcing but with manufacturing too.
Remember those days, when people were going around telling Chinese products where crap? That didn't really work and more things only got made in China.
This is all so similar to early days of Google search, its just that cost of a search was low enough that finding things got easier and ubiquitous. That same is unfolding with AI now. People have a hard time believing a big part of their thinking can be outsourced to something that costs $20/month.
How can something as good as me be cheaper than me? You are asking the wrong question. For centuries now, every decade a machine(s) has arrived that can do a thing cheaper than what the human was doing at the time. Its not exactly impossible. You are only living in denial by asking this question, this has been how it has worked the day since humans found way of mimicking human work through machines. We didn't get here in a day.
Pretty sure cars are more expensive than horse carriage, or that iPhones are/were more expensive than button phones. You can cite so many such examples. Like photocopying machines, or cameras, or wrist watches, or even things like radio, television etc.
More importantly, sometimes how you do things change. And that changes how you go about your life in a very fundamental way.
That is what internet was about when it first came out, thats what internet search, online maps, or search etc etc were.
AI will change how you go about living your life, in a very fundamental way.
LLMs are being made into another rental extraction system and should be viewed as such.
Basic car ownership can be quite a bit cheaper than a horse + carriage.
The horse will probably eat $10-20/day in food. $600/mo in just food costs. Not including vet bills and what not.
A decent and cheap horse will probably cost you $3k up front. Add in several thousand dollars more for the carriage.
A horse requires practically daily maintenance. A carriage will still require some maintenance.
A horse requires a good bit more land, plus the space to store the carriage. Plus, all the extra time and work mounting and unmounting your horse whenever you need to go.
A horse and carriage isn't really cheaper than a cheap car and way less functional.
Most successful technologies provide multiple of these benefits. What is terrible, and the direction we are going right now, is that these new systems (or offshoring like we are talking about here) seem/are "Less Effort" but do not hit the other two axioms. This is a very dangerous place to be.
People would rather be lazy than roll their sleeves up and focus, especially in our attention diverting world.
I used upwork (when it was elance) quite a lot in a startup I was running at the time, so I have direct experience of this and its _not_ a lie or "mostly exaggerated", it was a very real effect.
The trick was always to weed out these types by posting a very limited job for a cheap amount and accepting around five or more bids from broad prices in order to review the developers. Whoever is actually competent then gets the work you actually wanted done in the first place. I found plenty of competant devs at competitive prices this way but some of the submissions I got from the others were laughable. But you just accept the work, pay them their small fee, and never speak to them again.
And even if it could, how do you get senior devs without junior devs? ^^
The raise in interest rates a couple of years ago triggered many layoffs in the industry. When that happens salaries are squeezed. Experienced people work for less, and juniors have trouble finding job because they are now competing against people with plenty of experience.
Not sure how it can be read otherwise.
But the actual software part? I'm not sure anymore
I feel the same way today, but I got started around 2012 professionally. I wonder how much of this is just our fading optimism after seeing how shit really works behind the scenes, and how much the industry itself is responsible for it. I know we're not the only two people feeling this way either, but it seems all of us have different timescales from when it turned from "enjoyable" to "get me out of here".
Then one day I woke up and realized the ones paying me were also the ones using it to run over or do circles around everyone else not equipped with a bicycle yet; and were colluding to make crippled bicycles that'd never liberate the masses as much as they themselves had been previously liberated; bicycles designed to monitor, or to undermine their owner, or more disgustingly, their "licensee".
So I'm not doing it anymore. I'm not going to continue making deliberately crippled, overly complex, legally encumbered bicycles for the mind, purely intended as subjects for ARR extraction.
So nothing new? Just this/last month, it seems like the multi-select "open/close" button in the GitHub PR UI was just straight up broken. No one seemed to have noticed until I opened a bug report, and it continued being broken for weeks before they finally fixed it. Not the first time I encounter this on Microsoft properties, they seem to constantly push out broken shit, and no one seem to even notice until some sad user (like me) happens to stumble across it.
This is also shocking to me. Especially here on HN! Every tech CEO on earth is salivating over AI coding because they want it to devalue and/or replace their expensive human software developers. Whether or not that will actually happen, that's the purpose of building all of these "agentic" coding tools. And here we are, dumbass software engineers, cheerleading for and building the means of our own destruction! We downplay it with bullshit like "Oh, but AI is just a way to augment our work, it will never really replace us or lower our compensation!" Wild how excited we all are about this.
Anybody who thinks this place represents the average working or middle class programmer hasn't been paying much attention. They fool a lot of people by being social liberal to go along with their economic liberalism.
this website is owned and operated by a VC, who build fortunes off exploiting these people
"workers and oppressed peoples of all countries, unite!" is the last thing I'd expect to see here
We should not forget that on the other side of this issue are equally smart and motivated people and they too are aware of the power dynamics involved. For example, the phenomena of younger programmers poo pooing experienced engineers was a completely new valuation paradigm pushed by interested parties at some point around the dotcom bubble.
Doctors with n years in the OR will not take shit from some intern that just came out of school. But we were placed in that situation at some point after '00. So the fundamental issue is that there is an (engineered imho) generational divide, and coupled with age discrimination in hiring (again due to interested parties' incentives) has a created a situation where one side is accumiliating generational wealth and power and the other side (us developers) are divided by age and the ones with the most skin in the game are naive youngsters who have no clue and have been taught to hate on "millenials" and "old timers" etc.
I am speculating that this "AI Revolution" may lead to some revitalization of the movement as it would allow individual contributors the ability to compete on the same levels as proprietary software providers who previously had to employ legions of developers to create their software.
Yes, when your 100k quarterly RSU drop lands
Thank you. It's something I'm actively pursuing, I'm hoping to finish some chairs this spring and see if any local shops are interested in stocking them. But I'm skeptical I could find enough business to make it work full-time, pay for my family's health insurance, and so on. We'll see.
So, for experienced engineers, I see a great future fixing the shit show that is AI-code.
At what point does the human developers just give up and close the PRs as "AI garbage". Keep the ones that works, then just junk the rest. I feel that at some point entertaining the machine becomes unbearable and people just stops doing it or rage close the PRs.
Microsoft's stock price is dependent on them proving that this is a success.
it's not as if Microsoft's share price has ever reflected the quality of their products
No need to specify why they are interact with it, all engagement is good engagement.
Perhaps this explains the recent firings that affected faster CPython and other projects. While they throw money at AI but sucess still doesn't materialize, they need to make the books look good for yet another quarter through the old-school reliable method of laying off people left and right.
And then, while the tech is not mature, running on delusion and sunken costs, it's actually used for production stuffs. Butlerian Jihad when
I estimate two more years for the bubble to pop.
My sophisticated sentiment analysis (talking to co-workers other professional programmers and IT workers, HN and Reddit comments) seems to indicate a shift--there's a lot less storybook "Ay Eye is gonna take over the world" talk and a lot more distrust and even disdain than you'd see even 6 months ago.
Moves like this will not go over well.
https://github.com/dotnet/runtime/pull/115732#issuecomment-2...
Anyone who has dealt with Microsoft support knows this feeling well. Even talking to the higher level customer success folks feels like talking to a brick wall. After dozens of support cases, I can count on zero hands the number of issues that were closed satisfactorily.
I appreciate Microsoft eating their dogfood here, but please don't make me eat it too! If anyone from MS is reading this, please release finished products that you are prepared to support!
The feedback buttons open a feedback form modal, they don’t reflect the number of feedback given like the emoji button. If you leave feedback, it will reflect your thumbs up/down (hiding the other button), it doesn’t say anything about whether anyone else has left feedback (I’ve tried it on my own repos).
Comment in the GitHub discussion:
"...You and I and every programmer who hasn't been living under a rock knows that AI isn't ready to be adopted at this scale yet, on the premier; 100M-user code-hosting platform. It doesn't make any sense except in brain-washed corporate-talk like "we are testing today what it can do tomorrow".
I'm not saying that this couldn't be an adequate change some day, perhaps even in a few years but we all know this isn't it today. It's 100% financial-driven hype with a pinch of we're too big to fail mentality..."
Call me old school, but I find the workflow of "divide and conquer" to be as helpful when working with LLMs, as without them. Although what is needed to be considered a "large scale task" varies by LLMs and implementation. Some models/implementations (seemingly Copilot) struggles with even the smallest change, while others breeze through them. Lots of trial and error is needed to find that line for each model/implementation :/
So eg., one line of code which needed to handle dozens of hard-constraints on the system (eg., using a specific class, method, with a specific device, specific memory management, etc.) will very rarely be output correctly by an LLM.
Likewise "blank-page, vibe coding" can be very fast if "make me X" has only functional/soft-constraints on the code itself.
"Gigawatt LLMs" have brute-forced there way to having a statistical system capable of usefully, if not universally, adhreading to one or two hard constraints. I'd imagine the dozen or so common in any existing application is well beyond a Terawatt range of training and inference cost.
"Your code does not compile" and "Your tests fail"
If you have to tell an intern that more than once on a single task, there's going to be conversations.
I can't fire half my dev org tomorrow with that approach, I can't really fire anyone, so I guess it would be a big letdown for a lot of execs. Meanwhile though we just keep incrementally shipping more stuff faster at higher quality so I'm happy...
This works because it treats the LLM like what it actually is: an exceptionally good if slightly random text transformer.
And at the same time, absurdly slow? ChatGPT is almost 3 years old and pretty much AI has still no positive economic impact.
It will take some time for whatever reality is to actually show truthfully in the financials. When VC money stops subsidising datacentre costs, and businesses have to weigh the full price against real value provided, that is when we will see the reality of the situation.
I am content to be wrong either way, but my personal prediction is if model competence slows down around now, businesses will not be replacing humans en-mass, and the value provided will be notable but not world changing like expected.
I agree that most of the AI companies describe themselves and their products in hyperbolic terms. But that doesn't mean we need to counter that with equally absurd opposing hyperbole.
If it costs them even just one more dollar than that revenue number to provide that service (spoiler, it does), then you could say AI has had no positive economic impact.
Considering we know they’re being subsidized by obscene amounts of investment money just like all other frontier model providers, it seems pretty clear it’s still a negative economic impact, regardless of the revenue number.
Nobody seems to consider that LLMs are democratizing programming, and allowing regular people to build programs that make their work more efficient. I can tell you that at my old school manufacturing company, where we have no programmers and no tech workers, LLMs have been a boon for creating automation to bridge gaps and even to forgo paid software solutions.
This is where the change LLMs will bring will come from. Not from helping an expert dev write boilerplate 30% faster.
Don’t get me wrong: the current models are already powerful and useful. However, there is still a lot of reason to remain skeptical of an imminent explosion in intelligence from these models.
For some reason my pessimism meter goes off when I see single sentence arguments “change has been slow”. Thanks for brining the conversation back.
LLMs are like bumpers on bowling lanes. Pro bowlers don't get much utility from them. Total noobs are getting more and more strikes as these "smart" bumpers get better and better at guiding their ball.
Now look at the past year specifically, and only at the models themselves, and you'll quickly realize that there's been very little real progress recently. Claude 3.5 Sonnet was released 11 months ago and the current SOTA models are only marginally better in terms of pure performance in real world tasks.
The tooling around them has clearly improved a lot, and neat tricks such as reasoning have been introduced to help models tackle more complex problems, but the underlying transformer architecture is already being pushed to its limits and it shows.
Unless some new revolutionary architecture shows up out of nowhere and sets a new standard, I firmly believe that we'll be stuck at the current junior level for a while, regardless of how much Altman & co. insist that AGI is just two more weeks away.
Even if it could perform at a similar level to an intern at a programming task, it lacks a great deal of the other attributes that a human brings to the table, including how they integrate into a team of other agents (human or otherwise). I won't bother listing them, as we are all humans.
I think the hype is missing the forest for the trees, and I think exactly this multi-agent dynamic might be where the trees start to fall down in front of us. That and the as currently insurmountable issues of context and coherence over long time horizons.
-Being a parent to a small child and the associated sleep deprivation.
-His reluctance to read documentation.
-There being a language barrier between him the project owners. Emphasis here, as the LLM acts like someone who speaks through a particularly good translation service, but otherwise doesn't understand the language spoken.
Software today is written to accommodate every possible need of every possible user, and then a bunch of unneeded selling point features on top of that. These massive sprawling code bases made to deliver one-size fits all utility.
I don't need 3 million LOC Excel 365 to keep track of who is working on the floor on what day this week. Gemini 2.5 can write an applet that does that perfectly in 10 minutes.
I do like the idea of smaller programs fitting smaller needs being easy to access for everyone, and in my post history you would see me advocate for bringing software wages down so that even small businesses can have software capabilities in house. Software has so much to give to society outside of big VC flips and tech monoliths. Maybe AI is how we get there in the end.
But I think that supplanting humans with an AI workforce in the very near future might be stretching the projection of its capabilities too far. LLMs will be augmenting how businesses operate from now and into the future, but I am seeing clear roadblocks that make an autonomous AI agent unviable, and it seems to be fundamental limitations of LLMs, eg continuity and context. Advances recently seem to be from supplemental systems that try to patch those limitations. That suggests those limits are tricky, and until a new approach shows up, that is what drives my lack of faith in an AI agent revolution.
But it is clear to me that I could be wrong, and it could be a spectacular miscalculation. Maybe the robots will make me eat my hat.
When you look at it from afar, it looks potentially good, but as you start looking into it for real, you start realizing none of it makes any sense. Then you make simple suggestions, it does something that looks like what you asked, yet completely missing the point.
An intern, no matter how bad it is, could only waste so much time and energy.
This makes wasting time and introducing mind-bogglingly stupid bugs infinitely scalable.
This was discussed here
They are putting this in front of the developers as take it or leave it deal. I left the platform, doing my coding old way, hosting it somewhere else.
Discoverability? I don't care. I'm coding it for myself and hosting in the open. If somebody finds it, nice. Otherwise, mneh.
Other than that, I don't think this is bad tech, however, this brings another slippery slope. Today it's as you say:
> I think this process is intended for fixing papercuts rather than building anything involved. It just isn't good enough yet.
After sufficient T somebody will rephrase it as:
> I think this process is intended for writing small, personal utilities rather than building enterprise software. It just isn't good enough yet.
...and we will iterate from there.
So, it looks like I won't touch it for the foreseeable future. Maybe if the ethical problems with training material is solved (i.e. trained with data obtained with consensus and with correct licenses), I can use as alongside other analysis and testing tools I use, for a final pass.
AI will never be a core and irreplaceable part of my development workflow.
Unless AI use becomes a KPI in your annual review.
Duolingo did that just recently, for example.
I am developing serious regrets for conflating "computing as a medium for personal expression" with "computing for livelihood" early on.
That’d be an insta-quit for me :)
If we let intellectual property be a fundamental principle the line between idea (that can't be owned) and ip (that can be owned) will eventually devolve into a infinitely complex fractal that nobody can keep track of. Only lawyer AI's will eventually be able to tell the difference between idea and ip as the complexity of what we can encode become more complex. Why is weights not code when it clearly contain the ability to produce the code? Is a brain code? Are our experiences like code?
What is the fundamental reason that a person is allowed to train on ip but a bot is not? I suspect that this comes down to the same issue with the divide between ip and idea. But there might be some additional dimension to it. At some point we will need to see some AI as conscious entities and to me it makes little sense that there would be some magical discrete moment where an AI becomes conscious and gets rights to it's "own ideas".
Or maybe there's a simple explanation of the boundary between ip and idea that I have just missed? If not, I think intellectual property as a concept will not stand the test of time. Other principles will need to take its place if we want to maintain the fight for a good society. Until then IP law still has its place and should be followed but as an ethical principle it's certainly showing cracks.
I just don't want to type something away haphazardly, because your questions deserve more than 30 seconds to elaborate.
may you please let me know where are you hosting the code ? would love to migrate as well.
thank you !
You can also self-host a Forgejo instance on a €3/mo Hetzner instance (or a free Oracle Cloud server) if you want. I prefer Hetzner for their service quality and server performance.
I plan to use Source Hut for public projects.
For some research I use a private Git server. However, even that code might get released as Free Software when it matures enough.
Maybe that's how the microsoft employees are using it (in another IDE I suppose).
Too late?
Bloating the codebase with dead code is much more likely.
I'll never understand the antagonistic "us vs. them" mentality people have with their employer's leadership, or people who think that you should be actively sabotaging things or be "maliciously compliant" when things aren't perfect or you don't agree with some decision that was made.
To each their own I guess, but I wouldn't be able to sleep well at night.
Meanwhile a lot of folks have very unhealthy to non-existent relationships with their employers. There may be some mixture where they may be temporary hired/viewed as highly disposable or transient in nature having very little to gain from the success of the business, they may be compensated regardless of success/failure, they may have toxic management who treat them terribly (condescendingly, constantly critical, rarely positive, etc.). Bad and non-existent relationships lead to this sort of behavior. In general we’re moving towards “non-existent” relationships with employers broadly speaking for the labor force.
The counter argument is often floated here “well why work there” and the fact is money is necessary to survive, the number of positions available hiring at any given point is finite, and many almost by definition won’t ever be the top performers in their field to the point they truly choose their employers and career paths with full autonomy. So lots of people end up in lots of places that are toxic or highly misaligned with their interests as a survival mechanism. As such, watching the toxic places shoot themselves in the foot can be some level of justice people find where generally unpleasant people finally get to see consequences of their actions and take some responsibility.
People will prop others up from their own consequences so long as there’s something in it for them. As you peel that away, at some point there’s a level of poetic justice to watch the situation burn. This is why I’m not convinced having completely transactional relationships with employers is a good thing. Even having self interest and stability in mind, certain levels of toxicity in business management can fester. At some point no amount of money is worth dealing with that and some form of correction is needed there. The only mechanism is to typically assure poor decision making and action is actually held accountable.
I agree with all your points here, the broader context of one's working conditions really matter.
I do think there's a difference between sitting back and watching things go bad (vs struggling to compensate for other people's bad decisions) and actively contributing to the problems (the "malicious compliance" part)..
Letting things fail is sometimes the right choice to make, if you feel like you can't effect change otherwise.
Being the active reason that things fail, I don't think is ever the right choice.
Most employees want to do good work, but pretending there’s no structural divergence in interests flattens decades of labor history and ignores the power dynamics baked into modern orgs. It’s not about being antagonistic, it’s about being clear-eyed where there are differences between the motivations of your org. leadership and your personal best interests. After a few levels remove from your position, you're just headcount with loaded cost.
But 100% agreed that everyone should maintain a realistic expectation and understanding of their relationship with their employer, and that job security and employment guarantees are possibly at an all-time low in our industry.
Almost no one does but people get ground down and then do it to cope.
Interesting because "them" very much have an antagonistic mentality vs "us". "Them" would fire you in a fucking heartbeat to save a relatively small amount (10%). "Them" also want to aggressively pay you the least amount for which they can get you to do work for them, not what they "value" you at. "Us" depends on "them" for our livelihoods and the lives of people that depend on us, but "them" doesn't doesn't have any dependency on you that can't be swapped out rather quickly.
I am a capitalist, don't get me wrong, but it is a very one-sided relationship not even-footed or rooted in two-way respect. You describe "them" as "leadership" while "Them" describe you as a "human resource" roughly equivalent to the way toilet paper and plastics for widgets are described.
If you have found a place to work where people respect you as a person, you should really cherish that job, because most are not that way.
It's everyone's personal choice to put their own lens on how they believe other people think - like your take on how "leadership" thinks of their employees.
I guess I choose to be more positive about it - having been in leadership positions myself, including having to oversee layoffs as part of an eventual company wind-down - but I readily acknowledge that my own biases come into this based on my personal career experiences.
I don't get that
I read some of your other comments in this thread and I'm not sure what to make of your experience. If you've never felt mistreated or exploited in a 30 year career you are profoundly lucky to have avoided that sort of workplace
I've only been working in software for half as long, but I've never had a job that didn't feel unstable in some ways, so it seems impossible to me that you have avoided it for a career twice as long as mine
I have watched my current employer cut almost half of our employees in the past two years, with multiple rounds of layoffs
Now AI is in the picture and it feels inevitable that more layoffs will eventually come if they can figure out how to replace us with it
I do not sleep well knowing my employer would happily and immediately replace me with AI if they could
I have certainly been lucky in my career, I've often acknowledged that. But I do believe luck favours the prepared, and I've worked hard for my accomplishments and to get the jobs I've had.
I'm totally with you on the uncertainty that AI is bringing. I don't think anyone can dispute that change is coming because of AI.
I do think some companies will get it right, but some will get it wrong, when it comes to how best to improve the business using those new tools.
Your manager understands it. Their manager understands it. Department heads understand it. The execs understand it. The shareholders understand it.
Who does it benefit for the laborers to refuse to understand it?
It's not like I hate my job. It's just being realistic that if a company could make more money by firing me, they would, and if you have good managers and leadership, they will make sure you understand this in a way that respects you as a human and a professional.
> antagonism: actively expressed opposition or hostility
I agree with you that everyone should have a clear and realistic understanding of their relationship with their employer. And that is entirely possible in a professional and constructive manner.
But that's not the same thing as being actively hostile towards your place of work.
So I'm not quite sure why you would not see it as a "us vs. them" situation?
I have no idea how this will ultimately shake out legally, but it would be absolutely wild for Microsoft to not have thought about this potential legal issue.
>I have sole ownership of intellectual property rights to my Submissions
I would assume that the AI cannot have IP ownership considering that an AI cannot have copyright in the US.
>I am making Submissions in the course of work for my employer (or my employer has intellectual property rights in my Submissions by contract or applicable law). I have permission from my employer to make Submissions and enter into this Agreement on behalf of my employer.
Surely an AI would not be classified as an employee and therefore would not have an employer. Has Microsoft drafted an employment contract with Copilot? And if we consider an AI agent to be an employee, is it protected by the Fair Labor Standards Act? Is it getting paid at least minimum wage?
(Turns out the AI was programmed to ignore bots. Go figure.)
Nor can it be an entity to sign anything.
I assume the "not-copyrightable" issue, doesn't in anyway interfere with the rights trying to be protected by the CLA, but IANAL ..
I assume they've explicitly told it not to sign things (perhaps, because they don't want a sniff of their bot agreeing to things on behalf of MSFT).
We do know that LLMs will happily reproduce something from their training set and that is a clear copyright violation. So it can't be that everything they produce is public domain.
I can't remember the specific case now, but it has been ruled in the past, that you need human-novelty, and there was a case recently that confirmed this that involved LLMs.
Or MS already does that?
that's literally the bare minimum.
it also opens the PR as its working session. there are a lot of dials, and a lot of redditor-ass opinions from people who don’t use or understand the tech
what use is a bot if it can't do at least this simple step?
if you have used it for more than a few hours (or literally just read the docs) and aren’t stupid, you know this is easily solved
you’re giving into mob mentality
Exactly. LLM does not know how to use a debugger. LLM does not have runtime contexts.
For all we know, the LLM could’ve fixed the issue simply by commenting out the assertions or sanity checks and everything seemed fine and dandy until every client’s device catches on fire.
haha
Also, trying something new out will most likely have hiccups. Ultimately it may fail. But that doesn't mean it's not worth the effort.
The thing may rapidly evolve if it's being hard-tested on actual code and actual issues. For example it will be probably changed so that it will iterate until tests are actually running (and maybe some static checking can help it, like not deleting tests).
Waiting to see what happens. I expect it will find its niche in development and become actually useful, taking off menial tasks from developers.
There's however a border zone which is "worse than failure": when it looks good enough that the PRs can be accepted, but contain subtle issues which will bite you later.
Now when your small or medium size business management reads about CoPilot in some Executive Quarterly magazine and floats that brilliant idea internally, someone can quite literally point to these as examples of real world examples and let people analyze and pass it up the management chain. Maybe that wasn’t thought through all the way.
Usually businesses tend to hide this sort of performance of their applications to the best of their abilities, only showcasing nearly flawless functionality.
Reading AI generated code is arguably far more annoying than any menial task. Especially if the said code happens to have subtle errors.
Speaking from experience.
The joke is that PERL was a write-once, read-none language.
> Speaking from experience.
My experience is all code can have subtle errors, and I wouldn't treat any PR differently.
However, every PR adds load and complexity to community projects.
As another commenter suggested, doing these kind of experiments on separate forks sound a bit less intrusive. Could be a take away from this experiment and set a good example.
There are many cool projects on GitHub that are just accumulating PRs for years, until the maintainer ultimately gives up and someone forks it and cherry-picks the working PRs. I've than that myself.
I'm super worried that we'll end up with more and more of these projects and abandoned forks :/
I see this as a work in progress.. I am almost certain the humans in the loop on these PRs are well aware of what's going on and have their expectations in check, and this isn't just "business as usual" like any other PR or work assignment.
This is a test. You can't improve a system without testing it on real world conditions.
How do we know they're not tweaking the Copilot system prompts and settings behind the scenes while they're doing this work?
Can no one see the possibility that what is happening in those PRs is exactly what all the people involved expected to have happen, and they're just going through the process of seeing what happens when you try to refine and coach the system to either success or failure?
When we adopted AI coding assist tools internally over a year ago we did almost exactly this (not directly in GitHub though).
We asked a bunch of senior engineers to see how far they could get by coaching the AI to write code rather than writing it themselves. We wanted to calibrate our expectations and better understand the limits, strengths and weaknesses of these new tools we wanted to adopt.
In most of those early cases we ended up with worse code than if it had been written by humans, but we learned a ton. We can also clearly see how much better things have gotten over time, since we have that benchmark to look back on.
It's going to look stupid... until the point it doesn't. And my money's on, "This will eventually be a solved problem."
Good decision making would weigh the odds of 1 vs 8 vs 16 years. This isn’t good decision making.
Why is doing a public test of an emerging technology not good decision making?
> Good decision making would weigh the odds of 1 vs 8 vs 16 years.
What makes you think this isn't being done?
I'm not so sure they'll get there. If the solved problem is defined as a sub-standard but low cost, then I wouldn't bet against that. A solution better than that though, I don't think I'd put my money on that.
AI can remain stupid longer than you can remain solvent.
I have met people who believe that automobile engineering peaked in the 1960's, and they will argue that until you are blue in the face.
So the typical expectations or norms of how code reviews and PRs work between humans don't really apply here.
That's my guess at least. I have no more insider information than you.
>> This is a test. You can't improve a system without testing it on real world conditions.
Software developers know to fix build problems before asking for a review. The AIs are submitting PRs in bad faith because they don't know any better. Compilers and other build tools produce errors when they fail, and the AI is ignoring this first line of feedback.
It is not a maintainers job to review code for syntax errors, or use of APIs that don't actually exist, or other silly mistakes. That's the compilers job and it does it well. The AI needs to take that feedback and fix the issues before escalating to humans.
EVERY single prompt should have the opportunity to get copied off into a permanent log where the end user triggers it : log all input, all output, human writes a summary of what he wanted to happen but did not, what he thinks might have went wrong, what he thinks should have happened (domain specific experts giving feedback about how things are fucking up) And then its still only useful with long term tracking like how someone actually made a training change to fix this exact failure scenario.
None of that exists, so just like "full self driving" was a pie in the sky bullshit dream that proved machine learning has an 80/20 never gonna fully work problem, same thing here
Otherwise it would check the tests are passing.
I see it as wishful thinking in the extreme to suppose that probabilistic mashing together of plagiarized jigsaw pieces of code could somehow approach human intelligence and reasoning—and yet, the parlour trick is convincing enough that this has escalated into a mass delusion.
I would say the copilot system isn't really there yet for these kinds of changes, you don't have to run experiments on a language framework to figure that out.
The answer is probably that the Copilot team is using the rest of the engineering organization as testers. Great for the Copilot team, frustrating for everyone else.
For it to be "failed" it would have to also be finished/completed. They are likely continuously making tweaks, this thing was just released.
Now you don’t even need the frustrated end user!
Anyway, this is his public, stated opinion on this: https://github.com/dotnet/runtime/pull/115762#issuecomment-2...
They only gave their customers 9 months to migrate away.
I'm expecting that Microsoft did this to artificially pump up their AI usage numbers for next year by forcibly removing non-AI alternatives.
This only one example in AdTech but I expect other industries to be hit as well.
It is normal to preempt things like this when working with agents. That is easy to do in real time, but it must be difficult to see what the agent is attempting when they publish made up bullshit in a PR.
It seems very common for an agent to cheat and brute force solutions to get around a non-trivial issue. In my experience, its also common for agents to get stuck in loops of reasoning in these scenarios. I imagine it would be incredibly annoying to try to interpret a PR after an agent went down a rabbit hole.
It's a long-term play to have pricey senior developers argue with an llm
Yeah, I'm sure 100k comments with "Copilot, please look into this" and "The test cases are still failing" will massively improve these models.
Any senior dev at these organizations should know to some degree how LLMs work and in my opinion would to some degree, as a self protection mechanism, default to ambiguous vague comments like this. Some of the mentality is “if I have to look at it and solve it why don’t I go ahead and do it anyways vs having you do it” effort choices they’d do regardless of what is producing the PR. I think other parts of it is “why would I train my replacement, there’s no advantage for me here.”
Don't you think it has already been trained with, I don't know, maybe millions of PRs?
This is a performative waste of time
Equating LLMs to humans is pretty damn.. stupid. It's not even close (otherwise how come all the litany of office jobs that require far less reasoning than software development are not replaced?).
Doing so has low risk, the senior devs may perhaps get fed up and quit, and the company might be a laughing stock on public PRs. But the potential value for is huge.
reddit is a distillation of the entire internet on to one site with wildly variable quality of discussion depending upon which subreddit you are in.
Some are awful, some are great.
It's just that some internet extremophiles have managed to eke out a pleasant existence.
this stuff works. it takes effort and learning. it’s not going to magically solve high-complexity tasks (or even low-complexity ones) without investment. having people use it, learn how it works, and improve the systems is the right approach
a lot of armchair engineers in here
And here we have many examples from the biggest bullshit pushers in the whole market of their state of the art tool being hilariously useless in trivial cases. These PRs are about as simple as you can get without it being a typo fix, and we're all seeing it actively bullshit and straight up contradict itself many times, just as anyone who's ever used LLMs would tell you happens all the time.
The supposed magic, omnipotent tool that is AI apparently can't even write test scaffolding without a human telling it exactly what it has to do, yet we're supposed to be excited about this crap? If I saw a PR like this at work, I'd be going straight to my manager to have whoever dared push this kind of garbage reprimanded on the spot, except not even interns are this incompetent and annoying to work with.
you’re taking an anecdote and blowing it out of proportion to fit your preformed opinion. yes, when you start with the tool and do literally no work it makes bad PRs. yes, it’s early and experimental. that doesn’t mean it doesn’t work (I have plenty of anecdotes that it does!)
the truth lies in between and the mob mentality it’s magic or complete bullshit doesn’t help. I’d love to come to a thread like this and actually hear about real experiences from smart people using these kind of tools, but instead we get this bullshit
So I keep being told, but after judiciously and really trying my damned hardest to make these tools work for ANYTHING other than the most trivial imaginable problems, it has been an abject failure for me and my colleagues. Below is a FAR from comprehensive list of my attempts at having AI tooling do anything useful for me that isn't the most basic boilerplate (and even then, that gets fucked up plenty often too).
- I have tried all of the editors and related tooling. Cursor, Jetbrains' AI Chat, Jetbrains' Junie, Windsurf, Continue, Cline, Aider. If it has ever been hyped here on HN, I've given it a shot because I'd also like to see what these tools can do.
- I have tried every model I reasonably can. Gemini 2.5 Pro with "Deep Research", Gemini Flash, Claude 3.7 sonnet with extended thinking, GPT o4, GPT 4.5, Grok, That Chinese One That Turned Out To Be Overhyped Too. I'm sure I haven't used the latest and greatest gpt-04.7-blowjobedition-distilled-quant-3.1415, but I'd say I've given a large number of them more than a fair shot.
- I have tried dumb chat modes (which IME still work the best somehow). The APIs rather than the UIs. Agent modes. "Architect" modes. I have given these tools free reign of my CLI to do whatever the fuck they wanted. Web search.
- I have tried giving them the most comprehensive prompts imaginable. The type of prompts that, if you were to just give it to an intern, it'd be a truly miraculous feat of idiocy to fuck it up. I have tried having different AI models generate prompts for other AI models. I have tried compressing my entire codebase with tools like Repomix. I have tried only ever doing a single back-and-forth, as well as extremely deep chat chains hundreds of messages deep. Half the time my lazy "nah that's shit do it again" type of prompts work better than the detailed ones.
- I have tried giving them instructions via JSON, TOML, YAML, Plaintext, Markdown, MDX, HTML, XML. I've tried giving them diagrams, mermaid charts, well commented code, well tested and covered code.
Time after time after time, my experiences are pretty much a 1:1 match to what we're seeing in these PRs we're discussing. Absolute wastes of time and massive failures for anything that involves literally any complexity whatsoever. I have at this point wasted several orders of magnitudes more time trying to get AIs to spit out anything usable than if I had just sat down and done things myself. Yes, they save time for some specific tasks. I love that I can give it a big ass JSON blob and tell it to extract the typedef for me and it saves me 20 minutes of very tedious work (assuming it doesn't just make random shit up from time to time, which happens ~30% of the time still). I love that if there's some unimportant script I need to cook up real quick, I can just ask it and toss it away after I'm done.
However, what I'm pissed beyond all reason about is that despite me NOT being some sort of luddite who's afraid of change or whatever insult gets thrown around, my experiences with these tools keep getting tossed aside, and I mean by people who have a direct effect on my continued employment and lack of starvation. You're doing it yourself. We are literally looking at a prime of example of the problem, from THE BIGGEST PUSHERS of this tool, with many people in this thread and the reddit thread commenting similar things to myself, and it's being thrown to the wayside as an "anecdote getting blown out of proportion".
What the fuck will it take for the AI pushers to finally stop moving the god damn goal posts and trying to spin every single failure presented to us in broad daylight as a "you're le holding it le wrong teehee" type of thing? Do we need to suffer through 20 million more slop PRs that accomplish nothing and STILL REQUIRE HUMAN HANDHOLDING before the sycophants relent a bit?
AI is aimed at eliminating the jobs of most of HN so it's understandable that HN doesn't want AI to succeed at its goal.
These tools should be locked away in an R&D environment until sufficiently perfected.
MVP means 'ship with solid, tested basic features', not 'Ship with bugs and fix in production'.
oh wait
It's perfectly ok for a professional research experiment.
What's not ok is their insistence on selling the partial research results.
The AI agent/programmer corpo push is not about the capabilities and whether they match human or not. It's about being able to externalize a majority of one's workforce without having a lot of people on permanent payroll.
Think in terms of an infinitely scalable bunch of consultants you can hire and dismiss at your will - they never argue against your "vision", either.
If AI can change... well more likely can convince gullible c levels that AI can do those jobs... many jobs will be lost.
See Klarna "https://www.livemint.com/companies/news/klarnas-ai-replaced-..."
https://www.livemint.com/companies/news/klarnas-ai-replaced-...
Just the attempt to use AI and fail then degraded the previous jobs to a gig economy style job.
Fun facts schadenfreude: the emotional experience of pleasure in response to another’s misfortune, according to Encyclopedia Britannica.
Word that's so nasty in meaning that it apparently does not exist except in German language.
Except it does, we have "skadeglädje" in Swedish.
But I think it’s better for everyone if human ownership is central to the process. Like I vibe coded it. I will fix it if it breaks. I am on call for it at 3AM.
And don’t even get started on the safety issues if you don’t have clear human responsibility. The history of engineering disasters is riddled with unclear lines of responsibility.
Writing code fast is never relevant to any tasks I've encountered. Instead it's mostly about fast editing (navigate quickly to the code I need to edit and efficiently modify it) and fast feedback (quick linting, compiling, and testing). That's the whole promise of IDEs, having a single dashboard for these.
Step 2. Automate the use of these LLMs into “agents”
Step 3. ???
Step 4. Profit
This means its probably quite hard to measure the gain or the drag of using these agents. On one side, its a lot cheaper than a junior, but on the other side it pulls time from seniors and doesn't necessarily follow instruction well (i.e. "errr your new tests are failing").
This combined with the "cult of the CEO" sets the stage for organisational dissonance where developer complaints can be dismissed as "not wanting to be replaced" and the benefits can be overstated. There will be ways of measuring this, to project it as huge net benefit (which the cult of the CEO will leap upon) and there will be ways of measuring this to project it as a net loss (rabble rousing developers). All because there is no industry standard measure accepted by both parts of the org that can be pointed at which yields the actual truth (whatever that may be).
If I might add absurd conjecture: We might see interesting knock-on effects like orgs demanding a lowering of review standards in order to get more AI PRs into the source.
I’m not even sure if this is true when considering training costs of the model. It takes a lot of junior engineer salaries to amortize the billions spent building this thing in the first place.
> It is my opinion that anyone not at least thinking about benefiting from such tools will be left behind.
This is gross, keep your fomo to yourself.
Spending massive amounts of:
- energy to process these queries
- wasting time of mid-level and senior engineers to vibe code with copilot to ensure train and get it right
We are facing a climate change crisis and we continue to burn energy at useless initiatives so executives at big corporation can announce in quarterly shareholder meetings: "wE uSe Ai, wE aRe tHe FuTuRe, lAbOr fOrCe rEdUceD"
Much more worried about what this is going to do to the FOSS ecosystem. We've already seen a couple maintainers complain and this trend is definitely just going to increase dramatically.
I can see the vision but this is clearly not ready for prime time yet. Especially if done by anonymous drive-by strangers that think they're "helping"
> The stream of PRs is coming from requests from the maintainers of the repo. We're experimenting to understand the limits of what the tools can do today and preparing for what they'll be able to do tomorrow. Anything that gets merged is the responsibility of the maintainers, as is the case for any PR submitted by anyone to this open source and welcoming repo. Nothing gets merged without it meeting all the same quality bars and with us signing up for all the same maintenance requirements.
> It is my opinion that anyone not at least thinking about benefiting from such tools will be left behind.
The read here is: Microsoft is so abuzz with excitement/panic about AI taking all software engineering jobs that Microsoft employees are jumping on board with Microsoft's AI push out of a fear of "being left behind". That's not the confidence inspiring the statement they intended it to be, it's the opposite, it underscores that this isn't the .net team "experimenting to understand the limits of what the tools" but rather the .net team trying to keep their jobs.
If they weren't experimenting with AI and coding and took a more conservative approach, while other companies like Anthropic was running similar experiments, I'm sure HN would also be critiquing them for not keeping up as a stodgy big corporation.
As long as they are willing to take risks by trying and failing on their own repos, it's fine in my books. Even though I'd never let that stuff touch a professional github repo personally.
It's like the 2025 version not not using an IDE.
It's a powerful tool. You still need to know when to and when not to use it.
That's right on the mark. It will save you a little bit of work on tasks that aren't the bottleneck on your productivity, and disrupt some random tasks that may or may not be important.
It's makes so little difference that plenty of people in 2025 don't use an IDE, and looking at their performance from the outside one just can't tell.
Except that LLMs have less potential to improve your tasks and more potential to be disruptive.
Even for writing tests, you have to proof-read every single line and triple check they didn't write a broken test. It's absolutely exhausting.
I think, we should not read too much into it. He is honestly exploring how much this tool can help him to resolve trivial issues. Maybe he was asked to do so by some of his bosses, but unlikely to fear the tool replacing him in the near future.
https://www.theregister.com/2025/05/16/microsofts_axe_softwa...
Perhaps they were fired for failing to show enthusiasm for AI?
Half of Microsoft (especially server-side) still runs on dotnet. And there are no real contributors outside of microsoft. So it is a vital project.
Like, I need to start smashing my face into a keyboard for 10000 hours or else I won't be able to use LLM tools effectively.
If LLM is this tool that is more intuitive than normal programming and adds all this productivity, then surely I can just wait for a bunch of others to wear themselves out smashing the faces on a keyboard for 10000 hours and then skim the cream off of the top, no worse for wear.
On the other hand, if using LLMs is a neverending nightmare of chaos and misery that's 10x harder than programming (but with the benefit that I don't actually have to learn something that might accidentally be useful), then yeah I guess I can see why I would need to get in my hours to use it. But maybe I could just not use it.
"Left behind" really only makes sense to me if my KPIs have been linked with LLM flavor aid style participation.
Ultimately, though, physics doesn't care about social conformity and last I checked the machine is running on physics.
Kinda like how word processing used to be an important career skill people put on their resumes. Assuming AI becomes as that commonplace and accessible, will it happen fast enough that devs who want good jobs can afford to just wait that out?
If LLM usage is easy then I can't be left behind because it's easy. I'll pick it up in a weekend.
If LLM usage is hard AND I can otherwise do the hard things that LLMs are doing then I can't be left behind if I just do the hard things.
Still the only way I can be left behind is if LLM usage is nonsense or the same as just doing it yourself AND the important thing is telling managers that you've been using it for a long time.
Is the superpower bamboozling management with story time?
Law, civil service, academia and those who learnt enough LaTeX and HTML to understand text documents are in the minority.
In my org, we would have had to bypass precommit hooks to do this!
For refactoring and extending good, working code, AI is much more useful.
We are at a stage where AI should only be used for giving suggestions to a human in the driver's seat with a UI/UX that allows ergonomically guiding the AI, picking from offered alternatives, giving directions on a fairly micro level that is still above editing the code character by character.
They are indeed overpromising and pushing AI beyond its current limits for hype reasons, but this doesn't mean this won't be possible in the future. The progress is real, and I wouldn't bet on it taking a sharp turn and flattening.
@copilot please remove all tests and start again writing fresh tests.
Considering the ire that H1B related topics attract on HN, I wonder if the same outrage will apply to these multi-billion dollar boondoggles.
Does anyone know which model in particular was used in these PRs? They support a variety of models: https://github.blog/ai-and-ml/github-copilot/which-ai-model-...
1. Working out in the open
2. Dogfooding their own product
3. Pushing the state of the art
Given that the negative impact here falls mostly (completely?) on the Microsoft team which opted into this, is there any reason why we shouldn't be supporting progress here?
It’s showing the actual capabilities in practice. That’s much better and way more illuminating than what normally happens with sales and marketing hype.
Zuckerberg says: "Our bet is sort of that in the next year probably … maybe half the development is going to be done by AI, as opposed to people, and then that will just kind of increase from there".
It's hard to square those statements up with what we're seeing happen on these PRs.
Well, that makes sense to me. Microsoft's software has gotten noticably worse in the last few years. So much that I have abandoned it for my daily driver for the first time since the early 2000s.
This is what's happening right now: they are having to review every single line produced by this machine and trying to understand why it wrote what it wrote.
Even with experienced developers reviewing and lots of tests, the likelihood of bugs in this code compared to a real engineer working on it is much higher.
Why not do this on less mission critical software at the very least?
Right now I'm very happy I don't write anything on .NET if this is what they'll use as a guinea pig for the snake oil.
I doubt that anyone expected to merge any of these PRs. Question is - can the machine solve minor (but non-trivial) issues listed on github in an efficient way with minimal guidance. Current answer is no.
Also, _if_ anything was to be merged, dotnet is dogfooded extensively at Microsoft, so bugs in it are much more likely to be noticed and fixed before you get a stable release on your plate.
Personally I just think it is funny that MS is soft launching a product into total failure.
And given the absolute garbage the AI is putting out the quality of the repo will drop. Either slop code will get committed or the bots will suck away time from people who could've done something productive instead.
This presupposes AI IS progress.
Nevermind that what this actually shows is an executive or engineering team that so buys their own hype that they didn't even try to run this locally and internally before blasting to the world that their system can't even ensure tests are passing before submitting a PR. They are having a problem with firewall rules blocking the system from seeing CI outcomes and that's part of why it's doing so badly, so why wasn't that verified BEFORE doing this on stage?
"Working out in the open" here is a bad thing. These are issues that SHOULD have been caught by an internal POC FIRST. You don't publicly do bullshit.
"Dogfooding" doesn't require throwing this at important infrastructure code. Does VS code not have small bugs that need fixing? Infrastructure should expect high standards.
"Pushing the state of the art" is comedy. This is the state of the art? This is pushing the state of the art? How much money has been thrown into the fire for this result? How much did each of those PRs cost anyway?
As an outside observer but developer using .NET, how concerned should I be about AI slop agents being let lose on codebases like this? How much code are we going to be unknowingly running in future .NET versions that was written by AI rather than real people?
What are the implications of this around security, licensing, code quality, overall cohesiveness, public APIs, performance? How much of the AI was trained on 15+ year old Stack Overflow answers that no longer represent current patterns or recommended approaches?
Will the constant stream of broken PR's wear down the patience of the .NET maintainers?
Did anyone actually want this, or was it a corporate mandate to appease shareholders riding the AI hype cycle?
Furthermore, two weeks ago someone arbitrarily added a section to the .NET docs to promote using AI simply to rename properties in JSON. That new section of the docs serves no purpose.
How much engineering time and mental energy is being allocated to clean up after AI?
Besides, you could also say that 100% of code is generated "by software" no?
Microsoft has humongous amounts of source code in their repositories, amassed over decades. LLM-driven code generation is only feasible within the last few years. It would be completely unrealistic that 30% of all of their code is written by LLMs at this point in time. So yes, there is something in his quote that is intentionally misleading. Pick whatever you think it is, but I'm going to say that it's the "by software" part.
Translation: maybe some of the code in some of our projects is probably written by software.
Seriously. That's what he said. Maybe some of the code in some of our projects is probably written by software.
How this became "30% of MS code is written by LLMs" is beyond me. It's wild. It's ridiculous.
I can't help but think that this LLM bubble can't keep growing much longer. The investment to results ratio doesn't look great so far and there is only so many dreams you can sell before institutional investors pull the plug.
> @copilot fix the build error on apple platforms
> @copilot there is still build error on Apple platforms
Are those PRs some kind of software engineer focused comedy project?
Is there a more direct way? Filtering PRs in the repo by copilot as the author seems currently broken..
This AI bubble is far worse than the Blockchain hype.
Its not yet clear whether productivity gains are real and whether the gains are eaten by a decline in overall quality.
crazy times...
A Bull Request
Anyways I'm disappointed the LLM has yet to discover the optimal strategy, which is to only ever send in PRs that fix minor mis-spellings and improper or "passive" semantics in the README file so you can pad out your resume with all the "experience" you have "working" as a "developer" pm Linux, Mozilla, LLVM, DOOM (bonus points if you can successfully become a "developer" on a project that has not had any official updates since before you born!), Dolphin, MAME, Apache, MySQL, GNOME, KDE, emacs, OpenSSH, random stranger's implementation of conway's game of life he hasn't updated or thought about since he made it over the course of a single afternoon back during the obama administration, etc.
The @stephentoub MS user suggests this is an experiment (https://github.com/dotnet/runtime/pull/115762#issuecomment-2...).
If this is using open source developers to learn how to build a better AI coding agent, will MS share their conclusions ASAP?
EDIT: And not just MS "marketing" how useful AI tools can be.
cebert•9h ago
We have the option to use GitHub CoPilot on code reviews and it’s comically bad and unhelpful. There isn’t a single member of my team who find it useful for anything other than identifying typos.
jsheard•9h ago
It wouldn't be out of character, Microsoft has decided that every project on GitHub must deal with Copilot-generated issues and PRs from now on whether they want them or not. There's deliberately no way to opt out.
https://github.com/orgs/community/discussions/159749
Like Googles mandatory AI summary at the top of search results, you know a feature is really good when the vendor feels like the only way they can hit their target metrics is by forcing their users to engage with it.
XorNot•8h ago
Frost1x•8h ago
diggan•8h ago
Passkeys. As someone who doesn't see the value of it, every hype-driven company seems to be pushing me to replace OPT 2FA with something worse right now.
simonw•8h ago
Passkeys fix that.
diggan•8h ago
ipsi•7h ago
Turns out that under certain conditions, such as severe exhaustion, that "sus filter" just... doesn't turn on quickly enough. The aim of passkeys is to ensure that it _cannot_ happen, no matter how exhausted/stressed/etc someone is. I'm not familiar enough with passkeys to pass judgement on them, but I do think there's a real problem they're trying to solve.
diggan•7h ago
skydhash•6h ago
Something "$5 wrench"
https://xkcd.com/538/
simonw•3h ago
hoistbypetard•8h ago
dsign•8h ago
What this tells me is that software enterprises are so hellbent in firing their programmers and reducing their salary costs they they are willing to combust their existing businesses and reputation into the dumpster fire they are making. I expected this blatant disregard for human society to come ten or twenty years into the future, when the AI systems would actually be capable enough. Not today.
diggan•8h ago
Have you been sleeping under a rock for the last decade? This has been going on for a long long time. Outsourcing been the name of the game for so long people seem to forgot it's happening it all.
mtmail•9h ago
from https://news.ycombinator.com/item?id=44031432
"From talking to colleagues at Microsoft it's a very management-driven push, not developer-driven. Friend on an Azure team had a team member who was nearly put on a PIP because they refused to install the internal AI coding assistant. Every manager has "number of developers using AI" as an OKR, but anecdotally most devs are installing the AI assistant and not using it or using it very occasionally. Allegedly it's pretty terrible at C# and PowerShell which limits its usefulness at MS."
"From reading around on Hacker News and Reddit, it seems like half of commentators say what you say, and the other half says "I work at Microsoft/know someone who works at Microsoft, and our/their manager just said we have to use AI", someone mentioned being put on PIP for not "leveraging AI" as well. I guess maybe different teams have different requirements/workflows?"
4ggr0•8h ago
(just mentioning it because you linked a post and quoted two comments, instead of directly linking the comments. not trying to 'uhm, actually'.)
lovehashbrowns•8h ago
diggan•8h ago
The graphic "Internal structure of tech companies" comes to mind, given if true, would explain why the process/workflow is so different between the teams at Microsoft: https://i.imgur.com/WQiuIIB.png
Imagine the Copilot team has a KPI about usage, matching the company OKRs or whatever about making sure the world is using Microsoft's AI enough, so they have a mandate/leverage to get the other teams to use it regardless of if it's helping or not.
linza•8h ago
sgarland•8h ago
For example, if tomorrow my company announced that everyone was being switched to Windows, I would simply quit. I don’t care that WSL exists, overall it would be detrimental to my workday, and I have other options.
linza•3h ago
Personally i would also not particularly like it.
pydry•8h ago
egorfine•8h ago
Why?
srean•8h ago
egorfine•8h ago
srean•8h ago
MonkeyClub•8h ago
Further down, so that developers are used to train the AI that would replace both developers and managers.
It's a situation like this:
Mgr: Go dig a six-foot-deep rectangular hole.
Eng: What should the rectangle's dimensions be?
Mgr: How tall and wide are you?
jsheard•8h ago
dboreham•7h ago
rchaud•6h ago
marcosdumay•4h ago
(Or, rather, I have no idea how this compares with the image of they actually not delivering because they use it. But that's a next quarter problem.)
At every other place where management is strongly pushing it, I honestly have no idea. It makes zero sense for management to do that everywhere, yet management is doing that everywhere.
DebtDeflation•8h ago
It seems to me to be coming from the CEO echo chamber (the rumored group chats we keep hearing about). The only way to keep the stock price increasing in these low growth high interest rate times is to cut costs every quarter. The single largest cost is employee salaries. So we have to shed a larger and larger percentage of the workforce and the only way to do that is to replace them with AI. It doesn't matter whether the AI is capable enough to actually replace the workers, it has to replace them because the stock price demands it.
We all know this will eventually end in tears.
diggan•8h ago
I guess money-wise it kind of makes sense when you're outsourcing the LLM inference. But for companies like Microsoft, where they aren't outsourcing it, and have to actually pay the cost of hosting the infrastructure, I wonder if the calculation still make sense. Since they're doing this huge push, I guess someone somewhere said it does make sense, but looking at the infrastructure OpenAI and others are having to build (like Stargate or whatever it's called), I wonder how realistic it is.
MatthiasPortzel•8h ago
ParetoOptimal•8h ago
Idiots.
dboreham•7h ago
Masters of the Universe, because they think they will become more rich or at least more masterful.
xnorswap•8h ago
In my experience, LLMs in general are really, really bad at C# / .NET , and it worries me as a .NET developer.
With increased LLM usage, I think development in general is going to undergo a "great convergence".
There's a positive(1) feedback loop where LLM's are better at Blub, so people use them to write more Blub. With more Blub out there, LLMs get better at Blub.
The languages where LLMs struggle, with become more niche, leaving LLMs struggling even more.
C# / .NET is something LLMs seem particularly bad at, and I suspect that's partly caused by having multiple different things all called the same name. EF, ASP, even .NET itself are names that get slapped on a range of different technologies. The EF API has changed so much that they had to sort-of rename it to "EF Core". Core also gets used elsewhere such as ".NET core" and "ASP.NET Core". You (Or an LLM) might be forgiven for thinking that ASP.NET Core and EF Core are just those versions which work with .NET Core (now just .NET ) and the other versions are those that don't.
But that isn't even true. There are versions of ASP.NET Core for .NET Framework.
Microsoft bundle a lot of good stuff into the ecosystem, but their attitude when they hit performance or other issues is generally to completely rewrite how something works, but then release the new thing under the old name but with a major version change.
They'll make the new API different enough to not work without work porting, but similar enough to confuse the hell out of anyone trying to maintain both.
They've made things like authentication, which actually has generally worked fine out-of-the-box for a decade or more, so confusing in the documentation that people mostly tended to run for a third party solution just because at least with IdentityServer there was just one documented way to do it.
I know it's a bit of a cliche to be an "AI-doomer", and I'm not really suggesting all development work will go the way of the dinosaur, but there are specific ecosystem concerns with regard to .NET and AI assistance.
(1) Positive in the sense of feedback that increased output increases output. It's not positive in the sense of "good thing".
fabian2k•8h ago
static_void•5h ago
Hip-hop is just natural language with extra constraints like rhythm and rhyme. It requires the ability to edit.
Similarly, types and PL syntax have more constraints than English.
Until transformers can move backward and change what they've already autocompleted, the problem you've identified will continue.
macintux•7h ago
thraway2079081•7h ago
This feels like it will end badly.
RajT88•7h ago