> Use these tools as a massive force multiplier of your own skills.
Claude definitely makes me more productive in frameworks I know well, where I can scan and pattern-match quickly on the boilerplate parts.
> Use these tools for rapid onboarding onto new frameworks.
I’m also more productive here, this is an enabler to explore new areas, and is also a boon at big tech companies where there are just lots of tech stacks and frameworks in use.
I feel there is an interesting split forming in ability to gauge AI capabilities - it kinda requires you to be on top of a rapidly-changing firehose of techniques and frameworks. If you haven’t spent 100 hours with Claude Code / Claude 4.0 you likely don’t have an accurate picture of its capabilities.
“Enables non-coders to vibe code their way into trouble” might be the median scenario on X, but it’s not so relevant to what expert coders will experience if they put the time in.
One thing I love doing is developing a strong underlying data structure, schema, and internal API, then essentially having CC often one-shot a great UI for internal tools.
Being able to think at a higher level beyond grunt work and framework nuances is a game-changer for my career of 16 years.
A few days ago I lost some data including recent code changes. Today I'm trying to recreate the same code changes - i.e. work I've just recently worked through - and for the life of me I can't get it to work the same way again. Even though "just" that is what I set out to do in the first place - no improvements, just to do the same thing over again.
Actually no wait let’s expand it. Why not go say this to Ronnie O’Sullivan too!
The way you’re describing is such that there is no determinism behind what is being done. Simply not true.
It’s undeniable that humans exhibit stochastic traits, but we’re obviously not stochastic processes in the same sense as LLMs and the like. We have agency, error-correction, and learning mechanisms that make us far more reliable.
In practice, humans (especially experts) have an apparent determinism despite all of the randomness involved (both internally and externally) in many of our actions.
It feels like toil because it's not the interesting or engaging part of the work.
If you're going to build a piece of furniture. The cutting, nailing, gluing are the "boiler plate" that you have to do around the act of creation.
LLM's are just nail guns.
Some amount of boilerplate probably needs to exist, but in general it would be better off minimized. For a decade or so there's sadly been a trend of deliberately increasing it.
Reason Japanese carpenters do or did that is that sea air + high humidity would absolutely rot anything with nail and screw.
No furniture is really designed from a single tree, though. They aren't massive enough.
I agree with overall sentiment. But the analogy is higly flawed. You can't compare physical things with software. Physical things are way more constrained while software is super abstract.
I actually think I like the idea that, maybe by handling my boilerplate over to AI we can be more comfortable with having boilerplate to begin with.
These days, people mostly use things like GHC.Generics (generic programming for stuff like serialization that typically ends up being free performance-wise), newtypes and DerivingVia, the powerful and very generalized type system, and so on.
If you've ever run into a problem and thought "this seems tedious and repetitive", the probability that you could straightforwardly fix that is probably higher in Haskell than in any other language except maybe a Lisp.
There are? For example, rails has had boilerplate generation commands for a couple of decades.
Python’s subprocess for example has a lot of args and that reflects the reality that creating processes is finicky and there a lot of subtly different ways to do it. Getting an llm to understand your use case and create a subprocess call for you is much more realistic than imagining some future version of subprocess where the options are just magically gone and it knows what to do or we’ve standardized on only one way to do it and one thing that happens with the pipes and one thing for the return code and all the rest of it.
And then… that just kind of dropped out of the discussion. Throw things at the wall as fast as possible and see what stuck, deal with the consequences later. And to be fair, there were studies showing that choice of language didn’t actually make as big of difference as found in the emotions behind the debates. And then the web… committee designed over years and years, with the neve the ability to start over. And lots of money meant that we needed lots of manager roles too. And managers elevate their status by having more people. And more people means more opportunity for specializations. It all becomes an unabated positive feedback loop.
I love that it’s meant my salary has steadily climbed over the years, but I’ve actually secretly thought it would be nice if there was bit of a collapse in the field, just so we could get back to solid basics again. But… not if I have to take a big pay cut. :)
Here’s an incomplete list for those traits. For unusual, there’s many of the FP languages, Ada, APL, Delphi/Object Pascal, JS, and Perl. For duck typing, there’s Ruby, Python, PHP, JS, and Perl. For only interpreted, there are Ruby, PHP, and Perl (and formerly for some time Python and JS). For syntax that’s not necessarily odd (but may be) but lots of people find distasteful there’s Perl, any form of Lisp, APL, Haskell, the ML family, Fortran, JS, and in some camps Python, PHP, Ruby, Go, or anything from the Pascal family. For big languages with lots of interacting parts there’s Perl, Ada, PHP, Lisp with CLOS, Julia, and PHP. For slowdowns, there’s Julia, Python, PHP, and Ruby. The runtime for Perl is actually pretty fast once it’s up and running, but having to build the app before running it on every invocation makes for a slow start time.
All that said, certain orgs do impressive projects pretty quickly with some of these languages. Some do impressively quick work with even less popular languages like Pike, Ponie, Elixir, Vala, AppScript, Forth, IPL, Factor, Raku, or Haxe. Notice some of those are very targeted, which is another reason boilerplate is minimal. It’s built into the language or environment. That makes development fast, but general reuse of the code pretty low.
Haskell mostly solves boilerplate in a typed way and Lisp mostly solves it in an untyped way (I know, I know, roughly speaking).
To put it bluntly, there's an intellectual difficulty barrier associated with understanding problems well enough to systematize away boilerplate and use these languages effectively.
The difficulty gap between writing a ton of boilerplate in Java and completely eliminating that boilerplate in Haskell is roughly analogous to the difficulty gap between bolting on the wheels at a car factory and programming a robot to bolt on the wheels for you. (The GHC compiler devs might be the robot manufacturers in this analogy.) The latter is obviously harder, and despite the labor savings, sometimes the economics of hiring a guy to sit there bolting on wheels still works out.
You’re asking to shift this job from the editor (you) to the viewer (the browser).
Now we have a way we can get computers to do it!
There is no software you could possibly write that works for everything thatd be as good as "Give me an internal dashboard with these features"
They weren’t just saying ‘AI writes the boilerplate for me.’ They were saying: once you’ve written the same glue the 3rd, 4th, 5th time, you can start folding that pattern into your own custom dev tooling.
AI not as a boilerplate writer but as an assistant to build out personal scaffolding toolset quickly and organically. Or maybe you think that should be more systemized and less personal?
You dont understand how things evolve.
There have been plenty of platforms that got rid of boilerplate - e.g. ruby on rails about 20 years ago
But once they become the mainstream, people can get a competitive edge by re-adding loads of complexity and boilerplate again. E.g. complex front end frameworks like react.
If you want your startup to look good you've got to use the latest trendy front end thingummy
Also to be fair, its not just fashion. Features that would have been advanced 20 years ago become taken for granted as time goes on, hence we are always working at the current limit of complexity (and thats why we're always overrun with bugs and always coming up with new platforms to solve all the problems and get rid of all thr boilerplate so that we can invent new boilerplate)
Because everyone needs a boilerplate but it's a different boilerplate for everyone unless you're doing the most basic toy apps
I've felt this learning just this week - it's taken me having to create a small project with 10 clear repetitions, messily made from AI input. But then the magic is making 'consolidation' tasks where you can just guide it into unifying markup, styles/JS, whatever you may have on your hands.
I think it was less obvious to me in my day job because in a startup with a lack of strong coding conventions, it's harder to apply these pattern-matching requests since there are fewer patterns. I can imagine in a strict, mature codebase this would be way more effective.
Piling shit on top of shit only pays off on very short time scales - like a month or two. Because once you revisit that shit code all your time savings are out the window. If you have to revisit it more than once you probably slowed yourself down already.
"Use these tools as a massive force multiplier of your own skills" is a great way to formulate it. If your own skills in the area are near-zero, multiplying them by a large factor may still yield a near-zero result. (And negative productivity.)
It seems to me that LLMs help the most at the initial step of getting into some rabbit hole - when you're getting familiar with the jargon, so you can start reading some proper resources without being confused too much. The sooner you manage to move there, the better.
It seems to me that if you have been pattern matching the majority of your coding career, then you have a LLM agent pattern match on top of that, it results in a lot of headaches for people who haven't been doing that on a team.
I think LLM agents are supremely faster at pattern matching than humans, but are not as good at it in general.
just points to the fact that they've no idea what they're doing and would produce different, pointless code by hand, though much slower. this is the paradigm shift - you need a much bigger sieve to filter out the many more orders of magnitude of crap that inexperienced operators of LLMs create.
You cannot outsource thinking to LLMs, at least not yet, if ever. You have to be part of the whole process. You need to have knowledge. If you have no idea what it is doing or what you want it to do, you are going to have a difficult time.
The programming language eliminates some (incorrect syntax) while the type system get rid of others (contract error). We also have linter that helps us with harmful patterns. But the range of errors is still enormous. So what’s the probability of having the LLMs be error free or as close as possible to the intended result?
We as humans have reduced the probability of error by having libraries of correct code (or outsourcing the correction of code), thus having a firmer and cognitively manageable foundation to create new code. As well as not having to rely on language to solve problems.
Maybe if all you do is code, but that’s not how most people work. Being able write I need these things done in this way and then attend a meeting or start researching the next thing is valuable. And because of my other obligations there’s no way I could do more without Claude.
Also new languages - our team uses Ruby, and Ruby is easy to read, so I can skip learning the syntax and get the LLM to write the code. I have to make all the decisions, and guide it, but I don't need to learn Ruby to write acceptable-level code [0]. I get to be immediately productive in an unfamiliar environment, which is great.
[0] acceptable-level as defined by the rest of the team - they're checking my PRs.
> Also new languages - our team uses Ruby, and Ruby is easy to read, so I can skip learning the syntax and get the LLM to write the code.
If Ruby is "easy to read" and assuming you know a similar programming language (such as Perl or Python), how difficult is it to learn Ruby and be able to write the code yourself?
> ... but I don't need to learn Ruby to write acceptable-level code [0].
Since the team you work with uses Ruby, why do you not need to learn it?
> [0] acceptable-level as defined by the rest of the team - they're checking my PRs.
Ah. Now I get it.
Instead of learning the lingua franca and being able to verify your own work, "the rest of the team" has to make sure your PR's will not obviously fail.
Here's a thought - has it crossed your mind that team members needing to determine if your PR's are acceptable is "a bad thing", in that it may indicate a lack of trust of the changes you have been introducing?
Furthermore, does this situation qualify as "immediately productive" for the team or only yourself?
EDIT:
If you are not a software engineer by trade and instead a stakeholder wanting to formally specify desired system changes to the engineering team, an approach to consider is authoring RSpec[0] specs to define feature/integration specifications instead of PR's.
This would enable you to codify functional requirements such that their satisfaction is provable, assist the engineering team's understanding of what must be done in the context of existing behavior, identify conflicting system requirements (if any) before engineering effort is expended, provide a suite of functional regression tests, and serve as executable documentation for team members.
0 - https://rspec.info/features/6-1/rspec-rails/feature-specs/fe...
No, not at all.
What I was speaking about was if the person to whom I replied is not a s/w engineer, then perhaps a better contribution to their project would be to define requirements in the form of RSpec specifications (since Ruby is in use) and allow the engineering team to satisfy them as they determine appropriate.
I have seen product/project managers attempt to "contribute" to a development effort much like what was described. Usually there is a power dynamic such that engineers cannot overtly tell the manager(s), "you define the 'what' and we will define the 'how'." Instead, something like the PR flow described is grudgingly accepted and then worked around.
This reminds of some of the comments made by reviewers during the infamous Schön scientific fraud case. The scientific review process is designed to catch mistakes and honest flaws in research. It is not designed to catch fraud, and the evidence shows that it is bad at it.
Another applicable example would be the bad patches fiasco with the Linux kernel. (And there is going to be a session at the upcoming maintainers' summit about LLM-generated kernel patches.)
Great skill multiplier, right?
I lead the engineering team at my org and we hire almost exclusively for c++ engineers (we make games). Our build system by happenstance is written in c#, as are all the automation scripts. Out of our control to change. Should we require every engineer to be competent and write fluent c# or should we let them just get on with their value adds?
I would expect every engineer to be able to read C#. It’s not that hard.
Reading code doesn't mean you can write it, as any programmer will tell you.
If I want to know if a string in ruby begins with another string, is the method starts_with or start_with or startwith like python or is it like perl where I have to use some completely different method? I don't know, better google it.
But if I'm reading and see `str.start_with?("https://")` I know instantly what it's doing.
Then I need to expend extra time following everything it did so I can "fix" the problem.
A lot of people have the try and see if it works approach. That can be insanely wasteful in any moderately complex system. The scientist way is to have a model that reduce the system to a few parameters. Then you’ll see that a lot of libraries are mostly surface works and slighlty modified version of the same thing.
The onboarding angle is huge too: being able to “rent experience” in new frameworks drastically shortens ramp-up time, especially in environments with lots of stacks in play.
And you’re right about evaluation without real hours of use, it’s easy to underestimate what they can actually do. The “non-coders vibe coding” narrative misses that for experienced devs, they’re accelerants, not crutches.
It took a few prompts but I know enough about FFS (the Amiga filesystem) to guide it, and it created exactly the tool I wanted.
"force multiplier of your own skills" is a great description.
One thing where it hasn't shone is configuring my production deployment. I had set this project up with a docker-compose, but my selected CI/CD (Gitlab) and my selected hosting provider (DigitalOcean) seemed to steer me more towards Kubernetes, which I don't know anything about. Gitlab's documentation wanted me to setup Flux (?) and at some point referred to a Helm chart (?)... All words I've heard but their documentation is useless to newcomers ("manage containers in production!": yes, that's obviously what I'm trying to do... "Getting started: run this obscure command with 5 arguments": wth is this path I need to provide? what's this parameter? etc.) I honestly can't believe how complex the recommended setup is, to ultimately run 2 containers that I already have defined in ~20 lines of docker-compose...
Claude got me through it. Took it about 5-6 hours of trying stuff, build failing, trying again. And even then, it still doesn't deploy when I push. It builds, pushes the new container images, and spins up a new pod... which it then immediately kills because my older one is still running and I only want one pod running... Oh well, I'll just keep killing the old pod until I have some more energy to throw at it to try and fix it.
TL;DR: it's much better at some things than others.
> As a giant caveat, I should note that I have a small bit of prior experience working with kernel modules, and a good amount of experience with C in general
But yeah, the dream of new OSes is sweet...
We're talking about an order of magnitude quicker onboarding. This is absolutely massive.
Just like security holes generated by those LLMs. /s
What's wrong with exist one?
For example, FreeRTOS doesn't support 64-bit intel arch. And you don't "ship an app on FreeRTOS", it's more of an API and framework you use, and you sort of write a module in C and compile one big app. Quite different from non-embedded app design/shipping. You won't be able to run an Android app on an ESP32, but it should be possible to write apps for ESP32 and run them on Android-compatible hardware. But FreeRTOS would need optional MMU support, and you'd need extra components to load the app, in addition to hardware support.
If you're asking "why would you do that", it's because I want to write simple purpose-built apps without all the trappings of a larger OS and run them on all types of hardware. You could technically build a 'smart watch' that isn't so smart but runs on a single battery charge for 1 year. But not if you use a power-hungry SoC. Want a more efficient SoC? Good luck figuring that out. Making that whole process easier unlocks more technical solutions and products.
There is a good discussion/interview¹ between Alan Kay & Joe Armstrong about how most code is developed backwards b/c none of the code has a formal specification that can be "compiled" into different targets. If there was a specification other than the old driver code then the process of porting over the driver would be a matter of recompiling the specification for a new kernel target. In absence of such specification you have to substitute human expertise to make sure the invariants in the old code are maintained in the new one b/c the LLMs has no understanding of any of it other than pattern matching to other drivers w/ similar code.
1. The original hardware spec is usually proprietary, and
2. The spec is often what the hardware was supposed to do. But hardware prototype revisions are expensive. So at some point, the company accepts a bunch of hardware bugs, patches around them in software, ships the hardware, and reassigns the teams to a newer product. The hardware documentation won't always be updated.
This is obviously an awful process, but I've seen and heard of versions of it for over 20 years. The underlying factors driving this can be fixed, if you really want to, but it will make your product totally uncompetitive.
I've done _so_ many of these where I go "hmm, this might be useful", planned the project with gemini/chatgpt free versions to a markdown project file and then sic Claude on it while I catch up on my shows.
Within a few prompts I've got something workable and I can determine if it was a good idea or not.
Without an LLM I never would've even tried it, I have better and more urgent things to do than code a price-watcher for very niche Blu-ray seller =)
I think the training data is especially good, and ideally no logic needs to change.
That's even before taking on the brutal linux kernel mailing lists for code review explaining what that C code does which could be riddled with bugs that Claude generated.
No thanks and no deal.
The last version of the driver that was included in the kernel, right up until it was removed, was version 3.04.
BUT, the author continued to develop the driver independently of kernel releases. In fact, the last known version of the driver was 4.04a, in 2000.
My goal is to continue maintaining this driver for modern kernel versions, 25 years after the last official release." - https://github.com/dbrant/ftape
I doubt it would have been significantly easier to start the porting effort from that vs. the original 2.4.x source.
Test coverage between subsystems in the Linux kernel varies widely. I don't think a lack of tests would prevent inclusion.
> No thanks and no deal.
I mean, now we have a driver for this old hardware that runs on a modern kernel, which we didn't before. I imagine you don't even have that hardware, so why do you care if someone else gets some use out of it?
The negativity here in many of these comments is just staggering. I've only recently started adopting LLM coding tools, and I still remain a skeptic about the whole thing overall, but... damn. Seems like most people aren't thinking critically and are just regurgitating "durrrr LLMs bad" over and over.
A reminder though these LLM calls cost energy and we need reliable power generation to iterate through this next tech cycle.
Hopefully all that useless crypto wasted clock cycle burn is going to LLM clock cycle burn :)
You would certainly need an expert to make sure your air traffic control software is working correctly and not 'vibe coded' the next time you decide to travel abroad safely.
We don't need a new generation who can't read code and are heavily reliant on whatever a chat bot said because: "you're absolutely right!".
> Hopefully all that useless crypto wasted clock cycle burn is going to LLM clock cycle burn :)
Useful enough for Stripe to building their own blockchain and even that and the rest of them are more energy efficient than a typical LLM cycle.
But the LLM grift (or even the AGI grift) will not only cost even more than crypto, but the whole purpose of its 'usefulness' is the mass displacement of jobs with no realistic economic alternative other than achieving >10% global unemployment by 2030.
That's a hundred times more disastrous than crypto.
The same approach can be used to modernise other legacy codebases.
I'm thinking of doing this with a 15 year old PHP repo, bringing it up to date with Modern PHP (which is actually good).
As a giant caveat, I should note that I have a small bit of
prior experience working with kernel modules, and a good
amount of experience with C in general, so I don’t want to
overstate Claude’s success in this scenario. As in, it
wasn’t literally three prompts to get Claude to poop out a
working kernel module, but rather several back-and-forth
conversations and, yes, several manual fixups of the code.
It would absolutely not be possible to perform this
modernization without a baseline knowledge of the internals
of a kernel module.
Of note is the last sentence: It would absolutely not be possible to perform this
modernization without a baseline knowledge of the internals
of a kernel module.
This is critical context when using a code generation tool, no matter which one chosen.Then the author states in the next section:
Interacting with Claude Code felt like an actual
collaboration with a fellow engineer. People like to
compare it to working with a “junior” engineer, and I think
that’s broadly accurate: it will do whatever you tell it to
do, it’s eager to please, it’s overconfident, it’s quick to
apologize and praise you for being “absolutely right” when
you point out a mistake it made, and so on.
I don't know what "fellow engineers" the author is accustomed to collaborating with, junior or otherwise, but the attributes enumerated above are those of a sycophant and not any engineer I have worked with.Finally, the author asserts:
I’m sure that if I really wanted to, I could have done this
modernization effort on my own. But that would have
required me to learn kernel development as it was done 25
years ago.
This could also be described as "understanding the legacy solution and what needs to be done" when the expressed goal identified in the article title is: ... modernize a 25-year-old kernel driver
Another key activity identified as a benefit to avoid in the above quote is: ... required me to learn ...
Learning what must be done to implement a device driver in order for it to operate properly is not "gatekeeping." It is a prerequisite.
> I love agents explaining me projects I don‘t know.
Awesome. This is one way to learn about implementations and I applaud you for benefiting from same.
> Recently I cloned sources of Firefox and asked qwen-code (tool not significant) about the AI features of Firefox and how it‘s implemented. Learning has become awesome.
Again, this is not the same as implementing an OS device driver. Even though one could justify saying Firefox is way more complicated than a Linux device driver (and I would agree), the fact is that a defective device driver can lock-up the machine[0], corrupt internal data structures resulting in arbitrary data corruption, and/or cause damage to peripheral devices.
Apparently it's not, though. The author here had some baseline knowledge of how Linux kernel modules work, but the impression I got is that they would not have been able to do this on their own without a lot of learning.
> the fact is that a defective device driver can lock-up the machine[0], corrupt internal data structures resulting in arbitrary data corruption, and/or cause damage to peripheral devices.
Now that's some gatekeeping right there. "Only experts can write kernel modules" is a pretty toxic attitude to have.
On their computers.
Not mine.
"...kernel development as it was done 25 years ago."
Not "...kernel development as it is done today".
That "25 years ago" is important and one might be interested in the latter but not in the former.
I read "junior" as 'subordinate' and 'lacking in discernment'.. -- Sycophancy is a good description. I also like "bullshit" (as in 'for the purpose of convincing'). https://en.wikipedia.org/wiki/Bullshit#In_the_philosophy_of_...
The point being, there's nuance to "it felt like a collaboration with another developer (some caveats apply)". -- It's not a straightforward hype of "LLM is perfect for everything", nor is it so simple as "LLM has imperfections, it's not worth using".
> Another key activity identified as a benefit to avoid in the above quote is: > > ... required me to learn ...
It would be bad to avoid learning fundamentals, or things which will be useful later.
But, it's not bad to say "there are things I didn't need to know to solve a problem".
I think a moderately-skilled developer with experience in C could have done this, with Claude's help, even if they had little or no experience with the Linux kernel. It would probably take longer to do, and debugging would be harder, but it would still be doable.
M3, to answer the second part why AI won't be of much help, onwards use a massively different GPU architecture that needs to be worked out, again, from scratch. And all of that while there is a substantial number of subsystems remaining on M1, M2 and its variants that aren't supported at all, only partially supported or with serious workarounds, or where the code quality needs massive work to get upstreamed into Linux.
And on top of that, a number of contributors burned out along the way, some from dealing with the ultra-neckbeard faction amongst Linux kernel developers, some from other mental health issues, and Alyssa departed for Intel recently.
Seriously though, it does seem a menial task in itself to reverse engineer what's going on. Would be a really powerful show of force by one of leading AI providers if they setup shop like that to do it in the open.. if they could.
Thinking about asking Claude to reimplement it from scratch in Rust…
[1] https://codeberg.org/superseriousbusiness/gotosocial/src/bra...
That’s the reason I posted my comment.
Do you disagree with some part of the statement regarding "AI" in their CoC? Do you think there's a fault in their logic, or do you yourself personally just not care about the ethics at play here?
I find it refreshing personally to see a project taking a clear stance. Kudos to them.
Recently enjoyed reading the Dynamicland project's opinion on the subject very much too[0], which I think is quite a bit deeper of an argument than the one above.
Ethics seems to be, unfortunately, quite low down on the list of considerations of many developers, if it factors in at all to their decisions.
[0] https://dynamicland.org/2024/FAQ/#What_is_Realtalks_relation...
These ethics are definitely derived from a profit motive, however petty it may be.
(and of course without non-profit motivations, none of the open source ecosystem would exist!)
It does nothing to fix the issues of unpaid FOSS labor, though, but that was a problem well before the recent rise of LLMs.
Yeah, only if you look at permissive licenses like MIT and Apache, it most certainly doesn't follow the spirit of other licenses.
However they're able to do more than just regurgitating code, I can have them explain to me the underlying (mathematical or whatever) concept behind the code and write new code from scratch myself, with that knowledge.
Can/should this new code be considered as derivative work, if the underlying principles were already documented in literature?
A large C++ emulator project was failing to build with a particular compiler with certain Werror's enabled. It came down to reordering a few members (that matters in C++) and using the universal initializer syntax in a few places. It was a +3-3 diff. I got lambasted. One notoriously hostile maintainer accused me of making AI slop. The others didn't understand why the order mattered and referred to it as "churn."
If you don't have the technical understanding of a language or framework, there is going to be a lot of ambiguity in your prompts.
This specificity gap leads the LLM to fill in those gaps for you, which may not be what you intended. And that's usually where bugs hide.
I think this is the flip side to being a "force multiplier"
I would hope Claude would be like “Hold on man…”
| I'll create a simple C++ class with a constructor that takes a tuple. Here's an example...
> nah C
| C doesn't have classes or constructors - those are C++ features. In C, you would use a struct and create an initialization function. Here's how you'd handle a tuple-like structure:
This implementation provides:
Generic Tuple Structure: A flexible tuple that can hold different types of elements (int, double, string, float)
Constructor Function: tuple_create() initializes a new tuple with specified size
Type-Safe Setters:
tuple_set_int(), tuple_set_double(), tuple_set_string(), tuple_set_float()
Type-Safe Getters:
tuple_get_int(), tuple_get_double(), tuple_get_string(), tuple_get_float()
Memory Management: Automatic memory allocation and deallocation
Utility Functions:
tuple_print() for displaying tuple contents
tuple_destroy() for cleanup
Seems to be a common enough request to be present in the model's training data :D Even the code looked fine, apart from being a very naive and thus slow implementation.If you have access to state-of-art coding assistants, try asking this. You can additionally add a requirement to generate a destructor function (that would free dynamically-allocated fields).
Here the author has a passion/side project they have been on for a while. Upgrading the tooling is a great thing. Community may not support this since the niche is too narrow. LLM comes in and helps in the upgrade. This is exactly what we want - software to be custom - for people to solve their unique edge cases.
Yes author is technical but we are lowering the barrier and it will be lowered even more. Semi technical people will be able to solve some simpler edge cases, and so one. More power to everyone.
One of the things that has Claude as my goto option is its ability to start long-running processes, which it can read the output of to debug things.
There are a bunch of hacks you could have used here to skip the manual part, like piping dmesg to a local udp port and having Claude start a listener.
Even something simple like getting it to run a dev server in react can have it opening multiple servers and getting confused. I've watched streams where the programmer is constantly telling it to use an already running server.
It wouldn't surprise me if the drive and the tapes are still somewhere in my parents storage. Could be a fun weekend project to try it out, though I'm not sure I have any computer with a floppy interface anymore. And I don't think there's anything particularly interesting on those tapes either.
In any case, cool project! Kudos to the author!
I like how Claude code is more advanced in terms of CLI functionality but I prefer Codex output (with model high)
If you do not want to pay for both, then you can pick anyone and go with it. I don't think the difference is huge.
Just get the source code published into mainline.
I've been able to do things that I would not have the competence for otherwise, as I do not have a formal software engineering background and my main expertise is writing python data processing scripts.
E.g., yesterday I fixed a bug [2] by having Claude compare the CarPlay and iOS search implementations. It did at first suggest another code change than the one that fixed it, but that felt just like a normal part of debugging (you may need to try different things)
Most of my contributions [3] have been enabled by Claude, and it's also been critical to identify where the code for certain things are located - it's a very powerful search in the code base
And it is just amazing if you need to write a simple python script to do something, e.g., in [4]
Now this would obviously not be possible if everyone used AI tools and no one knew the existing code base, so the future for real engineers and architects is bright!
[1] https://codeberg.org/comaps/comaps [2] https://codeberg.org/comaps/comaps/pulls/1792 [3] https://codeberg.org/comaps/comaps/pulls?state=all&type=all&... [4] https://codeberg.org/comaps/comaps/pulls/1782
Hope to make the bridge soon with i18n of cartes.app.
I also use LLMs to work on it. Mistral, mostly.
Nowadays I heavily rely Claude Code to write code, I start a task by creating a design, then I write a bunch of prompt which cover the design details and detail requirements and interaction/interface with other compoments. So far so good, it boost the productivity much.
But I am really worrying or still not be able to believe this is the new norm of coding.
Only thing I got from this is nostalgia from the old PC with its internals sprawled out everywhere. I still use desktop PCs as much as I can. My main rig is almost ten years old and it's been upgraded countless times although is now essentially "maxed out". Thank god for PC gamers, otherwise I'm not sure we'd still have PCs at all.
It's very useful when you get the answer in several minutes rather than half a hour.
Maybe it is because they generate the code in one pass and cannot return back and fix the issues. LLM makers, you should allow LLMs to review and edit the generated code.
Also I wanted to add that LLMs (at least free ones) are pretty dumb sometimes and do not notice obvious thing. For example, when writing tests they generate lot of duplicated code and do not move it into a helper function, or do not combine tests using parametrization. I have to do it manually every time.
Do you prompt it to reduce duplicated code?You can tell it to move it and they'll move it and use this shared code from now on.
For example, imagine if you test a vector-like collection. In every test case dumb LLM creates vector manually and makes inserts/deletes. It could be replaced by adding a helper function that accepts a sequence of operations and returns the processed vector. Furthermore, once you have that function, you can merge multiple tests with parametrization, by having a test function accept a sequence of operation and expected result:
parametrize('actions, result', (
# Test that remove removes items from vector
([Ins(1, 2, 3, 4), Remove(4)], [1, 2, 3]),
...
)
But it takes time to write this explanation, and dumb LLM might not merge all tests from the first time."Don't create vector manually inline in every test case, make a helper function for that."
and see what agent does. It might do something smart. It might do something a bit dumb but by understanding why exactly it's dumb, you can communicate what correction is needed pretty smoothly.
I’ve used these tools enough to recognize they can go from “productivity multiplier” to “burning your house down” on a dime and was making sure to create a git commit after each change (which Claude was also instructed to do it itself but would forget every so often).
The extension was working almost perfectly except had a bug when installed on Android Chromium browsers so showed the issue to CC and it was like “right! that makes perfect sense, the problem is [LLM wall of text].
Annnnnnd CC then proceeded to methodically break all the critical code paths for the actual downloads making the extension unusable on any browser.
I sat down and did 30 min of research actually reading docs and what it described as the issue/fix turned out to be a subtly hallucinated “wouldn’t it be great if the Chrome API worked like this” fake API fix. Reverted to the last commit, which I evidently failed to notice was actually a couple changes back.
I tried to repeat the prompts that got it into the almost working state and it couldn’t one shot it like it did the first time, would keep breaking some part of the request and I had to spent a couple hours coaxing it to recreate what was originally only 10 min of work.
Ultimately, yes it was entirely my fault that I failed to make sure it created a commit when it was 95% working but I was in the zone and excited and it had nailed almost every prompt perfectly until then. It still saved a lot of time learning how Chrome extensions work and the Chrome API (although that knowledge would have helped when I finally needed to do just that at the end).
Was a rude reminder how easy it can be to allow myself to start feeling like it’s “magic” again and that 1) diligent oversight is always necessary even when things seem to be going flawlessly 2) these tools really need more deterministic rules so I could e.g flip a switch and say “enable commits after every confirmed change”, and just use Claude for the step of writing the commit message instead of making the decision to commit on its own based on instructions.
Hilarious! https://xkcd.com/1200/
And clearly define what we need with specs and thorough tests.
We are constantly reminded that LLMs are the future despite the real world evidence to the contrary. Look at what happens when LLMs are trained on the output of other LLMs, such as the low quality code flooding the internet. It is all a self-solving problem set in motion.
And LLMs are deterministic too if you freeze the seed.
There are currently multiple posts per day on HN that escalate into debates on LLMs being useful or not. I think this is a clear example that it can be. And results count. Porting and modernizing some ancient driver is not that easy. There's all sorts of stuff that gets dropped from the kernel because it's just too old to bother maintaining it and when nobody does, deleting code becomes the only option. This is a good example. I imagine, there are enough crusty corners in the kernel that could benefit from a similar treatment.
I've had similar mixed results with agentic coding sometimes impressing me and other times disappointing me. But if you can adapt to some of these limitations it's alright. And this seems to be a bit of a moving goalpost thing as well. Things that were hard a few months ago are now more doable.
These studies keep popping up where they randomly decide whether someone will use AI to assist in a feature or not and it's hard for me to explain just how stupid that is. And how it's a fundamental misunderstanding of when and how you'd want to use these tools.
It's like being a person who hangs up drywall with screws and your boss going "Hey, I'm gonna flip a coin and if it's heads you'll have to use the hammer instead of a screwdriver" and that being the method in which the hammer is judged.
I don't wake and go "I'm going to use AI today". I don't use it to create entire features. I use it like a dumb assistant.
> I've had similar mixed results with agentic coding sometimes impressing me and other times disappointing me. But if you can adapt to some of these limitations it's alright. And this seems to be a bit of a moving goalpost thing as well. Things that were hard a few months ago are now more doable.
Exactly my experience too.
I actually do this now. That's one of those things that went from impossible to doable under some circumstances. Still a bit of a coin flip but it can work well in some code bases. I still have a mental block even asking for these things under the assumption it would not work anyway. But I've been pleasantly surprised a few times where this actually works.
Even before tools like CC it was the case that LLMs enabled venturing into projects/areas that would be intimidating otherwise. But Claude-Code (and codex-cli as of late) has made this massively more true.
For example I recently used CC to do a significant upgrade of the Langroid LLM-Agent framework from Pydantic V1 to V2, something I would not have dared to attempt before CC:
https://github.com/langroid/langroid/releases/tag/0.59.0
I also created nice collapsible html logs [2] for agent interactions and tool-calls, inspired by @badlogic/Zechner’s Claude-trace [3] (which incidentally is a fantastic tool!).
[2] https://github.com/langroid/langroid/releases/tag/0.57.0
[3] https://github.com/badlogic/lemmy/tree/main/apps/claude-trac...
And added a DSL to specify agentic task termination conditions based on event-sequence patterns:
https://langroid.github.io/langroid/notes/task-termination/
Needless to say, the docs are also made with significant CC assistance.
I keep beating on the drum that they correctly point out. It's not perfect. But it saves hours and hours of work in generating compared to small conceptual debugging.
The era of _needing_ teams of people to spit out boilerplate is coming to an end. I'm not saying doing learn to write it, learning demands doing, making mistakes and personal growth. But after you've mastered this there's no need to waste time writing booklet plate on the clock unless you truly enjoy it.
This is a perfect example of time taken to debug small mistakes << time to start from scratch as a human.
Time, equivalent money, energy saved all a testament to what is possible with huge context windows and generic modern LLMs :) :) :)
https://github.com/Godzil/ftape
Could it be that Misanthropic has trained on that one?
> Maybe this driver have problems on 64Bit x86 machines.
Ouch. The part where it says it’s not possible to use a normal floppy and the tape flip anymore seemed odd enough, but those last points should scare anyone away from trying these on anything important.
I was able to port a legacy thermal printer user mode driver from legacy convoluted JS to pure modern Typescript in two to three days at the end of which printer did work.
Same caveats apply - I have decent understanding of both languages specifically various legacy JavaScript patterns for modularity to emulate other language features that don't exist in JavaScript such as classes etc.
It’s literally pathetic how these things just memorize, not achieve any actual problem-solving
Anyone with experience with LLMs will have experienced their actual problem solving ability, which is often impressive.
You'd be better off learning to use them, than speculating without basis about why they won't work.
Also “learn to use them” feels you’re holding it wrong vibes.
See also
https://machinelearning.apple.com/research/illusion-of-think...
Would give postmarketos a boost.
unethical_ban•17h ago
One note: I think the author could have modified sudoers file to allow loading and unloading the module* without password prompt.
anyfoo•17h ago
unethical_ban•16h ago
Another thought, IIRC in the plugins for Claude code in my IDE, you can "authorize" actions and have manual intervention without having to leave the tool.
My point is there were ways I think they could have avoided copy/paste.
anyfoo•16h ago
That is a bit different than allowing unconfirmed loading of arbitrary kernel code without proper authentication.
nico•16h ago
frumplestlatz•16h ago
Even a minor typo in kernel code can cause a panic; that’s not a reasonable level of power to hand directly to Claude Code unless you’re targeting a separate development system where you can afford repeated crashes.