Been there, done that!
for those one-off small things, LLMs are rather cool. Especially Cloude Code and Gemini CLI. I was given an archive of some really old movies recently, but files were bearing title names in Croatian instead of original (mostly English ones). So I claude --dangerously-skip-permissions into the directory with movies and in a two-sentence prompt I asked it to rename files into a given format (that I tend to have in my archive) and for each title to find original name and year or release and use it in the file.. but, before commiting rename to give me a list of before and after for approval. It took like what, a minute of writing a prompt.
Now, for larger things, I'm still exploring a way, an angle, what and how to do it. I've tried from yolo prompting to structured and uber structured approaches, all the way to mimicking product/prd - architecture - project management / tasks - developer/agents.. so far, unless it's rather simpler projects I don't see it's happening that way. Most luck I had was "some structure" as context and inputs and then guiding prompting during sessions and reviewing stuff. Almost pair-programming.
I thought large contexts are not necessarily better and sometimes have opposite effect ?
On coding you need to aggressively prune it, and only give minimum adjacent context, or it'll start going on useless tangents. And if you get stuck just refresh and start from 0, changing what is included. It's often faster than "arguing" with the LLM in multi-step sessions.
(the above is for existing codebases. for vibe-coding one-off scripts, just go with the vibes, sometimes it works surprisingly well from a quick 2-3 lines prompt)
I've been going down to sonnet for coding over opus. maybe i am just writing dumb code
Opus is also way more expensive. (Don’t forget to switch back to Sonnet in all terminals)
This phrasing can be misleading and points to a broader misunderstanding about the nature of doctoral studies, which it has been influenced by the marketing and hype discourse surrounding AI labs.
The assertion that there is a defined "PhD-level knowledge" is pretty useless. The primary purpose of a PhD is not simply to acquire a vast amount of pre-existing knowledge, but rather to learn how to conduct research.
What sticks out to me is Gemini catching bugs before production release, was hoping you’d give a little more insight into that.
Reason being is that we expect ai to create bugs and we catch them, but if Gemini is spotting bugs by some way of it being a QA (not just by writing and passing tests) then that perks my interest.
It’s not like once you have a PhD anyone cares about the subject, right? The only thing that matters is that you learnt to conduct research.
Further, I always assumed PhD level of knowledge meant coming up with the right questions. I would say it is at best a "Lazy Knowledge Rich worker", it won't explore hypothesis if you don't *ask it* to. A PHD would ask those questions to *themselves*. Let me give you a simple example:
The other day Claude Code(Max Pro Subscription) commented out a bunch of test assertions as a part of a related but separate test suite it was coding. It did not care to explore — what was a serious bug — why it was commenting it out because of a faulty assumption in the original plan. I had to ask it to change the plan by doing the ultra-think, think-hard trick to explore why it was failing, amend the plan and fix it.
The bug was the ORM object had null values because it was not refreshed after the commit and was fetched before by another DB session that had since been closed.*
Anyway I don't think this is ""PhD-knowledge"" questions, but job related electrical engineering questions.
Another contender in the "big idea" reasoning camp: DeepSeek R1. It's much slower, but most of the time it can analyze problems and get to the correct solution in one shot.
I find it very sad that people who have been really productive without "AI" now go out of their way to find small anecdotal evidence for "AI".
But I would stake my very life on the fact that the movement by developers we call open-source is the single greatest community and ethos humanity has ever created.
Of course it inherits from enlightenment and other thinking, it doesn't exist in a vacuum, but it is an extension of the ideologies that came before it.
I challenge anyone to come up with any single modern subcultures that has tangibly generated more that touches more lives, moves more weight, travels farther, effects humanity more every single day from the moment they wake up than the open source software community (in the catholic sense obviously).
Both in moral goodness and in measurable improvement in standard of living and understanding of the universe.
Some people's memories are very short indeed, all who pine pine for who they imagined they were and are consumed by a memetic desire of their imagined selves.
good lord.
people who believe in open source don't believe that knowledge should be secret. i have released a lot of open source myself, but i wouldn't consider myself a "true believer." even so, i strongly believe that all information about AI must be as open as possible, and i devote a fair amount of time to reverse engineering various proprietary AI implementations so that i can publish the details of how they work.
why? a couple of reasons:
1) software development is my profession, and i am not going to let anybody steal it from me, so preventing any entity from establishing a monopoly on IP in the space is important to me personally.
2) AI has some very serious geopolitical implications. this technology is more dangerous than the atomic bomb. allowing any one country to gain a monopoly on this technology would be extremely destabilizing to the existing global order, and must be prevented at all costs.
LLMs are very powerful, they will get more powerful, and we have not even scratched the surface yet in terms of fully utilizing them in applications. staying at the cutting edge of this technology, and making sure that the knowledge remains free, and is shared as widely as possible, is a natural evolution for people who share the open source ethos.
The "race against China" is a marketing trick to convince senators to pour billions into "AI". Here is who is financing the whole bubble to a large extent:
LLMs are useful—but there’s no way such an innovation should be a “guarded secret” even at this early stage.
It’s like saying spreadsheets should have remained a secret when they amplified what people could do when they became mainstream.
Consider cilantro. I’m happy to admit there are people out there who don’t like cilantro. But it’s like the people who don’t like cilantro are inventing increasingly absurd conspiracy theories (“Redis is going to add AI features to get a higher valuation”) to support their viewpoint, rather than the much simpler “some people like a thing I don’t like”.
https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...
Not to mention in the study less than 1/2 have ever used it before the study.
There's an entire class of investment scammers that string along their marks, claiming that the big payoff is just around corner while they fleece the victim with the death of a thousand cuts.
If the AI is doing the coding then that is a threat to some people. I am not sure why, LLMs can be good and you can enjoy coding...those things are unrelated. The logic seems to be that if LLMs are good then coding is less fun, lol.
This post has nothing to do with Redis and is even a follow up to a post I wrote before rejoining the company.
I agree with this, but this is why I use a CLI. You can pipe files instead of copying and pasting.
Ex: Implementing a spec, responding to my review comments, adding wider unit tests, running a role play for usability testing, etc. The main time we do what he describes of manually copying into a web ide is occasionally for a better short use of a model, like only at the beginning of some plan generation, or debug from a bunch of context we have done manually. Like we recently solved some nasty GPU code race this way, using a careful mix of logs and distributed code. Most of our job is using Boring Tools to write Boring Code, even if the topic/area is neato: you do not want your codebase to work like an adventure for everything, so we invest in making it look boring.
I agree the other commenter said: I manage context as part of the skill, but by making the AI do it. Doing that by hand is like slowly handcoding assembly. Instead, I'm telling Claude Code to do it. Ex: Download and crawl some new dependency I'm using for some tricky topic, or read in my prompt template markdown for some task, or generate and self-maintain some plan.md with high-level rules on context I defined. This is the 80% case.
Maybe one of the disconnects is task latency vs throughput as trade-offs in human attention. If I need the LLM to get to the right answer faster, so the task is done faster, I have to lean in more. But my time is valuable and I have a lot to do. If rather spend 50% less of my time per task, even if the task takes 4x longer, by the LLM spinning longer. In that saved human time, I can be working on another task: I typically have 2-3 terminals running Claude, so I only check in every 5-15min.
We do this ~daily for:
* Multitier webapps
* DevOps infrastructure: docker, aws, ci systems, shell scripts, ...
* Analytics & data processing
* AI investigations (logs, SIEMs, ..) <--- what we sell!
* GPU kernels
* Compilers
* Docs
* Test amplification
* Spec writing
I think ~half the code happening by professional software engineers fits into these, or other vibes friendly domains. The stuff antirez does with databases seems close to what we do with compilers, GPU kernels, and infra.
We are still not happy with production-grade frontend side of coding, though by being strong on API-first design and keeping logic vs UI seperated, most of our UI code is friendly to headless.
Whether it's vibe coding, agentic coding, or copy pasting from the web interface to your editor, it's still sad to see the normalization of private (i.e., paid) LLM models. I like the progress that LLMs introduce and I see them as a powerful tool, but I cannot understand how programmers (whether complete nobodies or popular figures) dont mind adding a strong dependency on a third party in order to keep programming. Programming used to be (and still is, to a large extent) an activity that can be done with open and free tools. I am afraid that in a few years, that will no longer be possible (as in most programmers will be so tied to a paid LLM, that not using them would be like not using an IDE or vim nowadays), since everyone is using private LLMs. The excuse "but you earn six figures, what' $200/month to you?" doesn't really capture the issue here.
But the price we're paying (and I don't mean money) is very high, imho. We all talk about how good engineers write code that depends on high-level abstractions instead of low-level details, allowing us to replace third party dependencies easily and test our apps more effectively, keeping the core of our domain "pure". Well, isn't it time we started doing the same with LLMs? I'm not talking about MCP, but rather an open source tool that can plug into either free and open source LLMs or private ones. That would at least allow us to switch to a free and opensource version if the companies behind the private LLMs go rogue. I'm afraid tho that wouldn't be enough, but it's a starting point.
To put an example: what would you think if you need to pay for every single Linux process in your machine? Or for every Git commit you make? Or for every debugging session you perform?
There are open source tools that do exactly that already.
But every single post I read don't mention them?
Why would they?Does every single post about a Jetbrains feature mention that you can easily switch from Jetbrains to an open source editor like VS Code or vim?
Because the models are so much worse that people aren't using them.
Philosophical battles don't pay the bills and for most of us they aren't fun.
There have been periods of my life where I stubbornly persisted using something inferior for various reasons - maybe I was passionate about it, maybe I wanted it to exist and was willing to spend my time debugging and offer feedback - but there a finite number of hours in my life and often I'd much rather pay for something that works well than throw my heart, soul, time, and blood pressure at something that will only give me pain.
It just so happens that closed models are better today.
Has someone computed/estimated what is at cost $$$ value of utilizing these models at full tilt: several messages per minute and at least 500,000 token context windows? What we need is a wikipedia like effort to support something truly open and continually improving in its quality.
I have been building that for a couple of years now: https://llm.datasette.io
Yet JetBrains has been a business longer than some of my colleagues have been alive, and Microsoft’s Visual Basic/C++/Studio made writing software for Windows much easier, and did not come cheap.
You won’t be able to switch to a meaningful vim if you channel your support to closed source software, not for long.
Best to put money where the mouth is.
Can you make vim work roughly the same way? Probably you can get pretty close. But how many hours do I have to sink into the config? A lot. And suddenly the PyCharm license is cheap.
And it's exactly the same thing with LLMs. You want hand crafted beautiful code, untainted by AI? You can still do that. But I'm paid to solve problems. I can solve them faster/solve more of them? I get more money.
The reason I don't like those arguments is that they merge two orthogonal stuff: Solving problems and optimizing your tooling. You can optimize PyCharm just as much you can fiddle with Vim's config. And people are solving with problems with Vim just as you do with an IDE. It's just a matter of preference.
In my day job, I have two IDEs, VSCode, and Emacs open. I prefer Emacs to edit and git usage, but there's a few things that only the IDEs can do (as in I don't bother setting emacs to do the same), and VSCode is there because people get dizzy with the way I switch buffers in Emacs.
The alternative is to restrict yourself to “not as good” ones already now.
That problem does not even include lock-in, surveillance, IP theft and all other things that come with SaaS.
Local models are not quite there yet. For now, use the evil bad tools to prepare for the good free tools when they do get there. It's a self-correcting form of technical debt that we will never have to pay down.
Why do I have to prepare? Once the good free tools are available, it should just work no?
To be fair, change is not always good. We still haven't fixed fitness/obesity issues caused (partly) by the invention of the car, 150 years later. I think there's a decent chance LLMs will have the same effect on the brain.
I added a "disclosures" section to my own site recently, in case you're interested: https://simonwillison.net/about/#disclosures
Since when? It starts with computers, the main tool and it's architecture not being free and goes from there. Major compilers used to not be free. Major IDEs used to not be free. For most things there were decent and (sometimes) superior free alternatives. The same is true for LLMs.
> The excuse "but you earn six figures, what' $200/month to you?" doesn't really capture the issue here.
That "excuse" could exactly capture the issue. It does not, because you chose to make it a weirder issue. Just as before: You will be free to either not use LLMs, or use open-source LLMs, or use paid LLMs. Just as before in the many categories that pertain to programming. It all comes at a cost, that you might be willing to pay and somebody else is free to really does not care that much about.
There were and are a lot of non-free ones, but since the 1990s, GCC and interpreted languages and Linux and Emacs and Eclipse and a bunch of kinda-IDEs were all free, and now VS Code is one of the highest marketshare IDEs, and those are all free. Also, the most used and learned programming language is JS, which doesn't need compilers in the first place.
And my point is that's simply not the case. Different products have always been not free, and continue to be not free. Recent example would be something like Unity, that is not entirely free, but has competitors, which are entirely free and open source. JetBrain is something someone else brought up.
Again: You have local LLMs and I have every expectation they will improve. What exactly are we complaining about? That people continue to build products that are not free and, gasp, other people will pay for them, as they always have?
There's never been anything stopping you from building your own
Soon there will be. The knowledge of how to do so will be locked behind LLMs, and other sources of knowledge will be rarer and harder to find as a result of everything switching to LLM use
How are LLMs equivalent? People posting their prompts on bulletin boards at cafes?
You have very powerful open weight models, they are not the cutting edge. Even those you can't really run locally, so you'd have to pay a 3rd party to run it.
Also the competition is awesome to see, these companies are all trying hard to get customers and build the best model and driving prices down, and giving you options. No one company has all of the power, its great to see capitalism working.
Just like every other subscription model, including the one in the Black Mirror episode, Common People. The value is too good to be true for the price at the beginning. But you become their prisoner in the long run, with increasing prices and degrading quality.
259. Anthropic tightens usage limits for Claude Code without telling users (techcrunch.com)
395 points by mfiguiere 2 days ago | hide | 249 comments
https://news.ycombinator.com/item?id=44598254It isn’t specific to software/subscriptions but there are plenty of examples of quality degradation in the comments
If you feel it is stealing your life, then please feel free to reclaim your life at any time.
Leave the programming to those of us who actually want to do it. We don't want you to be a part of it either
I've been programming professionally since 2012 and still love it. To me the sweet spot must've been the early mid 2000s, with good enough search engines and ample documentation online.
I understand your frustration but the problem is mostly people. Not the particular skill itself.
Once it becomes economical to run a Claude 4 class model locally you'll see a lot more people doing that.
The closest you can get right now might be Kimi K2 on a pair of 512GB Mac Studios, at a cost of about $20,000.
And how they don't mind freely opening up their codebase to these bigtech companies.
I am on board to agree that pure LLM + pure original full code as context is the best path at the moment, but I’d love to be able to use some shortcuts like quickly applying changes, checkpoints, etc.
My persistent (and not unfounded?) worry is that all the major tools & plugins (Cursor, Cline/Roo) all play games with their own sub-prompts and context “efficiency”.
What’s the purest solution?
If your codebase fits in the context window, you can also just turn on "MAX" mode and it puts it all in the context for you.
IMHO those two variables are 10x (maybe 100x) more explanatory than any vibe coding setup one can concoct.
Anyone who is befuddled by how the other person {loves, hates} using LLMs to code should ask what kind of problem they are working on and then try to tackle the same problem with AI to get a better sense for their perspective.
Until then, every one of these threads will have dozens of messages saying variations of "you're just not using it right" and "I tried and it sucks", which at this point are just noise, not signal.
After all the effort getting to the point where the generated code is acceptable, one has to wonder, why not just write it yourself? The time spent typing is trivial to all the cognitive effort involved in describing the problem, and describing the problem in a rigorous way is the essence of programming.
You know, I would often ask myself that very question...
Then I discovered the stupid robots are good at designing a project, you ask them to produce a design document, argue over it with them for a while, make revision and changes, explore new ideas, then, finally, ask them to produce the code. It's like being able to interact with the yaks you're trying to shave, what's not to love about that?
When the change is very small, self-contained feature/refactor it can mostly work alone, if you have tests that cover the feature then it is relatively safe (and you can do other stuff because it is running in an action, which is a big plus...write the issue and you are done, sometimes I have had Claude write the issue too).
When it gets to a more medium size, it will often produce something that will appear to work but actually doesn't. Maybe I don't have test coverage and it is my fault but it will do this the majority of the time. I have tried writing the issue myself, adding more info to claude.md, letting claude write the issue so it is a language it understands but nothing works, and it is quite frustrating because you spend time on the review and then see something wrong.
And anything bigger, unsurprisingly, it doesn't do well.
PR reviews are good for small/medium tasks too. Bar is lower here though, much is useless but it does catch things I have missed.
So, imo, still quite a way from being able to do things independently. For small tasks, I just get Claude to write the issue, and wait for the PR...that is great. For medium (which is most tasks), I don't need to do much actual coding, just directing Claude...but that means my productivity is still way up.
I did try Gemini but I found that when you let it off the leash and accept all edits, it would go wild. We have Copilot at work reviewing PRs, and it isn't so great. Maybe Gemini better on large codebases where, I assume, Claude will struggle.
That said, are there tools that make going through a codebase easier for LLMs? I guess tools like Claude Code simply grep through the codebase and find out what Claude needs. Is that good enough or are there tools which keep a much more thorough view of the codebase?
I used a similar setup until a few weeks ago, but coding agents became good enough recently.
I don’t find context management and copy pasting fun, I will let GitHub Copilot Insiders or Claude Code do it. I’m still very much in the loop while doing vibe coding.
Of course it depends on the code base, and Redis may not benefit much from coding agents.
But I don’t think one should reject vibe coding at this stage, it can be useful when you know what the LLMs are doing.
antirez is a big fuggin deal on HN.
I’m sort of curious if the AI doubting set will show up in force or not.
Despite running up a $100 tab trying to get the LLMs to solve this problem, they are worthless. Yes, I still use them in the workflow within the web console chat, nonetheless, the project is hard and I'm flustered wracking my brain, but at least I don't have to worry about LLMs replacing me just yet.
* I thought about calling the library Playlite but went with something meaning the same as a puppeteer and a playwright, Cordyceps! :)
theodorewiles•3h ago
antirez•3h ago
dkdcio•3h ago
physicles•1h ago
afro88•1h ago
You can use an LLM to help document a codebase, but it's still an arduous task because you do need to review and fix up the generated docs. It will make, sometimes glaring sometimes subtle, mistakes. And you want your documentation to provide accuracy rather than double down on or even introduce misunderstanding.
Hasnep•3h ago
__MatrixMan__•2h ago
----
Several codebases I've known have provided a three-stage pipeline: unit tests, integration tests, and e2e tests. Each of these batches of tests depend on the creation of one of three environments, and the code being tested is what ends up in those environments. If you're interested in a particular failing test, you can use the associated environment and just iterate on the failing test.
For humans with a bit of tribal knowledge about the project, humans who have already solved the get-my-dev-environment-set-up problem in more or less uniform way, this works ok. Humans are better at retaining context over weeks and months, whereas you have to spin up a new session with an LLM every few hours or so. So we've created environments for ourselves that we ignore most of the time, but that are too complex to be bite sized for an agent that comes on the scene as a blank slate every few hours. There are too few steps from blank-slate to production, and each of them is too large.
But if successively more complex environments can be built on each other in arbitrarily many steps, then we could achieve finer granularity. As a nix user, my mental model for this is function composition where the inputs and outputs are environments, but an analogous model would be layers in a docker files where you test each layer before building the one on top of it.
Instead of maybe three steps, there are eight or ten. The goal would be to have both whatever code builds the environment, and whatever code tests it, paired up into bite-sized chunks so that a failure in the pipeline points you a specific stage which is more specific that "the unit tests are failing". Ideally test coverage and implementation complexity get distributed uniformly across those stages.
Keeping the scope of the stages small maximizes the amount of your codebase that the LLM can ignore while it works. I have a flake output and nix devshell corresponding to each stage in the pipeline and I'm using pytest to mark tests based on which stage they should run in. So I run the agent from the devshell that corresponds with whichever stage is relevant at the moment, and I introduce it to onlythe tests and code that are relevant to that stage (the assumption being that all previous stages are known to be in good shape). Most of the time, it doesn't need to know that it's working stage 5 of 9, so it "feels" like a smaller codebase than it actually is.
If evidence emerges that I've engaged the LLM at the wrong stage, I abandon the session and start over at the right level (now 6 of 9 or somesuch).
Keyframe•3h ago
exitb•2h ago
lubujackson•1h ago
victorbjorklund•2h ago