But, hey; it worked on me. I'm going to try it, since I've been looking for exactly this.
Of course, the question is price. It's "free during beta".
The one thing we dont see news stories of is how open source has finally crushed that big evil corp because ai-multiplier times many open source contributors is Unlimited Power (tm)
write any project that has a lot of interactive 'dialogues', or exacting and detailed comments, eats a lot of tokens.
My record for tapping out the Claude Max API quickly was sprint-coding a poker solver and accompanying web front end w/ Opus. The backend had a lot of gpgpu stuff going on , and the front end was extremely verbose w/ a wordy ui/ux.
For example “commit and push”
You can make it somewhat better by adding instructions in CLAUDE.md, but I did notice those instructions getting ignored from time to time unless you "remind it".
See for yourself: https://github.com/search?utf8=%E2%9C%93&q=%F0%9F%A4%96+Gene...
I have not needed multiple agents or using CC over an SSH terminal to run overnight. The main reason is that LLMs are not correct many times. So I still need time to test. Like run the whole app, or check what broke in CI (GitHub Actions), etc. I do not go through code line by line anymore and I organize work with tickets (sometimes they are created with CC too).
Both https://github.com/pixlie/Pixlie and https://github.com/pixlie/SmartCrawler are vibe coded (barely any code that I wrote). With LLMs you can generated code 10x than writing manually. It means you can also get 10x the errors. So the manual checks take some time.
Our existing engineering practices are very helpful when generating code through LLMs and I do not have mental bandwidth to review a mountain of code. I am not sure if we scale out LLMs, it will help in building production quality software. I already see that sometimes CC makes really poor guesses. Imagine many such guesses in parallel, daily.
edit: typo - months/weeks
I mean, it's probably the most linked YouTube video by a factor of 100x so it makes sense for it to be hardcoded in the model.
This genuinely isn't an attack, I just don't think you can? The AI isn't granted copyright over what it produces.
For code generated by an LLM the human user would likely be considered the author if you provided sufficient creative input, direction, or modification.
The level of human involvement matters, simply prompting "write me a function" might not be enough, but providing detailed specifications, reviewing, and modifying the output would strengthen the claim.
the Copyright, Designs and Patents Act 1988 (CDPA), Section 9(3) staes, "In the case of a literary, dramatic, musical or artistic work which is computer-generated, the author shall be taken to be the person by whom the arrangements necessary for the creation of the work are undertaken". This was written before LLM's existed, but recent academic literature has supported this position, https://academic.oup.com/jiplp/article/19/1/43/7485196?login...
However, a comparable situation was tested with Thaler v Comptroller-General, where courts emphasised that legal rights require meaningful human involvement, not just ownership of the AI system. - https://www.culawreview.org/journal/unlocking-the-canvas-a-l... and https://www.whitecase.com/insight-our-thinking/uk-supreme-co...
I do acknowledge there is uncertainty, and this is highlighted here in "The Curious Case of Computer-Generated Works under the Copyright, Designs and Patents Act 1988.", with "section 9(3): the section is either unnecessary or unjustifiably extends legal protection to a class of works which belong in the public domain" - https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4072004
Today, I think it's doubtful that a functional application can be entirely vibe coded without decent direction and modification, but I don't think that will always be the case.
for code it hasn't been challenged yet, but I find it doubtful they'd decide differently there
So far, the judge believe that training models on open source code is not a license violation as the code is public for anyone to read, but by "distribution or redistribution" (I assume, of the model's outputs?) it is still up for the court's decision whether that violate the terms of the license, among other laws.
The case is currently moved to Ninth Circuit without a decision in the district court, as there are other similar cases (such as Authors Guild's) and they wanted that the courts would offer a consistent rules. I believe one of the big delay in the case is in damages, which I think the plaintiff tried to ask for details of Microsoft's valuation of GitHub when it was acquired, as GitHub's biggest asset is the Git repositories and may provide a monetary value of how much each project is worth. Microsoft is trying to stall and not reveal this.
There was a much appealed case of a monkey taking a photo, where it was decided the photo was in the public domain.
https://en.wikipedia.org/wiki/Monkey_selfie_copyright_disput...
It boiled down to the creator not being a "legal person" and so could not hold copyright.
The real problem for software is where the line is for a "sufficient" transformation from the source material by a human to make it acquire copyright. You can write a Dickens' character derived novel and have copyright in it, but not gain control over those characters as Dickens described them.
Claim partial copyright without specifying clearly what exactly?
People sell annotated Bibles, or Shakespeare etc. You can transform it in to something that can acquire copyright, but it must have an artistic step.
This is a big thing in the fine art world as well, you can take inspiration, you can in some circumstances outright copy, but then you need to transform it sufficiently that it becomes your own art. People argue in front of judges about this stuff, of course.
Verne is a good example too, because if you print an English version, the translator acquires copyright in the translated version.
The "barely" part may be important and I would like to know what others are doing.
I get that people do it anyway but I guess it's kind of a grey-area because it's hard to tell after the fact that some snippet has been copied from SO.
Unless the product includes code licensed by others, then - like any other repo - I don't see any license issue here.
If you mean there is no insight as to whether licensed code is included, that's one of the constraints of vibe-coding (which people often confuse with AI-assisted coding).
Its the job of the user to check and curate the contributions as they would any third-party human input (eg. via prs). Again though - that's not an AI coding issue, but a human process decision.
If you tried to sue someone for copyright infringement based on code that an LLM generated for you, you'd be laughed out of court.
Use a hammer, you own the output. Use an intern, the intern does.
Of course if you're aren't a person you can't own anything.
They can say the code is in the public domain.
This is distinct from open source, yes, but in almost all cases less restricted than anything with a (open source or otherwise) license.
Take whatever you want and relicense it cause it doesn't belong to the "author"
Lolololololol "author"
The legal standards in the United States for software copyrights are Jaslow and Altai, known to Federal courts as SSO [0] and AFC [1], respectively.
These standards consider the overall structure of code as being copyrightable. This means that you can't just rename a bunch of variables and class names. The overall organization of the code is considered an arbitrary expression. Someone would be infringing on copyright if they took your Java code and converted it to Python with different class, variable and function names but kept the same relationships between classes and the same general structure.
So what does this have to do with LLMs? Well, if the author directed the code to be structured in a certain way, directed to create specific APIs, etc, then there is a legal argument that the author has at least copyright over the arbitrary and expressive decisions that were made while building a software system.
[0] https://en.wikipedia.org/wiki/Structure,_sequence_and_organi...
[1] https://en.wikipedia.org/wiki/Abstraction-Filtration-Compari...
Rewriting it to something sane would be harder and more time consuming than just writing a decent implementation upfront.
Edit: And more critically, a lot of work incoming for people that can teach software development.
But: if we reduce cost, then can we not add deterministic guardrails in software that are also maintained at LLM speed and cost? This is pretty much what I am trying to understand. Choice of Rust/TypeScript in my projects in very intentional and you may see why.
Now the critical point: what if my own quality is inferior to many others. I think we will have this issue a lot since LLMs can generated code at 10x speeds. The quality will be what the human operator sets. Or, maybe, more and more tools will adapt best practices as guardrails and do not give humans much option to steer away from them.
Reinventing this space but make it slow and expensive seems like it's not a serious business idea. I believe the business idea behind coding LLM SaaS is actually about looking into other corporations and seeing what they do and how.
"Oh hey <boss>, can you update the status page to say we can't really understand the code and don't have an eta but but we're trying to talk the ai into correcting whatever the issue is?"
Probably more that would irk me if I looked closely.
The biggest bottleneck for background agents is code review.
I'm building a tool that can give the first pass so the result of the background agent isn't garbage most of the time.
If people use Claude without a critical eye, our code bases will grow immensely.
Sounds like the baseline for programming in teams. People are more likely to write their own helpers or install dependencies than knowing what's already available in the repository.
And fwiw I use cc with a complex code base and an async workflow like the author describes daily.
Local git worktrees are way faster.
The painful thing for me right now is setting up the agent instructions, permissions and sandboxes varies for each of those use cases and the tooling around that is very rudimentary right now.
It's already great at spinning up 5+ agents working on different PRs, triggered by just @mentioning claude on any github issue.
EDIT: The command does it now, thanks! I tried if a few weeks ago and it didn't, so this is great.
That said, Terragon here (which is akin to Codex and Jules) are often "too autonomous" for my taste. Human in the loop is made more difficult -- commenting may not be enough: I can't edit the code from those because the workspace is so ephemeral and/or remote.
How are people just firing them off to build stuff with any confidence?
AI won't magically know your codebase unless it is pretty vanilla - but then you teach it. If it makes a mistake, teach it by adding a rule.
You have to confine the output space or else you quickly get whatever.
I added a web server ontop, so I can use Claudia from my phone now: https://github.com/getAsterisk/claudia/pull/216
For example: changing the type signatures of all functions in a module to pass along some extra state, a huge amount of work. I ended up reverting the changes and replacing the functionality with thread local storage (well, dynamically scoped variables).
So, definitely not a panacea, but still well worth the money.
I have yet to meet someone in meatspace who is competent and isn't of the opinion that LLM SaaS are fine for querying out a bit of bash or Python here and there, but not much else. Sometimes people or corporations on the Internet are like 'wow, our entire repo is vibbed, so cool' and when I go look it's as if a bootcamped intern wrote it unsupervised but with a lot more Markdown I'm not going to look at, or a toy or a simple tool gluing some libraries.
One of the next features I'm expecting wrappers to add on top is auto-translation. In many work contexts it makes more sense to translate what the user said to English, process that and translate the answer back than ask the model to speak the language natively.
I easily double my output with very little effect on quality. It does require constant supervision though, I don’t see AI producing professional quality code by any other mean than human supervision, for a looooong time.
So, it's kinda both. Terragon works on separate tasks in parallel, Claude Code farms out subtasks and sometimes also in parallel.
yes
I'm hoping this will commodify the BaaS sector which is good news if you're competing on efficiency and functionality as opposed to relying on network effects.
I guess a lot of folks are vibe coding 90%+ their side hustles to the point they require 5 instances of CC, each running 8 subagents?
Anyway, lightly discussed at the time https://news.ycombinator.com/item?id=44193933
You aren't a programmer. The basic nature of your work is not serving users with better software, but buying $1000 of currency for $200.
It’s easy to see how trivial it will be to have an adwords type operation where output tokens are replaced with paid values.
For example:
this_varable_is_sponsored_by_starbucks = 42
Unfortunately, I can't say the same for other types of tests like E2E tests. It makes sense why: CC doesn't have access to all of the context to determine what's going wrong. During an E2E test suite run it can't pause to look into the console, view whats on the page, look at the backend logs, etc. I tried writing some utilities to snapshot and log the state of parts of the page when a test failed and this did help a bit, but not enough to get the same productivity boost for other types of tests.
Has anyone had any luck with this? Any strategies to share?
I built a tool that understands the codebase and can give the agent the first pass. So you don't need to spend mental bandwidth to review mountains of code.
If anyone is open to trying - email is in profile
0_gravitas•6mo ago
ipnon•6mo ago