This matters a lot to us because the difference in performance of our workflows can be the difference in $10/day in costs and $1000/day in costs.
Just like TFA stresses, it’s the expertise in the team that pushes back against poor AI-generated ideas and code that is keeping our business within reach of cash flow positive. ~”Surely this isn’t the right way to do this?”
- accountability
- reliability
- validation
- security
- liability
Humans can reliably produce text with all of these features. LLMs can reliably produce text with none of them.
If it doesn't have all of these, it could still be worth paying for if it's novel and entertaining. IMO, LLMs can't really do that either.
these are the kinds of people that can use generative AI best IMO. Deep domain knowledge is needed to spot when the model output is wrong even though it sounds 100% correct. I've seen people take a model's output as correct to a shocking degree like placing large bets at a horse track after uploading a pic of the schedule to ChatGPT. Many people believe whatever a computer tells them but, in their defense, no one has had to question a large calculation done by a calculator until now.
That said, I use Antigravity with great success for self hosted software. I should publish it.
Why haven't I?
* The software is pretty specific to my requirements.
* Antigravity did the vast amount of work, it feels unworthy?
* I don't really want a project, but that shouldn't really stop me pushing to a public repo.
* I'm a bit hesitant to "out" myself?
Nonetheless, even though I'm not the person, I'm surprised there isn't more evidence out there.
https://github.com/schoenenbach/thermal-bridge https://thermal-bridge.streamlit.app/
Here are several real stories I dug into:
"My brick-and-mortar business wouldn't even exist without AI" --> meant they used Claude to help them search for lawyers in their local area and summarize permits they needed
"I'm now doing the work of 10 product managers" --> actually meant they create draft PRD's. Did not mention firing 10 PMs
"I launched an entire product line this weekend" --> meant they created a website with a sign up, and it shows them a single javascript page, no customers
"I wrote a novel while I made coffee this morning" --> used a ChatGPT agent to make a messy mediocre PDF
The content of the tweets isn't the thing.. bull-posting or invoking Cunningham's Law is. X is the destination for formula posting and some of those blue checkmarks are getting "reach" rev share kickbacks.
“I used AI to make an entire NES emulator in an afternoon!” —-> a project that has been done hundreds of times and posted all over github with plenty of references
It is really good in doing this.
those ideas are like UI experiments or small tools helping me doing stuff.
Its also super great in ELI5'ing anything
Welcome to the internet
There was a story years ago about someone who made hundreds of novels on Amazon, in aggregate they pulled in a decent penny. I wonder if someone's doing the same but with ChatGPT instead.
“yes it will”, “no it won’t” - nobody really knows, it's just a bunch of extremely opinionated people rehashing the same tired arguments across 800 comments per thread.
There’s no point in talking about it anymore, just wait to see how it all turns out.
Doesn't help that no one talks about exactly what they are doing and exactly how they are doing it, because capitalism vs open technology discussions meant to uplift the species.
1) The prompts/pipelines portain to proprietary IP that may or may not be allowed to be shown publically.
2) The prompts/pipelines are boring and/or embarrassing and showing them will dispel the myth that agentic coding is this mysterious magical process and open the people up to dunking.
For example in the case of #2, I recently published the prompts I used to create a terminal MIDI mixer (https://github.com/minimaxir/miditui/blob/main/agent_notes/P...) in the interest of transparency, but those prompts correctly indicate that I barely had an idea how MIDI mixing works and in hindsight I was surprised I didn't get harrassed for it. Given the contentious climate, I'm uncertain how often I will be open-sourcing my prompts going forward.
The results (for me) are very much hit-and-miss and I still see it as a means of last resort rather than a reliable tool that I know the up and downsides of. There is a pretty good chance you'll be wasting your time and every now and then it really moves the needle. It is examples like yours that actually help to properly place the tool amongst the other options.
1) the code AI produces is full of problems, and if you show it, people will make fun of you, or
2) if you actually run the code as a service people can use, you'll immediately get hacked by people to prove that the code is full of problems.
2) there are plenty of services which do not require state or login and can't be hacked. So still plenty of use cases you can explore. But yes i do agree that Security for production live things are still the biggest worry. But lets be honest, if you do not have a real security person on your team, the shit outthere is not secure anyway. Small companies do not know how to build securely.
There's also the lessons on the recent shitstorms in the gaming industry, with Sandfall about Expedition 33's use of GenAI and Larian's comments on GenAI with concept art, where both received massive backlash because they were transparent in interviews about how GenAI was (inconsequentially) used. The most likely consequence of those incidents is that game developers are less likely on their development pipelines.
If your hand is good, throw it down and let the haters weep. If you scared to show your cards, you don't have a good hand and you're bluffing.
However, I'm not nearly organized enough to save all my prompts! I've tried to do it a few times for my own reference. The thing is, when I use Claude Code, I do a lot of:
- Going back and revising a part of the conversation and trying again—sometimes reverting the code changes, sometimes not.
- Stopping Claude partway through a change so I can make manual edits before I let Claude continue.
- Jumping between entirely different conversation histories with different context.
And so on. I could meticulously document every action, but I find it gets in the way of experimentation. It's not entirely different from trying to write down every intermediate change you make in your code editor (between actual VCS commits). I guess I could record my screen, but (A) I promise you don't actually want to watch me fiddle with Claude for hours and (B) it would make me too self-conscious.
It would be very cool to have a tool that goes through Claude's logs and exports some kind of timeline in a human-readable format, but I would need it to be automated.
---
Also, if you can't tell from the above, my use of Claude is very far from "type a prompt, get a finished program." I do a lot of work in order to get useful output. I happen to really enjoy coding this way, and I've gotten great results, but it's not like I'm typing a prompt and then taking a nap.
You nailed it. Prompting is dull and self evident. Sure, you need basic skills to formulate a request. But it’s not a science and has nothing to do with engineering.
A roulette wheel can still be useful if enough of the outcomes are wins, but that doesn't make it something you can get better at...
Edit: I don't find it "dull" though, possibly because I like writing. And, I suppose there is some skill to being able to describe what you want precisely enough for the AI to (hopefully) follow your instructions.
Of course you can also get rich selling scams.
My wife, who has no clue about coding at all, chatgpted a very basic android app only with guidance of chatgpt. She would never ever been able to do this in 5 hours or so without my guidance. I DID NOT HELP HER at all.
I'm 'vibecoding' stuff small stuff for sure, non critical things for sure but lets be honest, i'm transforming a handfull of sentences and requirements into real working code, today.
Gemini 3 and Claude Opus 4.5 def feel better than their prevous versions.
Do they still fail? Yeah for sure but thats not the point.
The industry continues to progress on every single aspect of this: Tooling like claude CLI, Gemini CLI, Intellij integration, etc., Context length, compute, inferencing time, quality, depth of thinking etc. there is no current plateau visible at all.
And its not just LLMs, its the whole ecosystem of Machine Learning stuff: Highhly efficient weather model from google, Alpha fold, AlphaZero, Roboticsmovement, Environment detection, Image segmentation, ...
And the power of claude for example, you will only get with learning how to use it. Like telling it your coding style, your expectations regarding tests etc. We often assume, that an LLM should just be the magic work collegue 10x programmer but its everything an dnothing. If you don't communicate well enough it is not helpful.
And LLMs are not just good in coding, its great in reformulating emails, analysing error messages, writing basic SVG files, explaining kubernetes cluster status, being a friend for some people (see character.ai), explaining research paper, finding research, summarizing text, the list is way to long.
Alone 2026 there will go so many new datacenters live which will add so much more compute again, that the research will continue to be faster and more efficient.
There is also no current bubble to burst, Google fights against Microsoft, Antrophic and co. while on a global level USA competets with China and the EU on this technology. The richest companies on the planet are investing in this tech and they did not do this with bitcoins because they understod that bitcoin is stupid. But AI is not stupid.
Or Machine learing is not stupid.
Do not underestimate the current status of AI tools we have, do not underestimate the speed, continues progress and potential exponential growth of this.
My timespan expecation for obvious advancments in AI is 5-15 years. Experts in this field predict already 2027/2030.
But to iterate over this: a few years ago no one would have had a good idea how we could transform basic text into complex code in such a robust way, which such diverse input (different language, missing specs, ...) . No one. Even 'just generating a website'.
If AI was so good today, why isn't there an explosion of successful products? All we see is these half baked "zomg so good bro!" examples that are technically impressive, but decisively incomplete or really, proof of concepts.
I'm not saying LLMs aren't useful, but they're currently completely misrepresented.
Hype sells clicks, not value. But, whatever floats the investors' boat...
Sitting 2 hours with an Ai agent developing end to end products does.
It does not matter if they get the details wrong, its just that it needs to be vague enough, and exciting enough. Infact vagueness and not sharing the code part signals they are doing something important or they are 'in the know' which they cannot share. The incentives are totally inverted.
But I would never sit down to convince a person who is not a friend. If someone wanted me to do that, I'd expect to charge them for it. So the guys who are doing it for free are either peddling bullshit or they have some other unspecified objective and no one likes that.
We're drowning in tweets, posts, news... (way more than anyone can reasonably consume). So what rises to the top? The most dramatic, attention-grabbing claims. "I built in 1 hour what took a team months" gets 10k retweets. "I used AI to speed up a well-scoped prototype after weeks of architectural thinking" gets...crickets
Social platforms are optimized for engagement, not accuracy. The clarification thread will always get a fraction of the reach of the original hype. And the people posting know this.
The frustrating part is there's no easy fix. Calling it out (like this article does) get almost no attention. And the nuanced followup never catches up with the viral tweet.
> If you're confident that you know how to securely configure and use Wireguard across multiple devices then great
https://news.ycombinator.com/item?id=46581183
What happened to your overconfidence in LLMs ability to help people without previous experience do something they were unable to before?
dcre•1h ago
irishcoffee•21m ago
Masks during Covid and LLMs, used as political pawns. It’s kind of sad.