I've personally found this is where AI helps the most. I'm often building pretty sophisticated models that also need to scale, and nearly all SO/Google-able resources tend to be stuck at the level of "fit/predict" thinking that so many DS people remain limited to.
Being able to ask questions about non-trivial models as you build them, really diving into the details of exactly how certain performance improvements work and what trade offs there are, and even just getting feed back on your approach is a huge improvement in my ability to really land a solid understanding of the problem and my solution before writing a line of code.
Additionally, it's incredibly easy to make a simple mistake when modeling a complex problem and getting that immediate feedback is a kind of debugging you can otherwise only get on teams with multiple highly-skill people on them (which at a certain level is a luxury reserved only for people working a large companies).
For my kind of work, vibe-coding is laughably awful, primarily because there aren't tons of examples of large ML systems for the relatively unique problem you are often tasked with. But avoiding mistakes in the initial modeling process feels like a super power. On top of that, quickly being able to refactor early prototype code into real pipelines speeds up many of the most tedious parts of the process.
Regardless, I do find that o3 is great at auditing my plans or implementations. I will just ask "please audit this code" and it has like a 50% hit rate on giving valuable feedback to improve my work. This feels like it has a meaningful impact on improving the quality of the software that I write, and my understanding of its edge cases.
The writing part was never the bottleneck to begin with...
Figuring out what to write has always been the bottleneck for code
AI doesn't eliminate that. It just changes it to figuring out if the AI wrote the right thing
So having an AI doing the dangerous part of thinking, leaves humans to do what they do best: follow orders.
Even better AI will take on the responsibility when anything fails: just get the AI to fix it, after all AI coded the mistake.
An individual consumer doesn't derive any benefit from companies missing out on automation opportunities.
Would you prefer to buy screws that are individually made on a lathe?
They weren’t cheap soups, but they sure were good.
A high end soup and an affordable soup might be serving two different markets.
I personally think a far more likely scenario is that small businesses of one or a few people become vastly more commonplace. They will be able to do a lot more by themselves, including with less expertise in areas they may not have a lot of knowledge in. I don't think regular employees today should see LLMs as competition, rather they should see it as a tool they can use to level the playing field against current CEOs.
LLMs aren't some magic silver bullet to elevate people out of poverty. Lack of access to capital is an extreme restriction on what most people can actually accomplish on their own, it doesn't matter if they have the worlds best LLM helping them or not.
It doesn't matter if you use an LLM to build the most brilliant business in the world if you can't afford to buy real world things to do real world business
Also, Historically when regular people decide to level the playing field against the ultra wealthy, they use violence
I don't think anyone should be expecting LLMs to be the great equalizer. The great equalizer has always been violent and probably always will be violence.
I don't think that was your point but pressed screws got way better properties than cut screws.
There will not be a "quality" dial that you get to tweak to decide on your perfect quality of soup. There will be graduations, and you will be stuck with whatever the store provides. If you want medium quality soup, but the store only carries 3 brands of soup (because unlike in your utopia somebody actually has to maintain an inventory and relationships with their supply chain) and your favourite brand decides to bottom out their quality. It's not "good actually" because of economic whatever. Your soup just sucks now.
Oh but "the market will eventually provide a new brand" is a terrible strategy when they start spicing the soup with lead to give it cinnamon flavor or whatever.
I'm not an ethereal being. I'm a human, I need it to be good now. Not in theory land.
- consolidation, such that there are only a few different choices of soup
- a race to the bottom in quality
- poisoning
These are all possibilities under our current system, and we have mechanisms (laws and market competition) which limit the extent to which they occur.
What is it about extreme automation technology that you think will increase these prevalence of these issues? By what mechanisms will these issues occur more frequently (rather than less frequently), as production technology becomes more capable?
A lot of people think wealth inequality isn't a big deal, but I disagree. The more proportion of money a select few have in comparison to everyone else, the higher the likelihood those select few can mold society to their whim. Corruption thrives off of wealth inequality. Without it, it cannot exist.
This is a decent point, but you're describing the world we already live in, no? I mean, we already have significant automation, significant wealth inequality, significant ability for control of money to affect legislation and culture.
But we (in the US at least) generally have abundant access to food which is safe to eat.
> Corruption thrives off of wealth inequality. Without it, it cannot exist.
Corruption can exist without wealth inequality.
Consider a city where teachers and politicians earn the exact same salary (and the same ability to build wealth over time). You might think this eliminates the potential for corruption, but imagine a scenario where politicians heavily rely on teachers' unions for their election campaigns. In this case, politicians might make decisions that benefit the unions (e.g. no performance standards, lifetime tenure, long holidays, keeping underenrolled schools open) in exchange for their support, even if those decisions don't optimize for student achievement (or whatever else taxpayers want schools to promote).
> they start spicing the soup with lead to give it cinnamon flavor or whatever
Like, we all know lead is bad, and we all know that humans are unscrupulous, but at least the human putting lead in the soup knows they're being unscrupulous at that time (probably). For an AI it would just be an entirely favorable optimization.
We're going to find out how far we can trust them, and the limits of that trust will determine where the people need to be.
https://slatestarcodex.com/2014/07/30/meditations-on-moloch/
I want this idea to be drawn to an extreme where I can't buy soup or anything for that matter. Sure I will starve and die soon, but I feel the kind of burning the world will go through will be fun to watch. With tears of course.
“The factory of the future will have only two employees, a man and a dog. The man will be there to feed the dog. The dog will be there to keep the man from touching the equipment” – Warren G. Bennis.
For almost all businesses these days, distribution (getting customers to know about your product and have access to it) is much harder than actually creating the product.
There are gatekeepers to getting audience - whether it is is supermarkets, search engines, big players in the ad tech industry, well established websites, and traditional media. And they all see profitability of businesses benefiting from them for distribution as money left on the table.
This already exists.
> Just lease a fully-automated factory and a bunch of AI workers,
The current solution is not “fully automated”, but it can be fairly hands off for the owner.
> and you're instantly shipping and making money!
… but this is the hard part. Just because you can make soup doesn’t mean anyone will buy it.
Marketing and distribution are very real challenges in markets like this, and the players in those markets will put tremendous pressure on the owner via their costs such that the owner doesn’t make money unless they commit to producing a relatively high level of volume.
Source: My family members who have retail food product lines.
Every time I do something I add another layer of AI automation/enhancement to my personal dev setup with the goal of trying to see how much I can extend my own ability to produce while delivering high quality projects.
I definitely wouldn't say I'm 10x of what I could do before across the board but a solid 2-3x average.
In some respects like testing, it's perhaps 10x because having proper test coverage is essential to being able to let agentic AI run by itself in a git worktree without fearing that it will fuck everything up.
I do dream of a scenario where I could have a company that's equivalent to 100 or 1000 people with just a small team of close friends and trusted coworkers that are all using this kind of tooling.
I think the feeling of small companies is just better and more intimate and suits me more than expanding and growing by hiring.
Can you give some examples? What’s worked well?
The more you can do to tell the AI what you want via a “code-lint-test” loop, the better the results.
So we get code coverage without all the effort, it works well for well defined problems that can be verified with test.
- Using AI code gen to make your own dev tools to automate tasks. Everything from "I need a make target to automate updating my staging and production config files when I make certain types of changes" or "make an ETL to clean up this dirty database" to "make a codegen tool to automatically generate library functions from the types I have defined" and "generate a polished CLI for this API for me"
- Using Tilt (tilt.dev) to automatically rebuild and live-reload software on a running Kubernetes cluster within seconds. Essentially, deploy-on-save.
- Much more expansive and robust integration test suites with output such that an AI agent can automatically run integration tests, read the errors and use them to iterate. And with some guidance it can write more tests based on a small set of examples. It's also been great at adding formatted messages to every test assertion to make failed tests easier to understand
- Using an editor where an AI agent has access to the language server, linter, etc. via diagnostics to automatically understand when it makes severe mistakes and fix them
A lot of this is traditional programming but sped up so that things that took hours a few years ago now take literally minutes.
I worry that messing with the AI is the equivalent of tweaking my colour schemes and choosing new fonts.
- anything with good enough adoption is good enough (unless I'm an SME to judge directly)
- build something with it before considering a switch
- they're similar enough that what I learn in one will transfer to others
- everything sucks compared with 2-3 years from now; switching between "sucks" and "sucks+" will look silly in retrospect
I found this didn't take me very long. Try things in order of how popular they seem and keep notes on what you do and don't like.
I personally settled on Zed (because I genuinely like the editor even with the AI bits turned off), Copilot (because Microsoft gave me a free subscription as an active OSS dev) and Claude Sonnet (seems to be a good balance). Other people I work with like Claude Code.
Can you provide concrete details?
When I do projects in this realm, it requires significant discussion with the business to understand how reality is modeled in the database and data, and that info is required before any notion of "clean up" can be defined.
That just leaves the other 80-90% to do manually ;)
Our target deploy environment is K8S if that makes a difference. Right now I’m using mise tasks to run everything
That's a cloud subscription away!
With a good idea and good execution teams can be impressively small.
That is massively wrong, and frankly an insulting worldview that a lot of people on HN seem to have.
The secret is that some companies - usually ones focused on a single highly scalable technology product, and that don't need a large sales team for whatever reason - those companies can be small.
The majority of companies are more technically complex, and often a 1,000 person company includes many, many people doing marketing, sales, integrations with clients, etc.
In many companies, tech, whether good or bad, is not the majority of the workforce, nor is it necessarily the "core competency" of the company, even if they are selling technical products! A much bigger deal is often their sales and marketing, their brand, etc.
It's a bit simplified and idealized, but is actually fairly spot-on.
I have been using AI every day. Just today, I used ChatGPT to translate an app string into 5 languages.
[0] https://www.oneusefulthing.org/p/superhuman-what-can-ai-do-i...
I guess similar to my experience with the AI voice translation YouTube has, I’ve felt similar - I’d rather listen to the original voice but with translated subtitles than a fake voice.
What was useful, was that I could explain exactly what the context was, in both a technical and usability context, and it understood it enough to provide appropriate translations.
UPDATE: I went and verified it. The translation was absolutely perfect. Not sure what this means for translation services, but it certainly saved me several hundred dollars, and several days, just to add one label prompt to a free app.
Yes. And the sites that gives me a poorly translated text (which may or may not be translated by ai) with no means to switch to English is an immediate back-button.
Usually, and especially technical articles, poor/unreadable translations are identifiable within a few words. If the text seems like it could be interesting, I spend more time searching for the in-english button then I spent reading the text.
It can be plugged into your code forge and fully automated — you push the raw strings and get a PR with every new/modified string translated into every other language supported by your application.
I use its auto-translation feature to prepare quick and dirty translations into five languages, which lets you test right away and saves time for professional translators later — as they have told me.
If anyone is reading this, save yourself the time on AI bullshit and use Weblate — it's a FOSS project.
For my bulk translations, I have used Babbelon[0] for years. They do a great job. I wouldn't dream of replacing them entirely with ChatGPT.
What I would use ChatGPT for, is when I need to do a very minor change (like adding a simple prompt label to an app). Maybe just a few words.
Doing that through the translation service is crazy. They have a minimum price, and it can take a day or three to get the results. Since I work quickly, and most of my apps are free (for users; they usually cost me, quite a bit), changes can be a problem. Translations are a major "concrete galosh"[1].
With ChatGPT, I can ask, not only for a direct translation, but can also explain the context, so the translation is relevant to the implementation. That's a lot of work for bulk, but quite feasible for small "spot jobs."
As far as responding to reports of issues, that isn't always "black and white." For instance, I live in the US, and Spanish is basically a second US language. But it isn't just "Spanish." We have a dozen different variants, and proponents of each, can get very passionate about it.
For Spanish, I have learned to just use Castilian Spanish, most times. No one (except Spaniards) are completely happy, but it prevents too much bellyaching.
In some instances (like highly local sites), choosing a specific dialect may be advisable.
You verify by having a lot of friends, all over, who are true native speakers of languages. They usually aren't up for doing the translations, but are willing to vet the ones you do implement.
Localization is a huge topic, and maybe I'll write about it, sometime.
What happened it that it because the new norm, and the window were you could charge the work of 50 people for a team of 5 was short. Some teams cut the prices to gain marketshare and we were back to usual revenue per employee. At some point nobody thought of a CRUD app with a web UI as a big project.
It's probably what will happen here (if AI does gives the same productivity boost as langages with memory management and web frameworks): soon your company with a small team of friends will not be seen by anyone as equivalent to 100 or 1000 people, even if you can achieve the same thing of a company that size a few years earlier.
The question is what happens to developers. Will they quit the industry or move to smaller companies?
Reverse there afaict, enterprise + defense tech are booming. AI means get to do a redo + extension of the code automation era. It's fairly obvious to buyers + investors this time around so don't even need to educate. Likewise, in gov/defense tech, palantir broke the dam, and most of our users there have an instinctive allergic reaction to palantir+xai, so pretty friendly.
I do think the trend of the tiny team is growing though and I think the real driver were the laysoffs and downsizings of 2023. People were skeptical if Twitter would survive Elon's massive staff cuts and technically the site has survived.
I think the era of the 2016-2020 empire building is coming to an end. Valuing a manager on their number of reports is now out of fashion and theres now no longer any reason to inflate team sizes.
This morning I used Claude 4 Sonnet to figure out how to build, package and ship a Docker container to GitHub Container Registry in 25 minutes start to finish. Without Claude's help I would expect that to take me a couple of hours at least... and there's a decent chance I would have got stuck on some minor point and given up in frustration.
Transcript: https://claude.ai/share/5f0e6547-a3e9-4252-98d0-56f3141c3694 - write-up: https://til.simonwillison.net/github/container-registry
So far from what I've experienced AI coding agents automate away the looking things up on SO part (mostly by violating OSS licenses on Github). But that part is only bad because the existing tools for doing that were intentionally enshitified.
My vote for the unintentionally funniest company name. I wonder if they were aware when the landed on it, or if they were so deep in the process that it was too late to change course when they realized what they had done.
AI ended up being a convenient excuse for big tech to justify their layoffs, but Twitter already painted a story about how bloated some organizations were. Now that there is no longer any status in having 9,001 reports the pendulum has swing the other way - it's now sexy to brag about how little people you employ.
It is really nice to have that, it raises the floor on the skills I'm not good at.
Only if you squint. If you look at the quality of the site, it has suffered tremendously.
The biggest "fuck you" are phishers buying blue checkmarks and putting the face of the CEO and owner to shill scams. But you also have just extremely trash content and clickbaits consistently getting (probably botted) likes and appearing in the top of feeds. You open a political thread and somehow there's a reply of a bear driving a bicycle as the top response.
Twitter is dead, just waiting for someone to call it.
https://www.twz.com/news-features/u-s-has-attacked-irans-nuc...
and see for yourself if Twitter is dead.
It's a shame. Twitter used to be the undefeated king of breaking news.
2. 80% of the posts in that article-thingy are "no longer available".
I highly doubt human nature has changed enough to say that. It's just a down market.
...unless you're shoveling AI itself, I guess.
They were written before the advent of ChatGPT and LLMs in general, especially coding related ones, so the ceiling must be even greater now, and this is doubly true for technical founders, for LLMs aren't perfect and if your vibed code eventually breaks, you'll need to know how to fix it. But yes, in the future with agents doing work on your behalf, maybe your own work becomes less and less too.
Revenue per employee, to me, is an aside that distracts from the ideas presented.
Greg Isenberg has some of the best takes on this on X. He articulates the paradigm shift extremely well.. @gregisenberg — one example: https://x.com/gregisenberg/status/1936083456611561932?s=46)
Ahh yes, fantastic insights.
I doubt it, but maybe?
Tech companies forget that software is easy, the real world is hard. Computers are very isolated and perfect environments. But building real stuff, in meatspace, has more variables than anyone can even conceptualize.
and is also exactly what people want. Having and app is fine and maybe cool, but at some point what I want from my taxi company is to get in a real car with the person who is preferably not a murderer and drive somewhere. The app is not very valuable to me unless it somehow optimizes that desirable part of the exchange.
I've worked a few years in the enterprise now, and the same thing keeps popping up. Startups think they have some cool cutting-edge technology to sell, but we aren't buying technology. We will gladly pay you to take away some real life problem though, but that also means you have to own any problems with your software, since we will be paying for a service, not software.
Digital devices that track everything you do, that then generated so much data that the advertising actually got worse. Thereby the data was collected with the promise that the adverts would get more appropriate.
Now comes AI to make sense of the data and the training data (I.e., the internet) is being swamped with AI content so that the training data for AIs is becoming useless.
I wonder what is being invented to remove all the AI content from the training data.
I’m not sure where you are from, but this is not my perspective from Northern California.
1. Apps in general, and Uber in particular, have very much revolutionized the part-time work landscape via gig work. There are plenty of criticisms of gig work if/when people try to do it full time, but as a replacement for part time work, it’s incredible. I always try to strike up a conversation with my uber drivers about what they like about driving, and I have gotten quite a few “make my own schedule” and “earn/save for special things” (e.g., vacations, hobby items, etc.). Many young people I know love the flexibility of the gig apps for part-time work, as the pay is essentially market rate or better for their skill set, and they get to set their own schedule.
2. AirBnB has revolutionized housing. It’s easier for folks to realize the middle class dream of buying an house and renting it out fractionally (by the room). I’ve met several people who have spun up a a few of these. Related, mid-term rentals (e.g., weeks or months rather than days or years) are much easier to arrange now than they were 20 years ago. AirBnBs have also created some market efficiency by pricing properties competitively. Note that I think that many of these changes are actually bad (e.g., it’s tougher to buy a house where I am), but it’s revolutionary nonetheless.
We clearly live in two completely separate parts of the world. I'm from Denmark (where Uber ran away after being told they had to operate as a taxi company) and calling a taxi was never a problem for me. You called the dispatch, said roughly where you were, and they can by with a dude in a car who you then told where you wanted to go. By now the taxi companies have apps too, but the experience is roughly identical.
The prices suck, but that's not really a usability problem.
> Startups used to brag about valuations and venture capital. Now AI is making revenue per employee the new holy grail.
The corrected form is:
> Startups used to brag about valuations and venture capital. Now AI is making rate of revenue growth per employee the new holy grail.
Specifically, as with all growth capitalism, it is long-term irrelevant how much revenue each employee generates. The factor that is being measured is how much each employee increases the rate of growth of revenue. If a business is growing revenue at +5% YoY, then a worker that can increase that rate by 20% (to +6% YoY) is worth keeping; a worker that can only increase revenue by 5% contributed +0% YoY after the initial boost and will be replaced by automation, AI, etc. (This is also why tech won’t invest in technical debt: it may lower expenses, but those one-time efficiencies are typically irrelevant when increasing the rate of growth of income results in far more income than the costs of the debt.)
I've noticed across the board, they also spend A LOT of time getting all the data into LLMs so they can talk to them instead of just reading reports, like bro, you don't understand churn fundamentally, why are you looking at these numbers??
I talked to a friend recently who is a plastic surgeon. He told me about a young pretty girl came in recently with super clear ideas what she wanted to fixed.
Turns out she uploaded her pictures to an LLM and it gave her recommendations.
My friend told her she didn’t need any treatment at this stage but she kept insisting that the LLM had told her this and that.
I’m worried that these young folks just trust whatever these things tell them to do is right.
Adding another bit - the multi-modality brings them a step closer to us. Go ahead and use the physical whiteboard, then take a picture of it.
Probably just a matter of time before someone hooks up Excalidraw/Miro/Freeform into an LLM (MCPs FTW).
The whole point is that with LLMs, you can explore ideas as deeply as you want without tiring them out or burning social capital. You're conflating this with poor judgment about what to ask humans and when.
Taking 'bombard' literally is itself pretty asinine when the real point is about using AI to get thoroughly informed before human collaboration.
And if using AI to explore questions deeply is a sign you're 'not cut out for the work,' then you're essentially requiring omniscience, because no one knows everything about every domain, especially as they constantly evolve.
> You really haven't, since they can't just generate tokens at the rate and consistency of an LLM
Is wrong. It's not because they can't generate tokens at the rate and consistency of an LLM
It's because trying to offload your work onto your coworkers this way would make you a huge jerk
Whether it's because humans can't handle the pace or because it would make you a jerk to try: either way, you just agreed that humans can't/shouldn't handle unlimited questioning. That's precisely why LLMs are valuable for deep exploratory thinking, so when we engage teammates, we're bringing higher-quality, focused questions instead of raw exploration.
And you're also missing that even IF someone were patient enough to take every question you brought them, they still couldn't keep up with the pace and consistency of an LLM. My original point was about what teammates are 'willing to take', which naturally includes both courtesy limits AND capability limits.
This isn't really new though. We used to use search engines and language docs and stack overflow for this
Before that people used mailing lists and reference texts
LLMs don't really get me to answers faster than Google did with SO previously imo
And it still relies on some human having asked and answered the question before, so the LLM could be trained on it
To make my point, let me know when Stack Overflow has a post specifically about the nuances of your private codebase.
Or when Google can help you reason through why your specific API design choices might conflict with a new feature you're considering. Or when a mailing list can walk through the implications of refactoring your particular data model given your team's constraints and timeline.
LLMs aren't just faster search: they're interactive reasoning partners that can engage with your specific context, constraints, and mental models. They can help you think through problems that have never been asked before because they're unique to your situation. That's the 'deep exploratory thinking' I'm talking about.
The fact that you're comparing this to Stack Overflow tells me you're thinking about LLMs as glorified search engines rather than reasoning tools. Which explains why you think teammates can provide the same value: because you're not actually using the technology for what it's uniquely good at.
I know talking to an llm is not exactly parallel, but it's a similar idea, it's like talking to the guy with wikipedia instead of batting back and forth ideas and actually thinking about stuff.
:sigh:
So he missed out on the thing we should do when being together: talk and brainstorm, and he didn't help with anything meaningful, because he didn't grasp the requirements.
Exactly the approach I'm taking with Tunnelmole, which as of right now is still a one person company with no investors.
I focused on coding, which I'm good at. I'm also reasonably good at content writing, I have some articles on Hackernoon before the age of AI.
So far AI has helped with
- Marketing ideas and strategies
- General advice on setting up a company
- Tax stuff, i.e what are my options for paying myself
- The logo. I used stable diffusion and an anime art model from CivitAI, had multiple candidates made, chose one, then did some minor touch ups in Gimp
I'm increasingly using it for more and more coding tasks as it gets better. I'll generally use it for anything repeatable and big refactors.
One of the biggest things coding wise working alone is Code Review. I don't have human colleagues at Tunnelmole who can review code for me. So I've gotten into the routine of having AI review all my changes. More than once, bugs have been prevented from being deployed to prod using this method.
It's ushering in a new era of valley bullshit. If only journalists tried to falsify their premise before blindly publishing it.
> Jack Clark whether AI’s coding ability meant “the age of the nerds” was over.
When was the "age of the nerds" exactly? What does that even mean? My interpretation is that it means "is the age of having to pay skilled programmers for quality work over?" Which explains Bloomberg's interest.
> “I think it’s actually going to be the era of the manager nerds now,” Clark replied. “I think being able to manage fleets of AI agents and orchestrate them is going to make people incredibly powerful.”
And they're all going to be people on a subscription model and locked into one particular LLM. It's not going to make anyone powerful other than the owner class. This is the worst type of lie. They don't believe any of this. They just really really hate having to pay your salary increases every year.
> AI is sometimes described as providing the capability of “infinite interns.”
More like infinite autistic toddlers. Sure. It can somehow play a perfect copy of Chopin after hearing it once. Is that really where business value comes from? Quickly ripping other people off so you can profit first?
The Bloomberg class I'm sure is so thrilled they don't even have the sense to question any of this self serving propaganda.
As much as I am definitely more productive when it comes to some dumb "JSON plumbing" feature of just adding a field to some protobuf, shuffling around some data, etc, I still can't quite trust it to not make a very subtle mistake or have it generate code that is in the same style of the current codebase (even using the system prompt to tell it as such). I've had it make such obvious mistakes that it doubles down on (either pushing back or not realizing in the first place) before I practically scream at it in the chat and then it says "oopsie haha my bad", e.g.
```c++
class Foo
{
int x_{};
public:
bool operator==(Foo const& other) const noexcept { return x_ == x_; // <- what about other.x_? }
};
```
I just don't know at this point how to get it (Gemini or Claude or any of the GPT) to actually not drop the same subtle mistakes that are very easy to miss in the prolific amount of code it tends to write.
That said, saying "cover this new feature with a comprehensive test suite" saves me from having to go through the verbose gtest setup, which I'm thoroughly grateful for.
If they are 1099 they aren’t part of the team, right?
kjhughes•7mo ago