You wrote this like this is some rare occurrence, and not a description of a bulk of the production code that exists today, even at high level tech companies.
It sounds a lot like like the Murderbot character in the AppleTV show!
Maybe there’s genuine sentience there, maybe not. Maybe that text explains what’s happening, maybe not.
It would have been cool to see what prompt was used for that page!
I would go further and say it's _always_ fabricated. LLMs are no better able to explain their inner workings than you are able to explain which neurons are firing for a particular thought in your head.
Note, this isn't a statement on the usefulness of LLMs, just their capability. An LLM may eventually be given a tool to enable it to introspect, but IMO its not natively possible with the LLM architectures today.
An LLM that says "I said orcs are green because I recalled a scene in lord of the rings..." is fabricating*. An LLM that says "I talked about white genocide because my system prompt told me to" is very likely telling the truth because it can literally see the system prompt as it generates the output. Even though in the situation I'm referring to the system prompt was hidden from users. It's a logical conclusion from the combination of the system prompt and its previous output that that is why its previous output is what it is (that anyone could make with the same degree of confidence if they had access to the full buffer).
* Unless it's reading back from a <thinking> section of the buffer that was potentially hidden from the user.
LLMs have next to no understanding of their own internal processes. There's a significant amount of research that demonstrates this. All explanations of an internal thought process in an LLM are completely reverse engineered to fit the final answer (interestingly, humans are also prone to this – seen especially in split brain experiments).
In addition, the degree to which the author must have prompted the LLM to get it to anthropomorphize this hard makes the rest of the project suspect. How many of the results are repeated human prompting until the author liked the results, and how many come from actual LLM intelligence/analysis skill?
You're stuck on the anthropomorphize semantics, but that wasn't the purpose of the exercise.
and as the article said "an LLM who just spent thousands of words explaining why they're not allowed to use thousands of words", its just funny to read.
I don't want to be overly negative, but I think it's only fair given the author hasn't graced us with their own thoughts, instead offloading the actual writing to an LLM.
FireFox 113.0.2, how come?
It is good because it highlights the relevant aspects of the design and you can use this, plus some other resources, to replicate the idea.
It sees everything it needs to in one pass, no extra reasoning or instruction tokens around things like MCP that abstract and create hops to simple understanding of where things are at.
I was never a great terminal developer, I cant even type right - but Claude Code by far provides the best software engineering interface in there terms of LLM/agent UX.
triyambakam•1d ago
rane•1d ago
triyambakam•1d ago
ramoz•1d ago
SparkyMcUnicorn•1d ago
AiderDesk lets you save snapshots of a point in time, so I create "presets" to restore sets of context files and/or conversation history (you can restore one or both), which is a really nice bonus. You can also add/remove context as needed without the manual copy/pasting work when I forget to include something or accidentally included too much. Its VS Code extension makes adding/removing files from context seamless.
[0] https://github.com/hotovo/aider-desk
__mharrison__•1d ago
ramoz•1d ago
maleldil•1d ago
thegeomaster•1d ago
SatvikBeri•1d ago
In addition, after a 3 hour session I told it to create a CLAUDE.md that would help it program similarly to me, based on my preferences. I then edited that file a bit, and that has helped a lot.
rane•1d ago
Use the @ thing to prod it read some relevant files for context (kind of like with Aider)
maleldil•17h ago
I find it helps to have a CLAUDE.md file with instructions and thorough documentation. This is on a ~30k LOC Python codebase with type-checking and tests. YMMV with other languages.
SatvikBeri•1d ago
cedws•1d ago
mindwok•1d ago
gaut•1d ago
phillipcarter•1d ago
insane_dreamer•1d ago
Claude Code now also has a PyCharm plugin (and probably other JetBrains IDEs) that also shows you diffs in the pycharm editor.
scuff3d•1d ago
elliotec•1d ago
I’m building a non-trivial platform as a solo project/business and have been working on it since about January. I’ve gotten more done in two nights than I did in 3 months.
I’m sure there are tons of arguments and great points against what I just said, but it’s my current reality and I still can’t believe it. I shelled out the $100/mo after one night of blowing through the $20 credits I used as a trial.
It does struggle with design and front end. But don’t we all.
john2x•1d ago
petetnt•1d ago
Before those 3 months you mentioned, how much did you spend time coding on average (at work, or as a hobby) percentagewise?
elliotec•1d ago
I’m not sure how to answer the question on percentage of time coding. I quit my job as a director where coding wasn’t part of the job but have kept up on side stuff and architecture at work. Since the new year when I started this it’s been in bursts, some weeks or nights I’ll go super hard coding and others I’ll focus on other stuff. I go to conferences and study a lot on the subject of the industry so that’s what I do in bursts of the non-coding time.
I hired a virtual assistant to help with the non-coding things so lately it’s been a lot more.
In general I’d estimate at least 50% of my work on this thing since January has been coding but it’s really hard to gauge. Claude over the past 3 days has surpassed my personal coding productivity over the past 3 months though, if it wasn’t clear what I was saying.
scuff3d•1d ago
What I've seen is people feel more productive, until the reality of all the subtle problems start to set in. Even skilled engineers usually only end up with 10 or 20% productivity gains by the time they reduce its usecase to where it's actually not total dog shit, or by the time they go back around and fix all the problems.
The highest quality product I know of where the creator has talked about his use of AI is ghostty, and he's not claiming massive improvements, just that it's definitely helpful.
elliotec•1d ago
Hopefully it’s obvious that Claude will not have simply written the entire thing but you might get a sense of what it can do quickly as part of a whole - maybe similar to your last sentence but I suppose I am claiming massive improvements (in productivity, no warranty on quality yet).
Also keep in mind I’m entirely solo here. I fully agree with your points that the proof is in the pudding and obviously there’s nuance to all of it. But yeah, I’m not exaggerating with my commentary above.
scuff3d•1d ago
And how much time would you say you spend wrangling the AI, meaning either repromting or substantially editing what you get back?
shostack•1d ago