How do your abilities compare to the newest models?
edit: Please specify which model and version you are talking about.
How do your abilities compare to the newest models?
edit: Please specify which model and version you are talking about.
So this AI Agent... It is much faster at doing code when given specific instructions. But it keeps loosing context on architecture, and i cant really let it build complex things with interdependencies that build on each other. At times it feels like doing pair programming with a guy who is so crazy fast that im left behind with my head spinning, wondering how we just jumped from a hello world to a working thing that would have taken me ten iterations. And i get a bad feel when i then wonder how is this app doing what it does? because my agent cant explain it, and i would be stupid to believe what it hallucinated because it sounds really solid until you scratch the construction.
At the beginning i was almost euphoric about my new friend, now im sometimes disappointed, sometimes confused, but i am learning to give better, more concise instructions, to do smaller development jumps. It is tempting to set a long haul goal and let it do. But, i think for now, even if it is much faster at the small things, it would be also faster to build a catastrophic spaghetti code nightmare if not used with great care.
Are you using Sonnet 4.6?
> So this AI Agent... It is much faster at doing code when given specific instructions. But it keeps loosing context on architecture, and i cant really let it build complex things with interdependencies that build on each other.
I've only built small things (< 1000 lines) with the systems, so I might be missing this problem.
Is it better than you at building small self-contained things?
> And i get a bad feel when i then wonder how is this app doing what it does? because my agent cant explain it, and i would be stupid to believe what it hallucinated because it sounds really solid until you scratch the construction.
Do you ask it to generate test suites for the things that it builds?
> it would be also faster to build a catastrophic spaghetti code nightmare if not used with great care.
noted
Is it bad at designing systems that don't have a bunch of integrations?
If the task is simple, I spend more time telling it what to do than doing it myself. But if the task is complex, I use certain skills/commands and create intermediate files (more than necessary) between each step (analysis, planning, design, workflow, and implementation) and clear the context between each of them. The result is fairly accurate, but not perfect.
My take is, we remain the architects of our code, and AI agents are an excellent tool that we need to master.
Generally speaking, if the problem is common, the model has likely already been trained to solve it.
If it's truly complex and/or specific to my needs, I can try using a reasoning model to think through a solution before moving on to implementation.
I use the agent to conduct research, find resources to understand the complexity, best practices, feedback, etc., and to write a Markdown analysis file on the topic.
Then I can use this file as a basis to precisely define what I want to do and brainstorm with the agent in thinking mode. The more the task is described and defined, the more accurate the result will be.
drrob•1h ago
par1970•1h ago
drrob•1h ago
It kept producing code that looked to eye that it might work, but each time I ran it it would just throw schoolboy exceptions. I got tired of telling it to correct the things it kept forgetting to check for (nulls, path starts, empty lists), and just coded it from scratch myself.
I find ChatGpt is like pair-programming with a junior, except I'm not getting paid to coach them like I would if it were an actual graduate hire.
chiengineer•1h ago
Learn how to prompt better you'll be fine
drrob•1h ago
multidude•1h ago
Given a long haul goal with instructions and everything they will reinvent the wheel four times and one of those you will get a square. Reminds me of that monkey paw wish thing. You look at your finished app. Looks beautiful, but its inner workings are a ball of confusion.