Why "just prompt better" doesn't work

https://www.bicameral-ai.com/blog/tech-debt-meeting

29•jinkuan•2h ago

Comments

0xbadcafebee•1h ago

  In summary, the user research we have conducted thus far uncovered the central tension that underlies the use of coding assistants:
      1.  Most technical constraints require cross-functional alignment, but communicating them during stakeholder meetings is challenging due to context gap and cognitive load
      2.  Code generation cannibalizes the implementation phase where additional constraints were previously caught, shifting the burden of discovery to code review — where it’s even harder and more expensive to resolve
  
  How to get around this conundrum? The context problem must be addressed at its inception: during product meetings, where there is cross-functional presence and different ideas can be entertained without rework cost. If AI handles the implementation, then the planning phase has to absorb the discovery work that manual implementation used to provide.

They're emphasizing one thing too much and another not enough.

First, the communication problem. Either the humans are getting the right information and communicating it, or they aren't. The AI has nothing to do with this; it's not preventing communication at all. If anything, it will now demand more of it, which is good.

Second, the "implementation feedback". Yes, 'additional constraints' were previously encountered by developers trying to implement asinine asks, and would force them to go back and ask for more feedback. But now the AI goes ahead and implements crap. And this is perfectly fine, because after it churns out the software in a day rather than a week, anyone who tries to use the software will see the problem, and then go back and ask for more detail. AI is making the old feedback loop faster. It's just not at implementation-time anymore.

noduerme•1h ago

Well, that or it's taking a situation where the client didn't understand the software but the dev did, and turning it into a situation where no one understands what anyone's babbling about at a meeting.

How do you explain the constraints to the stakeholders if you didn't try to solve them yourself and you don't fully understand why they are constraints?

[edit] Just to add to this thought: It might be more useful to do the initial exploratory work oneself, to find out what's involved in fulfilling a request and where the constraints are, and then ask an AI to summarize that for a client along with an estimate of the work involved. Because to me, the pain point in those meetings is getting mired in explaining technical details about asynchronous operational/code processes or things like that, trying to convey the trade-offs involved.

noduerme•1h ago

I found this interesting:

>> Small decisions have to be made by design/eng based on discovery of product constraints, but communicating this to stakeholders is hard and time consuming and often doesn’t work.

This implies that a great deal of extraneous work and headaches result from the stakeholders not having a clear mental model of what they need software to do, versus what is either secondary or could be disposed of with minor tweaks to some operational flow, usage guidance, or terms of service document. In my experience: Even more valuable than having my own mental model of a large piece of software, is having an interlocutor representing the stakeholders and end users, who understands the business model completely and has the authority to say: (A) We absolutely need to remove this constraint, or (B) If this is going to cost an extra 40 hours of coding, maybe we can find a workflow on our side thet gets around it - or find a shortcut, and shelve this for now so you can move on with the rest of the project.

Clients usually have a poor understanding of where constraints are and why some seemingly easy problems are very hard, or why some problems that seem hard to them are actually quite easy. I find that giving them a clear idea of the effort involved in each part of fulfilling a request often leads to me talking to someone directly who can make a call as to whether it's actually necessary.

colechristensen•25m ago

It's almost always more productive for stakeholders to argue about the current solution that exists rather than the hypothetical one you're going to build.

charcircuit•1h ago

>it’s that it can’t refuse to write bad ones

It can. It totally is able to refuse and then give me options for how it thinks it should do something.

refactor_master•32m ago

I think it's meant to be taken more in the abstract. Yes, LLM can refuse your request, and yes you can ask it to prepend "have you checked that it already exists?", but it can't directly challenge your super long-range assumptions the same way as another person saying at the standup "this unrelated feature already does something similar, so maybe you can modify it to accomplish both the original goal and your goal", or they might say "we have this feature coming up, which will solve your goal". Without proper alignment you're just churning out duplicate code at a faster rate now.

siriusastrebe•59m ago

Can this be solved by a question answer session?

You ask the coding assistant for a brand new feature.

The coding assistant says, we have two or three or four different paths we could go about doing it. Maybe the coding assistant can recommend a specific one. Once you pick the option, the coding assistant can ask more specific questions.

The database looks like this right now, should we modify this table which would be the simplest solution, or create a new one? If you will in the future want a many-to-one relationship for this component, we should create a new table and reference it via a join table. Which approach do you prefer?

What about the frontend, we can surface controls for this in on our existing pages, however for reasons x, y, and z I'd recommend creating a new page for the CRUD operations on this new feature. Which would you prefer?

Now that we've gotten the big questions squared away, do you want to proceed with code generation, or would you like to dig deeper into either the backend or the frontend implementation?

jaggederest•41m ago

You're describing existing behavior of codex and claude at the moment, for what it's worth. They don't always catch every edge case (or even most) in depth or discuss things thoroughly, depending on the prompt, but if you say "ask questions and be sure to clarify any ambiguity or technical issues" they'll run right through many of the outstanding concerns.

wmeredith•31m ago

This is how I use Cursor IDE with its Planning mode.

clktmr•12m ago

You don't know what you want. That's why asking questions doesn't work. You think you know it, but only after you've spent some time iterating in the space of solutions, you'll see the path forward.

OrderlyTiamat•2m ago

> You think you know it, but only after you've spent some time iterating in the space of solutions, you'll see the path forward.

I'd turn it around- this is the reason asking questions does work! When you don't know what you want, someone asking you for more specifics is sometimes very illuminating, whether that someone is real or not.

LLMs have played this role well for me in some situations, and atrociously in others.

Frontier AI agents violate ethical constraints 30–50% of time, pressured by KPIs

Discord will require a face scan or ID for full access next month

Rust implementation of Mistral's Voxtral Mini 4B Realtime runs in your browser

Converting a $3.88 analog clock from Walmart into a ESP8266-based Wi-Fi clock

Why is the sky blue?

Is particle physics dead, dying, or just hard?

Hard-braking events as indicators of road segment crash risk

What functional programmers get wrong about systems

America has a tungsten problem

LiftKit – UI where "everything derives from the golden ratio"

Luce: First Electric Ferrari

Pure C, CPU-only inference with Mistral Voxtral Realtime 4B speech to text model

Sandboxels

Upcoming changes to Let's Encrypt and how they affect XMPP server operators

Eight more months of agents

Stop using icons in data tables

LLMs as Language Compilers: Lessons from Fortran for the Future of Coding

Game Theory Patterns at Work (2016)

UEFI Bindings for JavaScript

Everyone’s building “async agents,” but almost no one can define them

Why "just prompt better" doesn't work

Another GitHub outage in the same day

History of UHF Television: TV Above Channel 13 (2024)

Thoughts on Generating C

Discord Alternatives, Ranked

The shadowy world of abandoned oil tankers

Game Boy Advance Audio Interpolation

Importance of Tuning Checkpoint in PostgreSQL

Why is Singapore no longer "cool"?

Expansion Microscopy Has Transformed How We See the Cellular World

Why "just prompt better" doesn't work

Comments

Frontier AI agents violate ethical constraints 30–50% of time, pressured by KPIs

Discord will require a face scan or ID for full access next month

Rust implementation of Mistral's Voxtral Mini 4B Realtime runs in your browser

Converting a $3.88 analog clock from Walmart into a ESP8266-based Wi-Fi clock

Why is the sky blue?

Is particle physics dead, dying, or just hard?

Hard-braking events as indicators of road segment crash risk

What functional programmers get wrong about systems

America has a tungsten problem

LiftKit – UI where "everything derives from the golden ratio"

Luce: First Electric Ferrari

Pure C, CPU-only inference with Mistral Voxtral Realtime 4B speech to text model

Sandboxels

Upcoming changes to Let's Encrypt and how they affect XMPP server operators

Eight more months of agents

Stop using icons in data tables

LLMs as Language Compilers: Lessons from Fortran for the Future of Coding

Game Theory Patterns at Work (2016)

UEFI Bindings for JavaScript

Everyone’s building “async agents,” but almost no one can define them

Why "just prompt better" doesn't work

Another GitHub outage in the same day

History of UHF Television: TV Above Channel 13 (2024)

Thoughts on Generating C

Discord Alternatives, Ranked

The shadowy world of abandoned oil tankers

Game Boy Advance Audio Interpolation

Importance of Tuning Checkpoint in PostgreSQL

Why is Singapore no longer "cool"?

Expansion Microscopy Has Transformed How We See the Cellular World