Ask HN: Should AI credits be refunded on mistakes?

13•ed_elliott_asc•14h ago

Something I’ve noticed (I’m using Claude subscription so no refunds but it applies to usage windows) is that sometimes AI makes mistakes so if something is important I tell Claude code to spin up a couple of sub agents and verify the information, often there will be a mistake and it gets rectified.

It feels unfair I have to pay (or lose some usage) for this.

Interested in other people’s thoughts.

Comments

sturza•14h ago

Do you pay your employer when you introduce bugs? I think you're lucky if you get usable output which you don't consider a mistake. Also, you might be mistaken if you think that you pay for a deterministic service.

edit: typo

meltyness•14h ago

It's interesting on the grounds of aligning incentives.

It's not interesting due to the fact that it suggests humans are still in the loop of some slow-cycle improvements. That'd never get by any board. In fact, selection of model modes implies it's your responsibility, so that meal was scraped into your flowerpot years ago.

I'd say fat chance.

verdverm•13h ago

No, you should know that no man or machine writes bug or mistake free code. You are paying for tokens (electricity and cooling), not what those tokens represent. How would you define mistakes in non code tasks?

Would you give money back to your employer when you make a mistake?

sloaken•13h ago

Interesting, you have just identified a potential market distinction. First we need a group (ala consumers report) to evaluate different services. Then different services would be motivated to perform the sub agent verification automatically as a Competitive Advantage,

ed_elliott_asc•9h ago

Just parse responses for "sorry about that"!

codingdave•11h ago

LLMs hallucinate - This is known.

If you choose to use them, you go in knowing they need help to be accurate. You clearly know how to use the tools to reach the accuracy you desire, but asking for that usage to be free seems to be based on a false premise. There has never been an expectation of accuracy in the first place when it comes to LLM output.

bauldursdev•10h ago

I think it would be expensive to check. For a coding task any reviewer would need to understand programming (these people aren't cheap), the domain context, cultural differences (e.g. American "cookie" vs British "biscuit"), and make a determination.

If the AI companies just paid all of that out of the goodness of their pocketbook I'd be fine with it, but in reality I think they'd just pass on the costs. The same way that basically every business passes on spoilage, theft, return rates, etc.. So I think the value would be risk mitigation rather than cost (as in, you know if you pay for $10 worth of tokens, it will $10 worth of good tokens, but the individual token cost would need to account for all the tokens that the company doesn't get paid for)

philipnee•9h ago

probably not. but they should be more explicit about the usage, not just - you've used up 5%.

bhanuhai2•8h ago

I think they are already doing it case by case basis, but the support experience is worst

omgwtfbyobbq•8h ago

I think so. IMO, at this point, AI systems should also be using expert/rule systems to validate their output to avoid bad/obvious mistakes. In ambiguous/complex cases, I don't think so, but in certain circumstances, the output is ridiculous and could have been caught by a relatively simple expert system/rules engine, likely something the AI itself could have helped build.

AStrangeMorrow•7h ago

I mean “mistakes” can be hard to define. IMHO there is an area of responsibility between the LLM, the LLM user, and the code itself.

Did it make a mistake because I didn’t follow instructions properly or hallucinated some content?

Did it make a mistake because the prompt was unclear/open to interpretation or plain wrong?

Did it make a mistake because it lacked some context? Or too much context and it starts getting confused?

Is not handling edge cases automatically when that was not requested a mistake?

I am not just trying to defend LLMs, in many cases they make obvious mistakes and just don’t follow my arguably clear instructions properly. But sometimes it is not so clear cut. Maybe I didn’t link a relevant file (you can argue it could have looked to it), maybe my prompt just wasn’t that clear etc

Ask HN: Any interesting niche hobbies?

Ask HN: Should AI credits be refunded on mistakes?

Tell HN: Anthropic no longer allowing Claude Code subscriptions to use OpenClaw

Ask HN: Founders/investors, what AI bet you made in 2022 and how it is going?

Ask HN: How do you handle marketing as a solo technical founder?

Ask HN: How do you manage your digital legacy for after you die?

Hybrid Attention

Ask HN: What are you working on? (April 2026) (Non AI)

Early precursor signals observed before incidents (RTT/DNS/HTTP telemetry)

Zooming UIs in 2026: Prezi, impress.js, and why I built something different

Anthropic banned Pi, a third party harness

GPT 5.4 in practice – Stinks?

Free models you can use with your OpenClaw (no credit card needed)

Claude Code limits are starting to feel like a psychological trick

Ask HN: Alternatives to Claude (Code)?

Ask HN: How are you orchestrating multi-agent AI workflows in production?

Ask HN: Where are all the disruptive software that AI promised?

Write Your Own Copy

Ask HN: How are you controlling costs and enforcing limits for LLM calls?

Compact multi-port network setup (2.5G / 10G / SFP+) – looking for feedback

Upwork Inc. violates its own DMARC and SPF policy

Microsoft Discontinuing Publisher. Alternatives?

Intelligence Cannot Be Trained?

Anthropic to limit Using third-party harnesses with Claude subscriptions

Ask HN: Should AI credits be refunded on mistakes?

Comments

Ask HN: Any interesting niche hobbies?

Ask HN: Should AI credits be refunded on mistakes?

Tell HN: Anthropic no longer allowing Claude Code subscriptions to use OpenClaw

Ask HN: Founders/investors, what AI bet you made in 2022 and how it is going?

Ask HN: How do you handle marketing as a solo technical founder?

Ask HN: How do you manage your digital legacy for after you die?

Hybrid Attention

Ask HN: What are you working on? (April 2026) (Non AI)

Early precursor signals observed before incidents (RTT/DNS/HTTP telemetry)

Zooming UIs in 2026: Prezi, impress.js, and why I built something different

Anthropic banned Pi, a third party harness

GPT 5.4 in practice – Stinks?

Free models you can use with your OpenClaw (no credit card needed)

Claude Code limits are starting to feel like a psychological trick

Ask HN: Alternatives to Claude (Code)?

Ask HN: How are you orchestrating multi-agent AI workflows in production?

Ask HN: Where are all the disruptive software that AI promised?

Write Your Own Copy

Ask HN: How are you controlling costs and enforcing limits for LLM calls?

Compact multi-port network setup (2.5G / 10G / SFP+) – looking for feedback

Upwork Inc. violates its own DMARC and SPF policy

Microsoft Discontinuing Publisher. Alternatives?

Intelligence Cannot Be Trained?

Anthropic to limit Using third-party harnesses with Claude subscriptions