'Done!' Is a Claim, Not Evidence

TL;DR

An agent's confidence has almost nothing to do with its correctness. It reports "done" even when it never ran the thing. Don't read what it says — look at the real result. Verify, then trust.

The agent reports: "Fixed it. The function now returns the right value." Sounds solid. You believe it.

But read that sentence closely: it's describing what it intended to do, not what it saw happen. Quite possibly it never ran that function once — it inferred "the right value" by re-reading the code it just wrote. The very thing that could be wrong.

I used to believe sentences like that. Many times. Until I made myself look straight at something counterintuitive.

01Confidence isn't correctness

Here's the hardest thing to accept about working with an agent: its level of confidence has almost nothing to do with its level of correctness. It reports success in that same cheerful voice whether the thing runs or not. It will "assume the tests pass" instead of running them. It describes the expected output instead of the actual one. It's not lying — it just doesn't draw a hard line between "I did it" and "I meant to do it."

And there's a cruel asymmetry: saying "done" costs it nothing. Finding out it isn't done costs you — usually at the worst moment.

✕ The claim

"Done, the function works now"

▼

…it never actually ran it

▼

You believe it → silent bug ships

✓ The evidence

Make it run it, paste real output

▼

You read the result, not the narration

▼

Verify, then trust

"Done!" is a claim. The actual output it produces is the evidence.

02Flip the mantra

The old line is "trust but verify." For an agent, reverse it: verify, then trust. The default is not trusted until something other than the agent's narration proves it.

Verifying is cheaper than you think — the principle: read the actual result, not the narration. Make it run what it built and paste the real output (not a description of it). Or run it yourself for thirty seconds. Look at the real diff, the actually-rendered UI. Best of all: have something that checks independently of the agent's word — a test, a build, a real run.

03Tells to watch for

The summary leaks the truth if you watch the words: "this should work" (it didn't run it); "I've updated X to do Y" with no evidence Y happens; "tests pass" without showing the run; a description so smooth it's suspicious for something it couldn't have observed. Each is an invitation to hit "run" and look for yourself.

04Make verification part of the task

Don't leave verifying as the step you do if you remember. Bake it into the request from the start: "When you're done, run it and show me the actual output." Make "show me the evidence" the default rule, not the exception.

The agent's job is to do the work. Your job is to not take its word for it. That cheerful "Done!" is exactly where most bugs like to hide — and once checking that spot becomes reflex, you'll be surprised how many you catch.

'Done!' Is a Claim, Not Evidence

01Confidence isn't correctness

✕ The claim

✓ The evidence

02Flip the mantra

03Tells to watch for

04Make verification part of the task

The Most Expensive Bug Is the One That Keeps the Build Green

End of this cluster

01Confidence isn't correctness

✕ The claim

✓ The evidence

02Flip the mantra

03Tells to watch for

04Make verification part of the task

The Most Expensive Bug Is the One That Keeps the Build Green

End of this cluster

Get new pieces by email