Four control gates that let an agent ship real work

TL;DR

When you hand real work to an agent, don't just say "be careful." Add four gates: make it plan and stop, split the work into checkpoints, map impact before shared changes, and only trust "done" when it shows real output.

Use this when the agent can already do useful work, but you still feel forced to hover: it changes more than you asked, chooses a risky direction on its own, says "done" without running the thing, or produces something that looks right and breaks days later.

Do this on the next task:

Before it writes code, make it produce a plan and stop.
If the task is bigger than one review pass, split it into checkpoints.
Before it changes a function, field, API, shared file, or format, make it list what depends on that thing.
Before accepting "done," ask for the command it ran and the real output.

If you remember one sentence, make it this: the agent can run freely between gates, but it must stop at the gate.

Plan, then stop

"Explain the approach, files you'll touch, and what you're unsure about. Stop before editing."

Checkpoint

"Do stage one first. Stop and show me before moving to the next stage."

Check impact

"Before changing anything shared, list who or what depends on it."

Show evidence

"Run it and paste the output. Don't describe what you expect to happen."

That's the version to use immediately. The rest explains why these four gates catch the places agents most often fall.

01When to use these gates

You don't need all four gates for every tiny task. If you're asking for a wording tweak, a throwaway name, or a rough list of options, let it run. The four gates earn their keep when the work has one of these signs:

The agent will edit something other people or later steps depend on.
The task has multiple steps, where step two depends on step one.
A mistake may stay quiet: the build stays green, the page still renders, the dashboard still shows a number.
You cannot review the whole output in a few minutes.

That's the dangerous zone: the agent works faster than you, but it also drifts faster than you. The problem isn't that it's stupid. The problem is treating it like an obedient machine, when in practice it's closer to someone very fast, very bold, and very forgetful.

Blazing fasttypes 10× faster than you

Fearlesswill touch anything

Amnesiacwakes up having forgotten

You're not commanding a machine. You're supervising a junior — brilliant, and forgetful.

This isn't just a cute metaphor. It's an operating checklist: speed needs gates; boldness needs scope; forgetfulness needs context and evidence.

You don't need to watch every line. Do that and the agent's speed evaporates. Your job is to choose where it must stop. Most tasks share the same lifecycle: receive the job, work through it, touch a shared boundary, then report done. The four gates sit on that lifecycle.

A TASK'S LIFECYCLE

task in

Planbefore it starts

Checkpointbetween stages

Impactbefore shared code

Verifybefore you trust "done"

ship →

The gates aren't four separate tricks. They're four moments where stepping in early keeps the mistake cheap.

02Gate 1: make it plan, then stop

The expensive mistake often starts before the first line of code. The agent misreads the scope, chooses a direction that's too large, or plans to touch a place you didn't expect. If you let it think and act in the same breath, by the time you see the drift, it may have built half the feature on the wrong foundation.

Prompt to use:

Before doing the work, state:
- what you think the goal is
- your approach
- files/areas you will touch
- anything uncertain or needing my confirmation

Then STOP. Do not edit anything until I approve.

Good result: you read the plan and nothing surprises you. If the plan does surprise you, the gate just saved you.

03Gate 2: split the work into checkpoints

A large task done in one pass is where an agent drifts furthest. Each step builds on the one before it. If step two is slightly wrong, step three can build very reasonably on that wrong step, and step four can look even more coherent. By the end, the output has a story, but the story is wrong.

Use checkpoints so the wrong turn appears early. A good checkpoint is small enough to review in one sitting, produces something visible, and can be thrown away without dragging the rest down.

Prompt to use:

Split this into 2-4 stages.
Each stage must produce output I can inspect.
After stage 1, stop and report:
- what changed
- what I should check
- whether you deviated from the original plan

Good result: you don't wait until the very end to find out it went wrong. You catch drift at the nearest checkpoint.

04Gate 3: check impact before changing shared things

A shared thing is anything someone or something else depends on: field names, function signatures, schemas, APIs, config, file formats, routes, templates, shared prompts. The agent can see the file in front of it without seeing every place that depends on it.

Before changing shared ground, make it map the blast radius. No essay needed. You need a concrete list.

Prompt to use:

Before changing this, list what depends on it:
- files/functions/routes/templates that call or read it
- data shapes or formats affected
- tests or smokes to run afterward

If you're not sure you found everything, say exactly where you're unsure.

Good result: if it changes a boundary, it also changes or checks the places that rely on that boundary.

05Gate 4: trust "done" only when there is evidence

"Done" is a claim, not evidence. An agent can sound confident even when it hasn't run the command, opened the page, or checked the output. The most dangerous sentence often isn't "I don't know." It's "this should work."

Prompt to use:

Before calling this done, show real evidence:
- command/test/smoke you ran
- key output
- if you couldn't run it, say why and what remains unverified

Good result: you see evidence outside the agent's narration. No evidence means the status is not "done"; it is "edited."

06When not to use all four gates

Don't turn every small request into ceremony. If the output is disposable, easy to inspect, or harmless if wrong, one gate is enough — usually verification. If the work is creative and needs divergence, don't over-gate the first idea phase; let it explore, then add gates when you choose a direction that will become real.

The point is not always using all four. The point is matching the gate to the risk:

Risk of misunderstanding → plan gate.
Risk of drifting mid-run → checkpoint.
Risk of breaking something elsewhere → impact gate.
Risk of fake "done" → verify gate.

07The scaffolding beats the model

These gates don't slow you down. What slows you down is the aftermath of not having them: the agent breaks something once, you lose trust, and from then on you re-check every line it writes. The speed you bought disappears because you no longer dare to delegate.

Step in now ~30 seconds

Clean up if you skip it an afternoon · a 2 a.m. phone call

The cheap move is stopping at the right moment. The expensive move is cleaning up after the agent has already run past the wrong turn.

The stronger the model, the more the gates matter. Not because stronger models are bad, but because when they go wrong, they go wrong quickly and persuasively. A good agent doesn't need you beside it at every step. It needs a small system that makes it stop at the right places.

So next time, don't start with "do this for me" and hope. Start with: plan, then stop. From there, you're no longer delegating by faith. You're giving the agent a track with gates.

Each of the four gates deserves its own piece: Gate 1 — gate it, don't just ask · Gate 2 — stages & checkpoints · Gate 3 — read the blast radius · Gate 4 — "Done!" is just a claim.

Four control gates that let an agent ship real work

01When to use these gates

02Gate 1: make it plan, then stop

03Gate 2: split the work into checkpoints

04Gate 3: check impact before changing shared things

05Gate 4: trust "done" only when there is evidence

06When not to use all four gates

07The scaffolding beats the model

Start of this cluster

Don't Ask the Agent for a Plan — Gate It

01When to use these gates

02Gate 1: make it plan, then stop

03Gate 2: split the work into checkpoints

04Gate 3: check impact before changing shared things

05Gate 4: trust "done" only when there is evidence

06When not to use all four gates

07The scaffolding beats the model

Start of this cluster

Don't Ask the Agent for a Plan — Gate It

Get new pieces by email