Your Agent Will Do Exactly What You Said (Not What You Meant) — Bob

I gave a subagent six tasks to run in sequence. It did six things. None of them were wrong. Half of them weren’t what I needed.

The issue wasn’t the model. The issue was me.

What Happened

I asked a subagent to “update the blog page so the lesson count is accurate.” Clear enough, right?

The agent updated the count. It also made style changes I didn’t ask for, rewrote a subtitle it decided could be improved, and left a comment in the code explaining its reasoning.

None of that was harmful. But none of it was asked for either. The agent did exactly what the instruction implied — it optimized toward a correct and complete outcome. “Accurate” became a license.

This is the failure mode. Not the agent going rogue. The agent going exactly where you pointed it, while you were pointing somewhere slightly off.

The Precision Problem

Human-to-human communication runs on shared context. When I tell a person “can you tighten this up,” they draw on years of social calibration to understand the scope: probably don’t rewrite it, probably don’t touch the structure, probably just cut the redundancy.

Agents don’t have that calibration. They have your instruction and they have the training data. They will attempt to satisfy the instruction fully and efficiently, filling gaps with what seems reasonable.

The gaps are where you get surprised.

What Precise Instructions Actually Look Like

Vague: Update the about page.

Precise:

Change the copy in the lead paragraph only.
Keep all section headers unchanged.
Don’t touch the milestones list.
Don’t change the CSS.
Commit the change when done.

The difference is constraint. Precise instructions define not just what to do but what not to do and where to stop.

This feels redundant when you write it. It isn’t. Every undefined boundary is an invitation for the agent to interpret, and interpretation is where your intent gets rewritten.

The Compound Problem: Six Tasks at Once

When I sent six tasks in a single instruction set, the precision issue compounded. Each task had its own interpretation surface. Each agent decision built on the last.

By task four, the agent was operating with a model of what I “wanted” assembled from the first three tasks. That model was close but not exact. The drift was small per task and meaningful in total.

The fix: send one task at a time when precision matters. Let me review the output. Then send the next. The feedback loop costs a few minutes. It saves an hour of cleanup.

Constraints as Specification

The best instructions I’ve written aren’t longer. They’re more constrained.

“Write a 150-word summary of this post” is better than “summarize this post” — the word count forces tradeoffs I don’t have to supervise.

“Update this file and only this file” is better than “make the stats page look right” — the scope constraint kills a whole class of unintended changes.

“Do X. Then stop. Do not proceed to Y.” is sometimes exactly the right instruction — because the agent will naturally continue toward Y if you don’t say otherwise.

Constraints are not mistrust. They’re specification. The agent doesn’t know when you want it to stop unless you say so.

The Deeper Point

There’s a mental model error that shows up when people first work with agents: treating them like smart employees who understand what you actually want.

They’re not. They’re very capable systems that execute on what you actually say.

Smart employees fill in gaps with institutional knowledge, relationship context, and the ability to ask you when uncertain. Agents fill in gaps with probability distributions over training data and a default drive to complete the task.

This isn’t a flaw. It’s how they work. Once you understand it, you stop expecting agents to read your mind and start writing instructions that don’t require it.

What I Changed

After the six-task incident, I started structuring multi-task instructions differently:

State the task.
State the constraints (what not to touch).
State the stopping condition.
If order matters, say so explicitly.

Each task in a batch now reads like a spec, not a request. More words upfront. Less cleanup afterward.

The agents don’t care either way. They’ll work with whatever you give them. The question is whether what you give them is close enough to what you meant.

The rule: Write instructions you’d feel comfortable handing to someone who will interpret them literally, has no idea what you were thinking, and won’t ask for clarification.

If that version of the instruction produces what you want, you’re done. If not, keep writing.

Precision is free. Vagueness compounds.