Why I Don't Let My AI Agents Plan (When the Process Is Known)

The model tried to spawn an agent type that didn't exist.

Not a hallucination in text - a hallucination in action. The orchestrator called Task with subagent_type: "general-purpose". This is a built-in subagent type documented by Anthropic. It was valid in the SDK, but invalid in our registry.

The run did not crash immediately. It appeared to proceed, then failed the completion check because required subagents were never invoked. It took three debugging sessions before I found the root cause: the model was defaulting to generic agent types we hadn't registered.

This failure - invisible, plausible, and preventable - changed how I think about building with AI agents.

The belief we held

When we started this project, I held a belief that agent builders often hold: autonomy is the point. Give the model tools, give it a goal, let it figure out the path. The agent decides what to do and when.

Anthropic draws a distinction that reframed this for me:

Workflows are systems where LLMs and tools are orchestrated through predefined code paths.
Agents are systems where LLMs dynamically direct their own processes and tool usage, maintaining control over how they accomplish tasks.

Our process was not open-ended. We knew the workflow: fetch sources, extract entities, prefilter, score, write report. The judgment work required models. The orchestration did not.

We were not building an autonomous agent. We were building an agentic workflow: a fixed sequence with model judgment at specific nodes. Once I understood that distinction, the architecture became obvious.

The setup

Client: A venture-backed organization evaluating early-stage climate and energy technologies.

Problem: Their tech scouts spent most of their time on discovery - scanning funding announcements, competition results, regulatory approvals, industry publications. The judgment work (evaluating fit, deciding who to meet) was squeezed into whatever time remained.

Goal: Automate the discovery phase so humans could focus on evaluation.

Constraints:

Auditable pipeline. Every decision traceable.
Deterministic cost and latency. No surprise $500 runs.
Bias for recall with explicit fallback paths. Missing a good company is worse than reviewing a mediocre one.

Build note: We used AI-assisted development (Claude Code + Codex), but every architectural decision was validated in code, logs, and runs.

What we built

A multi-agent pipeline using the Claude Agent SDK:

Orchestrator drives a fixed sequence: prefetch -> extract -> prefilter -> score -> report
Subagents do judgment work: entity extraction (Sonnet), relevance scoring (Sonnet), keyword prefilter + optional Haiku prefilter (Haiku), report writing (Sonnet)
Deterministic code handles fetching, caching, deduplication, persistence

The key decision: it's an agentic workflow, not an autonomous agent. The SDK gives us hooks, allowlists, and orchestration primitives. We use them for controlled delegation, not open-ended planning.

The receipts

Deterministic fetch vs. SDK WebFetch (same source):

Metric	httpx + BeautifulSoup	SDK WebFetch
Time	128 ms	13,348 ms
Content	8,533 chars (raw)	1,300 chars
Speed	104x faster	—
Content ratio	6.6x more	—

The SDK WebFetch tool always summarizes and does not return raw HTML; it answers a prompt about the page. We needed raw text for entity extraction. Deterministic fetch won.

Chunking constraints:

Chunk size: 20,000 characters
Overlap: 1,000 characters
Max chunks per source: 5
Total content cap: 100,000 characters

If a source exceeds these limits, we truncate and mark it. The extractor processes what is available.

Default caps and thresholds:

entity_cap: 25 (40 in broad-sweep mode)
prefilter_min_score: 1.0 (0.5 in broad-sweep mode)
prefilter_cap: 75 (entity_cap x 3)
rescore_after_days: 30

These are tuned to balance recall against cost and keep runs predictable.

Deep dive #1: Invalid subagent types

What broke: The orchestrator called Task with subagent_type: "general-purpose".

How it presented: The run looked normal at first, then failed a completion check because required subagents were missing.

Internal run transcript (redacted):

Starting general-purpose
...
ERROR in orchestrator: Missing required subagent invocations: entity-extractor, report-writer, relevance-scorer

Root cause: The SDK documents general-purpose as a built-in subagent type, and models will attempt it when Task is allowlisted. Our registry only defined four: entity-extractor, prefilter-scorer, relevance-scorer, report-writer.

The model did not know our registry was smaller than the SDK's capabilities.

The fix: Validate subagent_type in pre_tool_hook before the SDK processes it.

python

async def pre_tool_hook(self, input_data, tool_use_id, context):
    if tool_name == "Task":
        agent_type = tool_input.get("subagent_type", "unknown")
        if self.allowed_subagents and agent_type not in self.allowed_subagents:
            return self._reject_subagent(agent_type, description, tool_use_id, now)

The rejection returns a blocking response that the SDK passes back to the model:

python

def _reject_subagent(self, agent_type, description, tool_use_id, now):
    self.invalid_subagent_attempts += 1
    message = (
        f"BLOCKED: Invalid subagent_type '{agent_type}'. "
        f"You MUST use one of: {', '.join(sorted(self.allowed_subagents))}. "
        f"Retry immediately with a valid subagent_type."
    )

    if self.invalid_subagent_attempts > self.max_invalid_subagent_attempts:
        raise InvalidSubagentError("Exceeded max invalid subagent attempts.")

    return {"decision": "block", "reason": message, "systemMessage": message}

The model sees the error, retries with a valid type, and the run continues. If it fails repeatedly (default: 2 attempts), we abort entirely.

Defense in depth: We also added explicit "DO NOT USE" lines in the orchestrator prompt:

FORBIDDEN subagent types (DO NOT USE):
  - "general-purpose" <- NEVER USE THIS
  - "Explore" <- NEVER USE THIS

The prompt helps. The hook enforces.

Deep dive #2: Tools used by the wrong agent

What broke: The relevance scorer attempted to fetch source content.

How it presented: Scorer returned partial results with fetch errors mixed in.

Root cause: The SDK allows any registered tool by default. We had a global allowlist, but it was too coarse. The scorer only needs get_themes and check_entities_bulk, not fetch_source_chunk.

Different agents have different trust levels. A scorer should never fetch. An extractor should never save. Global allowlists do not capture this.

The fix: Per-agent tool allowlists that override the global list.

python

tracker = SubagentTracker(
    run_id=context.run_id,
    allowed_tools=allowed_tools_for_tracker,  # Global fallback
    allowed_tools_by_agent={
        "entity-extractor": {
            "mcp__client-discovery-tools__fetch_source_chunk",
        },
        "prefilter-scorer": {
            "mcp__client-discovery-tools__record_haiku_prefilter",
        },
        "relevance-scorer": {
            "mcp__client-discovery-tools__get_themes",
            "mcp__client-discovery-tools__check_entities_bulk",
        },
        "report-writer": {
            "mcp__client-discovery-tools__generate_report",
        },
    },
)

Validation in pre_tool_hook:

python

def _validate_tool_call(self, tool_name, tool_input, tool_use_id, now, agent_type):
    allowed_for_agent = None
    if agent_type and agent_type in self.allowed_tools_by_agent:
        allowed_for_agent = self.allowed_tools_by_agent[agent_type]
    elif self.allowed_tools:
        allowed_for_agent = self.allowed_tools

    if allowed_for_agent and tool_name not in allowed_for_agent:
        return self._block_tool_call(
            tool_name,
            "Tool not allowlisted for this session.",
            tool_use_id,
            now,
        )

Now we always: Define per-agent allowlists from day one. Least privilege by role, enforced in code.

Deep dive #3: Duplicate fetches

What broke: The entity extractor fetched the same source multiple times in one run.

How it presented: Costs spiked. Runs took 3x longer than expected. Some sources appeared in the extraction output multiple times.

Root cause: Models do not reliably track state across long contexts. The extractor would process a chunk, extract entities, then - apparently forgetting it had already fetched - request the same source again.

This is a fundamental issue with context-based memory. Over long tool-use sequences, models lose track of what they have done. Never trust model memory for expensive operations.

The fix: Track fetch counts per source_id in hooks. Block after limit.

python

tracker = SubagentTracker(
    fetch_tool_name="mcp__client-discovery-tools__fetch_source_chunk",
    max_fetches_per_source=5,  # Allow sequential chunks, block repeats
)

Validation:

python

if self.fetch_tool_name and tool_name == self.fetch_tool_name:
    source_id = tool_input.get("source_id")
    if source_id:
        count = self.fetch_counts_by_source.get(source_id, 0)
        if self.max_fetches_per_source and count >= self.max_fetches_per_source:
            return self._block_tool_call(
                tool_name,
                f"Fetch for '{source_id}' exceeded limit {self.max_fetches_per_source}. "
                "You already received the allowed chunks for this source.",
                tool_use_id,
                now,
            )
        self.fetch_counts_by_source[source_id] = count + 1

Broader lesson: Any expensive operation - network calls, database writes, API requests - needs hook-level tracking. Do not trust the model to remember what it has done.

Deep dive #4: Truncation -> Read loop -> timeout

What broke: Large sources exceeded tool response limits. The SDK wrote full output to a file. The model tried repeated Read calls to page through it.

How it presented: Runs timing out. Tool call logs showing 15+ sequential Read operations on the same file path.

Root cause: We granted Read access "for flexibility." The SDK truncates tool responses over a certain size and writes the full output to a file, expecting the model to read it. The model obliged - inefficiently. It would read 2000 lines, realize there was more, read another 2000, and loop until timeout.

The fix: Remove Read access. Provide chunked tools that return exactly one chunk per call.

python

@tool(
    "fetch_source_chunk",
    "Fetch a single cached content chunk for a source URL (cache only).",
    {"source_id": str, "run_id": str, "chunk_index": int}
)
async def fetch_source_chunk_tool(args):
    # Returns exactly one chunk, plus metadata about total chunks
    return {
        "source_id": source_id,
        "chunk_count": len(chunks),
        "chunk_index": chunk_index,
        "chunk_text": chunks[chunk_index],
        "truncated": truncated,
    }

The model calls with chunk_index=0, gets the first chunk plus chunk_count. It can then call sequentially through chunk_index=chunk_count-1. Deterministic. No paging.

Now we always: Use chunked tools for large inputs. No file access for agents. The tool decides the chunk size, not the model.

Other issues (brief mentions)

Dual-runtime migration collision: Our Python agents and TypeScript server both run migrations on the same SQLite database. Python added a new column first; TypeScript crashed on startup with a duplicate column error. Fix: make migrations idempotent. Check before applying.

Haiku prefilter results not captured: The prefilter scorer returned results in the subagent response, but we were not logging them. We only logged tool calls, not tool results. Fix: add explicit record_haiku_prefilter tool that the prefilter scorer must call. Audit logging happens in the tool handler, not in response parsing.

Rescore gating observability gap: Entities skipped by rescore gating were not visible in audit logs. We logged the tool call to check_entities_bulk but not its response. Fix: log tool results in post_tool_hook, not just tool invocations.

Operator takeaways

The counter-narrative: I am not anti-autonomy. I am pro fit-for-purpose.

If your process is known - if you can write down the steps - use agentic workflows. Fixed sequences with model judgment at specific nodes. Hooks enforce constraints. Allowlists grant least privilege.

If your process is open-ended - if the task genuinely requires exploration and you cannot pre-specify the workflow - use autonomous agents. Accept the cost and latency variability that comes with it.

Most production systems are the former. Discovery pipelines, data processing, content generation, code review - these have known workflows. The judgment work is bounded. Autonomy is not the default; fit-for-purpose is.

What changed in our defaults:

Validate subagent types in hooks before the SDK processes them. Block invalid types, return error to model, abort after N failures.
Define per-agent tool allowlists. Not just global. Different agents have different trust levels.
Track and limit expensive operations at the hook level. Fetch counts, API calls, database writes. Do not trust model memory.
Use chunked tools for large inputs. No file access for agents. The tool decides chunk size.
Make migrations idempotent when multiple runtimes share a database. Check before applying. Always.
Log tool results, not just tool calls. Observability gaps compound. If you cannot see it, you cannot debug it.

What I would do differently:

Start with hooks, not prompts. Prompts guide; hooks enforce. Build the enforcement layer first.
Define allowlists from day one. Every new tool gets added to specific agent allowlists, not to a global list.
Make observability a first-class output. The audit log should be as important as the final report. Design for it.

Minimal code patterns

Pattern 1: Subagent type validation

python

# Sanitized pattern (from SubagentTracker.pre_tool_hook)
class SubagentTracker:
    def __init__(self, allowed_subagents: set[str], max_invalid_attempts: int = 2):
        self.allowed_subagents = set(allowed_subagents)
        self.max_invalid_attempts = max_invalid_attempts
        self.invalid_attempts = 0

    async def pre_tool_hook(self, input_data, tool_use_id, context):
        tool_name = input_data.get("tool_name")
        tool_input = input_data.get("tool_input", {})

        if tool_name == "Task":
            agent_type = tool_input.get("subagent_type")
            if agent_type not in self.allowed_subagents:
                self.invalid_attempts += 1
                if self.invalid_attempts > self.max_invalid_attempts:
                    raise RuntimeError("Max invalid subagent attempts exceeded")
                return {
                    "decision": "block",
                    "systemMessage": f"Invalid subagent_type. Use: {self.allowed_subagents}"
                }
        return {}  # Allow

Pattern 2: Per-agent tool allowlists

python

def validate_tool_for_agent(tool_name, agent_type, allowlists_by_agent, global_allowlist):
    if agent_type in allowlists_by_agent:
        allowed = allowlists_by_agent[agent_type]
    else:
        allowed = global_allowlist

    if allowed and tool_name not in allowed:
        return {"decision": "block", "reason": "Tool not allowed for this agent"}
    return None  # Allow

Pattern 3: Fetch count limiter

python

class FetchLimiter:
    def __init__(self, max_per_source: int = 5):
        self.counts: dict[str, int] = {}
        self.max_per_source = max_per_source

    def check_and_increment(self, source_id: str) -> str | None:
        count = self.counts.get(source_id, 0)
        if count >= self.max_per_source:
            return f"Fetch limit exceeded for {source_id}"
        self.counts[source_id] = count + 1
        return None  # Allowed

When the process is defined, workflows beat autonomy. The SDK gives you hooks, allowlists, and orchestration primitives. Use them.

References

[1] Anthropic. "Building effective agents." December 2024. source

[2] Anthropic. "Subagents in the SDK." Claude Developer Docs. source (accessed 2026-01-22)