L / 003

What's Sub?

The very fine engineers at wild have been watching the AI engineering tooling space evolve from "here's a chatbot that knows about code" to "here's a distributed system of specialized AI agents that you accidentally architected while trying to refactor a single function." It's been educational.

The latest development in this trajectory is subagents, basically your main AI agent's ability to spawn smaller, more focused AI agents to handle specific subtasks. Think of it as delegation, but for silicon-based workers who never complain about meeting overload.

Predetermined Hierarchy

GitHub Copilot has been integrating subagents into their workflow, and we've been experimenting with them across our projects. The implementation is deterministic, which in this context means "predictable and well-behaved," two qualities we value highly in both our infrastructure and caffeine intake.

In Copilot's cloud-based agents, there's a default subagent called "explore" that operates as a context retriever. When the main agent needs to understand a codebase or locate specific documentation, it dispatches the explore subagent, which performs the research and returns only the relevant findings. The main agent's context window stays focused on the actual problem-solving rather than getting bloated with every file it considered and rejected.

We've been deploying this pattern for two specific use cases: codebase research and documentation management. When we need to understand how a particular feature works across multiple services, say, tracing authentication flows through an API layer, the subagent approach prevents the main conversation from degenerating into a 47-file archaeological expedition through our commit history.

The technical advantage here is context isolation. Each subagent operates in its own context window, performs its specific task, and reports back with a summary. The main agent's working memory remains comprehensible to humans, which turns out to be an underrated feature when you're trying to understand what decisions were made three hours ago.

For documentation work, this becomes especially useful. A subagent can read through existing docs, cross-reference them with current code, identify discrepancies, and produce a report, all without cluttering the main thread with the equivalent of "I read 23 markdown files and here are my stream-of-consciousness notes about each one."

Subagents++ with Kimi 2.5

Kimi 2.5 takes a layered approach. Like Copilot, it supports predefined subagents with their own specialized prompts and tool configurations. These are the proven workhorses: well-tested, reliable, and optimized for common scenarios.

But Kimi also provides an experimental tool called CreateSubagent that lets the main agent dynamically define new subagents during a session based on the current task. This addresses a practical constraint of predefined-only systems: you can't anticipate every possible subtask configuration ahead of time.

The key advantage is flexibility without infinite upfront planning. Predefined subagents handle the 80% use case, code review, implementation, refactoring, things we do repeatedly and want consistent behavior for. The CreateSubagent tool handles the remaining 20%: unusual one-off investigations, domain-specific analysis that doesn't warrant a permanent subagent definition, or experimental approaches where you're still figuring out what the helper should even do.

With CreateSubagent, the main agent can specify custom system prompts and tool access for new subagents on the fly. This is particularly useful for exploratory work where the exact nature of the subtask isn't known until you're already three steps into the problem.

We've been running experiments with this on our deployment tooling. When debugging a complex performance issue that might involve Docker configurations, network settings, database queries, or application-level bottlenecks, having the main agent spawn specialized subagents for each investigation area lets us parallelize the diagnosis without manually coordinating multiple chat sessions.

The technical implementation uses a Task tool that launches subagents with isolated contexts. These subagents can't access the main agent's conversation history, all necessary information must be explicitly provided in the task prompt. This is simultaneously a feature (clean separation of concerns) and a constraint (you can't be lazy about what context you provide).

The Practical Reality

Both approaches share a common benefit: they reduce token consumption and keep the primary conversation comprehensible. When you're five hours into a complex refactoring task, the ability to review your main conversation thread without scrolling past extensive research tangents is not a minor convenience.

The architectural pattern also maps reasonably well to how we already think about complex tasks. You don't implement a feature by considering every possible approach simultaneously; you break down the problem, research specific aspects, and synthesize the findings. They formalize that process in a way that makes it visible to both the AI and the human.

There are still rough edges. Deciding when to spawn a subagent versus handling something directly isn't always obvious. Providing sufficient context to a subagent without essentially explaining the entire problem requires some calibration. And occasionally you'll watch an AI agent spawn a subagent to research something that it could have answered in two seconds, which is either inefficient or a very advanced form of delegation depending on your perspective.

But as we continue building AI-powered applications at wild for various client projects with increasingly complex deployment requirements, having tools that help manage AI context and task complexity is valuable. Subagents aren't solving every problem in AI engineering, but they're a great step toward making these systems more usable for actual development work. They’re definitely an essential part of the new AI Software Engineering stack.

And that's more or less what's sub.