How does multi-persona AI agent orchestration work?
What a persona actually is
Inside KanBots a persona is a short prompt snippet — usually 100 to 400 words — that gets appended to the agent's system prompt via --append-system-prompt (Claude Code) or the Codex equivalent. It is stored as a row in the local SQLite database in .kanbots/db.sqlite, never leaves your machine, and survives across workspaces if you copy the file.
Built-in personas ship with the desktop: Product Manager, Senior Engineer, UX Designer, Growth Lead, Reliability Engineer, Tester. "New persona" in the persona picker lets you write your own — give it a name, paste a prompt, save. Custom personas are bound to the workspace and reusable forever.
Persona is not a model choice and not a tool allowlist. Those are separate knobs (model selector and containment mode). Persona is the lens — what the agent thinks it is supposed to optimize for on this run.
Round-robin in five lines
The Feature-Dev autopilot session stores a single integer: cycle_index. When a slot is ready for work it claims the next index atomically, picks personas[index % personas.length], increments the counter, and dispatches. The lock is a single JavaScript promise chain in packages/api/src/autopilot/feature-dev.ts — single process, single source of truth, no two slots can ever land on the same index.
With four personas and parallelism 2 the first eight cycles look like:
cycle 0 → slot A: product author
cycle 1 → slot B: engineer
cycle 2 → slot A: reviewer
cycle 3 → slot B: tester
cycle 4 → slot A: product author
cycle 5 → slot B: engineer
cycle 6 → slot A: reviewer
cycle 7 → slot B: testerEach cycle is a real dispatch — own worktree, own kanbots/issue-N-runId branch, own cost rollup. Slots do not share state with each other; they share only the parent issue and any subtasks it has spawned.
Why this beats a single-persona loop
Run the engineer persona alone on a feature and you get a recurring failure mode: it over-implements. The engineer is biased to ship — it will add caching, retries, and a metrics counter to a CRUD endpoint because those look like engineering. The reviewer persona on a second pass reads the diff and flags the surplus: the spec said write the endpoint, not redesign the data layer. The tester persona on a third pass refuses to merge until there is a test that exercises the actual path. The product author on a fourth pass splits the leftover into a fresh card.
That feedback loop is the value. A reviewer who arrives afterthe engineer commits is structurally better at catching slop than the engineer reviewing its own work, even if both are running the same model. The persona prompt is what biases the run away from rationalizing its own choices.
Configuring 2 personas vs 4
Two personas plus parallelism 2 is the cheap default. Pick engineer and tester. Slot 0 implements, slot 1 writes or runs tests, they alternate. Total cycles to a CRUD endpoint with tests: 3 to 5. Approximate spend with Sonnet at medium effort: $2.50 to $6.
Four personas plus parallelism 2 is what you reach for on a real feature. Product author, engineer, reviewer, tester. Slot 0 and slot 1 alternate through all four. Product author splits the issue, engineer implements each subtask, reviewer reads the diff, tester runs the suite, then the cycle restarts. Total cycles: 8 to 14. Approximate spend at Sonnet medium: $10 to $20.
Four personas plus parallelism 4 is the maximum the MAX_PARALLELISM = 4 guard allows. Use it for big parent issues that obviously decompose (a multi-screen UI, a refactor that touches many files). On a 16GB laptop two CLI processes are comfortable; four will swap. Budget accordingly.
A worked Feature-Dev session
- Card #87: "Add /users CRUD endpoint with input validation." Open autopilot Feature-Dev. Pick engineer and tester. Parallelism 2. Effort medium. Budget $8.
- Cycle 0 (slot A, engineer): creates the worktree, scaffolds the route handler, returns 503 on bad input. Cost $0.62.
- Cycle 1 (slot B, tester): spawns a separate worktree on the same branch base, writes a test that POSTs malformed JSON and asserts 400. Test fails because the handler returns 503. Tester reports the mismatch by emitting a decision event. Cost $0.41.
- Cycle 2 (slot A, engineer): reads the tester's artifact, swaps 503 to 400, commits. Cost $0.55.
- Cycle 3 (slot B, tester): re-runs the suite, all green. Emits a result event with no failures. Cost $0.34.
- Slots return. Session total $1.92. Two promoteable worktrees on disk. You read the diffs, pick the engineer's, promote-to-PR.
Failure modes
Two engineer personas and nothing else. You set the roster wrong — same persona twice, or two engineering-flavored personas. The round-robin still works, but the second pass does not challenge the first. Symptom: the diff grows monotonically with no rejections. Fix: open the persona chip row and remove duplicates; always have at least one persona that does not write code.
Persona prompt is too long. A 1,500-word persona prompt eats context and pushes the actual issue body deeper into the system prompt. The agent then ignores the spec because it is focused on persona instructions. Fix: trim personas to under 400 words. They are biases, not specifications.
Reviewer never finishes a cycle. The reviewer keeps finding new things to ask about because the persona prompt says "leave no concern unaddressed." Symptom: reviewer cycles run 4–6 minutes each, no progress on engineer side. Fix: add a sentence to the reviewer persona: "Surface at most three concerns per pass. If the diff is mergeable, say so."
When personas are wrong
Personas are wrong when the work has no review surface — fixing a typo, renaming a single variable, bumping a single dep. The round-robin overhead (worktree per cycle, cost rollup, context re-load) costs more than just running one engineer dispatch. Personas also do not help when the parent issue has no acceptance criterion: the reviewer cannot reject what was never defined.
For the outer loop that drives this, see autopilot mode; for how the backlog grows under personas see self-evolving backlog.
Try it on your own folder
Drop a folder, get a board, dispatch parallel agents. The desktop runs locally on macOS, Linux, and Windows.
Related questions
- What is autopilot mode for Claude Code?Autopilot picks personas, parallelism, and budget. It loops until the work converges or the cost cap hits. The mental model and when to use it.
- How do you put a budget cap on AI coding agents?Per-run cost tracking, per-card rollups, per-autopilot-session caps. Stop runaway spend before it stops you.
- How do AI agents fit a feature-branch workflow?One agent → one branch → one PR, isolated by worktree, with pre-push hooks preventing agent-side pushes. The exact branch naming and promote flow.
- Can an AI agent backlog evolve itself?When personas split a parent issue into subtasks, the backlog grows. How to keep that growth productive instead of runaway.