How do you build an AI agent PR review workflow?

Spawn a reviewer persona in its own git worktree against the PR's head SHA. The agent reads the diff, runs pnpm typecheck && pnpm test inside that worktree, and posts a structured verdict to the card. The difference from "AI review comments on GitHub" is that this agent actually executes the tests.

Most AI PR review products read the diff and produce a list of opinions. KanBots runs a reviewer agent with shell access to a fresh worktree at the PR's head SHA. It can read the diff, but it can also run your typecheck, your unit tests, your lint config, and a build. The verdict on the card is grounded in observed behavior, not pattern-matching on the patch text.

The worktree isolation is what makes this safe to run on every PR. Humans editing the main checkout don't see contention; the reviewer agent isn't trampling files someone else has open in their editor. When the run finishes, the worktree stays around so you can inspect what it did or start its dev server with one click.

The reviewer persona pattern

KanBots ships a default reviewer persona at .kanbots/personas/reviewer.md. Its prompt is explicit about three things: the agent must run the test commands declared in the workspace config before posting any verdict, it must structure its output as "verdict / blocking concerns / non-blocking observations," and it must call out any place the diff touches code paths the test suite doesn't cover.

The check commands come from .kanbots/config.json — same source the QA autopilot uses. A typical config looks like {"typecheck":"pnpm typecheck","tests":"pnpm test","lint":"pnpm lint"}. The reviewer persona runs all three and pastes the results into the verdict section. If pnpm test fails, the verdict header says request changes; if everything passes and the diff is small, it says approve; everything in between is comment.

How a review run wires up

In GitHub mode, KanBots subscribes to pull_request.opened and pull_request.synchronize webhooks. When either fires for a repo bound to the workspace, KanBots creates a card in the review column with a link to the PR, the head SHA, and the diff URL.

Click Dispatch on the card. The dispatcher creates a worktree off the PR's head SHA (not the default branch — important; you're reviewing the PR's state, not main) and spawns the agent with the reviewer persona prompt. The agent first runs the check commands, then reads the diff, then writes its verdict to a file in the worktree. The dispatcher reads that file and posts the body to the card thread.

Cost transparency: the run drawer shows tokens spent. A reviewer pass on a 100-line diff with Sonnet typically lands at $0.30–$0.60. If your typecheck and test runs are slow, the agent waits — token cost is for the reviewer's reading and verdict, not for the shell waits.

Walkthrough — wiring it into your repo

  1. Open KanBots on the repo. In Settings → Workspace → Check commands, set typecheck, tests, and lint to whatever your project uses. These override the defaults in .kanbots/config.json.
  2. Connect GitHub via the GitHub App install flow if you haven't already. Bind your repo to the project. KanBots will start receiving PR webhooks.
  3. Open a PR somewhere in the repo. Within a minute it appears as a card in the review column with the head SHA and a link.
  4. Click Dispatch. The agent starts; the live thread shows it running your check commands first, then reading the diff.
  5. When the verdict lands, the card body has a structured comment: verdict header, test/lint output in fenced blocks, blocking concerns, non-blocking notes. Read it; if you agree, copy the verdict body into a GitHub review comment with one click from the run drawer.
  6. If you want this to be automatic on every PR open, switch on the workspace autopilot rule "auto-dispatch reviewer on review column entry." Cards will get reviewed without your input.

Failure modes and fixes

Your test suite is slow and the agent's verdict lags

Symptom: pnpm test takes 12 minutes and the reviewer run is mostly shell-waiting. The token cost is still cheap, but you don't get the verdict until the suite finishes. Fix: in the reviewer persona prompt, scope the tests to the changed files — pnpm test --filter $(git diff --name-only main...HEAD | xargs). Or split the autopilot: a fast reviewer pass with typecheck + lint only, plus a scheduled full-suite pass via QA autopilot every hour.

The agent rubber-stamps

Symptom: every PR comes back approve and the human reviewer stops trusting the bot. Fix: the default persona prompt is too lenient on missing test coverage. Edit .kanbots/personas/reviewer.md to add "If the diff touches a code path with no test coverage, the verdict is at most comment, never approve." Re-run on a known-bad PR and recalibrate.

The worktree fails to set up the repo

Symptom: the agent reports command not found: pnpm or module not found errors before reading the diff. Fix: the worktree shares the main checkout's node_modules by default (KanBots symlinks it), but if your project has post-install hooks that write to node_modules from the working directory, symlinking breaks. Add a worktree.setup command in .kanbots/config.json that runs pnpm install --frozen-lockfile inside each worktree on creation.

When agent review is the wrong tool

Three cases. First, PRs that change infrastructure — Terraform, Helm, CI configs. The agent can read the diff but it can't apply it to anything real, and the consequence of a wrong verdict is much higher than for application code. Use it as a first pass at best; require a human approver who has actually run the change in staging.

Second, security review. Crypto, auth, session handling, anything that touches a secrets path. The agent's verdict is a useful prompt for the human security reviewer but it should never be the only sign-off.

Third, "design intent" review. If half the value of a PR review is "is this the right abstraction for where we want the system to go in six months," the agent doesn't have the six-month context. It reviews what the diff is, not what the diff should have been.

Related reading: how reviewer fits into a multi-persona feature-dev cycle and the one-agent-one-branch-one-PR pattern the worktrees enforce.

Try it on your own folder

Drop a folder, get a board, dispatch parallel agents. The desktop runs locally on macOS, Linux, and Windows.