Claude Code Adds Multi-Agent PR Review and CI Auto-Fix

Anthropic used the London leg of Code with Claude 2026 to put Claude Code Review — the multi-agent pull-request reviewer first announced at the San Francisco keynote on May 6 — on the road to general availability for Teams and Enterprise customers. The Code Review feature dispatches a team of specialist sub-agents in parallel against every PR, each looking for a different class of issue (logic bugs, security holes, performance regressions, test gaps, style drift), and posts consolidated comments with suggested fixes. Paired with the new CI auto-fix capability that opens repair PRs against failing test runs, the combination changes the economics of code review at scale. Anthropic did not announce a new model at the conference — the entire focus was tooling, orchestration, and reliability for the Claude Code product.

What’s actually new in Claude Code Review

Three features moved meaningfully forward between the San Francisco keynote on May 6 and the London continuation on May 19-20. Claude Code Review is the headline. It is a multi-agent system that runs server-side when a developer opens a pull request in a repository the admin has enabled. A lead orchestrator agent dispatches several specialist sub-agents in parallel — each with its own prompt, tool set, and focus — and the orchestrator merges their findings into a single review with suggested code changes. The system runs as comments and review threads on the PR, not as a separate dashboard, so it lives where reviewers already work.

CI auto-fix is the second new capability. When a CI pipeline fails (failing test, build error, lint violation), Claude Code automatically attempts to diagnose the cause and produce a fix as a new pull request against the same branch. Human review is still required to merge, but the time from “test broke at 2 AM” to “candidate fix waiting in review” drops from hours to minutes. Auto-fix is opt-in per repository and works with most major CI systems via the standard GitHub Actions integration.

Third, the Advisor + Executor pattern shipped as a first-class orchestration primitive in the Claude Agent SDK. Advisor pairs Claude Opus (the more capable, slower, expensive model) as the strategic planner with Claude Sonnet (faster, cheaper) as the tactical executor. The Opus advisor reviews the plan, intervenes on hard decisions, and lets Sonnet handle the high-volume tactical work — a meaningful improvement in both quality and cost over running everything on either model.

Why it matters for Claude Code Review and the agent stack

  • Multi-agent code review is the first end-to-end consumer-of-orchestration product from a frontier lab. The pattern (orchestrator + specialist sub-agents in parallel) has been studied for two years; this is its first widespread production deployment.
  • The Code Review economics matter at scale. Anthropic’s internal usage data, shared at the keynote, shows teams catching meaningful bugs that experienced human reviewers missed — and saving 15-30% of human reviewer time on routine PRs.
  • CI auto-fix changes incident response. Failing main-branch tests at 3 AM are no longer a wakeup-the-on-call event; they’re a queue of candidate fixes to review the next morning.
  • No new model was a deliberate signal. Anthropic spent the keynote making Claude Code work better, not announcing Mythos or Opus 5. The message: orchestration and tooling are the current frontier, not raw capability.
  • Advisor + Executor is a generalizable pattern. The Advisor/Executor primitive in the Agent SDK is available for any application, not just Claude Code — expect to see it adopted broadly in 2026 agent builds.
  • Code Review’s per-PR billing is novel. Anthropic announced a separate metered billing model for Code Review (cost per PR reviewed, not per token) — a packaging shift that simplifies enterprise procurement.

How to use Claude Code Review today

Code Review is available now for Claude Teams and Enterprise customers. CI auto-fix is in expanded preview as of May 19, 2026. Here’s the setup path.

  1. Confirm your workspace is on a Claude Teams or Enterprise plan. Code Review is not available on Claude Pro or free tiers. Workspace admin can check at console.claude.ai → Settings → Plan.
  2. From the workspace console, open Claude Code → Code Review → Repositories. Connect your GitHub organization via OAuth. Anthropic supports GitHub and GitLab today; Bitbucket is in beta.
  3. Select repositories to enable. Code Review is opt-in per repository — enabling on every repo at once is not the default. Pick a moderate-traffic repo for the first rollout.
# Repo-level configuration (.claude/code-review.yaml)
version: 1
enabled: true
reviewers:
  - role: security
    focus: ["sql_injection", "xss", "secrets", "auth"]
  - role: performance
    focus: ["n_plus_1", "memory_leaks", "blocking_io"]
  - role: tests
    focus: ["coverage", "test_quality", "edge_cases"]
  - role: style
    focus: ["naming", "complexity", "dead_code"]

# Run on
triggers:
  - pull_request_opened
  - pull_request_updated

# Where to post
output:
  inline_comments: true
  summary_comment: true
  suggestions_as_diffs: true
  1. For CI auto-fix, additionally enable the integration at Settings → CI Integration. Authenticate the Claude GitHub App and grant the repo write access — auto-fix opens PRs, so it needs write scope.
# GitHub Actions integration (example workflow)
# .github/workflows/claude-auto-fix.yml
name: Claude CI Auto-Fix
on:
  workflow_run:
    workflows: ["CI"]
    types: [completed]
    branches: [main]

jobs:
  auto-fix:
    if: ${{ github.event.workflow_run.conclusion == 'failure' }}
    runs-on: ubuntu-latest
    steps:
      - uses: anthropics/claude-auto-fix@v1
        with:
          api-key: ${{ secrets.ANTHROPIC_API_KEY }}
          workflow-run-id: ${{ github.event.workflow_run.id }}
          fix-budget: 3
          require-human-review: true
  1. For developers writing agents on top of Claude rather than using Code Review directly, the Advisor + Executor primitive is in the Claude Agent SDK 0.5+.
# Using Advisor + Executor in the Agent SDK (TypeScript)
import { Agent, AdvisorExecutor } from '@anthropic/agent-sdk';

const orchestrator = new AdvisorExecutor({
  advisor: {
    model: 'claude-opus-4-7',
    role: 'Strategic planner. Reviews each step before execution.',
    intervene_on: ['high_stakes_decision', 'ambiguous_state']
  },
  executor: {
    model: 'claude-sonnet-4-6',
    role: 'Tactical executor. Handles routine steps under advisor guidance.'
  }
});

const result = await orchestrator.run({
  task: 'Refactor authentication module to use scoped tokens',
  tools: ['read_file', 'write_file', 'run_tests'],
  budget_usd: 5.00
});
console.log(result.summary);

How it compares to GitHub Copilot Code Review and other tools

Feature Claude Code Review GitHub Copilot Code Review CodeRabbit Greptile
Multi-agent parallel review Yes — orchestrator + specialists Single model Single model with prompts Single model
Base model Claude Opus 4.7 + Sonnet 4.6 GPT-5.5 / Claude Opus 4.7 Choice of model Claude Opus 4.7
CI auto-fix Yes — opens fix PRs Partial — suggestions only No No
Inline suggestions as diffs Yes Yes Yes Limited
Per-repo configuration YAML in .claude/ Org-level + per-repo YAML in repo YAML in repo
Pricing model Per-PR metered (Teams/Enterprise) Per-seat ($10-19/mo) Per-PR (with seat option) Per-user ($15-25/mo)
Customizable specialist roles Yes No Partial No
GitHub support Yes Yes (native) Yes Yes
GitLab support Yes Yes (Premium) Yes Yes

Claude Code Review’s main differentiators are the multi-agent parallel architecture and CI auto-fix. The multi-agent design produces qualitatively different reviews because each specialist agent can run with a different prompt, different tool access, and different temperature — they catch different bugs than a single-pass review would. CI auto-fix is unique today; competitors offer fix suggestions but not opened-as-PR fixes.

The trade-offs are also clear. GitHub Copilot Code Review wins on tight integration with the rest of the Copilot ecosystem (you get review + autocomplete + chat for one seat price). CodeRabbit and Greptile have longer track records and broader rule customization. For a team standardizing on Claude across other workflows (Claude Code CLI, Anthropic API), Claude Code Review is the natural choice; for teams already on Copilot, the gain of switching depends on how much CI auto-fix matters.

What’s next for Claude Code Review and the agent stack

Three things to watch through summer 2026 and into autumn. First, the auto-fix scope. Today’s auto-fix handles failing tests, broken builds, and lint violations. The next category Anthropic has signaled is dependency upgrades — automatically opening PRs that bump pinned dependency versions, run the full test suite, and report back. Dependabot-style functionality with full test verification is a real engineering improvement.

Second, the move from review to author. The same orchestration primitive that powers Code Review will eventually power feature implementation — a developer opens a ticket, the system researches, plans, codes, tests, and opens a draft PR. Anthropic showed previews of this at the keynote but did not commit to a date. Most engineers expect a public preview by Q4 2026.

Third, the broader Code with Claude ecosystem. The conference is moving from one-time keynote to a recurring developer relations cadence. Tokyo Code with Claude is scheduled for July 2026; New York for September. Expect more orchestration primitives, more managed agent patterns, and a steady drumbeat of Claude Code feature additions through the rest of the year.

Frequently Asked Questions

What does Claude Code Review cost?

Anthropic announced a metered pricing model for Code Review: cost per pull request reviewed, with the exact rate depending on PR size (lines of code, files touched) and depth of review (which specialist roles are enabled). At the keynote, Anthropic said the median PR review cost was “under $1” with the standard configuration. Teams and Enterprise plans include a monthly Code Review allowance; overage is metered at per-PR rates. Exact pricing is in the Claude billing console under Code Review usage.

Does Claude Code Review replace human code review?

No. Anthropic is explicit that Code Review is designed to augment human review, not replace it. The system catches a class of issues (security holes, performance traps, missing tests, style drift) that experienced reviewers often skim past, while humans remain the decision-makers for architecture, design choices, and business logic. The product positioning is “make every PR faster and more thorough” — not “no humans needed.”

Can Code Review run on private code without sending it to Anthropic?

No. Code Review runs as a managed service in Anthropic’s infrastructure. PR contents are sent to Anthropic’s API the same way any Claude API call would be. The standard Claude Teams/Enterprise terms apply: data is not used for training; access is logged; SOC 2 and ISO 27001 controls are in place. For organizations that cannot send code outside their environment, Code Review is not a fit; the open-source Claude Code CLI (running against your own Anthropic API account) is the closest alternative but does not have the multi-agent review orchestration.

How does CI auto-fix decide when to attempt a fix?

CI auto-fix triggers on failed workflows that match configured patterns. By default, it attempts fixes on test failures, build errors, and common lint violations. It does not attempt fixes for failures it can’t reliably diagnose (flaky tests, infrastructure failures, network issues). The fix-budget parameter caps how many fix attempts run per PR before giving up; the default is three. Auto-fix opens its fix as a new draft PR against the same branch with a clear “Generated by Claude” label, never merging directly.

What is the Advisor + Executor pattern and when should I use it?

Advisor + Executor pairs a more capable, expensive model (Claude Opus) with a faster, cheaper model (Claude Sonnet). The Advisor reviews each step before execution, intervenes on hard decisions, and otherwise lets the Executor handle the routine work. The pattern produces better quality than running everything on Sonnet alone, at significantly lower cost than running everything on Opus. Use it for any agent task that has a mix of high-stakes decisions and routine execution — most production agent workloads fit this shape.

How does this compare to OpenAI Codex Cloud or Devin?

Claude Code Review and Codex Cloud / Devin solve different problems. Codex Cloud and Devin focus on autonomous coding — given a task, produce a working PR. Claude Code Review focuses on reviewing PRs after a human has authored them. There is overlap on the eventual roadmap (both will move into authoring + review), but as of May 2026 they are complementary tools. Many teams use Devin or Codex for new feature implementation and Claude Code Review for review on all PRs, including those Devin produces.

Scroll to Top