Debate

new opus

Multi-agent debate orchestrator. Spawn 2-5 dynamic agents with unique characters, run structured debates, and produce judge-verified decisions. Three modes for different goals: select the best option, synthesize strategy, or find all weaknesses.

✏️

CHALLENGE

Generate or receive variants, debate to select the best one. Defenders argue FOR, critics attack. Outcome: selected variant with justification.

STRATEGY

Deep analysis — each agent independently forms their approach, then debate to converge. Outcome: synthesized strategy or ranked approaches.

🔍

CRITIC

Find all weaknesses and risks in a given solution, plan, or code. All agents are critics with different perspectives. Outcome: prioritized issue list.

Quick Reference

FieldValue
Command/brewtools:debate
Arguments[topic] [-m challenge|strategy|critic] [-n 2-5] [-r max-rounds] [--review]
ModesChallenge (default), Strategy, Critic
Agents2-5 dynamic agents + secretary + judge (main session)
Rounds1-10 (default: 5)
Log formatJSONL — one entry per debate turn
Modelopus
Outputdecisions.md, summary.md, debate-log.jsonl

Quick Start

# Challenge mode -- compare options (default)
/brewtools:debate "React vs Vue vs Svelte for our new dashboard"

# Strategy mode -- deep analysis
/brewtools:debate "Migration plan from monolith to microservices" -m strategy

# Critic mode -- find all weaknesses
/brewtools:debate "Review our authentication flow" -m critic

# Custom agent count and max rounds
/brewtools:debate "Kubernetes vs ECS" -n 4 -r 8

# Auto-detect mode from keywords
/brewtools:debate "Which database should we pick for time-series data"

Discovery Phase

Every debate begins with evidence, not opinions. Before any argument starts, 2-3 research agents run in parallel to build a shared evidence base. This makes decisions grounded in real project code and current industry knowledge, not hallucinated reasoning.

🔍

Codebase Explorer

Searches your project for relevant code, patterns, dependencies, and existing conventions. Finds what already exists before debating what should exist.

🔗

Web Researcher

Searches the internet for current best practices, official documentation, benchmarks, and community discussions. Brings external context into the debate.

📄

Evidence Aggregation

Combines all findings into discovery.md with full sources — file paths, URLs, code snippets. Every debate agent receives this as context.

Caution

Evidence mandate: Every debate argument MUST reference findings from Discovery. Unsourced claims are challenged by the judge and do not count toward consensus.

Workflow

  1. Phase 0 — Validation

    Run validate.sh to check all skill files exist. Load archetypes into context. Stop on failure.

  2. Phase 1-2 — Parse and Init

    Parse arguments: topic, mode (-m), agent count (-n), max rounds (-r). Auto-detect mode from keywords if not specified. Create report directory and empty JSONL log.

  3. Phase 3 — User Interview

    Confirm or adjust: mode, agent count, agent profiles (auto or custom), max rounds. Interactive via AskUserQuestion.

  4. Phase 4 — Agent Generation

    Detect domain, assign roles (mode-aware), select character archetypes. Display full agent table: name, role, character, perspective, rationale. User confirms or adjusts.

  5. Phase 4.5 — Discovery (Research)

    2-3 research agents run in parallel: Codebase Explorer searches the project for relevant code, patterns, and dependencies; Web Researcher(s) search the internet for current best practices, official docs, and community discussions. All findings are documented with sources (file paths, URLs) in discovery.md. This evidence base is injected into every debate agent’s context.

  6. Phase 5 — Debate

    Execute mode-specific flow. Every argument must cite Discovery evidence — unsourced claims are challenged by the judge. Each agent sees previous entries via JSONL log. Judge monitors for consensus, stalemate, or max rounds. Sequential execution ensures coherent argumentation.

  7. Phase 6 — Summary

    Secretary agent writes summary.md with key arguments, turning points, and areas of agreement/disagreement.

  8. Phase 7 — Decision

    Judge (main session) writes decisions.md with final verdict, reasoning, and minority opinions.

  9. Phase 8 — Final Output

    Status table, outcome (consensus / partial / none), decision bullets, links to all artifacts. Optional /brewcode:review if —review flag was set.

Modes Deep Dive

Goal: Select the best variant from multiple options.

Agent Roles: Defenders argue FOR their assigned variant(s); Critics attack all variants, probing weaknesses. Split: 2 agents = 1/1, 3 = 1/2, 4 = 2/2, 5 = 2/3.

Flow:

  1. After Discovery, defenders present variants with evidence-backed arguments
  2. Critics challenge each variant citing counter-evidence
  3. Multiple rounds until judge detects consensus or max rounds reached

Best for: Technology choices, architecture decisions, comparing approaches, “A vs B” questions.

Goal: Synthesize the strongest approach from independent proposals.

Agent Roles: All agents are Strategists with different archetypes — no defenders or critics.

Flow:

  1. After Discovery, each agent formulates their approach independently (parallel thinking)
  2. Judge picks opening order — most divergent proposals go first
  3. Agents discuss, challenge, and synthesize, citing sources
  4. Goal is convergence, not winner selection

Best for: Migration plans, system design, strategic decisions, “how should we approach X” questions.

Goal: Find every weakness, risk, and flaw in a given solution.

Agent Roles: All agents are Critics with different perspectives (operational, security, UX, financial, architectural). No defender — the target IS the document/plan/code.

Flow:

  1. After Discovery, each critic analyzes from their unique angle, citing evidence
  2. Rounds build on previous findings
  3. Judge consolidates into prioritized issue list with severity ratings

Best for: Code review, architecture review, plan validation, risk assessment, security audit.

Agent Archetypes

10 character archetypes define HOW an agent argues. Combined with roles (defender/critic/strategist) to create unique debate personas.

Pragmatist

Results-oriented, impatient with theory. “What actually works in production?” Cites real-world outcomes, dismisses hypotheticals.

Visionary

Big-picture, future-oriented. “Where is this heading in 5 years?” Argues from trends, first principles, emerging patterns.

🔍

Skeptic

Cautious, evidence-demanding. “Show me the data.” Finds edge cases, stress-tests assumptions, demands proof.

✏️

Architect

Systematic, pattern-focused. “How does this fit the bigger system?” Argues from design principles, consistency, modularity.

🔧

Operator

Reliability-focused, operations-minded. “Who maintains this at 3 AM?” Argues from operational reality and incident response.

❤️

Advocate

User-centric, empathetic. “What does the end user experience?” Argues from UX, adoption, accessibility, learning curve.

💰

Economist

Cost-conscious, ROI-focused. “What’s the total cost of ownership?” Argues with numbers, trade-off matrices, opportunity cost.

📚

Historian

Precedent-aware, pattern-matching. “We tried this in 2019 and it failed because…” Argues from past failures and case studies.

🔥

Provocateur

Contrarian, challenges consensus. “What if we’re solving the wrong problem?” Reframes the debate, asks uncomfortable questions.

🤝

Diplomat

Consensus-seeking, synthesizing. “I hear both sides — what if we combine…” Finds common ground, proposes compromises.

Tip

Archetypes are auto-selected to create productive tension. Challenge mode pairs contrasting styles (e.g., Pragmatist defender vs Visionary critic). Avoid pairing archetypes that argue identically.

Configuration

FlagDefaultDescription
-mauto-detectMode: challenge, strategy, critic
-n3Agent count: 2-5
-r5Max debate rounds: 1-10
--reviewoffRun /brewcode:review on final output
(positional)Topic text or file path

Auto-detect mode when -m is omitted:

Keywords in topicDetected mode
compare, choose, select, best, vs, versus, pick, whichchallenge
strategy, approach, plan, how to, design, architecturestrategy
critique, weakness, risk, flaw, review, audit, problemcritic
(no match)challenge (default)

Output Format

All artifacts are written to .claude/reports/{TS}_debate/.

JSONL Log

Each debate turn is a single line in debate-log.jsonl:

{"ts":"2026-04-05T14:30:00","from":"agent-1","to":["agent-2"],"what":"React has larger ecosystem","why":"More packages, hiring pool, community support","type":"argument","mode":"challenge"}
{"ts":"2026-04-05T14:30:15","from":"agent-2","to":["agent-1"],"what":"Ecosystem size != quality","why":"Vue's curated ecosystem avoids dependency hell","type":"counter","mode":"challenge"}
{"ts":"2026-04-05T14:31:00","from":"agent-3","to":["all"],"what":"Both miss the point","why":"Developer experience and bundle size matter more for our team size","type":"redirect","mode":"challenge"}

Entry types: argument, counter, proposal, agree, question, redirect

Artifacts

Each debate produces four files:

FileContent
discovery.mdResearch findings with sources: file paths, URLs, code snippets, best practices
decisions.mdJudge verdict, reasoning, minority opinions
summary.mdSecretary summary: key arguments, turning points, agreement areas
debate-log.jsonlFull debate log, one entry per turn

Examples

# Compare frontend frameworks for a new project
/brewtools:debate "React vs Vue vs Svelte for our dashboard" -n 3

# 3 agents: 1 defender (Pragmatist), 2 critics (Skeptic, Visionary)
# Output: selected framework with full justification
# Strategy for migration
/brewtools:debate "How to migrate from REST to GraphQL" -m strategy -n 4

# 4 strategists: Architect, Operator, Economist, Pragmatist
# Each proposes independently, then debate to converge
# Critic mode on authentication flow
/brewtools:debate "Critique our JWT auth implementation in auth/" -m critic -n 5

# 5 critics: Skeptic (edge cases), Operator (ops risks),
# Architect (design flaws), Advocate (UX), Economist (cost)
# Database selection with extended rounds
/brewtools:debate "PostgreSQL vs DynamoDB for our event store" -n 4 -r 8

# 4 agents, up to 8 rounds for thorough analysis
# Auto-detects challenge mode from "vs" keyword
# Find weaknesses in a deployment plan
/brewtools:debate "Review our Kubernetes migration plan" -m critic --review

# All-critic team reviews the plan
# --review flag triggers /brewcode:review on output

Note

Tips for effective debates:

  • Be specific in the topic — “React vs Vue for our 5-person team building a real-time dashboard” produces better arguments than “React vs Vue”
  • Use 3 agents for focused debates, 4-5 for complex multi-faceted topics
  • Strategy mode works best when the problem space is open-ended
  • Critic mode shines when you already have a solution and want to stress-test it
  • The judge (main session) can intervene mid-debate to redirect if agents go off-topic

Note

Design decisions: Agent tool for dynamic prompts (no pre-defined files), sequential execution (each agent sees previous entries), JSONL shared state (minimal tokens, scriptable), Bash scripts for I/O (schema enforcement), judge as main session (full context, no overhead), mode-specific flows (token efficiency).

🚀

Latest Release

Download, changelog, and installation instructions.

🔗

View on GitHub

Source code, README, and configuration files.

Updating plugins

Use /brewtools:plugin-update to check and update the brewcode plugin suite in one command. See the FAQ for details.