Tester

Caution

Flaky tests, regressions, and mysterious CI failures cost hours of developer time. When tests start failing after a refactor, the root cause is rarely obvious — stack traces mislead, shared state hides bugs, and sleep-based waits mask timing issues. The tester agent exists to handle exactly this class of problem without pulling the developer away from production code.

Tip

Tester runs tests, diagnoses failures, and fixes test-level issues autonomously. Production bugs are routed to developer with a precise report — test code, assertion logic, and flakiness are fixed in place. Trigger with: “run tests”, “tests failing”, “flaky test”, “debug test”, “test coverage”.

Quick reference

FieldValue
Modelsonnet
ToolsRead, Write, Edit, Glob, Grep, Bash, Task
Triggers”run tests”, “tests failing”, “flaky test”, “debug test”, “test coverage”
Scope YESRun tests, analyze failures, debug flaky, fix test code, configure test runs, report issues
Scope NOFix production code (→ developer), substantial test rewrites (→ developer)

When to use

  • Post-refactor validation — tests broke after a code change, need failure triage before developer continues
  • Flaky test elimination — tests that pass sometimes and fail other times, timing or state issues
  • CI failure investigation — build is red, need a structured failure report with root cause
  • Coverage gaps — need to understand what’s covered and what’s missing before a release
  • Framework migration — existing tests need to run correctly under a new test runner or config

Examples

"Run the tests, several are failing after my refactor"
"This test passes sometimes and fails other times — debug it"
"Check test coverage before we release"

Flow

  1. Pre-analysis

    Reads all rules (.claude/rules/*-best-practice.md, .claude/rules/*-avoid.md) and CLAUDE.md for test commands, frameworks, and coverage requirements. Analyzes existing test patterns before touching anything.

  2. Stack detection

    Identifies test framework from project indicators: jest.config.* → Jest, pytest.ini/conftest.py → pytest, *Test.java/pom.xml → JUnit, *_test.go → go test, *.spec.ts → Jasmine/Mocha, Cargo.toml + #[test] → Rust. Also determines test level: unit, integration, E2E, or component.

  3. Execute and capture

    Runs the test command from project config. Captures full output — pass/fail counts, stack traces, timing. No guessing at commands; uses the project-defined entrypoint.

  4. Analyze failures

    Reads stack traces bottom-up, compares expected vs actual values. Categorizes each failure: TEST BUG (fix here), PRODUCTION BUG (route to developer), ENVIRONMENT issue, or FLAKY (fix here). Applies GIVEN/WHEN/THEN structure to new or modified tests.

  5. Fix in scope

    Fixes test-level issues directly: flaky timing, shared state, over-mocking, conditional assertions, sleep-based waits. Enforces quality rules — unit tests under 100ms, integration under 5s, E2E under 30s.

  6. Report

    Produces a structured report with scope, command, duration, summary counts, failure details with root cause and fix suggestions, flaky list, coverage numbers, and next steps. Production bugs go to developer with a precise reproduction path.

Internals

Output format

=== TEST EXECUTION REPORT ===
Scope: [level] | Command: [cmd] | Duration: [time]
SUMMARY: ✅ Passed: X | ❌ Failed: Y | Skipped: Z

FAILURES (→ DEVELOPER):
1. [Test#method] File: [path:line]
   Error: [msg] | Expected: [x] | Actual: [y]
   Root cause: [analysis] | Fix: [suggestion]

FLAKY (I will fix): [list]
COVERAGE: Line [%] | Branch [%]
NEXT: Developer fixes [list] → Re-run

Anti-patterns the agent fixes

PatternFix
Flaky testsAdd proper waits, remove timing dependencies
Shared stateReset before each test
Over-mockingUse real objects where practical
Conditional assertionsAssert preconditions first
Sleep-based waitsUse polling/async utilities

Test quality rules enforced

AspectRule
NamesDescribe behavior clearly
StructureArrange/Act/Assert or GIVEN/WHEN/THEN
AssertionsSingle focus, concrete values, with description
SpeedUnit <100ms, IT <5s, E2E <30s

Scope boundary

Tester fixes test code and configuration. Any failure that requires changing production code is handed off to developer with a report including test name, file path, expected vs actual values, and root cause analysis.

💻

Developer

Receives production bug reports from tester. Implements fixes, features, and refactors.

📄

Reviewer

Architecture, security, and performance review after tests pass.

🔗

GitHub source

Agent definition, role boundaries, and output format specification.

🚀

Brewcode overview

All brewcode agents and skills — infinite task execution, quorum reviews, session handoff.

Updating plugins

Use /brewtools:plugin-update to check and update the brewcode plugin suite in one command. See the FAQ for details.