glm-zai-specialist

sonnet auto-delegated

Caution

The Z.ai API uses OpenAI-compatible schema but its own endpoint and key. Mixing up ZAI_API_KEY with OPENROUTER_API_KEY, or pointing at the wrong base URL, produces 401 errors with no helpful message. The agent always validates the environment before sending a request.

Tip

This agent is auto-delegated by /brewui:glm-design-to-code when Z.ai is the selected provider (the default). You can also invoke it directly for ad-hoc API calls, rate-limit debugging, or model selection questions.

Quick reference

FieldValue
Agent nameglm-zai-specialist
Modelsonnet
ToolsRead, Write, Edit, Bash, Glob, Grep
Triggers”zai api”, “glm request”, “z.ai”, “send to glm”, “glm vision”, “glm model”, “design to code api”, “glm-5v”, “glm-4.6v”
Endpointhttps://api.z.ai/api/paas/v4/chat/completions
Fallbackhttps://openrouter.ai/api/v1/chat/completions

When to use

  • Vision request — send a screenshot or image to a GLM vision model and get structured output (HTML, code, JSON)
  • Model selection — pick the right GLM tier (free flash vs paid turbo) for your budget and quality target
  • Rate limit debugging — diagnose and recover from 429 errors with backoff or provider switch
  • Response parsing — extract ===FILE: ...=== blocks from multi-file GLM output
  • Pipeline troubleshooting — validate ZAI_API_KEY, jq, base64, and pipeline scripts exist before any request

Examples

"Send this screenshot to GLM for design-to-code conversion"
"GLM is returning 429 errors, fix the request"
"Which GLM model should I use for a free iteration round?"

Flow

  1. Validate prerequisites

    Checks ZAI_API_KEY (or OPENROUTER_API_KEY for fallback), jq, base64, curl, and pipeline scripts under $BU_PLUGIN_ROOT/skills/glm-design-to-code/scripts/. Stops with an explicit error if anything is missing — never silently sends a broken request.

  2. Select model

    Free dev/test: glm-4.6v-flash. Budget production: glm-4.6v. Max quality: glm-5v-turbo. Vision models required for image input — text-only models (glm-4.5-flash, glm-5-turbo) are rejected for image payloads.

  3. Build payload

    Encodes the image as base64, wraps it in the OpenAI-compatible content array, merges the system prompt and context file, and sets max_tokens. Uses glm-build-request.sh or constructs manually with jq for non-standard inputs.

  4. Send request

    Calls glm-request.sh which adds —retry 3 —retry-delay 5 and streams the response to a file. Checks HTTP status, logs token usage (prompt / completion / reasoning), and reports estimated cost.

  5. Parse response

    Extracts choices[0].message.content. Checks finish_reasonstop is complete, length means truncated (increases max_tokens or splits the task). If the response uses ===FILE: path=== markers, runs glm-extract.sh to write each file to the output directory.

  6. Report

    Summarises: model used, tokens consumed, estimated cost, list of extracted files, and next step (e.g., open Playwright verification or hand off to tester).

Z.ai model matrix & rate limits
ModelVisionInput $/1MOutput $/1MContextNotes
glm-5v-turboimage + video$1.20$4.00202KBest quality, CogViT
glm-4.6v-flashimageFREEFREE131KFree dev/test
glm-4.6vimage + video$0.30$0.90131KBudget production
glm-5-turbotext only$1.20$4.00202KText flagship
glm-4.7-flashtext onlyFREEFREE202KFree text
glm-4.5-flashtext onlyFREEFREE131KFree text

Rate limit handling:

ScenarioSolution
Free tier 429glm-request.sh retries: 5 s → 10 s → 20 s
Persistent 429Switch to paid model or wait 60 s
Image too largeResize to 1024 px max side, reduce JPEG quality

Common errors:

ErrorCauseFix
401 UnauthorizedInvalid or missing API keyCheck ZAI_API_KEY / OPENROUTER_API_KEY
429 Too Many RequestsFree-tier rate limitRetry with delay or use paid tier
400 Bad RequestMalformed payloadjq empty payload.json
Empty contentReasoning-only responseRead reasoning_content field instead
finish_reason: lengthOutput truncatedIncrease max_tokens (up to 131072)
🤖

GLM OpenRouter Specialist

Same capabilities routed via OpenRouter — use as a fallback when Z.ai is rate-limited or unavailable.

🧩

glm-design-to-code

The skill that orchestrates this agent — screenshot or URL to multi-framework code in one command.

🔗

GitHub source

Agent definition, pipeline scripts, and prompt templates.

📄

Brewui overview

All brewui skills and agents — image generation, design-to-code, GLM providers.

Updating plugins

Use /brewtools:plugin-update to check and update the brewcode plugin suite in one command. See the FAQ for details.