glm-zai-specialist

sonnet auto-delegated

Caution

The Z.ai API uses OpenAI-compatible schema but its own endpoint and key. Mixing up ZAI_API_KEY with OPENROUTER_API_KEY, or pointing at the wrong base URL, produces 401 errors with no helpful message. The agent always validates the environment before sending a request.

Tip

This agent is auto-delegated by /brewui:glm-design-to-code when Z.ai is the selected provider (the default). You can also invoke it directly for ad-hoc API calls, rate-limit debugging, or model selection questions.

Quick reference

Field	Value
Agent name	`glm-zai-specialist`
Model	sonnet
Tools	Read, Write, Edit, Bash, Glob, Grep
Triggers	”zai api”, “glm request”, “z.ai”, “send to glm”, “glm vision”, “glm model”, “design to code api”, “glm-5v”, “glm-4.6v”
Endpoint	`https://api.z.ai/api/paas/v4/chat/completions`
Fallback	`https://openrouter.ai/api/v1/chat/completions`

When to use

Vision request — send a screenshot or image to a GLM vision model and get structured output (HTML, code, JSON)
Model selection — pick the right GLM tier (free flash vs paid turbo) for your budget and quality target
Rate limit debugging — diagnose and recover from 429 errors with backoff or provider switch
Response parsing — extract ===FILE: ...=== blocks from multi-file GLM output
Pipeline troubleshooting — validate ZAI_API_KEY, jq, base64, and pipeline scripts exist before any request

Examples

"Send this screenshot to GLM for design-to-code conversion"

"GLM is returning 429 errors, fix the request"

"Which GLM model should I use for a free iteration round?"

Flow

Validate prerequisites
Checks ZAI_API_KEY (or OPENROUTER_API_KEY for fallback), jq, base64, curl, and pipeline scripts under $BU_PLUGIN_ROOT/skills/glm-design-to-code/scripts/. Stops with an explicit error if anything is missing — never silently sends a broken request.
Select model
Free dev/test: glm-4.6v-flash. Budget production: glm-4.6v. Max quality: glm-5v-turbo. Vision models required for image input — text-only models (glm-4.5-flash, glm-5-turbo) are rejected for image payloads.
Build payload
Encodes the image as base64, wraps it in the OpenAI-compatible content array, merges the system prompt and context file, and sets max_tokens. Uses glm-build-request.sh or constructs manually with jq for non-standard inputs.
Send request
Calls glm-request.sh which adds —retry 3 —retry-delay 5 and streams the response to a file. Checks HTTP status, logs token usage (prompt / completion / reasoning), and reports estimated cost.
Parse response
Extracts choices[0].message.content. Checks finish_reason — stop is complete, length means truncated (increases max_tokens or splits the task). If the response uses ===FILE: path=== markers, runs glm-extract.sh to write each file to the output directory.
Report
Summarises: model used, tokens consumed, estimated cost, list of extracted files, and next step (e.g., open Playwright verification or hand off to tester).

Z.ai model matrix & rate limits

Model	Vision	Input $/1M	Output $/1M	Context	Notes
`glm-5v-turbo`	image + video	$1.20	$4.00	202K	Best quality, CogViT
`glm-4.6v-flash`	image	FREE	FREE	131K	Free dev/test
`glm-4.6v`	image + video	$0.30	$0.90	131K	Budget production
`glm-5-turbo`	text only	$1.20	$4.00	202K	Text flagship
`glm-4.7-flash`	text only	FREE	FREE	202K	Free text
`glm-4.5-flash`	text only	FREE	FREE	131K	Free text

Rate limit handling:

Scenario	Solution
Free tier 429	`glm-request.sh` retries: 5 s → 10 s → 20 s
Persistent 429	Switch to paid model or wait 60 s
Image too large	Resize to 1024 px max side, reduce JPEG quality

Common errors:

Error	Cause	Fix
401 Unauthorized	Invalid or missing API key	Check `ZAI_API_KEY` / `OPENROUTER_API_KEY`
429 Too Many Requests	Free-tier rate limit	Retry with delay or use paid tier
400 Bad Request	Malformed payload	`jq empty payload.json`
Empty content	Reasoning-only response	Read `reasoning_content` field instead
`finish_reason: length`	Output truncated	Increase `max_tokens` (up to 131072)

🤖

GLM OpenRouter Specialist

Same capabilities routed via OpenRouter — use as a fallback when Z.ai is rate-limited or unavailable.

🧩

glm-design-to-code

The skill that orchestrates this agent — screenshot or URL to multi-framework code in one command.

🔗

GitHub source

Agent definition, pipeline scripts, and prompt templates.

📄

Brewui overview

All brewui skills and agents — image generation, design-to-code, GLM providers.

Updating plugins

Use /brewtools:plugin-update to check and update the brewcode plugin suite in one command. See the FAQ for details.