image-gen

new haiku

AI image generation via multiple providers with anti-AI-slop prompt engineering. Four modes, five providers, three art styles. Fast path for 99% of cases — just type a prompt.

GLM-image
GLM-image — flagship Z.ai model

Top-tier image quality at just $0.015/image. The best quality-to-price ratio among all providers. Z.ai GLM-image docs

GENERATE

Text-to-image generation. Default mode. Anti-slop prefix applied automatically based on style. Supports batch generation up to 10 images.

✏️

EDIT

Modify existing images. Supported by Gemini (native edit) and OpenAI (DALL-E 2). Provide image path and edit instructions.

🔧

CONFIG

Set up API keys for providers. Interactive key entry, validation, and storage to .env, ~/.zshrc, or .claude.local.md.

🔄

UPDATE

Check provider APIs for latest models, pricing, and breaking changes. Web search for current documentation.

Quick Reference

FieldValue
Command/brewui:image-gen
Arguments[prompt] [--edit image.png 'instructions'] [--config] [--update] [--service] [--style] [--count N] [--output dir] [--size WxH]
Modesgenerate (default), edit, config, update
Providers5 (Gemini Imagen 4, OpenRouter Gemini 2.5 Flash, OpenRouter GPT-5, Z.ai GLM-image, OpenAI DALL-E 3)
Stylesphoto, illustration, art
Modelhaiku
OutputImages + sidecar JSON metadata

Quick Start

# Generate with defaults (gemini, photo style, 1024x1024)
/brewui:image-gen "a cozy coffee shop at sunset with warm lighting"

# Specific provider and style
/brewui:image-gen --service openrouter --style illustration "tech blog header"

# Edit existing image
/brewui:image-gen --edit photo.png "add warm golden hour lighting"

# Configure API keys
/brewui:image-gen --config

# Batch generation
/brewui:image-gen --count 4 "mountain lake at dawn, misty atmosphere"

Providers

Gemini Imagen 4

Google’s latest image model. Very high quality. Requires paid Gemini plan. Native edit support. Default provider.

OpenRouter Gemini 2.5 Flash

Cheapest option (~$0.001/image). Fast, high quality. Via OpenRouter API. No edit support.

🧠

OpenRouter GPT-5 Image

Highest quality. ~$0.01/image. Via OpenRouter API. Medium speed. No edit support.

👁

Z.ai GLM-image

Flagship Z.ai model. Top-tier quality at ~$0.015/image. Best quality-to-price ratio. Via Z.ai API.

🏛

OpenAI DALL-E 3

Reliable, most expensive ($0.04-0.12/image). 1 image per request. Edit via DALL-E 2 fallback.

Anti-Slop System

Style-aware prompt engineering that prevents common AI image artifacts. Applied automatically as a prefix to your prompt.

Target: Physically accurate photography.

Enforces: real lighting physics, correct human anatomy, natural material textures, proper depth of field, no plastic skin, no impossible reflections, no symmetrical faces.

Target: Professional illustration.

Enforces: clean line work, proper color theory, organic imperfections, consistent style, no airbrushed gradients, no floating elements, no generic clip-art feel.

Target: Consistent artistic medium.

Enforces: unified brushwork, intentional composition, coherent color temperature, visible artistic technique, no mixed-media confusion, no over-processed look.

Workflow

  1. Phase 0 — Parse Arguments

    Run parse-args.sh to extract flags and detect mode from natural language context. Priority: explicit flags > context analysis > default (generate).

  2. Phase 1 — Validate and Gather

    Load environment, check API key (.env > shell env > ask user). Fast path: if prompt provided with no flags, skip service/style/count questions and use defaults.

  3. Phase 2 — Build and Generate

    Load anti-slop prefix for style. Build enhanced prompt. Load provider specs. Construct JSON payload. Send API request. Parse response (base64 or URL).

  4. Phase 3 — Save Images

    Generate kebab-case title from prompt. Save each image with sidecar JSON metadata (prompt, service, style, size, timestamp).

  5. Phase 4 — Report

    Display results table with file paths, provider, style, size, cost estimate. Offer: generate more, different prompt, edit, or done.

Modes Deep Dive

Default mode (99% of cases).

Fast path: when $ARGUMENTS is just a prompt text with no flags, skip interactive questions (count, service, style, output) and use defaults: count=1, service=gemini, style=photo, output=.claude/reports/images/.

Only asks user questions when: prompt is missing, or API key is not found.

Agent invocation: When called from an agent, all provided args are treated as final. No confirmation step if all params are explicit.

Modify existing images.

Requires: image path + edit instructions. Validates image exists and is valid image MIME type.

Supported providers: Gemini (native edit), OpenAI (DALL-E 2 fallback). OpenRouter redirects user to supported provider.

Set up API keys.

Interactive flow: select service > enter API key > validate with test request > choose storage location (.env, ~/.zshrc, .claude.local.md).

Key URLs: Gemini (aistudio.google.com/apikey), OpenRouter (openrouter.ai/keys), OpenAI (platform.openai.com/api-keys).

Check for API changes.

Web searches each provider for latest models, pricing changes, and breaking API changes. Compares against current references/providers.md and offers to update.

Configuration

ParameterDefaultOptionsDescription
--servicegeminigemini, openrouter, openrouter-gpt5, zai, openaiImage generation provider
--stylephotophoto, illustration, artAnti-slop style preset
--count11-10Number of images to generate
--size1024x1024WxH formatImage dimensions
--output.claude/reports/images/Directory pathWhere to save images
--edit<image.png> 'instructions'Edit mode with image path
--configEnter config mode
--updateCheck for API updates

Examples

# Blog post OG image with illustration style
/brewui:image-gen --style illustration "minimalist tech blog header with dark mode theme, code brackets, purple accent"

# Output: .claude/reports/images/minimalist-tech-blog-header-001.png
# Sidecar: .claude/reports/images/minimalist-tech-blog-header-001.json
# Photo-realistic hero image
/brewui:image-gen "developer working at a standing desk, morning light through large windows, multiple monitors showing code"

# Uses defaults: gemini, photo style, 1024x1024
# Generate 4 variations to compare
/brewui:image-gen --count 4 --service openrouter "abstract geometric pattern, blue and orange gradient"

# Output: 4 images with sequential numbering
# Edit existing image
/brewui:image-gen --edit ./hero.png "remove the background, add soft drop shadow, increase contrast"

# Requires Gemini or OpenAI provider

Cost optimization

OpenRouter Gemini 2.5 Flash is the cheapest at ~$0.001/image. Use it for iterations and drafts. Switch to Gemini Imagen 4 or GPT-5 for final high-quality output.

Remote viewing with Brewpage

Working from a remote terminal or mobile? Can’t open the generated image locally? Combine with /brewdoc:publish to get a shareable URL instantly:

# Generate an image
/brewui:image-gen "hero image for blog post about AI coding"

# Publish to get a viewable URL
/brewdoc:publish .claude/reports/images/hero-image-for-blog-001.png
# → https://brewpage.app/your-namespace/hero-image-for-blog-001

This works great from SSH sessions, Telegram bots, or any headless environment where you can’t open files directly.

API key priority

Keys are resolved in order: explicit in prompt > .env file > shell environment variable > interactive config. For team projects, use .env (add to .gitignore). For personal use, ~/.zshrc works well.

🚀

Latest Release

Download, changelog, and installation instructions.

🔗

View on GitHub

Source code, README, and configuration files.

Updating plugins

Use /brewtools:plugin-update to check and update the brewcode plugin suite in one command. See the FAQ for details.