OpenAI image generation

The orchestrator routes OpenAI image requests to OpenAI's hosted APIs via the imageGen step. Five models, each with its own behaviour and quality tier:

`model`	Operations	Notes
`gpt-image-2`	`createImage` / `editImage`	Default — latest GPT-Image model. Arbitrary `width`/`height` (not fixed presets), optional `maskImage` for regional edits. No `background` control.
`gpt-image-1.5`	`createImage` / `editImage`	Previous flagship. Fixed `size` enum, 4 images max, quality + background controls.
`gpt-image-1`	`createImage` / `editImage`	Older GPT-Image. Up to 10 images per call. Supports `background: "transparent"`.
`dall-e-3`	`createImage` only	Stand-alone `natural` vs `vivid` style control, `standard` / `hd` quality, up to 1792 px.
`dall-e-2`	`createImage` / `editImage`	Legacy. Only supports square outputs (256² / 512² / 1024²) and 1000-char prompts. Use only for compatibility reasons.

Default choice for new integrations: model: "gpt-image-2". It's OpenAI's latest flagship with flexible output dimensions and mask-based editing. Fall back to gpt-image-1.5 if you need background control or prefer the fixed size enum; gpt-image-1 for transparent backgrounds or quantity > 4; dall-e-3 for style-controlled vivid output; avoid dall-e-2 unless you specifically need it.

Prerequisites

A Civitai orchestration token (Quick start → Prerequisites)
For editImage: one or more source image URLs, data URLs, or Base64 strings
For gpt-image-2 mask-based edits: a mask image whose fully transparent pixels (alpha = 0) indicate the region to edit

gpt-image-2 (default)

http

POST https://orchestration.civitai.com/v2/consumer/workflows?wait=60
Authorization: Bearer <your-token>
Content-Type: application/json

{
  "steps": [{
    "$type": "imageGen",
    "input": {
      "engine": "openai",
      "model": "gpt-image-2",
      "operation": "createImage",
      "prompt": "A photorealistic portrait of a woman with flowers in her hair, golden hour lighting",
      "width": 1024,
      "height": 1024,
      "quantity": 1,
      "quality": "high"
    }
  }]
}

Set your Civitai API token via the Token button in the navbar to enable Try It.

Request body — edit to customize (e.g. swap the image URL or prompt)

Valid JSON

Parameters

Field	Default	Allowed	Notes
`prompt`	— ✅	≤ 32 000 chars	Natural-language works best.
`width`	`1024`	256–3840, multiple of 16	Explicit width in pixels.
`height`	`1024`	256–3840, multiple of 16	Explicit height in pixels.
`quantity`	`1`	`1`–`4`
`quality`	`high`	`low` / `medium` / `high`	Drives pricing (see Cost below).
`outputFormat`	`jpeg`	`jpeg` / `png` / `webp`	Inherited from the `imageGen` step.

Editing (`editImage`)

json

{
  "steps": [{
    "$type": "imageGen",
    "input": {
      "engine": "openai",
      "model": "gpt-image-2",
      "operation": "editImage",
      "prompt": "Make it a winter scene with snow falling",
      "images": [
        "https://image.civitai.com/.../source.jpeg"
      ]
    }
  }]
}

Set your Civitai API token via the Token button in the navbar to enable Try It.

Request body — edit to customize (e.g. swap the image URL or prompt)

Valid JSON

In edit mode width / height are optional. When both are omitted, the output size is inferred from the input images (image_size: "auto" is sent to the model). To force explicit output dimensions, set both fields.

To restrict the edit to a specific region, pass maskImage — a URL, data URL, or Base64 string where fully transparent pixels (alpha = 0) mark the area to modify:

json

{
  "steps": [{
    "$type": "imageGen",
    "input": {
      "engine": "openai",
      "model": "gpt-image-2",
      "operation": "editImage",
      "prompt": "Replace the background with a tropical beach",
      "images": ["https://image.civitai.com/.../source.jpeg"],
      "maskImage": "https://image.civitai.com/.../mask.png"
    }
  }]
}

Set your Civitai API token via the Token button in the navbar to enable Try It.

Request body — edit to customize (e.g. swap the image URL or prompt)

Valid JSON

Only the first image in images[] is masked; additional reference images are ignored by the mask.

gpt-image-1.5

http

POST https://orchestration.civitai.com/v2/consumer/workflows?wait=60
Authorization: Bearer <your-token>
Content-Type: application/json

{
  "steps": [{
    "$type": "imageGen",
    "input": {
      "engine": "openai",
      "model": "gpt-image-1.5",
      "operation": "createImage",
      "prompt": "A photorealistic portrait of a woman with flowers in her hair, golden hour lighting",
      "size": "1024x1024",
      "quantity": 1,
      "quality": "high"
    }
  }]
}

Set your Civitai API token via the Token button in the navbar to enable Try It.

Request body — edit to customize (e.g. swap the image URL or prompt)

Valid JSON

Parameters

Field	Default	Allowed	Notes
`prompt`	— ✅	≤ 32 000 chars	Natural-language works best.
`size`	`1024x1024`	`1024x1024` / `1536x1024` / `1024x1536`	Exact pixel dimensions as a string, not width/height.
`quantity`	`1`	`1`–`4`	Lower cap than `gpt-image-1` (which allows up to 10).
`background`	`auto`	`auto` / `transparent` / `opaque`	Transparent backgrounds require PNG-compatible output.
`quality`	`high`	`low` / `medium` / `high`	1.5 doesn't expose `auto`; pick one explicitly.

Editing (`editImage`)

json

{
  "steps": [{
    "$type": "imageGen",
    "input": {
      "engine": "openai",
      "model": "gpt-image-1.5",
      "operation": "editImage",
      "prompt": "Make it a winter scene with snow falling",
      "size": "1024x1024",
      "images": [
        "https://image.civitai.com/.../source.jpeg"
      ]
    }
  }]
}

Set your Civitai API token via the Token button in the navbar to enable Try It.

Request body — edit to customize (e.g. swap the image URL or prompt)

Valid JSON

Pass one or more source images in images[]. The Edit shape also accepts a mask (string URL/DataURL/Base64) if you want to inpaint a specific region.

gpt-image-1 (previous flagship, transparent backgrounds)

Same shape as gpt-image-1.5 but older weights. Reach for it when you need:

quantity > 4 (up to 10)
background: "transparent" (1.5 also supports it but older clients may already be wired for 1)
quality: "auto"

json

{
  "steps": [{
    "$type": "imageGen",
    "input": {
      "engine": "openai",
      "model": "gpt-image-1",
      "operation": "createImage",
      "prompt": "A stylized logo mark for a coffee brand, simple vector illustration",
      "size": "1024x1024",
      "quantity": 1,
      "background": "transparent",
      "quality": "high"
    }
  }]
}

Set your Civitai API token via the Token button in the navbar to enable Try It.

Request body — edit to customize (e.g. swap the image URL or prompt)

Valid JSON

size, background, images[], and mask work exactly like 1.5; the only difference is quantity goes up to 10 and quality accepts auto as a fifth option.

dall-e-3 (style-controlled, `natural` vs `vivid`)

dall-e-3 is create-only and exposes a style dimension the GPT-Image models don't:

json

{
  "steps": [{
    "$type": "imageGen",
    "input": {
      "engine": "openai",
      "model": "dall-e-3",
      "operation": "createImage",
      "prompt": "A majestic fantasy landscape with floating islands",
      "size": "1024x1024",
      "style": "vivid",
      "quality": "hd"
    }
  }]
}

Set your Civitai API token via the Token button in the navbar to enable Try It.

Request body — edit to customize (e.g. swap the image URL or prompt)

Valid JSON

Field	Default	Allowed	Notes
`prompt`	— ✅	≤ 4 000 chars	DALL·E 3 silently rewrites short/vague prompts — write verbose directives for control.
`size`	— ✅	`1024x1024` / `1792x1024` / `1024x1792`	Required field on DALL·E 3 (unlike GPT-Image, which has a default).
`style`	`vivid`	`natural` / `vivid`	`vivid` pushes hyper-real colours; `natural` stays closer to the prompt.
`quality`	`auto`	`auto` / `standard` / `hd`	All quality tiers cost the same Buzz (300 flat per image); `hd` increases runtime only.

DALL·E 3 doesn't support editImage or multiple samples — quantity isn't on the schema.

dall-e-2 (legacy)

Only include for legacy compatibility. Prompt ≤ 1000 chars, square outputs only, and much lower output quality than newer models:

json

{
  "steps": [{
    "$type": "imageGen",
    "input": {
      "engine": "openai",
      "model": "dall-e-2",
      "operation": "createImage",
      "prompt": "A vintage postcard illustration of a mountain town",
      "size": "512x512",
      "quantity": 1
    }
  }]
}

Set your Civitai API token via the Token button in the navbar to enable Try It.

Request body — edit to customize (e.g. swap the image URL or prompt)

Valid JSON

Field	Allowed	Notes
`size`	`256x256` / `512x512` / `1024x1024`	Square only.
`prompt`	≤ 1000 chars	Much tighter than newer models.
`quantity`	`1`–`10`

Supports editImage with image (a single source string, not an array), but honestly — use GPT-Image 1 or 1.5 unless you have a reason not to.

Reading the result

All models emit the standard imageGen output:

json

{
  "status": "succeeded",
  "steps": [{
    "name": "0",
    "$type": "imageGen",
    "status": "succeeded",
    "output": {
      "images": [
        { "id": "blob_...", "url": "https://.../signed.png" }
      ]
    }
  }]
}

Blob URLs are signed and expire — refetch the workflow or call GetBlob for a fresh URL.

Runtime

OpenAI's API queue is the dominant factor — Civitai routes your request straight through. Typical wall times:

Model	Per-image wall time	`wait` recommendation
`dall-e-2`	3–8 s	`wait=30` fine
`dall-e-3` (standard)	10–20 s	`wait=60` fine
`dall-e-3` (hd)	15–40 s	`wait=60` usually fine
`gpt-image-1` / `1.5`	10–30 s per image	`wait=60` fine for `quantity: 1`; fall back to `wait=0` for batches
`gpt-image-2`	15–45 s per image (larger dims take longer)	`wait=60` fine for `quantity: 1`; `wait=0` + poll for batches or 4K

Cost

Billed in Buzz on the workflow's transactions. Use whatif=true for an exact preview; see Payments (Buzz) for currency selection.

gpt-image-2 (size-aware)

Unlike the earlier GPT-Image models, gpt-image-2 prices scale with output dimensions as well as quality. The orchestrator finds the cheapest canonical tier whose dimensions cover your requested width/height (comparing rotation-aware), then bills at that tier:

Dimensions	`low`	`medium`	`high`
1024 × 768	13	52	195
1024 × 1024	13	78	286
1024 × 1536	13	65	221
1920 × 1080	13	52	208
2560 × 1440	13	78	299
3840 × 2160	26	143	533

All values are Buzz per image. Final cost is tier × quantity, plus any priority / output-format surcharges applied by the imageGen step. Requests above 3840 × 2160 clamp to that row; requests smaller than 1024 × 768 floor to the cheapest row.

In edit mode with width/height omitted (image_size: "auto"), we estimate cost from the first input image's dimensions.

Other models (flat per-quality)

total = base × quantity

Model	`quality`	Base (Buzz per image)
`gpt-image-1.5`	`low`	25
`gpt-image-1.5`	`medium`	100
`gpt-image-1.5`	`high` (default)	375
`gpt-image-1`	`low`	25
`gpt-image-1`	`medium`	100
`gpt-image-1`	`high`	375
`gpt-image-1`	`auto`	300
`dall-e-3` / `dall-e-2`	(any)	300

Examples:

gpt-image-2 high, 1024×1024, quantity: 1 → ~286 Buzz
gpt-image-2 high, 1920×1080, quantity: 2 → ~416 Buzz
gpt-image-2 medium, 3840×2160, quantity: 1 → ~143 Buzz
gpt-image-1.5 high, quantity: 1 → ~375 Buzz
dall-e-3 hd, quantity: 1 → ~300 Buzz

For the older models size, background, and style don't change the Buzz price — only quality (GPT-Image) and quantity do. For gpt-image-2, width / height also affect the price via the tier lookup above.

Troubleshooting

Symptom	Likely cause	Fix
`400` with "size must be one of"	Sent `1536x1024` to DALL·E 3 (it wants `1792x1024`) or `1792x1024` to GPT-Image	Match the size enum for the model you're using — the tables above list each model's allowed set.
`400` with "style is not a valid property"	Sent `style` outside DALL·E 3	Only DALL·E 3 exposes `style`.
`400` with "quantity must be ≤ 4" on `gpt-image-1.5`	Using 1.5 ceilings with 1's quantity expectations	Drop to `quantity: 4` or use `gpt-image-1` (up to 10).
`400` with "prompt too long" on `dall-e-2`	DALL·E 2's 1000-char prompt cap	Trim the prompt or move to a newer model.
Output is unexpectedly stylised on DALL·E 3	`style: "vivid"` default	Set `style: "natural"` for closer-to-prompt output.
Output is PNG when you wanted JPEG	Transparent backgrounds force PNG	Set `background: "opaque"` if you want JPEG, or leave `outputFormat` unset and accept whatever OpenAI returns.
Request timed out (`wait` expired)	Large `quantity` or `hd` on DALL·E 3 during busy periods	Resubmit with `wait=0` and poll.
Step `failed`, `reason = "blocked"`	OpenAI's content filter or Civitai moderation	Don't retry the same input — see Errors & retries → Step-level failures.

SubmitWorkflow — operation used by every example here
GetWorkflow — for polling
Google image generation — alternative commercial tier with Imagen 4 and Nano Banana
Gemini image generation — Google's Gemini 2.5 Flash Image direct-API route
Flux 2 / Flux 1 / Qwen — open-weights alternatives on Civitai-hosted workers
Image upscaling — chain after imageGen for higher-res output
Prompt enhancement — LLM-rewrite a prompt before feeding it in via $ref
Full parameter catalog: the OpenAIGpt1CreateImageInput, OpenAIGpt1EditImageInput, OpenAIGpt15CreateImageInput, OpenAIGpt15EditImageInput, OpenAIDallE3CreateImageGenInput, OpenAIDallE2CreateImageGenInput, OpenAIDallE2EditImageInput schemas in the API reference
imageGen endpoint OpenAPI spec — standalone OpenAPI 3.1 YAML covering the full imageGen surface

OpenAI image generation ​

Prerequisites ​

gpt-image-2 (default) ​

Parameters ​

Editing (editImage) ​

gpt-image-1.5 ​

Parameters ​

Editing (editImage) ​

gpt-image-1 (previous flagship, transparent backgrounds) ​

dall-e-3 (style-controlled, natural vs vivid) ​

dall-e-2 (legacy) ​

Reading the result ​

Runtime ​

Cost ​

gpt-image-2 (size-aware) ​

Other models (flat per-quality) ​

Troubleshooting ​

Related ​

OpenAI image generation

Prerequisites

gpt-image-2 (default)

Parameters

Editing (`editImage`)

gpt-image-1.5

Parameters

Editing (`editImage`)

gpt-image-1 (previous flagship, transparent backgrounds)

dall-e-3 (style-controlled, `natural` vs `vivid`)

dall-e-2 (legacy)

Reading the result

Runtime

Cost

gpt-image-2 (size-aware)

Other models (flat per-quality)

Troubleshooting

Related