Skip to content

OpenAI image generation

The orchestrator routes OpenAI image requests to OpenAI's hosted APIs via the imageGen step. Five models, each with its own behaviour and quality tier:

modelOperationsNotes
gpt-image-2createImage / editImageDefault — latest GPT-Image model. Arbitrary width/height (not fixed presets), optional maskImage for regional edits. No background control.
gpt-image-1.5createImage / editImagePrevious flagship. Fixed size enum, 4 images max, quality + background controls.
gpt-image-1createImage / editImageOlder GPT-Image. Up to 10 images per call. Supports background: "transparent".
dall-e-3createImage onlyStand-alone natural vs vivid style control, standard / hd quality, up to 1792 px.
dall-e-2createImage / editImageLegacy. Only supports square outputs (256² / 512² / 1024²) and 1000-char prompts. Use only for compatibility reasons.

Default choice for new integrations: model: "gpt-image-2". It's OpenAI's latest flagship with flexible output dimensions and mask-based editing. Fall back to gpt-image-1.5 if you need background control or prefer the fixed size enum; gpt-image-1 for transparent backgrounds or quantity > 4; dall-e-3 for style-controlled vivid output; avoid dall-e-2 unless you specifically need it.

Prerequisites

  • A Civitai orchestration token (Quick start → Prerequisites)
  • For editImage: one or more source image URLs, data URLs, or Base64 strings
  • For gpt-image-2 mask-based edits: a mask image whose fully transparent pixels (alpha = 0) indicate the region to edit

gpt-image-2 (default)

http
POST https://orchestration.civitai.com/v2/consumer/workflows?wait=60
Authorization: Bearer <your-token>
Content-Type: application/json

{
  "steps": [{
    "$type": "imageGen",
    "input": {
      "engine": "openai",
      "model": "gpt-image-2",
      "operation": "createImage",
      "prompt": "A photorealistic portrait of a woman with flowers in her hair, golden hour lighting",
      "width": 1024,
      "height": 1024,
      "quantity": 1,
      "quality": "high"
    }
  }]
}
POST/v2/consumer/workflows
Set your Civitai API token via the Token button in the navbar to enable Try It.
Request body — edit to customize (e.g. swap the image URL or prompt)
Valid JSON

Parameters

FieldDefaultAllowedNotes
prompt— ✅≤ 32 000 charsNatural-language works best.
width1024256–3840, multiple of 16Explicit width in pixels.
height1024256–3840, multiple of 16Explicit height in pixels.
quantity114
qualityhighlow / medium / highDrives pricing (see Cost below).
outputFormatjpegjpeg / png / webpInherited from the imageGen step.

Editing (editImage)

json
{
  "steps": [{
    "$type": "imageGen",
    "input": {
      "engine": "openai",
      "model": "gpt-image-2",
      "operation": "editImage",
      "prompt": "Make it a winter scene with snow falling",
      "images": [
        "https://image.civitai.com/.../source.jpeg"
      ]
    }
  }]
}
POST/v2/consumer/workflows
Set your Civitai API token via the Token button in the navbar to enable Try It.
Request body — edit to customize (e.g. swap the image URL or prompt)
Valid JSON

In edit mode width / height are optional. When both are omitted, the output size is inferred from the input images (image_size: "auto" is sent to the model). To force explicit output dimensions, set both fields.

To restrict the edit to a specific region, pass maskImage — a URL, data URL, or Base64 string where fully transparent pixels (alpha = 0) mark the area to modify:

json
{
  "steps": [{
    "$type": "imageGen",
    "input": {
      "engine": "openai",
      "model": "gpt-image-2",
      "operation": "editImage",
      "prompt": "Replace the background with a tropical beach",
      "images": ["https://image.civitai.com/.../source.jpeg"],
      "maskImage": "https://image.civitai.com/.../mask.png"
    }
  }]
}
POST/v2/consumer/workflows
Set your Civitai API token via the Token button in the navbar to enable Try It.
Request body — edit to customize (e.g. swap the image URL or prompt)
Valid JSON

Only the first image in images[] is masked; additional reference images are ignored by the mask.

gpt-image-1.5

http
POST https://orchestration.civitai.com/v2/consumer/workflows?wait=60
Authorization: Bearer <your-token>
Content-Type: application/json

{
  "steps": [{
    "$type": "imageGen",
    "input": {
      "engine": "openai",
      "model": "gpt-image-1.5",
      "operation": "createImage",
      "prompt": "A photorealistic portrait of a woman with flowers in her hair, golden hour lighting",
      "size": "1024x1024",
      "quantity": 1,
      "quality": "high"
    }
  }]
}
POST/v2/consumer/workflows
Set your Civitai API token via the Token button in the navbar to enable Try It.
Request body — edit to customize (e.g. swap the image URL or prompt)
Valid JSON

Parameters

FieldDefaultAllowedNotes
prompt— ✅≤ 32 000 charsNatural-language works best.
size1024x10241024x1024 / 1536x1024 / 1024x1536Exact pixel dimensions as a string, not width/height.
quantity114Lower cap than gpt-image-1 (which allows up to 10).
backgroundautoauto / transparent / opaqueTransparent backgrounds require PNG-compatible output.
qualityhighlow / medium / high1.5 doesn't expose auto; pick one explicitly.

Editing (editImage)

json
{
  "steps": [{
    "$type": "imageGen",
    "input": {
      "engine": "openai",
      "model": "gpt-image-1.5",
      "operation": "editImage",
      "prompt": "Make it a winter scene with snow falling",
      "size": "1024x1024",
      "images": [
        "https://image.civitai.com/.../source.jpeg"
      ]
    }
  }]
}
POST/v2/consumer/workflows
Set your Civitai API token via the Token button in the navbar to enable Try It.
Request body — edit to customize (e.g. swap the image URL or prompt)
Valid JSON

Pass one or more source images in images[]. The Edit shape also accepts a mask (string URL/DataURL/Base64) if you want to inpaint a specific region.

gpt-image-1 (previous flagship, transparent backgrounds)

Same shape as gpt-image-1.5 but older weights. Reach for it when you need:

  • quantity > 4 (up to 10)
  • background: "transparent" (1.5 also supports it but older clients may already be wired for 1)
  • quality: "auto"
json
{
  "steps": [{
    "$type": "imageGen",
    "input": {
      "engine": "openai",
      "model": "gpt-image-1",
      "operation": "createImage",
      "prompt": "A stylized logo mark for a coffee brand, simple vector illustration",
      "size": "1024x1024",
      "quantity": 1,
      "background": "transparent",
      "quality": "high"
    }
  }]
}
POST/v2/consumer/workflows
Set your Civitai API token via the Token button in the navbar to enable Try It.
Request body — edit to customize (e.g. swap the image URL or prompt)
Valid JSON

size, background, images[], and mask work exactly like 1.5; the only difference is quantity goes up to 10 and quality accepts auto as a fifth option.

dall-e-3 (style-controlled, natural vs vivid)

dall-e-3 is create-only and exposes a style dimension the GPT-Image models don't:

json
{
  "steps": [{
    "$type": "imageGen",
    "input": {
      "engine": "openai",
      "model": "dall-e-3",
      "operation": "createImage",
      "prompt": "A majestic fantasy landscape with floating islands",
      "size": "1024x1024",
      "style": "vivid",
      "quality": "hd"
    }
  }]
}
POST/v2/consumer/workflows
Set your Civitai API token via the Token button in the navbar to enable Try It.
Request body — edit to customize (e.g. swap the image URL or prompt)
Valid JSON
FieldDefaultAllowedNotes
prompt— ✅≤ 4 000 charsDALL·E 3 silently rewrites short/vague prompts — write verbose directives for control.
size— ✅1024x1024 / 1792x1024 / 1024x1792Required field on DALL·E 3 (unlike GPT-Image, which has a default).
stylevividnatural / vividvivid pushes hyper-real colours; natural stays closer to the prompt.
qualityautoauto / standard / hdAll quality tiers cost the same Buzz (300 flat per image); hd increases runtime only.

DALL·E 3 doesn't support editImage or multiple samples — quantity isn't on the schema.

dall-e-2 (legacy)

Only include for legacy compatibility. Prompt ≤ 1000 chars, square outputs only, and much lower output quality than newer models:

json
{
  "steps": [{
    "$type": "imageGen",
    "input": {
      "engine": "openai",
      "model": "dall-e-2",
      "operation": "createImage",
      "prompt": "A vintage postcard illustration of a mountain town",
      "size": "512x512",
      "quantity": 1
    }
  }]
}
POST/v2/consumer/workflows
Set your Civitai API token via the Token button in the navbar to enable Try It.
Request body — edit to customize (e.g. swap the image URL or prompt)
Valid JSON
FieldAllowedNotes
size256x256 / 512x512 / 1024x1024Square only.
prompt≤ 1000 charsMuch tighter than newer models.
quantity110

Supports editImage with image (a single source string, not an array), but honestly — use GPT-Image 1 or 1.5 unless you have a reason not to.

Reading the result

All models emit the standard imageGen output:

json
{
  "status": "succeeded",
  "steps": [{
    "name": "0",
    "$type": "imageGen",
    "status": "succeeded",
    "output": {
      "images": [
        { "id": "blob_...", "url": "https://.../signed.png" }
      ]
    }
  }]
}

Blob URLs are signed and expire — refetch the workflow or call GetBlob for a fresh URL.

Runtime

OpenAI's API queue is the dominant factor — Civitai routes your request straight through. Typical wall times:

ModelPer-image wall timewait recommendation
dall-e-23–8 swait=30 fine
dall-e-3 (standard)10–20 swait=60 fine
dall-e-3 (hd)15–40 swait=60 usually fine
gpt-image-1 / 1.510–30 s per imagewait=60 fine for quantity: 1; fall back to wait=0 for batches
gpt-image-215–45 s per image (larger dims take longer)wait=60 fine for quantity: 1; wait=0 + poll for batches or 4K

Cost

Billed in Buzz on the workflow's transactions. Use whatif=true for an exact preview; see Payments (Buzz) for currency selection.

gpt-image-2 (size-aware)

Unlike the earlier GPT-Image models, gpt-image-2 prices scale with output dimensions as well as quality. The orchestrator finds the cheapest canonical tier whose dimensions cover your requested width/height (comparing rotation-aware), then bills at that tier:

Dimensionslowmediumhigh
1024 × 7681352195
1024 × 10241378286
1024 × 15361365221
1920 × 10801352208
2560 × 14401378299
3840 × 216026143533

All values are Buzz per image. Final cost is tier × quantity, plus any priority / output-format surcharges applied by the imageGen step. Requests above 3840 × 2160 clamp to that row; requests smaller than 1024 × 768 floor to the cheapest row.

In edit mode with width/height omitted (image_size: "auto"), we estimate cost from the first input image's dimensions.

Other models (flat per-quality)

total = base × quantity
ModelqualityBase (Buzz per image)
gpt-image-1.5low25
gpt-image-1.5medium100
gpt-image-1.5high (default)375
gpt-image-1low25
gpt-image-1medium100
gpt-image-1high375
gpt-image-1auto300
dall-e-3 / dall-e-2(any)300

Examples:

  • gpt-image-2 high, 1024×1024, quantity: 1~286 Buzz
  • gpt-image-2 high, 1920×1080, quantity: 2~416 Buzz
  • gpt-image-2 medium, 3840×2160, quantity: 1~143 Buzz
  • gpt-image-1.5 high, quantity: 1~375 Buzz
  • dall-e-3 hd, quantity: 1~300 Buzz

For the older models size, background, and style don't change the Buzz price — only quality (GPT-Image) and quantity do. For gpt-image-2, width / height also affect the price via the tier lookup above.

Troubleshooting

SymptomLikely causeFix
400 with "size must be one of"Sent 1536x1024 to DALL·E 3 (it wants 1792x1024) or 1792x1024 to GPT-ImageMatch the size enum for the model you're using — the tables above list each model's allowed set.
400 with "style is not a valid property"Sent style outside DALL·E 3Only DALL·E 3 exposes style.
400 with "quantity must be ≤ 4" on gpt-image-1.5Using 1.5 ceilings with 1's quantity expectationsDrop to quantity: 4 or use gpt-image-1 (up to 10).
400 with "prompt too long" on dall-e-2DALL·E 2's 1000-char prompt capTrim the prompt or move to a newer model.
Output is unexpectedly stylised on DALL·E 3style: "vivid" defaultSet style: "natural" for closer-to-prompt output.
Output is PNG when you wanted JPEGTransparent backgrounds force PNGSet background: "opaque" if you want JPEG, or leave outputFormat unset and accept whatever OpenAI returns.
Request timed out (wait expired)Large quantity or hd on DALL·E 3 during busy periodsResubmit with wait=0 and poll.
Step failed, reason = "blocked"OpenAI's content filter or Civitai moderationDon't retry the same input — see Errors & retries → Step-level failures.

Civitai Developer Documentation