Chat completion
chatCompletion routes text (and optionally image) inputs through large language models. Any model available on OpenRouter is supported, plus Civitai-hosted AIR models. The request and response shapes follow the OpenAI Chat Completions API.
Access paths
Two ways to use chat completion, depending on your use case:
| Path | When to use |
|---|---|
POST /v1/chat/completions | Drop-in replacement for the OpenAI API. Accepts stream: true for SSE streaming. |
chatCompletion workflow step | Chain with other steps (imageGen, convertImage, etc.) in a multi-step workflow. |
Both paths share the same input schema and produce the same output format.
Basic text completion
Via the OpenAI-compatible endpoint
POST https://orchestration.civitai.com/v1/chat/completions
Authorization: Bearer <your-token>
Content-Type: application/json
{
"model": "openai/gpt-4o-mini",
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "What is the capital of France?" }
]
}Via SubmitWorkflow
POST https://orchestration.civitai.com/v2/consumer/workflows?wait=60
Authorization: Bearer <your-token>
Content-Type: application/json
{
"steps": [{
"$type": "chatCompletion",
"input": {
"model": "openai/gpt-4o-mini",
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "What is the capital of France?" }
]
}
}]
}/v2/consumer/workflowsRequest body — edit to customize (e.g. swap the image URL or prompt)
Vision (image inputs)
Pass images in user message content parts. Any vision-capable model (e.g. openai/gpt-4o, google/gemini-2.0-flash) can process them.
{
"model": "openai/gpt-4o",
"messages": [{
"role": "user",
"content": [
{ "type": "text", "text": "Describe this image in detail." },
{
"type": "image_url",
"image_url": {
"url": "https://image.civitai.com/.../photo.jpeg",
"detail": "auto"
}
}
]
}],
"max_tokens": 300
}detail can be "auto" (default), "low", or "high". The image source can be a public URL, a data URL (data:image/jpeg;base64,...), or raw Base64 — the orchestrator uploads it to blob storage before dispatching the job.
/v2/consumer/workflowsRequest body — edit to customize (e.g. swap the image URL or prompt)
Image generation
Set "modalities": ["image", "text"] on the request to generate images through /v1/chat/completions. The response carries an images array on the assistant message, where each entry is a base64 data URI — the same shape OpenRouter uses, so existing OpenRouter-style SDK code works unmodified.
POST https://orchestration.civitai.com/v1/chat/completions
Authorization: Bearer <your-token>
Content-Type: application/json
{
"model": "google/gemini-2.5-flash-image",
"messages": [
{ "role": "user", "content": "A cat in a teacup, soft window light" }
],
"modalities": ["image", "text"],
"image_config": {
"aspect_ratio": "1:1",
"image_size": "1K"
}
}Response:
{
"id": "chatcmpl-...",
"model": "google/gemini-2.5-flash-image",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "",
"images": [{
"type": "image_url",
"image_url": { "url": "data:image/png;base64,iVBOR..." }
}]
},
"finish_reason": "stop"
}]
}Image editing (multi-turn)
Pass a prior generated image (or any image URL / data URI) as a content part on a user message and the request routes through the engine's edit operation:
{
"model": "google/gemini-2.5-flash-image",
"messages": [{
"role": "user",
"content": [
{ "type": "text", "text": "Make it a dog instead of a cat." },
{ "type": "image_url", "image_url": { "url": "data:image/png;base64,..." } }
]
}],
"modalities": ["image", "text"]
}Supported models
model | Engine | Operations |
|---|---|---|
google/gemini-2.5-flash-image | Gemini 2.5 Flash Image | create, edit |
openai/gpt-image-1 | OpenAI gpt-image-1 | create, edit |
openai/dall-e-3 | OpenAI DALL·E 3 | create |
openai/dall-e-2 | OpenAI DALL·E 2 | create, edit |
black-forest-labs/flux.2-dev | Flux 2 Dev | create, edit |
black-forest-labs/flux.2-flex | Flux 2 Flex | create, edit |
black-forest-labs/flux.2-pro | Flux 2 Pro | create, edit |
black-forest-labs/flux.2-max | Flux 2 Max | create, edit |
black-forest-labs/flux.2-klein | Flux 2 Klein | create, edit |
The provider prefix (google/, openai/, black-forest-labs/) is optional — short names like gemini-2.5-flash-image, gpt-image-1, flux-2-dev are also accepted. Unknown model names with modalities: ["image"] return 400 with the supported list.
Civitai AIR URNs
Pass a Civitai AIR URN as model to use a community checkpoint. The ecosystem segment of the AIR (sd1, sdxl, flux1, anima) selects the engine; the AIR is forwarded as the checkpoint:
{
"model": "urn:air:sdxl:checkpoint:civitai:101055@128078",
"messages": [{ "role": "user", "content": "A cyberpunk samurai" }],
"modalities": ["image", "text"],
"image_config": { "aspect_ratio": "1:1", "image_size": "1K" }
}| Ecosystem | Engine | Operations | Notes |
|---|---|---|---|
sd1 | SD 1.5 (sd-cpp) | create, variant | Pass an input image to trigger img2img variant. |
sdxl | SDXL (sd-cpp) | create, variant | Same — img2img variant when an input image is supplied. |
flux1 | Flux 1 (sd-cpp) | create, edit | Edit operation accepts up to 2 input images; width/height clamped to 832–1216. |
anima | Anima (sd-cpp) | create | Anima checkpoints; no img2img path through chat-completions. |
Other ecosystems (zimage, qwen, wan, flux2) hardcode their checkpoints — pass the matching named model instead (e.g. flux-2-dev) and use the imageGen workflow step directly when you need to override the checkpoint.
image_config
| Field | Values | Effect |
|---|---|---|
aspect_ratio | 1:1, 16:9, 9:16, 4:3, 3:4, 21:9 | Sets width/height ratio. OpenAI engines snap to their nearest allowed size. Gemini ignores (always 1024×1024). |
image_size | 0.5K, 1K, 2K, 4K | Approximate megapixel target. Engines clamp to their supported range. |
n | 1–10 | Number of images. Falls back to the top-level n. Engines clamp to their supported max. |
For full per-engine knobs (samplers, LoRAs, guidance scales, advanced operations), use the imageGen workflow step directly instead — chat-completions is a thin facade tuned for SDK compatibility, not a full passthrough of every engine parameter.
Multi-turn conversations
Include prior turns as assistant messages to maintain context:
{
"model": "openai/gpt-4o-mini",
"messages": [
{ "role": "system", "content": "You are a concise assistant." },
{ "role": "user", "content": "Write a haiku about the ocean." },
{ "role": "assistant", "content": "Waves crash endlessly,\nSalt..." },
{ "role": "user", "content": "Now write one about mountains." }
],
"temperature": 0.7
}/v2/consumer/workflowsRequest body — edit to customize (e.g. swap the image URL or prompt)
Streaming
Via /v1/chat/completions
Set "stream": true and handle Server-Sent Events (SSE). The response is a stream of data: {...} lines ending with data: [DONE]:
POST https://orchestration.civitai.com/v1/chat/completions
Authorization: Bearer <your-token>
Content-Type: application/json
{
"model": "openai/gpt-4o-mini",
"messages": [{ "role": "user", "content": "Tell me a short story." }],
"stream": true
}Via workflow step
Set stream: true in the step metadata field:
{
"steps": [{
"$type": "chatCompletion",
"metadata": { "stream": true },
"input": {
"model": "openai/gpt-4o-mini",
"messages": [{ "role": "user", "content": "Tell me a short story." }]
}
}]
}When streaming is enabled, the orchestrator stores the raw NDJSON chunks in a streaming blob and assembles them into the standard ChatCompletionOutput shape for the workflow output.
Tool use (function calling)
Define tools as JSON Schema function definitions. The model decides when and how to call them:
{
"model": "openai/gpt-4o",
"messages": [
{ "role": "user", "content": "What is the weather in Paris?" }
],
"tools": [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a city",
"parameters": {
"type": "object",
"properties": {
"city": { "type": "string" }
},
"required": ["city"]
}
}
}],
"tool_choice": "auto"
}When the model calls a tool, the assistant message in the response contains a tool_calls array instead of (or alongside) content. Submit the tool result back as a tool message:
{
"role": "tool",
"tool_call_id": "call_abc123",
"content": "{\"temperature\": 18, \"condition\": \"sunny\"}"
}Model selection
model accepts any string that identifies a model on OpenRouter or a Civitai AIR URI:
| Format | Example | Notes |
|---|---|---|
| OpenRouter ID | openai/gpt-4o-mini | Any model from openrouter.ai/models. |
| OpenAI shorthand | gpt-4o, gpt-4o-mini | OpenRouter also accepts bare OpenAI model names. |
| AIR URI | urn:air:llm:model:civitai:<modelId>@<versionId> | Routes to a Civitai-hosted model. |
Parameters reference
| Field | Default | Notes |
|---|---|---|
model | — ✅ | Model ID (OpenRouter) or AIR URI. |
messages | — ✅ | Array of role-discriminated messages (at least 1). |
temperature | 1 | 0–2. Higher = more random output. |
topP | 1 | 0–1. Nucleus sampling. Alternative to temperature; usually set one or the other. |
maxTokens | null | Max output tokens, 1–128 000. Unlimited when omitted. |
n | 1 | Number of completions to generate, 1–128. |
stop | null | Up to 4 stop sequences. |
presencePenalty | 0 | -2 to 2. Positive values discourage repeating topics. |
frequencyPenalty | 0 | -2 to 2. Positive values discourage repeating exact tokens. |
seed | null | Integer seed for deterministic output (beta). |
user | null | End-user identifier for abuse monitoring. |
logprobs | null | Return log probabilities for generated tokens. |
topLogprobs | null | 0–20. Number of top log-prob candidates per token (requires logprobs: true). |
tools | null | Function definitions available to the model. |
tool_choice | null | "auto", "none", "required", or { "type": "function", "function": { "name": "..." } }. |
chatTemplateKwargs | null | Extra kwargs passed to the model's chat template (vLLM-specific). |
modalities | null | Output modalities. Include "image" to route the request through the image-generation pipeline. See Image generation. |
imageConfig | null | Image-generation parameters (aspect_ratio, image_size, n). Only consulted when modalities includes "image". |
Messages reference
Messages are discriminated by the role field:
system
{ "role": "system", "content": "You are a helpful assistant.", "name": "optional" }user
Content can be a plain string or an array of content parts:
{ "role": "user", "content": "Plain text" }{
"role": "user",
"content": [
{ "type": "text", "text": "What's in this image?" },
{ "type": "image_url", "image_url": { "url": "https://...", "detail": "auto" } }
]
}assistant
{ "role": "assistant", "content": "Prior response text." }Or with tool calls (as returned by the model):
{
"role": "assistant",
"content": null,
"tool_calls": [{
"id": "call_abc123",
"type": "function",
"function": { "name": "get_weather", "arguments": "{\"city\":\"Paris\"}" }
}]
}tool
{
"role": "tool",
"tool_call_id": "call_abc123",
"content": "{\"temperature\": 18}"
}Reading the result
The output is an OpenAI-compatible chat.completion object:
{
"status": "succeeded",
"steps": [{
"name": "0",
"$type": "chatCompletion",
"status": "succeeded",
"output": {
"id": "chatcmpl-...",
"object": "chat.completion",
"created": 1748000000,
"model": "openai/gpt-4o-mini",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "The capital of France is Paris."
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 24,
"completion_tokens": 9,
"total_tokens": 33
}
}
}]
}The /v1/chat/completions endpoint returns the output object directly (not wrapped in a workflow envelope).
Cost
Cost depends on whether the model routes through OpenRouter or is a Civitai AIR model.
OpenRouter models
Cost is computed from actual token usage with a 30% margin, converted to Buzz (1 000 Buzz = 1 USD):
buzzCost = actualCostUsd × 1000 × 1.3 (minimum 1 Buzz)Before execution, the orchestrator estimates cost using OpenRouter's published per-token prices. After execution, the final Buzz charge is based on the tokens actually consumed by the model.
Different models have very different per-token prices — check openrouter.ai/models for current pricing. Representative examples:
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Typical single call |
|---|---|---|---|
openai/gpt-4o-mini | $0.15 | $0.60 | < 1 Buzz |
openai/gpt-4o | $2.50 | $10.00 | 2–15 Buzz |
anthropic/claude-3-5-sonnet | $3.00 | $15.00 | 4–20 Buzz |
meta-llama/llama-3.3-70b-instruct | $0.12 | $0.30 | < 1 Buzz |
Use whatif=true on your first request to get an exact preview before committing.
AIR models (Civitai-hosted)
Flat-rate pricing based on image count and number of completions requested:
total = 1 × (imageCount × 2) × nKnown limitation
For text-only requests to AIR models (imageCount = 0), the images factor collapses the product to 0 Buzz. This is a known bug — expect it to be corrected in a future release. For now, AIR model text-only calls cost 0 Buzz.
Runtime
Most chat completions finish in 5–30 seconds depending on model and output length. Use wait=60 for simple requests; add wait=0 + polling for long outputs, large n, or slow models. The /v1/chat/completions endpoint waits up to 60 seconds before timing out with 504.
Troubleshooting
| Symptom | Likely cause | Fix |
|---|---|---|
400 with "messages must not be empty" | Empty messages array | Include at least one message. |
400 with "model is required" | Missing model field | model is always required. |
504 Gateway Timeout (via /v1) | Slow model or long output | Retry with wait=0 via SubmitWorkflow + polling. |
400 with "topLogprobs requires logprobs" | Sent topLogprobs without logprobs: true | Set "logprobs": true alongside topLogprobs. |
| Response truncated mid-sentence | maxTokens reached | Raise maxTokens or omit it to let the model decide. |
| Tool call in response instead of content | Expected behaviour | The model chose to call a tool — feed the tool_calls back as a tool message in the next turn. |
Step failed, reason = "no_provider_available" | AIR model offline or no worker available | Retry shortly. |
Related
SubmitWorkflow— for the workflow-step pathGetWorkflow— for polling- Results & webhooks — production result handling
- Prompt enhancement — uses a
chatCompletion-like step to rewrite image prompts - Image conversion — 1-Buzz utility step to post-process generated images