Chroma / ERNIE / Qwen / Z-Image LoRA training
Five smaller image-LoRA ecosystems share this page: each has its own ecosystem value and base checkpoint, but the request shape is otherwise the AI Toolkit standard.
ecosystem | Base | Buzz / epoch | Best for |
|---|---|---|---|
chroma | lodestones/Chroma1-HD | 200 | Chroma community model fine-tunes |
ernie | baidu/ERNIE-Image | 100 | ERNIE Image LoRAs |
qwen | Qwen-Image (versioned) | 200 | Qwen Image / Qwen-Image-Edit LoRAs |
zimageturbo | ostris/Z-Image-De-Turbo (+ Z-Image-Turbo extras) | 100 | Z-Image Turbo LoRAs (cheap, fast inference) |
zimagebase | Tongyi-MAI/Z-Image | 100 | Z-Image base LoRAs |
Each ecosystem has its own subsection with a runnable example. The shared schema lives in Common parameters; ecosystem-specific quirks are in each subsection.
Long-running step
Always submit with wait=0. These ecosystems run anywhere from ~10s/epoch (Z-Image Turbo) to ~2min/epoch (Chroma/Qwen). See Results & webhooks.
The request shape
{
"$type": "training",
"input": {
"engine": "ai-toolkit",
"ecosystem": "chroma" // chroma | ernie | qwen | zimageturbo | zimagebase
}
}Prerequisites
- A Civitai orchestration token (Quick start → Prerequisites)
- A training-data zip (signed R2 URL, Civitai R2 AIR, or any HTTPS URL)
- An accurate
countof images in the zip
Chroma
Trains on the Chroma1-HD base. Uses TextToImageV2Job for sample renders; output LoRA is usable wherever Chroma is supported.
POST https://orchestration.civitai.com/v2/consumer/workflows?wait=0
Authorization: Bearer <your-token>
Content-Type: application/json
{
"tags": ["training"],
"steps": [{
"$type": "training",
"priority": "normal",
"retries": 2,
"input": {
"engine": "ai-toolkit",
"ecosystem": "chroma",
"epochs": 5,
"resolution": 1024,
"lr": 0.0001,
"trainTextEncoder": false,
"lrScheduler": "cosine",
"optimizerType": "adamw8bit",
"networkDim": 16,
"networkAlpha": 16,
"trainingData": {
"type": "zip",
"sourceUrl": "https://civitai-delivery-worker-prod.5ac0637cfd0766c97916cefa3764fbdf.r2.cloudflarestorage.com/training-images/5418/2382561TrainingData.B6Tr.zip",
"count": 10
},
"samples": {
"prompts": [
"woman with red hair, playing chess at the park, dramatic explosion in background",
"a woman holding a coffee cup, in a beanie, sitting at a cafe",
"a horse acting as a DJ at a night club, fisheye lens, smoke machine, laser lights"
]
}
}
}]
}/v2/consumer/workflowsRequest body — edit to customize (e.g. swap the image URL or prompt)
Chroma defaults: networkDim: 16, optimizerType: adamw8bit, trainTextEncoder: false, lrScheduler: cosine. 200 Buzz / epoch.
ERNIE
Trains on Baidu's ERNIE-Image. Comfy-based ecosystem with built-in diffuser. Uses ComfyImageGenJob for sample renders.
POST https://orchestration.civitai.com/v2/consumer/workflows?wait=0
Authorization: Bearer <your-token>
Content-Type: application/json
{
"tags": ["training"],
"steps": [{
"$type": "training",
"priority": "normal",
"retries": 2,
"input": {
"engine": "ai-toolkit",
"ecosystem": "ernie",
"epochs": 5,
"lr": 0.0001,
"trainTextEncoder": false,
"lrScheduler": "cosine",
"optimizerType": "adamw8bit",
"networkDim": 32,
"networkAlpha": 32,
"trainingData": {
"type": "zip",
"sourceUrl": "urn:air:other:other:civitai-r2:civitai-delivery-worker-prod@training-images/7918795/2435272TrainingData.bJ7P.zip",
"count": 10
},
"samples": {
"prompts": ["a portrait of TOK", "TOK walking through a comic book city"]
}
}
}]
}/v2/consumer/workflowsRequest body — edit to customize (e.g. swap the image URL or prompt)
ERNIE defaults: networkDim: 32, optimizerType: adamw8bit, trainTextEncoder: false, lrScheduler: cosine. 100 Buzz / epoch.
Qwen
Trains on Qwen-Image. The version field selects a specific Qwen-Image release:
version | Base resolved to |
|---|---|
latest (default) | Qwen/Qwen-Image-Edit-2512 |
2509 | urn:air:qwen:checkpoint:civitai:1864281@2110043 |
2512 | Qwen/Qwen-Image-Edit-2512 (same as latest) |
POST https://orchestration.civitai.com/v2/consumer/workflows?wait=0
Authorization: Bearer <your-token>
Content-Type: application/json
{
"tags": ["training"],
"steps": [{
"$type": "training",
"priority": "normal",
"retries": 2,
"input": {
"engine": "ai-toolkit",
"ecosystem": "qwen",
"version": "latest",
"epochs": 1,
"resolution": 1024,
"lr": 0.00011,
"trainTextEncoder": false,
"lrScheduler": "cosine",
"optimizerType": "adamw8bit",
"networkDim": 16,
"networkAlpha": 16,
"trainingData": {
"type": "zip",
"sourceUrl": "urn:air:other:other:civitai-r2:civitai-delivery-worker-prod@training-images/3315022/2526079TrainingData.o4S8.zip",
"count": 10
},
"samples": {
"prompts": [
"woman with red hair, playing chess at the park, dramatic explosion in background",
"a woman holding a coffee cup, in a beanie, sitting at a cafe"
]
}
}
}]
}/v2/consumer/workflowsRequest body — edit to customize (e.g. swap the image URL or prompt)
Qwen defaults: networkDim: 16, optimizerType: adamw8bit, trainTextEncoder: false, lrScheduler: cosine. 200 Buzz / epoch.
Z-Image Turbo
Trains on ostris/Z-Image-De-Turbo and pulls in the original Tongyi-MAI/Z-Image-Turbo as an extras model. Output LoRA is usable in Z-Image generation on the turbo model.
POST https://orchestration.civitai.com/v2/consumer/workflows?wait=0
Authorization: Bearer <your-token>
Content-Type: application/json
{
"tags": ["training"],
"steps": [{
"$type": "training",
"priority": "normal",
"retries": 2,
"input": {
"engine": "ai-toolkit",
"ecosystem": "zimageturbo",
"epochs": 7,
"resolution": 512,
"lr": 0.000611,
"trainTextEncoder": false,
"lrScheduler": "cosine",
"optimizerType": "adamw8bit",
"networkDim": 32,
"networkAlpha": 32,
"trainingData": {
"type": "zip",
"sourceUrl": "urn:air:other:other:civitai-r2:civitai-delivery-worker-prod@training-images/3315022/2526079TrainingData.o4S8.zip",
"count": 10
},
"samples": {
"prompts": ["a photo of TOK", "TOK in a garden", "TOK portrait"]
}
}
}]
}/v2/consumer/workflowsRequest body — edit to customize (e.g. swap the image URL or prompt)
Z-Image Turbo defaults: networkDim: 32, optimizerType: adamw8bit, trainTextEncoder: false. 100 Buzz / epoch.
Z-Image Base
Trains on Tongyi-MAI/Z-Image. The orchestrator overrides optimizerType to automagic and lr to 0.000001 regardless of what you submit — the input fields are accepted but ignored. Use the Z-Image Turbo recipe instead unless you specifically need a base-model LoRA.
POST https://orchestration.civitai.com/v2/consumer/workflows?wait=0
Authorization: Bearer <your-token>
Content-Type: application/json
{
"tags": ["training"],
"steps": [{
"$type": "training",
"priority": "normal",
"retries": 2,
"input": {
"engine": "ai-toolkit",
"ecosystem": "zimagebase",
"epochs": 7,
"resolution": 512,
"lr": 0.000611,
"trainTextEncoder": false,
"lrScheduler": "cosine",
"networkDim": 32,
"networkAlpha": 32,
"trainingData": {
"type": "zip",
"sourceUrl": "urn:air:other:other:civitai-r2:civitai-delivery-worker-prod@training-images/3315022/2526079TrainingData.o4S8.zip",
"count": 10
},
"samples": {
"prompts": ["a photo of TOK", "TOK in a garden", "TOK portrait"]
}
}
}]
}/v2/consumer/workflowsRequest body — edit to customize (e.g. swap the image URL or prompt)
Z-Image Base defaults: networkDim: 32, optimizerType: automagic (overridden), lr: 0.000001 (overridden), trainTextEncoder: false. 100 Buzz / epoch.
Common parameters
Defaults shown are the post-ApplyDefaults values; per-ecosystem deviations are noted above.
| Field | Required | Default | Notes |
|---|---|---|---|
engine | ✅ | — | Always ai-toolkit. |
ecosystem | ✅ | — | One of: chroma, ernie, qwen, zimageturbo, zimagebase. |
version | (qwen only) | latest | latest, 2509, 2512. Selects the Qwen-Image base release. |
epochs | 5 | 1–20. Billed per epoch. | |
numberOfRepeats | varies (see ecosystem) | 1–5000. ERNIE / Z-Image auto-derive ceil(200 / count); Chroma / Qwen don't auto-set. | |
lr | 0.0001 | UNet learning rate. | |
trainTextEncoder | false | All five ecosystems leave the text encoder frozen. | |
lrScheduler | cosine | constant, constant_with_warmup, cosine, linear, step. | |
optimizerType | adamw8bit (automagic for Z-Image Base) | Full enum on the SDXL/SD1 page. | |
networkDim | 32 (16 for Chroma / Qwen) | 1–256. | |
networkAlpha | matches networkDim | 1–256. | |
noiseOffset | 0 | 0–1. | |
flipAugmentation | false | Random horizontal flips. | |
shuffleTokens / keepTokens | false / 0 | Caption-tag shuffling. | |
triggerWord | (none) | Activation token. Recommended for character / style LoRAs on Chroma, Z-Image. | |
trainingData.{type, sourceUrl, count} | ✅ | — | type: "zip". |
samples.prompts[] | [] | Per-epoch preview prompts rendered with the trained LoRA. | |
samples.negativePrompt | (none) | — |
Reading the result
Same envelope as the other training recipes — see SDXL/SD1 → Reading the result. Each epoch yields a .safetensors LoRA blob plus any sample images.
The trained LoRA is usable in the corresponding generation recipe — Chroma LoRAs in any Chroma workflow, ERNIE LoRAs in ERNIE image generation, Qwen LoRAs in Qwen image generation, Z-Image LoRAs in Z-Image generation.
Runtime
Per-epoch wall time, default settings on a 10-image dataset:
| Ecosystem | Per-epoch | Typical full run |
|---|---|---|
chroma | ~60–120 s | 5–15 min for 5 epochs |
ernie | ~30–60 s | 3–8 min for 5 epochs |
qwen | ~60–120 s | 5–15 min for 5 epochs |
zimageturbo | ~10–25 s | 1–4 min for 7 epochs |
zimagebase | ~10–25 s | 1–4 min for 7 epochs |
Always use wait=0.
Cost
total = costPerEpoch × epochs| Ecosystem | Buzz / epoch | epochs: 5 | epochs: 10 |
|---|---|---|---|
chroma | 200 | 1000 | 2000 |
ernie | 100 | 500 | 1000 |
qwen | 200 | 1000 | 2000 |
zimageturbo | 100 | 500 | 1000 |
zimagebase | 100 | 500 | 1000 |
Sample-prompt rendering is billed separately at each ecosystem's image-generation rate. Use whatif=true (the Preview cost button on the widgets above) to confirm exact charges before submitting.
Troubleshooting
| Symptom | Likely cause | Fix |
|---|---|---|
400 with "ecosystem unknown" | Typo, or not one of chroma / ernie / qwen / zimageturbo / zimagebase | Check spelling. |
400 with "version not allowed" (Qwen only) | version not one of latest / 2509 / 2512 | Use one of the listed values. |
Z-Image Base: optimizerType you set seems ignored | Intentional — ApplyDefaults overrides to automagic | Use Z-Image Turbo if you need full optimizer control. |
| Trained LoRA underbaked | Too few epochs / too low lr | Raise epochs to 8–15 (these ecosystems often need more epochs than SDXL); keep lr ≤ 5e-4. |
| Trained LoRA overcooked | Too many epochs or networkDim too high | Drop networkDim to 16, lower epochs. |
Step failed, moderationStatus: "Rejected" | Dataset failed content moderation | Replace flagged images. |
Related
- SDXL & SD1 LoRA training — classic Stable Diffusion ecosystems
- Flux 1 LoRA training / Flux 2 Klein LoRA training — Flux family
- Wan video LoRA training / LTX2 video LoRA training — video LoRAs
- Generation recipes for these ecosystems: Z-Image, Qwen, ERNIE
- Results & webhooks
SubmitWorkflow/GetWorkflow- Endpoint OpenAPI spec