Skip to content

WAN video generation

WAN is Alibaba's open video-generation model family. The orchestrator exposes every shipped version, across multiple providers, under a single videoGen step. This recipe walks through the full surface: which version to pick, which provider to route to, and how to invoke each operation.

Versions at a glance

versionProvidersOperationsNotes
v2.7faltext-to-video, image-to-video, reference-to-video, edit-videoCurrent flagship on FAL. Adds edit-video.
v2.6faltext-to-video, image-to-video, reference-to-videoFAL production default for new integrations.
v2.5faltext-to-video, image-to-videoStill supported; fewer operations than 2.6/2.7.
v2.2fal, comfytext-to-video, image-to-videoOnly version with a native ComfyUI path. Supports LoRAs + Turbo mode.
v2.1fal, civitaitext-to-video, image-to-videoLegacy — prefer 2.6+ unless you specifically need Civitai-hosted inference.

Default choice for new integrations: version: "v2.6", provider: "fal".

The request shape

Every WAN request is a single videoGen step on SubmitWorkflow. Four keys select which WAN variant runs:

json
{
  "$type": "videoGen",
  "input": {
    "engine":    "wan",
    "version":   "v2.6",         // 2.1 | 2.2 | 2.5 | 2.6 | 2.7
    "provider":  "fal",          // fal | comfy | civitai (version-dependent)
    "operation": "text-to-video" // see table above
  }
}

The orchestrator dispatches to the matching input schema (Wan26FalTextToVideoInput, Wan22ComfyVideoGenInput, etc.), so only the fields valid for that combination are accepted — SubmitWorkflow will 400 on unknown ones.

Operations

All examples target production and use <your-token> in place of your Bearer token. Request timeout is 100 swait is capped accordingly. See Results & webhooks for anything longer.

text-to-video

Prompt → video. The most common operation; supported on every WAN version.

http
POST https://orchestration.civitai.com/v2/consumer/workflows?whatif=false&wait=60
Authorization: Bearer <your-token>
Content-Type: application/json

{
  "steps": [{
    "$type": "videoGen",
    "input": {
      "engine": "wan",
      "version": "v2.6",
      "provider": "fal",
      "operation": "text-to-video",
      "prompt": "A serene forest with sunlight filtering through the trees, cinematic quality",
      "resolution": "1080p",
      "aspectRatio": "16:9",
      "duration": 5,
      "enablePromptExpansion": true
    }
  }]
}
POST/v2/consumer/workflows
Set your Civitai API token via the Token button in the navbar to enable Try It.
Request body — edit to customize (e.g. swap the image URL or prompt)
Valid JSON

image-to-video

One or more source images animate into a clip. Supported on every version.

json
{
  "engine": "wan",
  "version": "v2.6",
  "provider": "fal",
  "operation": "image-to-video",
  "images": [
    "https://image.civitai.com/.../19325406.jpeg"
  ],
  "prompt": "A dancing cat moving gracefully",
  "resolution": "1080p",
  "duration": 5
}
POST/v2/consumer/workflows
Set your Civitai API token via the Token button in the navbar to enable Try It.
Request body — edit to customize (e.g. swap the image URL or prompt)
Valid JSON

v2.7 image-to-video uses startImage + endImage (not images[]). Pass startImage to seed the first frame and optionally endImage to constrain the last frame (useful for loops and transitions). The images[] array accepted by v2.6 is not available on v2.7.

reference-to-video (v2.6, v2.7)

Pass one or more reference videos; refer to them from the prompt via @Video1, @Video2, @Video3 to transfer subjects / motion / style.

json
{
  "engine": "wan",
  "version": "v2.6",
  "provider": "fal",
  "operation": "reference-to-video",
  "referenceVideoUrls": [
    "https://example.com/reference.mp4"
  ],
  "prompt": "@Video1 is walking through a beautiful garden",
  "resolution": "1080p",
  "aspectRatio": "16:9",
  "duration": 5
}
POST/v2/consumer/workflows
Set your Civitai API token via the Token button in the navbar to enable Try It.
Request body — edit to customize (e.g. swap the image URL or prompt)
Valid JSON

Reference video URL

The example above uses https://example.com/reference.mp4 as a placeholder — replace with a real publicly fetchable video URL before submitting.

edit-video (v2.7 only)

Input video + prompt → transformed video. Preserves timing; rewrites content.

json
{
  "engine": "wan",
  "version": "v2.7",
  "provider": "fal",
  "operation": "edit-video",
  "videoUrl": "https://example.com/input.mp4",
  "prompt": "Transform the scene into a cyberpunk aesthetic with neon lighting",
  "resolution": "1080p",
  "audioSetting": "auto"
}
POST/v2/consumer/workflows
Set your Civitai API token via the Token button in the navbar to enable Try It.
Request body — edit to customize (e.g. swap the image URL or prompt)
Valid JSON

Source video URL

Replace https://example.com/input.mp4 with a real publicly fetchable video URL before submitting.

Common parameters

These appear on most (version, operation) combinations; the schema for your chosen variant is the source of truth.

FieldTypical valuesNotes
resolution720p, 1080p1080p costs more and takes longer.
aspectRatio16:9, 9:16, 1:1Vertical for reels/shorts.
duration5, 10 (seconds)Longer clips push you past the 100 s wait cap — use webhooks.
enablePromptExpansiontrue | falseLet the model expand short prompts. Disable for reproducibility.
enableSafetyCheckertrue (default)Disable only if you handle moderation yourself.
audioUrl / audioSettingURL or autoAttach background audio (2.6+) or drive audio inference (2.7 edit).

Provider-specific features

FAL (all versions)

Hosted inference with low queue time. FAL is the production default. enablePromptExpansion and audio attachment only exist on FAL variants.

Comfy (v2.2 only)

Runs on Civitai's ComfyUI workers. Two features aren't available on FAL:

  • LoRAs via the loras array with AIR identifiers
    json
    "loras": [{ "air": "urn:air:lora:civitai:123456@789012", "strength": 0.8 }]
  • Turbo mode (useTurbo: true) + frame-interpolator models (interpolatorModel: "film") for faster runs at lower quality
  • Multi-step workflows — chain videoGenvideoInterpolationvideoUpscaler in one steps array

Civitai (v2.1 only)

Legacy self-hosted path. Accepts explicit model AIRs and width/height instead of resolution/aspectRatio. Migrate to FAL 2.6+ unless you have a specific reason.

Reading the result

On success each videoGen step emits a single video blob:

json
{
  "id": "wf_01HXYZ...",
  "status": "succeeded",
  "steps": [{
    "name": "0",
    "$type": "videoGen",
    "status": "succeeded",
    "output": {
      "video": { "id": "blob_...", "url": "https://.../signed.mp4" }
    }
  }]
}

Blob URLs are signed and expire — refetch the workflow or call GetBlob for a fresh URL; don't cache them long-term. Download and store the bytes yourself if you need durable storage.

Long-running jobs

WAN jobs routinely run longer than 100 s (any 1080p clip ≥ 10 s; reference-to-video; edit-video). The 100-second request timeout means wait is capped — use wait=90 for a best-effort inline attempt, then fall back to:

  • Webhooks (preferred): register a callback with type: ["workflow:succeeded", "workflow:failed"] — see Results & webhooks
  • Polling: GET /v2/consumer/workflows/{workflowId} on a 5 s → 10 s → 30 s cadence

Troubleshooting

SymptomLikely causeFix
400 with unknown fieldField isn't valid for this (version, provider, operation) comboCheck the specific Wan<X><Provider><Op>Input schema via SubmitWorkflow.
Step failed, error.code = "no_provider"No capacity for that resolution/duration on the chosen providerRetry, drop to 720p, or switch provider.
workflow:processing after wait=90 returnsJob ran past the 100 s timeoutExpected — continue via webhook or poll.
Blob URL 403 after a few minutesSigned URL expiredRefetch the workflow to get a fresh URL.
Reference prompt ignored@VideoN tokens missing or misnumberedTokens are 1-indexed and must match items in referenceVideoUrls.

Cost

Billed in Buzz on the workflow's transactions. Use whatif=true for an exact preview; see Payments (Buzz) for currency selection.

WAN video pricing varies by version, provider, resolution, and acceleration flags. All the numbers below are per single video (not per second per clip unless noted).

v2.7 (FAL)

Flat per-second across resolutions:

total = 130 × duration_seconds
  • 5 s → ~650 Buzz
  • 10 s → ~1 300 Buzz

v2.6 (FAL)

Resolution-scaled per-second:

ResolutionBuzz per second
720p130
1080p195
  • 720p × 5 s → ~650 Buzz
  • 1080p × 5 s → ~975 Buzz

v2.5 (FAL)

Resolution-scaled per-second:

total = 100 × resolutionFactor × duration

with resolutionFactor = 1 (480p) / 2 (720p) / 3 (1080p).

  • 720p × 5 s → 1 000 Buzz
  • 720p × 10 s → 2 000 Buzz
  • 480p × 5 s → 500 Buzz

v2.2 (FAL, text-to-video / image-to-video)

Driven by Turbo / LoRA / standard, plus resolution:

Mode720p580p480p
Turbo~130 / video (flat)~97.5~65
With LoRA~94.9 × duration~94.9 × duration~94.9 × duration
Standard~104 × duration~78 × duration~52 × duration

Typical: 720p Turbo 5 s → ~130; 720p standard 5 s → ~520; 720p LoRA 5 s → ~475.

v2.2-5b (FAL)

ModeBuzz per video
Fast-wan 720p~32.5
Distill~75.9
Standard~142.35
Image-to-video~142.35 (flat)

v2.2 (Comfy, Civitai-hosted)

Variable per-pixel-per-step formula with an 8× markup, minimum 100 Buzz, rounded up to the nearest 25:

areaCost    = max(a × width × height + b, 0)    // per-frame per-step compute factor
duration    = length × steps × areaCost
buzz        = max(100, ceil((duration × 420/3600 × 8) / 25) × 25)

Where (a, b) is (1.22e-6, -0.14) for image-to-video, (2.53e-7, -0.0259) for text-to-video. This path is noticeably more expensive per second than the FAL routes — FAL is the default for a reason.

v2.1 (legacy)

total = 100 × resolutionFactor × duration

with resolutionFactor = 1 (480p) / 2 (720p) / 3 (1080p). 720p × 5 s → ~1 000 Buzz.

Quick reference

For new integrations on v2.6 / v2.7 at 720p × 5 s with no LoRAs, expect ~650–975 Buzz per video. Always whatif=true before long-duration / high-res submissions — costs scale linearly with duration and can escalate fast.

  • SubmitWorkflow — operation used by every example here
  • GetWorkflow — for polling
  • Results & webhooks — production-ready result handling
  • Full parameter catalog: the Wan<version><Provider><Operation>Input schemas in the API reference (e.g. Wan26FalTextToVideoInput, Wan27FalEditVideoInput)
  • videoGen endpoint OpenAPI spec — standalone OpenAPI 3.1 YAML covering the full videoGen surface (WAN, LTX2, Flux, etc.); import into Postman / OpenAPI Generator

Civitai Developer Documentation