Skip to content

Kling video generation

Kuaishou's Kling model family, available in two generations through the videoGen step:

engineModelsNotes
klingv1, v1.5, v1.6, v2, v2.5-turboOriginal Kling. Text-to-video and image-to-video.
kling-v3(version-agnostic)Kling V3. Five operations including video-to-video and reference-to-video. Duration in seconds (3–15).

Default choice for new integrations: engine: "kling-v3" with operation: "text-to-video". For speed + cost, use mode: "Standard"; for highest quality, mode: "Professional".

All Kling jobs exceed the 100-second timeout — always submit with wait=0 and handle results via webhooks or polling.

Kling (original)

Text-to-video

http
POST https://orchestration.civitai.com/v2/consumer/workflows?wait=0
Authorization: Bearer <your-token>
Content-Type: application/json

{
  "steps": [{
    "$type": "videoGen",
    "input": {
      "engine": "kling",
      "model": "v2.5-turbo",
      "prompt": "A serene mountain lake at dawn with mist rolling over the water",
      "aspectRatio": "16:9",
      "duration": "5"
    }
  }]
}
POST/v2/consumer/workflows
Set your Civitai API token via the Token button in the navbar to enable Try It.
Request body — edit to customize (e.g. swap the image URL or prompt)
Valid JSON

Image-to-video

Pass sourceImage (URL, data URL, or Base64) to animate a start frame:

json
{
  "engine": "kling",
  "model": "v1.6",
  "prompt": "The subject slowly turns to face the camera",
  "sourceImage": "https://image.civitai.com/.../photo.jpeg",
  "aspectRatio": "16:9",
  "duration": "5",
  "mode": "Standard"
}
POST/v2/consumer/workflows
Set your Civitai API token via the Token button in the navbar to enable Try It.
Request body — edit to customize (e.g. swap the image URL or prompt)
Valid JSON

Parameters

FieldDefaultNotes
engine— ✅"kling"
model— ✅"v1" / "v1.5" / "v1.6" / "v2" / "v2.5-turbo"
prompt— ✅Generation prompt.
negativePromptnullWhat to avoid.
mode"Standard""Standard" or "Professional". Affects quality and cost for v1/v1.5/v1.6. Ignored for v2/v2.5-turbo.
aspectRatio"16:9""16:9", "9:16", "1:1"
duration"5""5" or "10" (seconds). String enum.
cfgScale0.50–1. Prompt adherence.
sourceImagenullURL / data URL / Base64. Enables image-to-video.
cameraControlnullFine camera motion — see Camera control below.

Cost

Model5 s10 s
v1 / v1.5 / v1.6 Standard6001 200
v1 / v1.5 / v1.6 Professional1 0502 100
v21 2002 400
v2.5-turbo6001 200

Camera control

Available on all models. Provide a cameraControl object with a config sub-object containing any of these axes (all -10 to 10, default null = no control):

AxisEffect
horizontalTranslate left (−) / right (+)
verticalTranslate down (−) / up (+)
panRotate left (−) / right (+) around Y axis
tiltRotate down (−) / up (+) around X axis
rollCounter-clockwise (−) / clockwise (+) around Z axis
zoomNarrow FOV (−) / widen FOV (+)
json
{
  "cameraControl": {
    "config": { "zoom": -3, "pan": 2 }
  }
}

Kling V3 (engine: "kling-v3")

Kling V3 introduces a richer operation set via the operation discriminator.

Operations

operationDescriptionKey inputs
text-to-videoGenerate from a text promptprompt
image-to-videoAnimate a start frame (optionally to an end frame)sourceImage, optionally endImage
reference-to-videoStylize video from reference imagesimages[]
video-to-video-editEdit an existing video guided by a promptvideoUrl
video-to-video-referenceReference an existing video's motion/structurevideoUrl, optionally images[]

Text-to-video

json
{
  "engine": "kling-v3",
  "operation": "text-to-video",
  "prompt": "A timelapse of a flower blooming in a sunlit meadow",
  "aspectRatio": "16:9",
  "duration": 5,
  "mode": "Standard"
}
POST/v2/consumer/workflows
Set your Civitai API token via the Token button in the navbar to enable Try It.
Request body — edit to customize (e.g. swap the image URL or prompt)
Valid JSON

Image-to-video

json
{
  "engine": "kling-v3",
  "operation": "image-to-video",
  "prompt": "The cat stretches and yawns, then looks directly into the camera",
  "sourceImage": "https://image.civitai.com/.../photo.jpeg",
  "aspectRatio": "16:9",
  "duration": 5
}

Add endImage to interpolate between a start frame and an end frame:

json
{
  "engine": "kling-v3",
  "operation": "image-to-video",
  "prompt": "Smooth cinematic transition",
  "sourceImage": "https://.../start.jpeg",
  "endImage":   "https://.../end.jpeg",
  "duration": 5
}
POST/v2/consumer/workflows
Set your Civitai API token via the Token button in the navbar to enable Try It.
Request body — edit to customize (e.g. swap the image URL or prompt)
Valid JSON

Placeholder URLs

The first-last-frame example uses https://example.com/ placeholders. Replace them with publicly accessible image URLs before submitting.

Video-to-video

Edit or reference the motion of an existing video:

json
{
  "engine": "kling-v3",
  "operation": "video-to-video-edit",
  "prompt": "Transform the scene into a vintage 1970s film aesthetic with grain",
  "videoUrl": "https://example.com/input.mp4",
  "duration": 5,
  "mode": "Standard"
}

Use video-to-video-reference to guide generation from a video's motion without directly editing it.

Multi-prompt (Kling V3)

multiPrompt lets you sequence different prompts across a video timeline. Each entry has a prompt and a duration (seconds that prompt controls):

json
{
  "engine": "kling-v3",
  "operation": "text-to-video",
  "prompt": "Base scene description",
  "multiPrompt": [
    { "prompt": "The camera slowly pushes in on the subject", "duration": 3 },
    { "prompt": "The subject looks up and the scene brightens", "duration": 4 }
  ]
}

Audio generation (Kling V3)

Set generateAudio: true to produce a synchronized audio track. Optionally provide voiceIds to use a specific voice profile:

json
{
  "generateAudio": true,
  "voiceIds": ["voice_abc123"]
}

For video-to-video operations, keepAudio: true (default) preserves the original video's audio.

Parameters (Kling V3)

FieldDefaultNotes
engine— ✅"kling-v3"
operation"text-to-video"See operations table above.
prompt— ✅Generation prompt.
mode"Standard""Standard" or "Professional".
duration53–15 seconds (integer, unlike the original kling engine).
aspectRatio"16:9""16:9", "9:16", "1:1"
sourceImagenullStart frame for image-to-video.
endImagenullEnd frame for first-last-frame interpolation.
images[][]Reference images for reference-to-video.
videoUrlnullSource video for video-to-video-* operations.
generateAudiofalseGenerate a synchronized audio track.
voiceIdsnullVoice profile IDs for audio generation.
keepAudiotruePreserve source video audio in video-to-video operations.
multiPrompt[]nullTime-sequenced prompts { prompt, duration }.

Cost (Kling V3)

Cost scales linearly with duration. All costs are in Buzz per second:

Operation groupModeAudioBuzz/s
t2v / i2v / refStandardNo219
t2v / i2v / refStandardYes292
t2v / i2v / refProfessionalNo292
t2v / i2v / refProfessionalYes364
v2v-edit / v2v-refStandard328
v2v-edit / v2v-refProfessional437

Examples at duration: 5:

ScenarioBuzz
Standard t2v, no audio, 5 s~1 095
Standard t2v, with audio, 5 s~1 460
Professional t2v, no audio, 5 s~1 460
Professional t2v, with audio, 5 s~1 820
Standard video-to-video, 5 s~1 640
Professional video-to-video, 5 s~2 185

Reading the result

json
{
  "status": "succeeded",
  "steps": [{
    "name": "0",
    "$type": "videoGen",
    "status": "succeeded",
    "output": {
      "video": { "id": "blob_...", "url": "https://.../signed.mp4" }
    }
  }]
}

Blob URLs are signed and expire — refetch the workflow or call GetBlob for a fresh URL.

Long-running jobs

Kling V3 Standard at 5 s typically completes in 2–5 minutes; Professional and longer durations take longer. Always use wait=0 and handle via:

  • Webhooks (recommended): type: ["workflow:succeeded", "workflow:failed"] — see Results & webhooks
  • Polling: GET /v2/consumer/workflows/{workflowId} on a 10 s → 30 s → 60 s cadence

Troubleshooting

SymptomLikely causeFix
400 with "duration must be one of" (kling)Sent integer instead of stringThe original kling engine uses string duration: "5" or "10".
400 with "model is required" (kling)Missing model on the original enginemodel is required for kling; it is not used by kling-v3.
400 with "sourceImage is required"Used image-to-video without an imageProvide sourceImage for image-to-video.
400 with "videoUrl is required"Used video-to-video-* without a source videoProvide videoUrl for video-to-video operations.
Step failed, reason = "no_provider_available"No Kling worker availableRetry shortly.
Output doesn't match end frameendImage ignored for text-to-videoUse operation: "image-to-video" with both sourceImage and endImage to interpolate frames.

Civitai Developer Documentation