Kling video generation
Kuaishou's Kling model family, available in two generations through the videoGen step:
engine | Models | Notes |
|---|---|---|
kling | v1, v1.5, v1.6, v2, v2.5-turbo | Original Kling. Text-to-video and image-to-video. |
kling-v3 | (version-agnostic) | Kling V3. Five operations including video-to-video and reference-to-video. Duration in seconds (3–15). |
Default choice for new integrations: engine: "kling-v3" with operation: "text-to-video". For speed + cost, use mode: "Standard"; for highest quality, mode: "Professional".
All Kling jobs exceed the 100-second timeout — always submit with wait=0 and handle results via webhooks or polling.
Kling (original)
Text-to-video
POST https://orchestration.civitai.com/v2/consumer/workflows?wait=0
Authorization: Bearer <your-token>
Content-Type: application/json
{
"steps": [{
"$type": "videoGen",
"input": {
"engine": "kling",
"model": "v2.5-turbo",
"prompt": "A serene mountain lake at dawn with mist rolling over the water",
"aspectRatio": "16:9",
"duration": "5"
}
}]
}/v2/consumer/workflowsRequest body — edit to customize (e.g. swap the image URL or prompt)
Image-to-video
Pass sourceImage (URL, data URL, or Base64) to animate a start frame:
{
"engine": "kling",
"model": "v1.6",
"prompt": "The subject slowly turns to face the camera",
"sourceImage": "https://image.civitai.com/.../photo.jpeg",
"aspectRatio": "16:9",
"duration": "5",
"mode": "Standard"
}/v2/consumer/workflowsRequest body — edit to customize (e.g. swap the image URL or prompt)
Parameters
| Field | Default | Notes |
|---|---|---|
engine | — ✅ | "kling" |
model | — ✅ | "v1" / "v1.5" / "v1.6" / "v2" / "v2.5-turbo" |
prompt | — ✅ | Generation prompt. |
negativePrompt | null | What to avoid. |
mode | "Standard" | "Standard" or "Professional". Affects quality and cost for v1/v1.5/v1.6. Ignored for v2/v2.5-turbo. |
aspectRatio | "16:9" | "16:9", "9:16", "1:1" |
duration | "5" | "5" or "10" (seconds). String enum. |
cfgScale | 0.5 | 0–1. Prompt adherence. |
sourceImage | null | URL / data URL / Base64. Enables image-to-video. |
cameraControl | null | Fine camera motion — see Camera control below. |
Cost
| Model | 5 s | 10 s |
|---|---|---|
v1 / v1.5 / v1.6 Standard | 600 | 1 200 |
v1 / v1.5 / v1.6 Professional | 1 050 | 2 100 |
v2 | 1 200 | 2 400 |
v2.5-turbo | 600 | 1 200 |
Camera control
Available on all models. Provide a cameraControl object with a config sub-object containing any of these axes (all -10 to 10, default null = no control):
| Axis | Effect |
|---|---|
horizontal | Translate left (−) / right (+) |
vertical | Translate down (−) / up (+) |
pan | Rotate left (−) / right (+) around Y axis |
tilt | Rotate down (−) / up (+) around X axis |
roll | Counter-clockwise (−) / clockwise (+) around Z axis |
zoom | Narrow FOV (−) / widen FOV (+) |
{
"cameraControl": {
"config": { "zoom": -3, "pan": 2 }
}
}Kling V3 (engine: "kling-v3")
Kling V3 introduces a richer operation set via the operation discriminator.
Operations
operation | Description | Key inputs |
|---|---|---|
text-to-video | Generate from a text prompt | prompt |
image-to-video | Animate a start frame (optionally to an end frame) | sourceImage, optionally endImage |
reference-to-video | Stylize video from reference images | images[] |
video-to-video-edit | Edit an existing video guided by a prompt | videoUrl |
video-to-video-reference | Reference an existing video's motion/structure | videoUrl, optionally images[] |
Text-to-video
{
"engine": "kling-v3",
"operation": "text-to-video",
"prompt": "A timelapse of a flower blooming in a sunlit meadow",
"aspectRatio": "16:9",
"duration": 5,
"mode": "Standard"
}/v2/consumer/workflowsRequest body — edit to customize (e.g. swap the image URL or prompt)
Image-to-video
{
"engine": "kling-v3",
"operation": "image-to-video",
"prompt": "The cat stretches and yawns, then looks directly into the camera",
"sourceImage": "https://image.civitai.com/.../photo.jpeg",
"aspectRatio": "16:9",
"duration": 5
}Add endImage to interpolate between a start frame and an end frame:
{
"engine": "kling-v3",
"operation": "image-to-video",
"prompt": "Smooth cinematic transition",
"sourceImage": "https://.../start.jpeg",
"endImage": "https://.../end.jpeg",
"duration": 5
}/v2/consumer/workflowsRequest body — edit to customize (e.g. swap the image URL or prompt)
Placeholder URLs
The first-last-frame example uses https://example.com/ placeholders. Replace them with publicly accessible image URLs before submitting.
Video-to-video
Edit or reference the motion of an existing video:
{
"engine": "kling-v3",
"operation": "video-to-video-edit",
"prompt": "Transform the scene into a vintage 1970s film aesthetic with grain",
"videoUrl": "https://example.com/input.mp4",
"duration": 5,
"mode": "Standard"
}Use video-to-video-reference to guide generation from a video's motion without directly editing it.
Multi-prompt (Kling V3)
multiPrompt lets you sequence different prompts across a video timeline. Each entry has a prompt and a duration (seconds that prompt controls):
{
"engine": "kling-v3",
"operation": "text-to-video",
"prompt": "Base scene description",
"multiPrompt": [
{ "prompt": "The camera slowly pushes in on the subject", "duration": 3 },
{ "prompt": "The subject looks up and the scene brightens", "duration": 4 }
]
}Audio generation (Kling V3)
Set generateAudio: true to produce a synchronized audio track. Optionally provide voiceIds to use a specific voice profile:
{
"generateAudio": true,
"voiceIds": ["voice_abc123"]
}For video-to-video operations, keepAudio: true (default) preserves the original video's audio.
Parameters (Kling V3)
| Field | Default | Notes |
|---|---|---|
engine | — ✅ | "kling-v3" |
operation | "text-to-video" | See operations table above. |
prompt | — ✅ | Generation prompt. |
mode | "Standard" | "Standard" or "Professional". |
duration | 5 | 3–15 seconds (integer, unlike the original kling engine). |
aspectRatio | "16:9" | "16:9", "9:16", "1:1" |
sourceImage | null | Start frame for image-to-video. |
endImage | null | End frame for first-last-frame interpolation. |
images[] | [] | Reference images for reference-to-video. |
videoUrl | null | Source video for video-to-video-* operations. |
generateAudio | false | Generate a synchronized audio track. |
voiceIds | null | Voice profile IDs for audio generation. |
keepAudio | true | Preserve source video audio in video-to-video operations. |
multiPrompt[] | null | Time-sequenced prompts { prompt, duration }. |
Cost (Kling V3)
Cost scales linearly with duration. All costs are in Buzz per second:
| Operation group | Mode | Audio | Buzz/s |
|---|---|---|---|
| t2v / i2v / ref | Standard | No | 219 |
| t2v / i2v / ref | Standard | Yes | 292 |
| t2v / i2v / ref | Professional | No | 292 |
| t2v / i2v / ref | Professional | Yes | 364 |
| v2v-edit / v2v-ref | Standard | — | 328 |
| v2v-edit / v2v-ref | Professional | — | 437 |
Examples at duration: 5:
| Scenario | Buzz |
|---|---|
| Standard t2v, no audio, 5 s | ~1 095 |
| Standard t2v, with audio, 5 s | ~1 460 |
| Professional t2v, no audio, 5 s | ~1 460 |
| Professional t2v, with audio, 5 s | ~1 820 |
| Standard video-to-video, 5 s | ~1 640 |
| Professional video-to-video, 5 s | ~2 185 |
Reading the result
{
"status": "succeeded",
"steps": [{
"name": "0",
"$type": "videoGen",
"status": "succeeded",
"output": {
"video": { "id": "blob_...", "url": "https://.../signed.mp4" }
}
}]
}Blob URLs are signed and expire — refetch the workflow or call GetBlob for a fresh URL.
Long-running jobs
Kling V3 Standard at 5 s typically completes in 2–5 minutes; Professional and longer durations take longer. Always use wait=0 and handle via:
- Webhooks (recommended):
type: ["workflow:succeeded", "workflow:failed"]— see Results & webhooks - Polling:
GET /v2/consumer/workflows/{workflowId}on a 10 s → 30 s → 60 s cadence
Troubleshooting
| Symptom | Likely cause | Fix |
|---|---|---|
400 with "duration must be one of" (kling) | Sent integer instead of string | The original kling engine uses string duration: "5" or "10". |
400 with "model is required" (kling) | Missing model on the original engine | model is required for kling; it is not used by kling-v3. |
400 with "sourceImage is required" | Used image-to-video without an image | Provide sourceImage for image-to-video. |
400 with "videoUrl is required" | Used video-to-video-* without a source video | Provide videoUrl for video-to-video operations. |
Step failed, reason = "no_provider_available" | No Kling worker available | Retry shortly. |
| Output doesn't match end frame | endImage ignored for text-to-video | Use operation: "image-to-video" with both sourceImage and endImage to interpolate frames. |
Related
SubmitWorkflow— operation used by every example hereGetWorkflow— for polling- Results & webhooks — production result handling
- WAN video generation — comparable open-source alternative
- Veo 3 video generation — Google alternative for commercial-grade video