Seedance 2.0 Workflow & Prompt Guide

This guide walks through how to use ByteDance's top-ranked AI video model Seedance 2.0 in Kaiber Canvas. From popular workflows and setup, to adding reference media and writing prompts.

Written By Christine Larsen

Last updated 13 days ago

Seedance 2.0 in Kaiber Canvas offers a new level of control with mixed media references, unprecedented character consistency and realistic motion. Seedance works with four types of input: text, images, video and audio. Use one, or combine all four in the same generation.

Getting Started

Add a Create Video Flow to the Canvas. In the Model menu on the side of the Flow select Seedance 2.0.

Settings

Under Advanced Features, you’ll find:

Mode: Set to Reference when using reference media, or Generate for text-to-video or animating an image.
Model: Choose between Seedance 2.0 and Seedance 2.0 Fast.
Duration: 4 to 15 seconds
Resolution: 480p, 720p, 1080p or 4K
Aspect ratio: 16:9, 9:16, 4:3, 3:4, 21:9 and 1:1
Generate Audio: On by default. Leave it ticked to generate video with sound.

Generate Mode: text-to-video or animate image

The simplest way to start. Generate a video from text or animate an image.

Select Generate Mode in the Advanced Features.
Add an image to the Upload Image field. This will become a the first frame of your video.
Add your subject prompt.
Choose the resolution, aspect ratio and length of your video.
To generate with audio toggle it on in the Advanced Features.

Reference Mode: generate from reference media and a prompt

Add multiple image, video and audio references to create your clip.

Image references

Upload up to 9 images (30 MB each). Images work as visual anchors. The model locks onto character appearance, wardrobe, color palette and scene details from whatever you upload.

Reference each image in your prompt as @Image1, @Image2 and so on.

“@Image1 as the character. Black leather jacket, short dark hair. Walking through a neon-lit city street at night. Slow push-in. Cinematic lighting.”

Video references

Upload up to 3 videos (15 seconds total, up to 50 MB). Video references work as motion anchors. The model extracts camera movement, editing rhythm, pacing and shot transitions from whatever you upload. If you want a specific handheld feel, a whip pan or a Hitchcock zoom, show it rather than describe it.

Reference each video in your prompt as @Video1, @Video2 and so on.

“Reference @Video1 for camera movement and pacing. @Image1 is the character. Walking into a warehouse, rain outside. Medium shot, slow pull-back.”

Audio references

Upload up to 3 audio files (15 seconds total, up to 15 MB each). Seedance 2.0 analyzes the beat structure of your track and uses it to drive visual timing. Cuts, camera movement and energy shifts follow the music.

Reference each audio file in your prompt as @Audio1, @Audio2 and so on.

Audio generation is toggled on by default. Leave this checked to include audio in your output or turn it off for video without audio.

“@Image1 is the character. Reference @Audio1 for rhythm and beat timing. Full body shot, movement synced to the music energy. Neon-lit stage. Medium camera, slow push-in.”

Combining references

This is where Seedance 2.0 stands out. Stack images, video and audio together in one generation and use @mentions to tell the model exactly what each file is responsible for.

“@Image1 is the character. @Image2 is the outfit and color reference. Reference @Video1 for camera movement and pacing. Reference @Audio1 for rhythm and beat timing. 15-second music video sequence. Nighttime rooftop. Cinematic lighting. 16:9.”

The practical sweet spot is 3 to 5 image references plus 1 to 2 video references. Using all the slots at once often produces worse results. The model tries to satisfy too many constraints and things start to conflict.

Prompting

This is where most results are won or lost. A few rules that consistently make a difference.

Lead with the subject

Seedance 2.0 uses the opening of your prompt to lock in the subject and core action before processing the rest. Don’t bury your character in a list of style notes. Put who’s in the frame and what they’re doing at the top.

The basic formula: subject + action + environment + camera + lighting + style + quality constraints

“A woman in a silver bodysuit sprints through a collapsing tunnel. Dust fills the air. Handheld camera, tracking behind her. Dramatic backlighting. Cinematic, photorealistic, 35mm film look.”

Declare your shot structure upfront

For multi-shot sequences, state the number of shots, total duration and aspect ratio at the very top of your prompt. Everything else follows from there.

“Montage, 5 shots, 15 seconds, 16:9.”

Then describe each shot individually. Number them. Give each one a clear action and a camera move. An escalation arc works well: start calm, build tension, land the moment.

Text handles space. Video references handle time.

Keep this split in mind when building prompts. Text is best for spatial decisions: subject, wardrobe, lighting, environment, mood, color. Video references are best for temporal decisions: timing, rhythm, camera motion, the exact shape of movement.

You can write “slow push-in” in a prompt. But that’s still being interpreted. A reference clip doesn’t describe the move. It contains it.

Keep prompts under 2000 characters

Longer doesn’t mean better. Cut anything that restates something you’ve already said.

Use physics language

Seedance 2.0 handles movement well. Mentioning things like “wind blowing through hair” or “water splashing on impact” activates that strength and tends to lift overall output quality.

Prompt examples

Character dancing to a track

“@Image1 is the character. Reference @Audio1 for rhythm and beat timing. Full body shot of the character dancing, movement synced to the music energy. Expressive, fluid motion. Neon-lit stage, dark background. Medium camera, slow push-in. Cinematic lighting. Photorealistic. 16:9. 15 seconds.”

Multi-shot action sequence

“Montage, multi-shot, 6 shots, 15 seconds, 16:9. Cinematic lighting, photorealistic, 35mm film quality, ARRI ALEXA aesthetic. @Image1 is the character.

Shot 1: Wide shot, character standing at the edge of a rain-soaked rooftop. City lights below. Camera slow push-in.

Shot 2: Close-up on face. Determination. Wind catches hair.

Shot 3: Medium shot, character steps forward. Handheld shaky camera.

Shot 4: Wide shot, character leaps from the edge. Slow motion. City blurs below.

Shot 5: Low angle, landing on the next rooftop. Impact shockwave.

Shot 6: Medium shot, character looks back.”

Camera replication

“Reference @Image1 for the character’s appearance. Reference @Video1 for camera movements. @Image2 is the environment. One continuous shot. No cuts.”

Text to video (no references)

“Single continuous shot, 15 seconds, 16:9. A woman in a white coat walks through a heavy rainstorm toward a glass building. Streetlights reflect off wet pavement. Tracking shot, slightly behind her. Cinematic, photorealistic, 35mm film grain. Avoid static camera, avoid blurry motion.”

Popular workflows

Some of the Seedance 2.0 workflows people can’t stop talking about.

1. Transformation / Henshin

A character transforms into something dramatic such as a warrior, or fantasy hero with fluid motion. The face stays consistent throughout.

Workflow
Upload an AI-generated portrait of your character into Image Reference (@Image1).

Prompt structure: "@Image1 is the character. Medium shot. They stand in [location]. Camera slowly pushes in. A burst of [golden/electric/cosmic] light surrounds them and their outfit transforms into [armor/fantasy costume/etc], piece by piece. Camera circles as [cherry blossoms/sparks/particles] fall around them. Cinematic lighting, 9:16 vertical."

Set duration to 8–15s. Toggle audio on.

2. Beat-synced dance

Upload a track and a character reference, and the model generates movement that follows the audio's beat structure. Cuts, energy shifts, and camera movement can all sync to the music. The character's face stays stable across the whole clip.

Workflow: Upload your AI character portrait to Image Reference (@Image1). Upload your track (trim to 15s max) to Audio Reference (@Audio1). Optionally upload a dance reference clip to Video Reference (@Video1) if you want specific choreography rather than generic dancing.

Prompt structure: "@Image1 is the character. Reference @Audio1 for rhythm and beat timing. @Video1 for body movement and choreography. Full body shot, fluid expressive dancing, movement synced to the music energy. [Describe environment]. Medium shot transitioning to close-up. Maintain consistent face and clothing throughout." Turn Generate Audio on.

3. Action / fight sequences

Multi-shot fight choreography with cinematic camera work — tracking shots, slow motion, impact cuts. The motion physics set Seedance 2.0 apart here.

Workflow : Upload your character portrait to Image Reference (@Image1). If you have a reference fight clip with the camera movement or choreography you want, add it to Video Reference (@Video1).

Prompt structure: "@Image1 is the fighter. Reference @Video1 for camera movement and fight choreography. [Number] shots, [duration]s, 16:9. [Describe the fight setup, location, opponent]. Handheld camera, slow motion on key impacts. Cinematic lighting. Photorealistic, no 3D, no cartoon."

For best motion results, describe each shot explicitly with timing.

4. Satisfying loops / abstract content

Morphing objects, flowing liquids, paint collisions, infinite rotations.

Workflow: Use a reference image or work directly from text prompts. Choose Generate Mode.

Prompt structure: [subject] + [physics interaction] + [camera] + [lighting/mood] + [style].

Prompt example: "Two streams of luminescent paint colliding in zero gravity, one electric blue and one molten gold, creating fractal patterns as they merge and separate. Suspended droplets catching light. Extreme slow motion. Dark background. Mesmerizing and meditative."

Set to loop-friendly duration (8s works well). Generate Audio on for ambient soundscaping.

5. Cinematic short film (chained shots)

Four 15-second clips generated sequentially with the same character reference, then stitched in Editor. This is your workaround for the clip length limit to produce one-minute short films. The key is locking the character reference across every single generation so the face doesn't drift between clips.

Workflow: Generate each clip as a separate generation. For every clip load the same character portrait into Image Reference (@Image1). This keeps your character consistent between clips. Write each shot's prompt separately with consistent character description. Connect your clips in Editor.

6. Short drama / character scene

A two- or three-shot emotional scene (e.g. an entrance, a reaction, a reveal) where the same character appears across every shot and the AI handles the camera cuts. Describe the story beats in the prompt and let the model figure out the cinematography. The result feels directed rather than generated.

Workflow: Upload your character portrait to Image Reference (@Image1).

Prompt structure: "@Image1 is [character name/description]. [Shot count] shots, [duration]s. Shot 1: [location, action, camera]. Shot 2: [continuation, emotional beat, camera]. Shot 3: [resolution, camera]. Cinematic lighting, natural dialogue [if audio], maintain consistent face and clothing throughout. Handheld / steady cam [choose based on tone]." Turn Generate Audio on if you want ambient sound or dialogue. Keep dialogue to under 20 words total across the clip if using it.