Getting started with Lip Sync in Kaiber Canvas

Make a character sing or speak with your audio using Image or Video Lip Sync

Written By Christine Larsen

Last updated About 12 hours ago

Both Image Lip Sync and Video Lip Sync are available inside the Create Video flow in Kaiber Canvas. Add a Create Video flow to the canvas and select the one you need from the model menu.
Image Lip Sync: If you are starting with an image and audio
Video Lip Sync: If you are starting with a video and audio

For best Lip Sync results:

  • Use a forward-facing shot of a single person’s face

  • Consider camera distance

    • If the character is too far from the camera, lip movement may not come through clearly

    • If the character is too close, facial expressions may appear distorted.

  • Choose audio with clear speech and no cross-talk

  • For clearer movement use AudioShake Stem Separator to extract the vocals from your audio. Add the full audio back in edit

  • Test outputs with a short section of audio before creating a longer video

Image Lip Sync

Create a lip sync video by adding audio to an image of a person.

Length: Will match audio up to 300 seconds.


To use Image Lip Sync:

  • In the Create Video Flow menu select Image Lip Sync

  • Add your image and audio to the Flow

  • Click the generate button to create your video

Video Lip Sync

Create a lip sync video by adding audio to a video of a person.

Length: Will match your audio up to 30 seconds. If audio is longer than video, the video will loop.

To use Video Lip Sync:

  • In the Create Video Flow select Video Lip Sync

  • Add your video and audio to the Flow

  • Click the generate button to create your video

Use Video Lip Sync to:

  • Create lip sync videos with dynamic character and background movement

  • Add new or translated audio to a character in an existing video

For best results with Video Lip Sync choose a video where:

  • The character’s face is clear and stays in the shot for the entire video

  • The character’s movement is smooth

  • The character is stationary or the camera tracks the character’s face as they move

  • The character is not talking in the original video

Testing with short clips to refine your settings before generating longer videos is recommended.