A comprehensive guide to using reference videos to guide AI video generation for more controlled, predictable results.

How to Use the Reference-to-Video Tool

Reference-to-video is one of the most powerful features on Kensa. Instead of describing every detail of motion, camera work, and pacing in a text prompt, you can simply upload an existing video clip as a reference and let the AI learn from it. This guide walks you through the complete workflow, from choosing a good reference video to downloading your final result.

What Is Reference-to-Video?

Reference-to-video (sometimes called video-to-video or style transfer) is an AI generation mode where you provide an existing video clip as a guiding reference. The AI analyzes the reference for characteristics such as:

Camera movement -- pans, tilts, dollies, tracking shots, or static framing
Motion pacing -- slow and cinematic, fast and energetic, or something in between
Visual style -- color grading, lighting mood, contrast levels
Scene composition -- how subjects are arranged and how the frame evolves over time

The AI then generates a brand-new video that follows these learned patterns while incorporating whatever subject matter and scene you describe in your text prompt.

When to Use Reference-to-Video

Reference-to-video is not always the right choice. Here is a quick comparison to help you decide which tool fits your project:

Generation Mode	Best For	Input Required
Text-to-Video	Quick creative exploration, simple scenes	Text prompt only
Image-to-Video	Animating a specific still image	Image + text prompt
Reference-to-Video	Replicating a specific motion style or camera technique	Video clip + text prompt

Choose reference-to-video when you:

Have a specific camera movement you want to replicate (such as a smooth drone flyover)
Want consistent motion pacing across a series of videos (such as a brand video campaign)
Need to match the "feel" of an existing clip but with entirely different subject matter
Are creating variations on an existing concept with controlled stylistic consistency

Stick with text-to-video or image-to-video when you:

Do not have a reference clip available
Want the AI to be fully creative with no motion constraints
Need to animate a specific static image

Prerequisites

A Kensa account (free signup available)
Credits in your account (new users receive free credits)
A reference video clip on your device (MP4, MOV, or WebM format)

Detailed Steps

Step 1: Upload a Reference Video

Navigate to the Reference-to-Video tool on Kensa. You will see an upload area where you can drag and drop or browse for your reference video file.

Key guidelines for your reference clip:

Duration: Keep it between 5 and 15 seconds. Longer clips are trimmed automatically, but shorter clips give the AI a clearer signal.
Resolution: At least 720p is recommended. Higher resolution reference clips help the AI understand fine details.
Stability: If you want a steady output, use a steady reference. Shaky handheld footage will produce shaky results.
Simplicity: A single continuous shot works better than a clip with multiple cuts. The AI reads motion across the entire clip, so abrupt transitions can confuse the analysis.

Once your clip is uploaded, you will see a thumbnail preview and basic metadata (duration, resolution, file size). Confirm that the correct file is loaded before moving on.

Step 2: Write a Descriptive Prompt

Below the upload area, you will find the text prompt field. This is where you describe the content of the video you want to generate. The AI uses the reference video for style and motion guidance, and uses your prompt for the actual scene content.

Tips for writing effective prompts with references:

Focus on the subject and scene, not the camera movement. The reference video already provides the motion information, so your prompt should describe what appears in the frame.
Be specific about visual details: mention colors, lighting, time of day, environment, and atmosphere.
Include style keywords if you want to push the aesthetic further. Terms like "cinematic," "documentary-style," "neon-lit," or "soft pastel tones" can steer the final look.
Mention what should differ from the reference. For example: "Same smooth tracking shot as the reference, but set in a snowy mountain landscape instead of an urban setting."

Example prompt:

A majestic eagle soaring over a vast canyon at golden hour, dramatic clouds in the background, cinematic color grading with warm amber tones. The camera follows the same sweeping arc as the reference video.

Step 3: Choose a Model and Configure Settings

After entering your prompt, select an AI model from the model picker. Not all models support reference-to-video equally, so here is a breakdown:

Model	Reference Support	Duration Range	Recommended For
Sora 2	Full	10-15s	Cinematic reference matching
Kling 3	Full	5-15s	Realistic reference with audio
Seedance 1.5 Pro	Partial	5-10s	Motion style transfer

Configure these settings:

Duration: Choose how long the output video should be. For the closest match to your reference, set the output duration to match the reference clip length.
Aspect Ratio: Use the same aspect ratio as your reference video. A 16:9 reference paired with a 9:16 output will produce inconsistent framing.
Quality: Standard quality is fine for drafts. Use High or 1080P for final outputs when you are satisfied with the creative direction.

Step 4: Generate, Review, and Download

Click the Generate button to start the process. Here is what happens behind the scenes:

The AI analyzes your reference video frame by frame, extracting motion vectors, camera path data, and stylistic features.
It combines this motion blueprint with your text prompt to plan the new video.
The video is generated progressively and typically takes 3-7 minutes depending on the model and duration.
Once complete, the video appears in your generation queue with a status of Completed.

Reviewing your result:

Play back the generated video and compare it side-by-side with your reference.
Check whether the camera movement matches your expectations.
Look at the pacing -- does the speed and rhythm of motion feel right?
Evaluate the subject matter -- does the scene match your prompt description?

If the result is not quite right, you can:

Adjust your prompt to be more specific about elements that were missed.
Try a different model -- each model interprets reference videos slightly differently.
Use a different section of your reference clip if the original had varying motion styles.

When you are satisfied, click Download to save the video to your device. You can also find all your generated videos in the Dashboard under the Videos tab.

Tips for Choosing Good Reference Videos

The quality of your reference video has a direct impact on the quality of your output. Follow these guidelines to get the best results:

Single continuous shot: Avoid clips with jump cuts or scene transitions. One uninterrupted camera movement gives the AI the clearest instruction.
Consistent motion: A smooth, steady pan is easier for the AI to learn from than erratic, unpredictable movement. If you want dynamic motion, make sure it is intentionally dynamic throughout the clip.
Good lighting: Well-lit reference footage produces more predictable style transfer. Dark or inconsistently lit clips can cause unexpected shifts in the output.
Minimal text or overlays: Avoid reference clips with watermarks, subtitles, or on-screen graphics. The AI may attempt to replicate these elements.
Appropriate length: 5-10 seconds is the sweet spot. Very short clips (under 3 seconds) do not provide enough motion data. Very long clips (over 20 seconds) may dilute the signal.
Match your intent: If you want a slow, cinematic feel, use a slow, cinematic reference. If you want fast-paced action, use an energetic reference. The AI mirrors what it sees.

Common Use Cases

Brand video series: Upload one branded video as the reference and generate multiple variations with different products or messages, all maintaining the same visual style.
Social media content: Use a trending video style as a reference to create original content that matches the current aesthetic without copying the original.
Storyboard previsualization: Record a rough camera movement with your phone and use it as a reference to generate a polished, AI-enhanced version of the scene.
Style exploration: Take a reference from a film or commercial you admire and generate new scenes in that same visual language.

Troubleshooting

Issue	Likely Cause	Solution
Output motion does not match reference	Reference clip has multiple cuts	Use a single continuous shot
Output looks blurry	Low-resolution reference or low quality setting	Use a higher resolution reference and set quality to High
AI ignores the prompt content	Prompt conflicts with reference style	Simplify the prompt or choose a more neutral reference
Generation takes too long	High-resolution output with long duration	Reduce duration or quality for faster drafts