Advanced Guide Video Professional

Advanced Veo 3.1 Prompting: Professional Techniques

The official formula, timestamp scripting, multi-shot workflows, and audio direction - everything Google recommends for production-quality results.

Basic prompts get basic results. This guide covers the advanced techniques Google recommends for professional Veo 3.1 output - the same workflows used by WPP, Pocket FM, and QuickFrame in production.

The 5-Part Prompting Formula

Google's recommended structure for consistent, high-quality results:

[Cinematography] + [Subject] + [Action] + [Context] + [Style & Ambiance]

Element What to Include Example
CinematographyCamera work, shot type, movement"Medium shot", "Tracking shot", "Crane shot ascending"
SubjectMain character or focal point"A tired corporate worker", "A female explorer"
ActionWhat the subject is doing"Rubbing his temples in exhaustion"
ContextEnvironment, background, setting"In a cluttered 1980s office late at night"
Style & AmbianceAesthetic, mood, lighting, film style"Retro aesthetic, shot on 1980s colour film, slightly grainy"

Full Example

Medium shot, a tired corporate worker, rubbing his temples in exhaustion, in front of a bulky 1980s computer in a cluttered office late at night. The scene is lit by harsh fluorescent overhead lights and the green glow of the monochrome monitor. Retro aesthetic, shot as if on 1980s colour film, slightly grainy.

Every element is covered. The model knows exactly what to generate.

Cinematography Language That Works

The cinematography element is your most powerful tool for controlling tone and emotion.

Camera Movement

Term Effect
Dolly shotCamera physically moves toward/away from subject
Tracking shotCamera follows subject laterally
Crane shotCamera moves vertically (ascending/descending)
Aerial viewBird's eye perspective
Slow panGradual horizontal rotation
POV shotFirst-person perspective

// Crane shot example

Crane shot starting low on a lone hiker and ascending high above, revealing they are standing on the edge of a colossal, mist-filled canyon at sunrise. Epic fantasy style, awe-inspiring, soft morning light.

Composition

Term What It Does
Wide shotShows full environment, subject in context
Close-upFace or object detail
Extreme close-upIsolated detail (eyes, hands, texture)
Low angleCamera below subject, looking up (power, dominance)
High angleCamera above subject, looking down (vulnerability)
Two-shotTwo subjects in frame together

Lens & Focus

Term Visual Effect
Shallow depth of fieldSubject sharp, background blurred (cinematic, intimate)
Deep focusEverything sharp (documentary, wide establishing)
Wide-angle lensExpanded perspective, slight distortion at edges
Macro lensExtreme close-up of small objects
Soft focusDreamy, romantic quality

// Shallow depth of field example

Close-up with very shallow depth of field, a young woman's face, looking out a bus window at the passing city lights with her reflection faintly visible on the glass. Inside a bus at night during a rainstorm. Melancholic mood with cool blue tones, moody, cinematic.

Directing Audio

Veo 3.1 generates synchronised audio from text instructions. Use these conventions:

Dialogue

Use quotation marks for specific speech:

A woman turns to the camera and says, "We have to leave now."

Sound Effects

Use SFX: prefix for specific sounds:

SFX: Thunder cracks in the distance.
SFX: The metallic clang of a sword being drawn.
SFX: Footsteps echo on wet pavement.

Ambient Noise

Define the background soundscape:

Ambient noise: The quiet hum of a starship bridge.
Ambient noise: Busy cafe chatter, espresso machine, soft jazz.
Ambient noise: Forest at dawn - birdsong, rustling leaves, distant stream.

Combined Example

Medium shot of a barista preparing coffee behind an espresso machine. She tamps the grounds, locks in the portafilter, and pulls the shot. Ambient noise: Busy cafe atmosphere with soft conversation. SFX: The hiss of steam, the clatter of cups. She looks up and says, "The usual?" Warm lighting, indie coffee shop aesthetic.

Timestamp Prompting: Multi-Shot Sequences

This is the advanced technique. Script an entire multi-shot sequence in a single generation using timestamp markers.

Syntax

[00:00-00:02] First shot description...
[00:02-00:04] Second shot description...
[00:04-00:06] Third shot description...
[00:06-00:08] Fourth shot description...

Full Example

[00:00-00:02] Medium shot from behind a young female explorer with a leather satchel
and messy brown hair in a ponytail, as she pushes aside a large jungle vine to
reveal a hidden path.

[00:02-00:04] Reverse shot of the explorer's freckled face, her expression filled
with awe as she gazes upon ancient, moss-covered ruins in the background.
SFX: The rustle of dense leaves, distant exotic bird calls.

[00:04-00:06] Tracking shot following the explorer as she steps into the clearing
and runs her hand over the intricate carvings on a crumbling stone wall.
Emotion: Wonder and reverence.

[00:06-00:08] Wide, high-angle crane shot, revealing the lone explorer standing
small in the centre of the vast, forgotten temple complex, half-swallowed by
the jungle. SFX: A swelling, gentle orchestral score begins to play.

Result

A complete 8-second sequence with four distinct shots, varied angles, consistent character, and directed audio - all from one generation.

Why This Matters

Without timestamp prompting, you'd need to:

  1. Generate four separate clips
  2. Hope for visual consistency
  3. Edit them together manually
  4. Add audio separately

Timestamp prompting does it all in one pass.

First & Last Frame: Controlled Transitions

Create a specific camera movement or transformation between two defined images.

Step 1: Generate your starting frame

Medium shot of a female pop star singing passionately into a vintage microphone. Dark stage, lit by a single dramatic spotlight. Eyes closed, emotional moment. Photorealistic, cinematic.

Step 2: Generate your ending frame

POV shot from behind the singer on stage, looking out at a large, cheering crowd. Stage lights creating lens flare. Back of singer's head and shoulders in foreground. Audience is a sea of lights and silhouettes. Energetic atmosphere.

Step 3: Animate with Veo 3.1

The camera performs a smooth 180-degree arc shot, starting with the front-facing view of the singer and circling around her to seamlessly end on the POV shot from behind. The singer sings "when you look me in the eyes, I can see a million stars."

Use Cases

  • Product reveals: Start on packaging, end on product in use
  • Before/after: Transition between two states
  • Location changes: Move from one setting to another
  • Perspective shifts: Front to back, inside to outside

Ingredients to Video: Multi-Reference Consistency

Maintain consistent characters and settings across multiple shots by providing reference images as "ingredients."

Step 1: Generate reference images

  • Character A (the detective)
  • Character B (the mysterious woman)
  • Setting (the noir office)

Step 2: Generate Shot 1 using all references

Using the provided images for the detective, the woman, and the office setting, create a medium shot of the detective behind his desk. He looks up at the woman and says in a weary voice, "Of all the offices in this town, you had to walk into mine."

Step 3: Generate Shot 2 with same references

Using the provided images for the detective, the woman, and the office setting, create a shot focusing on the woman. A slight, mysterious smile plays on her lips as she replies, "You were highly recommended."

Result

Two shots with consistent character appearances, consistent setting, directed dialogue, and matching visual style. This is how you build scenes, not just clips.

Negative Prompting

Tell the model what to exclude, but phrase it positively:

Instead of... Use...
"No buildings""A desolate landscape with no buildings or roads"
"No text""Clean frame without any text overlays or watermarks"
"Not shaky""Smooth, stabilised camera movement"

The model responds better to descriptive exclusions than simple negatives.

Quick Reference: Audio Syntax

Type Syntax Example
DialogueQuotation markssays, "Hello there."
Sound effectsSFX: prefixSFX: Door slams shut.
AmbientAmbient noise:Ambient noise: Ocean waves, seagulls.
MusicDescribe in styleGentle piano score plays.
SilenceExplicit requestNo audio. / Silent.

Quick Reference: Timestamp Format

[MM:SS-MM:SS] Shot description with all five elements.

  • Use 2-second increments for 8-second videos: 00:00-00:02, 00:02-00:04, etc.
  • Include cinematography, subject, action, context, and style in each segment
  • Add SFX: and dialogue within each timestamp block
  • Maintain character consistency by describing the same subject across shots

Summary

Technique When to Use
5-part formulaEvery prompt - it's the foundation
Timestamp promptingMulti-shot sequences in one generation
First & Last FrameControlled transitions between two points
Ingredients to VideoConsistent characters across multiple shots
Audio syntaxDialogue, SFX, ambient sound direction
Negative promptingExcluding unwanted elements

Master these and you're not just generating video - you're directing it.

Create professional AI video

Put these techniques to work with Veo 3.1 in ChilledSites.

Start Generating