Audio Voiceover Direction Pack

Category creative
Subcategory audio-production
Difficulty beginner
Target models: claude-sonnet, gpt, gemini-pro
Variables: {{project_goal}} {{audience}} {{script_text}} {{tone_targets}} {{duration_target}} {{constraints}}
audio voiceover narration creative-ops speech
Updated April 23, 2026

The Prompt

You are a voice director. Build a practical direction pack that helps a narrator, dubbing editor, or synthetic-voice operator deliver the script with the intended tone and timing.

PROJECT GOAL:
{{project_goal}}

AUDIENCE:
{{audience}}

SCRIPT TEXT:
{{script_text}}

TONE TARGETS:
{{tone_targets}}

DURATION TARGET:
{{duration_target}}

CONSTRAINTS:
{{constraints}}

Return exactly:
1) Direction check
   - what the narration is trying to accomplish
   - what is still unclear
2) Master narrator brief
   - emotional job
   - audience stance
   - do and do-not cues
3) Three direction lanes
   - lane name
   - emotional posture
   - pacing guidance
   - emphasis map
   - pause cues
   - pronunciation or term watch-outs
4) Line-by-line script notes
   - delivery instruction per line or sentence
5) Timing and edit risks
   - lines likely to overrun the duration target
   - lines that may need simplification or visual support
6) Review checklist
   - clarity
   - tone fit
   - pacing
   - naturalness
   - audience fit
7) Localization and dubbing notes
   - terms to keep unchanged
   - phrases that may break when translated or dubbed

Rules:
- Keep the output tool-agnostic.
- Do not use vendor-specific tags, voice IDs, or proprietary parameters.
- Keep directions concrete and actionable enough for a human narrator or a dubbing editor to use immediately.
- If the script is too dense for the duration target, say so plainly and suggest where to cut.

When to Use

Use this when the script already exists but the delivery does not. It is useful before human recording, TTS generation, or dubbing passes for product demos, explainers, campaign voiceovers, onboarding videos, and short-form recap clips.

It matters more now because modern voice and dubbing tools can preserve emotion and timing better than older systems, but they still need a clean direction brief. Most bad synthetic narration is a script-direction failure long before it becomes a model-quality failure.

Variables

VariableDescriptionExample
project_goalWhat the narration should accomplish”Help new users understand the workflow in under 90 seconds”
audienceWho will listen and how familiar they are with the topic”Procurement leads who know the category but not our product”
script_textThe current script draft”Welcome to the new reporting flow…”
tone_targetsDesired style words or voice qualities”Calm, precise, warm, not salesy”
duration_targetApproximate runtime goal”75 seconds”
constraintsBrand, legal, pronunciation, accessibility, or language constraints”Keep product names exact, avoid hype words, accessible pace”

Tips & Variations

  • Ask for a “host-read” and “neutral narrator” lane when delivery style is still undecided.
  • Add a pronunciation table if the script includes names, product terms, acronyms, or multilingual phrases.
  • Request an accessibility version with slower pacing and clearer phrasing when captions or comprehension are a priority.
  • If the script will be dubbed later, ask the model to flag lines that are likely to drift in timing or meaning after translation.
  • If your tooling cannot synthesize audio directly, use the output as a recording brief for a human narrator.

Example Output

Master narrator brief: Speak like someone clarifying a process, not selling a miracle. Confidence comes from precision, not intensity.

Timing risk: Paragraph 2 is too dense for a 75-second read and should either be shortened or split across visuals.