Selecting a Voice for Narration

Every video created in ngram AI Studio includes an AI-generated voiceover that narrates the script.

Written By Rishikesh from ngram

Last updated About 1 month ago

The voice setting lets you choose which voice reads your script aloud, giving you control over the tone, personality, and feel of your video's narration.

Where to Find It

The voice selector is in the settings bar below the prompt box. It displays the name of the currently selected voice (e.g., "Alexandra").

Click the Voice button to open the voice selection dialog.

Voice selector dialog showing available voices

How the Voice Selection Dialog Works

When you click the Voice button, a dialog opens showing all available voices. The dialog includes:

A search bar at the top so you can quickly find a voice by name
A grid of voice options, each showing the voice name
The currently selected voice is highlighted

To select a voice, simply click on it. The dialog will close and your selection will appear in the settings bar.

Available Voices

ngram AI Studio offers a variety of AI voices, each with different characteristics. Here are some of the available options:

Voice	Characteristics
Alexandra	Clear and professional, well-suited for corporate and product content
Cassidy	Warm and approachable, works well for tutorials and friendly brand content
Fable	Engaging and narrative-driven, good for storytelling and brand videos
Sage	Calm and authoritative, ideal for educational and explainer content
Coral	Bright and energetic, effective for marketing and promotional videos
Onyx	Deep and confident, strong choice for product demos and announcements

Additional voices are available in the dialog. Each voice brings a different energy and personality to your video.

How the Voice Affects Your Video

The voice setting controls the voiceover narration of your video. Specifically:

The spoken audio track is generated using the selected voice
The tone and personality of the narration changes based on the voice
Pacing and emphasis may vary slightly between voices

The voice does not affect:

The script text itself (that is the text model)
The visual content (those are the image and video models)
The background music (that is generated separately)

Choosing the Right Voice

The right voice depends on your content type, brand personality, and target audience. Here are some guidelines:

For Product Demos and Corporate Videos

Choose a clear, professional voice. Alexandra or Sage work well for content that needs to feel polished and credible.

For Tutorials and Onboarding

Pick a warm, friendly voice. Cassidy or Fable create an inviting tone that makes learning feel approachable rather than intimidating.

For Marketing and Social Content

Go with an energetic, engaging voice. Coral brings enthusiasm that keeps viewers engaged in shorter, attention-grabbing content.

For Announcements and Launches

A confident, authoritative voice sets the right tone. Onyx conveys importance and gravitas for major product launches or company announcements.

Finding a Specific Voice

If you know the name of the voice you want, use the search bar at the top of the voice dialog. Type part of the name and the list will filter in real time.

This is particularly helpful as more voices are added to the platform over time.

Tips for Voice Selection

Match the voice to your brand. Think about your brand's personality. A playful startup might prefer Coral or Cassidy, while an enterprise SaaS product might lean toward Alexandra or Sage.
Consider your audience. Who will watch this video? The voice should resonate with your target viewers. A developer audience might prefer a calm, no-nonsense voice, while a consumer audience might respond better to an upbeat, engaging voice.
Be consistent across videos. If you are creating a series of videos (e.g., multiple product feature demos), use the same voice across all of them to build brand recognition and a cohesive feel.
The voice does not change the script. Switching voices does not rewrite your script. The same words will be spoken regardless of which voice you choose. If you want different phrasing or tone in the text itself, adjust your prompt or edit the script directly.
You can change the voice later. If you start the video creation process and then realize you want a different voice, you can regenerate the voiceover with a different voice selection without redoing the entire video.

How Voiceover Generation Works

Here is what happens behind the scenes:

The text model writes the script
The script is split into segments, one per scene
Each segment is sent to the text-to-speech system with your chosen voice
The audio is generated and synchronized with the corresponding video clip
The final video combines the voiceover audio, video clips, and background music

The voiceover generation typically takes less than a minute and runs in parallel with video animation.

Credit Usage

Voiceover generation costs 1 credit per second of audio with a 30-credit minimum reservation. Longer scripts (from higher duration settings) produce longer audio and use more credits. The voice selection itself does not change the credit cost -- all voices cost the same number of credits for the same audio duration.

Next Steps

Choosing a Visual Style -- set the art direction for your video's images
Setting Video Duration and Aspect Ratio -- the duration affects how much narration is generated
Choosing a Text Model -- the text model writes the script that the voice will narrate