Choosing an Image Model

The image model is the AI that generates the keyframe images for each scene in your video's storyboard.

Written By Rishikesh from ngram

Last updated 3 months ago

After the script is written, the image model creates visual frames that represent the start and end of each scene. These keyframes are then animated by the video model to produce the final video clips.

Where to Find It

The image model selector is in the settings bar below the prompt box. Look for the button labeled with the current image model name.

Image model selector in the settings bar

Click the button to open a dropdown showing all available image generation providers.

Available Providers

Google (Default)

Google is the default image model provider with several options: Nano Banana Pro (default), Nano Banana, and Nano Banana 2 (newest). These produce high-quality, versatile images that work well across a wide range of visual styles. Google's image generation integrates tightly with the rest of the ngram AI Studio pipeline, making it the most reliable choice for consistent results.

Best for: General-purpose image generation, most video styles, reliable consistency across scenes.

OpenAI

OpenAI offers GPT Image 1.5 for image generation, bringing strong image generation capabilities with good quality across various subjects.

Best for: Alternative to Google's models, general-purpose image generation.

Seedance

Seedance offers Seedream 4.5 and Seedream 4.0 models that excel at certain visual styles. They can produce distinctive and stylized images that stand out from standard AI-generated visuals.

Best for: Videos where you want a unique visual look, creative or artistic content.

Flux

Flux offers multiple models: FLUX.2 Pro, FLUX.2 Flex, and Flux Kontext Max. Known for generating highly detailed and photorealistic images. If your video needs images that look close to real photographs -- such as product shots, corporate settings, or realistic environments -- Flux is a strong option.

Best for: Realistic photography style, product demos, corporate videos, scenarios requiring high detail.

How the Image Model Affects Your Video

The image model controls the visual foundation of your video. Each scene in your storyboard gets keyframe images, and these images determine:

Visual quality -- how sharp, detailed, and polished each scene looks
Style accuracy -- how well the generated images match your selected visual style (e.g., Realistic Photo, Flat Vector, Watercolor)
Scene consistency -- how visually coherent images look across all scenes in the video
Subject accuracy -- how well the image represents what the script describes

The image model does not affect:

The script or narration (that is the text model)
The animation and motion (that is the video model)
The voiceover (that is the voice setting)

Choosing the Right Image Model

Stick with Google (the default) if you are just getting started. It handles the widest range of styles and subjects reliably.

Consider switching if:

You need photorealistic images: Try Flux for its strength in realistic detail
You want a more artistic or stylized look: Experiment with Seedance (Seedream models)
You want an alternative general-purpose model: Try OpenAI's GPT Image 1.5
Your current results are not matching expectations: Try a different model to see if it interprets your scene descriptions better
You are exploring creative directions: Generate the same storyboard with different image models to compare visual approaches

Tips for Better Image Results

Pair your image model with the right visual style. The Style setting (Auto, Realistic Photo, Flat Vector, etc.) works together with the image model. Some models handle certain styles better than others.
Write descriptive prompts. The more detail in your original prompt about the visual look you want, the better the image model can deliver. Mention specific visual elements, environments, or moods.
Review the storyboard carefully. After the AI generates your storyboard, check each scene's keyframe images. If a particular scene does not look right, you can ask the AI to regenerate just that scene.
Experiment across models. Different image models can produce surprisingly different results from the same scene description. If you are not satisfied with one model's output, try another before rewriting your prompt.

Understanding Keyframes

Each scene in your storyboard typically has two keyframe images:

Start keyframe -- the image shown at the beginning of the scene
End keyframe -- the image shown at the end of the scene

The video model then animates between these two keyframes, creating smooth motion for the scene. This is why the quality and accuracy of your keyframe images directly impacts the quality of your final video.

You can navigate between keyframes in the storyboard panel using the arrow buttons on each scene card.

Credit Usage

Different image models have different credit costs:

The default Google model offers a good balance of quality and credit efficiency
Specialized models like Flux or Seedance may use more credits per image
The total credit cost depends on the number of scenes (more scenes means more keyframe images)

You can monitor your credit usage in real time at the top of the chat page during video creation.

Next Steps

Choosing a Video Model -- select the AI that animates your keyframe images into video clips
Choosing a Visual Style -- pick an art direction that works with your image model
Choosing a Text Model -- the script quality influences how well image generation works