Choosing an Image Model

The image model is the AI that generates the keyframe images for each scene in your video's storyboard.

Written By Rishikesh from ngram

Last updated About 1 month ago

After the script is written, the image model creates visual frames that represent the start and end of each scene. These keyframes are then animated by the video model to produce the final video clips.

Where to Find It

The image model selector is in the settings bar below the prompt box. Look for the button labeled with the current image model name.

Image model selector in the settings bar

Click the button to open a dropdown showing all available image generation providers.

Available Providers

Google (Default)

Google is the default image model provider. It produces high-quality, versatile images that work well across a wide range of visual styles. Google's image generation integrates tightly with the rest of the ngram AI Studio pipeline, making it the most reliable choice for consistent results.

Best for: General-purpose image generation, most video styles, reliable consistency across scenes.

Seedance

Seedance is a specialized image generation model that excels at certain visual styles. It can produce distinctive and stylized images that stand out from standard AI-generated visuals.

Best for: Videos where you want a unique visual look, creative or artistic content.

Flux

Flux is known for its ability to generate highly detailed and photorealistic images. If your video needs images that look close to real photographs -- such as product shots, corporate settings, or realistic environments -- Flux is a strong option.

Best for: Realistic photography style, product demos, corporate videos, scenarios requiring high detail.

Wan

Wan offers an alternative approach to image generation with its own strengths in composition and color. It can be particularly effective for certain subject matters and visual styles.

Best for: Experimentation, alternative visual interpretations of your scenes.

How the Image Model Affects Your Video

The image model controls the visual foundation of your video. Each scene in your storyboard gets keyframe images, and these images determine:

  • Visual quality -- how sharp, detailed, and polished each scene looks

  • Style accuracy -- how well the generated images match your selected visual style (e.g., Realistic Photo, Flat Vector, Watercolor)

  • Scene consistency -- how visually coherent images look across all scenes in the video

  • Subject accuracy -- how well the image represents what the script describes

The image model does not affect:

  • The script or narration (that is the text model)

  • The animation and motion (that is the video model)

  • The voiceover (that is the voice setting)

Choosing the Right Image Model

Stick with Google (the default) if you are just getting started. It handles the widest range of styles and subjects reliably.

Consider switching if:

  • You need photorealistic images: Try Flux for its strength in realistic detail

  • You want a more artistic or stylized look: Experiment with Seedance

  • Your current results are not matching expectations: Try a different model to see if it interprets your scene descriptions better

  • You are exploring creative directions: Generate the same storyboard with different image models to compare visual approaches

Tips for Better Image Results

  1. Pair your image model with the right visual style. The Style setting (Auto, Realistic Photo, Flat Vector, etc.) works together with the image model. Some models handle certain styles better than others.

  2. Write descriptive prompts. The more detail in your original prompt about the visual look you want, the better the image model can deliver. Mention specific visual elements, environments, or moods.

  3. Review the storyboard carefully. After the AI generates your storyboard, check each scene's keyframe images. If a particular scene does not look right, you can ask the AI to regenerate just that scene.

  4. Experiment across models. Different image models can produce surprisingly different results from the same scene description. If you are not satisfied with one model's output, try another before rewriting your prompt.

Understanding Keyframes

Each scene in your storyboard typically has two keyframe images:

  • Start keyframe -- the image shown at the beginning of the scene

  • End keyframe -- the image shown at the end of the scene

The video model then animates between these two keyframes, creating smooth motion for the scene. This is why the quality and accuracy of your keyframe images directly impacts the quality of your final video.

You can navigate between keyframes in the storyboard panel using the arrow buttons on each scene card.

Credit Usage

Different image models have different credit costs:

  • The default Google model offers a good balance of quality and credit efficiency

  • Specialized models like Flux or Seedance may use more credits per image

  • The total credit cost depends on the number of scenes (more scenes means more keyframe images)

You can monitor your credit usage in real time at the top of the chat page during video creation.

Next Steps