Google DeepMind Introduces Nano Banana Pro: the Gemini 3 Pro Image Model for Text Accurate and Studio Grade Visuals

Google DeepMind Introduces Nano Banana Pro: the Gemini 3 Pro Image Model for Text Accurate and Studio Grade Visuals

Nano Banana Pro, also called Gemini 3 Pro Image, is Google DeepMind’s new image generation and editing model built on Gemini 3 Pro. It is positioned as a state of the art system for creating and editing images that must respect structure, world knowledge and text layout, not only style. Nano Banana Pro follows Nano Banana, which was based on Gemini 2.5 Flash Image and focused on fast, casual image editing such as restoring photos and generating figurines.

From Gemini 2.5 Flash Image to Gemini 3 Pro Image

The earlier Nano Banana model targeted quick creative edits for casual creators. It helped restore old photos and build stylized 3D mini figurines with a simple prompt. Nano Banana Pro keeps that editing flow but runs on top of Gemini 3 Pro, which brings stronger reasoning and real world knowledge into the image stack.

The model can turn prototypes, data tables and handwritten notes into diagrams and infographics that reflect the underlying information, rather than producing only decorative art.

Reasoning Guided, Search Grounded Visuals

A core design point for Nano Banana Pro is reasoning guided generation. Using Gemini 3 Pro, the model can consume text, structured content and references and then plan the image as an explanation of that content. Nano Banana Pro can also connect to Google Search, using the search index as a real time knowledge source.

Clear Text and Multilingual Layouts

Text inside images is a long standing failure mode for many diffusion based generators. Nano Banana Pro addresses this explicitly. Google states that it is the best model in the Gemini family for producing images with correctly rendered and legible text, for both short taglines and full paragraphs.

Gemini 3 Pro’s multilingual reasoning flows into the image model. Nano Banana Pro can render text in multiple languages and also translate text that already appears in products or posters. The documentation shows beverage cans where English text is translated into Korean while the visual design and layout stay unchanged.

Studio Level Control, Consistency and Upscaling

Nano Banana Pro exposes a set of controls aimed at design and production workflows rather than single shot art prompts. On the composition side, the model can use up to 14 input images and maintain the consistency and resemblance of up to 5 people in one workflow. This supports tasks such as combining reference photos into a single fashion editorial, transforming sketches into product shots or keeping the same cast across multiple scenes.

The studio control section of the model page lists several families of controls. Users can vary camera angle and shot type, including wide shot, panoramic and close up, while controlling depth of field and focus on specific subjects in the image. Color and lighting can be adjusted, for example changing day to night, replacing volumetric lighting with bokeh or applying a strong chiaroscuro effect without losing subject identity.

Nano Banana Pro supports explicit upscaling. The official Google blog states that it can generate crisp visuals at 1k, 2k or 4k resolution, and provides examples of progressive zoom in operations that keep detail and composition. Aspect ratio is also programmable. Prompts can convert between ratios such as 1:1, 4:3, 16:9 and cinematic formats while keeping the main character locked in place and adjusting only the background.

Key Takeaways

  • Nano Banana Pro is Gemini 3 Pro Image, an upgraded image generation and editing model that succeeds Nano Banana, which was based on Gemini 2.5 Flash Image, and is optimized for higher quality and control.
  • The model integrates Gemini 3 Pro reasoning and Google Search grounding so it can turn factual content, documents and real time data into infographics, recipes, process diagrams and other information dense visuals.
  • It provides strong text rendering and multilingual support, producing legible typography in images and enabling translation or localization of existing on image text while preserving layout and design.
  • Nano Banana Pro supports up to 14 input images and maintains resemblance for up to 5 people, with studio style controls for camera angle, depth of field, lighting, aspect ratios and upscaling to 1k, 2k and 4k resolutions.
  • The model is being deployed across Gemini app, AI Mode in Search, NotebookLM, Google Ads, Workspace apps, Gemini API, Google AI Studio, Vertex AI, Antigravity and Flow, with all outputs watermarked using SynthID plus tier specific visible watermarks.

Editorial Comments

Nano Banana Pro positions Gemini 3 Pro Image as a production oriented image system that links Gemini 3 Pro reasoning, Google Search grounding and structured controls for layout, text and upscaling. It directly addresses long standing issues in text rendering, multilingual localization and subject consistency, while keeping SynthID and visible watermarks as default provenance signals across tiers and surfaces. This launch moves Google’s image stack closer to an integrated, API first visual platform for developers and enterprises.


Check out the Technical details. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

The post Google DeepMind Introduces Nano Banana Pro: the Gemini 3 Pro Image Model for Text Accurate and Studio Grade Visuals appeared first on MarkTechPost.