Google Major Launch Imagen 4, Imagen 4 Ultra and Veo 3 models at Google I/O 2025

0 0 0

Người đăng: CometAPI

Theo Viblo Asia

Google is set to unveil its next-generation generative AI models—Imagen 4, Imagen 4 Ultra, and Veo 3—during its annual Google I/O developer conference on May 20, 2025. Early leaks of preview identifiers (e.g., imagen-4.0-generate-preview-05-20, imagen-4.0-ultra-generate-exp-05-20, veo-3.0-generate-preview) signal a staged rollout and multiple capability tiers across both image and video synthesis domains . Imagen 4 aims to deliver significant gains in photorealism, prompt fidelity, and stylistic consistency over Imagen 3, while the “Ultra” variant may offer even higher resolution or specialized performance modes . On the video side, Veo 3 promises more coherent clip-to-clip continuity and robust style adherence compared to Veo 2 . All three models are expected to integrate tightly with Google’s Gemini AI ecosystem, enabling seamless transitions from text prompts to images or videos within the same workflow .


Preview Identifiers and Rollout Strategy

Staged Previews: Internal references such as

  • imagen-4.0-generate-preview-05-20
  • imagen-4.0-ultra-generate-exp-05-20
  • veo-3.0-generate-preview

img

img

Have surfaced in code repositories and API previews, indicating Google’s intention to offer both standard and “Ultra” performance tiers for image generation, as well as an advanced video model preview for early testers.

Google I/O Launch:

These identifiers strongly suggest Google will showcase and potentially grant preview access to developers at I/O on May 20, 2025, mirroring previous rollouts for Imagen 3 and Veo 2.


What’s New in Imagen 4

Photorealism and Fidelity

  • Enhanced Rendering: Imagen 4 reportedly achieves greater photorealistic detail, reducing artifacts and improving color accuracy. Early rumors suggest improvements in understanding complex prompts, such as nuanced lighting or reflections .
  • Prompt Adherence: The model is expected to follow user instructions more precisely, delivering images that better match both content and style directives (e.g., “oil painting of sunset over mountains”) .

Style Consistency

  • Multi-Image Cohesion: Imagen 4 is designed to maintain a consistent visual style across multiple outputs, benefiting use cases like storyboarding or product catalog creation, where uniformity is critical .
  • Ultra Variant: The “Ultra” tier (imagen‑4.0‑ultra) likely offers higher-resolution outputs or specialized optimizations (e.g., ultra-high fidelity for print media) for enterprise and creative professionals .

What’s New in Veo 3

Improved Coherence

  • Clip-to-Clip Continuity: Veo 3 aims to generate video sequences where successive shots maintain consistent framing, lighting, and character appearance, addressing limitations in Veo 2 around visual drift over time .
  • Style Fidelity: The model focuses on replicating artistic or cinematic styles more faithfully, making it easier to produce videos in a desired aesthetic (e.g., noir, pastel animation).

Integration of SynthID Watermarking

  • Digital Watermarking: Leveraging DeepMind’s SynthID technology (introduced with Veo 2), Veo 3 will embed imperceptible watermarks to help identify AI-generated content and curb misuse.

Integration with Gemini AI

  • Seamless Access: Both Imagen 4 and Veo 3 are expected to be directly accessible through Google’s Gemini interfaces—enabling users to generate images or videos within chat-based prompts or through product interfaces like Google Photos and Google Slides.
  • Gemini Gems: Customized AI “Gems” may incorporate these models, allowing users to create specialized assistants (e.g., a travel-planning Gem that generates itinerary images and overview videos) and share them in a marketplace similar to ChatGPT’s GPT Store .

Availability and Next Steps

Public Preview: Developers and enterprise testers may receive invites to experiment with Imagen 4 (standard and Ultra) and Veo 3 beginning May 20, 2025 at Google I/O, with broader rollout to Labs and Vertex AI in the following weeks .

Feedback and Iteration: As with prior launches, Google will likely solicit user feedback to refine safety filters, watermarking robustness, and performance optimizations before general availability.

Watch This Space: interested developers should monitor the CometAPI.

The new model API will be listed on CometAPI, and it is promised to provide lower prices than Google to facilitate your integration. Please continue to pay attention API doc.

Bình luận

Bài viết tương tự

- vừa được xem lúc

Qwen Turbo API

Introduction to Qwen Turbo: A Breakthrough AI Model. Overview of Qwen Turbo API.

0 0 14

- vừa được xem lúc

Is GPT-4 Open Source?

Is GPT-4 Open Source? A Comprehensive Analysis. The rapid advancements in artificial intelligence (AI) over the past decade have sparked considerable debate and speculation regarding the openness of c

0 0 12

- vừa được xem lúc

Gemini 2.0 Pro API

Introduction to Gemini 2.0 Pro: A Next-Generation AI Model.

0 0 14

- vừa được xem lúc

A Guide to Setting Up Cursor AI With CometAPI

Artificial intelligence (AI) continues to revolutionize industries, enabling businesses and developers to build more intelligent and efficient applications. CometAPI provides state-of-the-art AI model

0 0 16

- vừa được xem lúc

How to access o3-mini model?

OpenAI's o3-mini is a newly introduced AI model optimized for enhanced reasoning, particularly in coding, STEM fields, and logical problem-solving. It is part of OpenAI's advanced AI model lineup, des

0 0 24

- vừa được xem lúc

What is Mistral 7B?

Artificial intelligence has made significant strides in recent years, with large language models (LLMs) driving innovation in fields such as natural language processing (NLP), machine learning, and co

0 0 15