Google is taking AI-generated image and video creation to the next level

Google has been making serious waves in the AI world of late, and their latest announcement shows they have no plans of slowing down. With the latest Gemini 2.0 announcement, you'd think Google would be good to rest on their laurels for the rest of 2024. But their latest announcements around the new Veo 2, updated Imagen 3, and Whisk are evidence that there's much more to Google's AI plans than we thought they'd deliver at this point.

Veo 2 isn't just an incremental improvement; it boasts a significant leap in quality and realism in AI-generated video. In tests where human raters compared Veo 2 against other leading models, it consistently came out on top. This new iteration demonstrates a deeper understanding of real-world physics, human movement, and even the nuances of cinematic expression. Want something as complex as a low-angle tracking shot with a shallow depth of field? Veo 2 can deliver, and with resolutions up to 4K and extended video lengths.

One of the most impressive aspects of Veo 2 is its ability to understand and respond to cinematic language. You can specify the genre, lens type, and desired effects, and Veo 2 will incorporate those elements into the generated video. This level of control opens up exciting possibilities for filmmakers and creators.

Imagen 3 has also received a major boost, producing brighter, more compelling images with richer detail and textures. It now excels at recreating a wide range of artistic styles, from photorealism to impressionism and abstract to anime.

Google says that Imagen 3 now follows prompts more faithfully, giving users greater control over the final output. Like Veo 2, Imagen 3 has undergone rigorous testing, achieving state-of-the-art results in comparisons with other leading image generation models.

Alongside the two upgraded photo and video models, Google has introduced a brand new tool called Whisk. This experimental platform allows users to input or create images and then remix them into unique creations. Whether you want to design a digital plushie or generate ideas for a sticker, Whisk can help you visualize your concepts.

Whisk leverages the power of Imagen 3 and combines it with Gemini's visual understanding and description capabilities. Gemini analyzes your input images and generates detailed captions, which are then fed into Imagen 3 to create variations and remixes.

Veo 2 is currently available through Google Labs' VideoFX tool, with plans to expand to YouTube Shorts and other products next year. Imagen 3's latest version is rolling out globally in ImageFX in over 100 countries. Whisk is launching in the U.S. as part of Google Labs.

Google's commitment to responsible AI development is the same in these new tools as we've seen in the past. They are gradually expanding access to Veo 2 to carefully monitor and improve its quality and safety. All outputs from these models include an invisible SynthID watermark to identify them as AI-generated, helping to combat misinformation.

Vivid News Wave

Google is taking AI-generated image and video creation to the next level

POPULAR CATEGORY

corporate

tech

entertainment

research

misc

wellness

athletics