How Desire Paths can Transform your Branding and Public Relations
How Desire Paths can Transform your Branding and Public Relations
12 Steps to Create Videos

Google introduces PaliGemma 2 vision-language AI models [Video]

Categories
AI Content Generation and Curation

Google has introduced a new family of PaliGemma vision-language models, offering scalable performance, long captioning, and support for specialized tasks.

PaliGemma 2 was announced December 5, nearly seven months after the initial version launched as the first vision-language model in the Gemma family. Building on Gemma 2, PaliGemma 2 models can see, understand, and interact with visual input, according to Google.

PaliGemma 2 makes it easier for developers to add more-sophisticated vision-language features to apps, Google said. It also enables more-sophisticated captioning abilities, including identifying emotions and actions in images. Scalable performance capabilities in PaliGemma 2 mean performance can be optimized for any task via multiple model sizes (3B, 10B, 28B parameters) and resolutions (224px, 448px, 896px). Long captioning in PaliGemma 2 generates detailed, contextually relevant captions for images, going beyond simple object identification to describe actions, emotions, and the overall narrative of the scene, Google said.

How to Reach your Market in a World Ruled by Generative AI
How to Reach your Market in a World Ruled by Generative AI
5 Steps to Creating Successful Ads