In a groundbreaking update, Google has elevated its AI prowess with the enhanced Gemini 1.5 Pro, now equipped with the remarkable ability to process audio inputs. This advancement enables the AI to digest and analyze audio files, ranging from corporate earnings discussions to diverse video content, without the reliance on text transcripts.
Revealed during the prestigious Google Next event, this leap forward marks the debut of Gemini 1.5 Pro to the wider public via Vertex AI, Google’s dedicated AI application development platform. Initially unveiled in February, Gemini 1.5 Pro is set to redefine expectations within the Gemini suite by outshining even the formidable Gemini Ultra in terms of speed and efficiency. Designed to navigate complex commands effortlessly, Gemini 1.5 Pro streamlines user interactions by obviating the need for meticulous model adjustments.
However, access to Gemini 1.5 Pro remains exclusive to Vertex AI subscribers. The broader exposure to Gemini’s capabilities typically occurs through its chatbot interfaces, with Gemini Ultra underpinning the advanced chatbot functionalities. Despite Gemini Ultra’s proficiency in parsing extensive commands, Gemini 1.5 Pro distinguishes itself with unmatched rapidity.
Furthermore, Google’s AI innovation spree extends to Imagen 2, the engine behind Gemini’s image generation features, now enhanced with inpainting and outpainting functionalities. These allow for creative modifications within images, adding or omitting elements at the user’s whim. A noteworthy addition is Google’s SynthID technology, now integrated across all Imagen-generated images, embedding an imperceptible watermark to trace origins via specialized detection tools.
The infusion of inpainting and outpainting capabilities into Imagen aligns with similar functionalities seen in other leading text-to-image models and is increasingly becoming accessible in consumer tech, such as the latest Samsung Galaxy smartphones.
In an effort to anchor its AI-generated responses in real-time accuracy, Google is exploring integrations with Google Search, ensuring responses are grounded in the most current information. This initiative underscores Google’s cautious approach to sensitive topics, notably opting to steer clear of the 2024 US election in AI-generated content.
As tech enthusiasts and professionals alike anticipate these developments, the landscape of AI interaction and creativity continues to evolve, underscored by Google’s relentless pursuit of innovation.
By Andrej Kovacevic
Updated on 23rd April 2024