In a stunning leap forward for artificial intelligence, Google DeepMind has unveiled two groundbreaking models in its Gemini Robotics family, redefining what robots can do in real-world settings. Imagine a future where robots don’t just respond to commands but intelligently plan and execute tasks with unparalleled precision!

The newly launched models, Gemini Robotics-ER 1.5 and Gemini Robotics 1.5, work hand in hand to enhance reasoning, vision, and action capabilities in robots. This dual-model approach addresses the shortcomings of previous generations, where one system struggled to both plan and perform, often leading to frustrating errors and delays.

The star of the show, Gemini Robotics-ER 1.5, acts as the planner—think of it as the brain behind the brawn. This vision-language model (VLM) excels in advanced reasoning and tool integration, crafting complex, multi-stage plans with ease. It can even tap into the vast information available on Google Search, making decisions informed by real-time data. Imagine a robot that can analyze a task, consult the internet for best practices, and then execute flawlessly!

Once a plan has been set in motion, it’s the Gemini Robotics 1.5’s turn to shine. This vision-language-action (VLA) model translates the planner’s blueprints into actionable commands for the robot. It identifies the most efficient path to success while providing understandable explanations of its decision-making process. In other words, these robots aren’t just following orders—they’re thinking on their feet!

Picture this: a robot sorting items into compost, recycling, and trash bins after consulting local recycling guidelines online. It analyzes the objects, plans the sorting process, and executes it flawlessly. The Gemini system’s ability to tackle such complex, multi-step tasks sets a new standard for robotics.

DeepMind has designed these AI models to be adaptable, accommodating robots of all shapes and sizes thanks to their advanced spatial awareness. While developers can currently access the orchestrator model via the Gemini API in Google AI Studio, Gemini Robotics 1.5 is presently available to select partners. This is just the beginning of an AI revolution in robotics!