Google’s New AI Models: Gemini Robotics To Handle the Messiness of the Physical World

  • Home
  • Technology
  • Google’s New AI Models: Gemini Robotics To Handle the Messiness of the Physical World

This isn’t Google’s first foray into robotics. We all remember the much talked-about ‘Everyday Robotics,’ by Google’s parent company, Alphabet, which met its end in 2023 due to budget cuts. But this time, the tech giant isn’t stopping at simple chores like trash sorting or table cleaning. Enter Gemini Robotics—a bold step forward, aiming to equip robots with intelligence and adaptability whose purpose, as Kanishka Rao, the DeepMind engineer, said, is to “be able to deal with the messiness (of our world).”

That’s right! Google is introducing two new AI models for robotics built upon the foundations of dexterity, interaction, and spatial understanding, aiming to achieve what has always been a challenge in the robotics world—handling unfamiliar scenarios.

The two debuting models for robotics are:

  • Gemini Robotics: an advanced vision-language-action (VLA) model that was built on Gemini 2.0 layered with physical actions as a new output modality for the purpose of directly controlling robots.
  • Gemini Robotics ER: a Gemini model with advanced spatial understanding, enabling roboticists to run their own programs using Gemini’s embodied reasoning (ER) abilities.

Soure: (Bing Images) DeepMind introducing Google’s Gemini Robotics 

The Real Promise: What is the Key to Sorting the Mess?

Rao’s statement about robots and how they will tackle their awaited role in handling the mess of our world is directly proportional to the scope of their spatial understanding; in other words, how far along is robotics in handling unfamiliar situations and why Google aims to take it to the next step.

Sundar Pichai, CEO, Google and Alphabet, told in an interview about the key focus of robotics and AI industries, which he called agentic workflow.

“Anywhere you can describe a task in natural language, and AI can act on your behalf and accomplish that.”

That, according to Pichai, is the real powerful impact that the companies are chasing and the fulfillment of the “real promise of having an AI assistant.” The progress in this is so far 85%.

This promise is what Google is trying to materialize through Gemini Robotics, which brings AI into the physical world. Multiple advanced features are imbued in the models that make real-life learning and problem-solving possible.

Features of Gemini Robotics

  • Generality: Gemini Robotics can generalize to novel situations, environments, and tasks it hasn’t been trained on.
  • Interactivity: It uses Gemini 2.0’s advanced language understanding to respond to conversational commands in multiple languages and adapt to environmental changes.
  • Dexterity: The model can handle intricate tasks requiring fine motor skills, such as origami folding or handling objects delicately.

Google’s Focus on The Ethical Aspect

Through collaborations with multiple partners, Google is not letting the leash loose on security measures. They are determined to take a step-by-step approach to slowly bring these Gemini models into the social expanse. A DeepMind research scientist, Vikas Sindhwani, on the concerns for safety ensured that these Gemini models had a “strong understanding of common sense safety” in the physical environment. 

Source: (Bing Images) Gemini Just Got Physical!

We have been warned time and again about the dangers of unchecked use and development of AI and technology. Hence, Google does not plan on direct human interaction; rather, they plan to gradually introduce these models over time, hence terming it as an “early exploration.” These are among the several safety precautions that the team has shared with us regarding the debut of these innovative models that are bound to reshape the world of robotics.

Conclusion, Collaborations and Competition: Apptronik, Meta and More

Google announced plans to begin exploring Gemini’s robotics capabilities through collaborations with industry leaders, including Apptronik, with whom it is working to create humanoid robots. Other collaborators trialing its Gemini Robotics-ER model include Agile Robots and Boston Dynamics, a company Alphabet purchased in 2013 and later sold to SoftBank Group. But the question is, what does this new innovation streak mean for the competitors? The realm of robotics and AI definitely has multiple players at work: Meta, OpenAI, etc. Expert take is that this bold step by Google is bound to set up a chain reaction. We can see ourselves entering into a new and fresh wave of technological revolution. 

Source: (Bing Images) Google Gemini enables Robots to respond to unfamiliar situations

At vero eos et accusamus et iusto odio digni goikussimos ducimus qui to bonfo blanditiis praese. Ntium voluum deleniti atque.

Melbourne, Australia
(Sat - Thursday)
(10am - 05 pm)