Google’s DeepMind Robotics researchers are exploring the potential of generative AI and large foundational models in robotics. They aim to give robots a better understanding of human desires. Traditionally, robots have been limited to singular tasks, but the newly announced AutoRT system harnesses large foundational models to expand their capabilities. AutoRT uses a Visual Language Model (VLM) for situational awareness and manages a fleet of robots equipped with cameras. A large language model suggests tasks that can be accomplished by the robots. The system has been tested with up to 20 robots and 52 different devices, collecting over 77,000 trials. Another development is RT-Trajectory, which uses video input and overlays a sketch of the arm in action to train robots. This method has shown double the success rate compared to previous training methods. RT-Trajectory also utilizes existing robot datasets to unlock knowledge and improve robot control policies. Overall, these advancements aim to enable robots to move accurately and efficiently in novel situations.
