Voyager: An Open-Ended Embodied Agent with Large Language Models

The document introduces Voyager, an embodied lifelong learning agent in Minecraft that continuously explores the world, acquires diverse skills, and makes novel discoveries without human intervention. Voyager consists of three key components: an automatic curriculum that maximizes exploration, an ever-growing skill library of executable code for storing and retrieving complex behaviors, and a new iterative prompting mechanism that incorporates environment feedback, execution errors, and self-verification for program improvement. Voyager interacts with GPT-4 via blackbox queries, which bypasses the need for model parameter fine-tuning. Empirically, Voyager shows strong in-context lifelong learning capability and exhibits exceptional proficiency in playing Minecraft. It obtains 3.3x more unique items, travels 2.3x longer distances, and unlocks key tech tree milestones up to 15.3x faster than prior SOTA. Voyager is able to utilize the learned skill library in a new Minecraft world to solve novel tasks from scratch, while other techniques struggle to generalize. The document also discusses the challenges of building generally capable embodied agents that continuously explore, plan, and develop new skills in open-ended worlds. The authors argue that classical approaches employ reinforcement learning (RL) and imitation learning that operate on primitive actions, which could be challenging for systematic exploration, interpretability, and generalization. Recent advances in large language model (LLM) based agents harness the world knowledge encapsulated in pre-trained LLMs to generate consistent action plans or executable policies. They are applied to embodied tasks like games and robotics, as well as NLP tasks without embodiment. The document concludes by stating that Voyager serves as a starting point to develop powerful generalist agents without tuning the model parameters.

full article

Leave a Reply