LanCon-Learn: Learning with Language to Enable Generalization in Multi-Task Manipulation Domains

Andrew Silva¹, Nina Moorman¹, William Silva¹, Zulfiqar Zaidi¹, Nakul Gopalan², Matthew Gombolay¹

¹Georgia Institute of Technology
²Arizona State University

Details

16:35 - 16:40 | Tue 24 May | Room 115A | TuB06.12

Session: Imitation Learning

Full Text

Abstract

Robots must be capable of learning from previously solved tasks and generalizing that knowledge to quickly perform new tasks to realize the vision of ubiquitous and useful robot assistance in the real world. While multi-task learning research has produced agents capable of performing multiple tasks, these tasks are often encoded as one-hot goals. In contrast, natural language specifications offer an accessible means both for (1) users to describe a set of new tasks to the robot and (2) robots to reason about the similarities and differences among tasks through language-based task embeddings. Until now, multi-task learning with language has been limited to navigation based tasks and has not been applied to continuous manipulation tasks, requiring precision to grasp and move objects. We present LanCon-Learn, a novel attention-based approach to language-conditioned multi- task learning in manipulation domains to enable learning agents to reason about relationships between skills and task objectives through natural language and interaction. We evaluate LanCon-Learn for both reinforcement learning and imitation learning, across multiple virtual robot domains along with a demonstration on a physical robot. LanCon-Learn achieves up to a 200% improvement in zero-shot task success rate and transfers known skills to novel tasks faster than non-language-based baselines, demonstrating the utility of language for goal specification.