An Aspect Representation for Object Manipulation Based on Convolutional Neural Networks

Li Yang Ku1, Erik Learned-Miller2, Rod Grupen3

  • 1Umass Amherst
  • 2University of Massachusetts, Amherst
  • 3University of Massachusetts

Details

11:50 - 11:55 | Tue 30 May | Room 4411/4412 | TUB4.5

Session: Autonomous Agent

Abstract

We propose an intelligent visuomotor system that interacts with the environment and memorizes the consequences of actions. In previous work, we introduced the aspect transition graph (ATG) which represents how actions lead from one observation to another. In this work, we propose a novel aspect representation based on hierarchical CNN features, learned with convolutional neural networks, that supports manipulation and captures the essential affordances of an object based on RGB-D images. In a traditional planning system, robots are given a pre-defined set of actions that take the robot from one symbolic state to another. However symbolic states often lack the flexibility to generalize across similar situations. Our proposed representation is grounded in the robot's observations and lies in a continuous space that allows the robot to handle similar unseen situations. The hierarchical CNN features within a representation also allow the robot to act precisely with respect to the spatial location of individual features. We evaluate the robustness of this representation using the Washington RGB-D Objects Dataset and show that it achieves state of the art results for instance pose estimation. We then test this representation in conjunction with an ATG on a drill grasping task on Robonaut-2. We show that given grasp, drag, and turn demonstrations on the drill, the robot is capable of planning sequences of learned actions to compensate for reachability constraints.