A Self-Supervised Learning System for Object Detection Using Physics Simulation and Multi-View Pose Estimation

Chaitanya Mitash1, Kostas E. Bekris2, Abdeslam Boularias3

  • 1Amazon Robotics
  • 2Rutgers, the State University of New Jersey
  • 3Rutgers University

Details

10:45 - 11:00 | Mon 25 Sep | Room 217 | MoAT14.2

Session: Computer Vision for Automation I

Abstract

Impressive progress has been achieved in object detection with the use of deep learning. Nevertheless, such tools typically require a large amount of training data and significant manual effort for labeling objects. This limits their applicability in robotics, where it is necessary to scale solutions to a large number of objects and a variety of conditions. The present work proposes a fully autonomous process to train a Convolutional Neural Network (CNN) for object detection and pose estimation in robotic setups. The application involves detection of objects placed in a clutter and in tight environments, such as a shelf. In particular, given access to 3D object models, several aspects of the environment are simulated and the models are placed in physically realistic poses with respect to their environment to generate a labeled synthetic dataset. To further improve object detection, the network self-trains over real images that are labeled using a robust multi-view pose estimation process. The proposed training process is evaluated on several existing datasets and on a dataset that we collected with a Motoman robotic manipulator. Results show that the proposed process outperforms popular training processes relying on synthetic data generation and manual annotation.