High-Dimensional Motion Segmentation by Variational Autoencoder and Gaussian Processes

Masatoshi Nagano1, Tomoaki Nakamura2, Takayuki Nagai3, Daichi Mochihashi4, Ichiro Kobayashi5, Wataru Takano3

  • 1University of Electro-Communications
  • 2The University of Electro-Communications
  • 3Osaka University
  • 4Institute of Statistical Mathematics
  • 5ochanomizu university

Details

11:30 - 11:45 | Tue 5 Nov | L1-R3 | TuAT3.3

Session: Learning and Adaptive Systems I

Abstract

Humans perceive continuous high-dimensional information by dividing it into significant segments such as words and units of motion. We believe that such unsupervised segmentation is also important for robots to learn topics such as language and motion. To this end, we previously proposed a hierarchical Dirichlet process--Gaussian process--hidden semi-Markov model (HDP-GP-HSMM). However, an important drawback to this model is that it cannot divide high-dimensional time-series data. Further, low-dimensional features must be extracted in advance. Segmentation largely depends on the design of features, and it is difficult to design effective features, especially in the case of high-dimensional data. To overcome this problem, this paper proposes a hierarchical Dirichlet process--variational autoencoder--Gaussian process--hidden semi-Markov model (HVGH). The parameters of the proposed HVGH are estimated through a mutual learning loop of the variational autoencoder and our previously proposed HDP-GP-HSMM. Hence, HVGH can extract features from high-dimensional time-series data, while simultaneously dividing it into segments in an unsupervised manner. In an experiment, we used various motion-capture data to show that our proposed model estimates the correct number of classes and more accurate segments than baseline methods. Moreover, we show that the proposed method can learn latent space suitable for segmentation.