Deep Learning Based Vehicle Position and Orientation Estimation Via Inverse Perspective Mapping Image

Youngseok Kim¹, Dongsuk Kum²

¹Korea Advanced Institute of Science and Technology
²KAIST

Details

10:08 - 10:19 | Mon 10 Jun | Berlioz Auditorium | MoAM2_Oral.4

Session: Vision Sensing and Perception

10:08 - 10:19 | Mon 10 Jun | Room 4 | MoAM2_Oral.4

Session: Poster 1: (Orals) AV + Vision

Full Text

Digital Library Website

Abstract

In this paper, we present a method for estimating a position, size, and orientation using a single monocular image. The proposed method makes use of an inverse perspective mapping to effectively estimate the distance from the image. The proposed method consists of two stages: 1) cancel the pitch and roll motion of the camera using inertial measurement unit and project the corrected front view image onto the bird’s eye view using inverse perspective mapping. 2) detect the position, size, and orientation of the vehicle using convolutional neural network. The camera motion cancellation process makes vanishing point to be located at the same point regardless of the ego vehicle attitude change. Through this process, the projected bird’s eye view image can be parallel and linear to the x-y plane of the world coordinate system. The convolutional neural network predicts not only the position and size but also the orientation of the vehicle for the 3D localization. The predicted oriented bounding box from the bird’s eye view image is converted in meter unit by inverse projection matrix. The proposed method was evaluated on the KITTI raw dataset on the metric of the root mean square error, mean average percentage error, and average precision. Despite the conceptually simple architecture, proposed method achieves better performance compared to other image based approaches. The video demonstration is available online [1].