Policy Gradient based Reinforcement Learning Approach for Autonomous Highway Driving

Szilard Aradi1, Tamás Bécsi, Peter Gaspar

  • 1Budapest University of Technology and Economics

Details

11:40 - 12:00 | Thu 23 Aug | Fredensborg | ThA3.6

Session: Autonomous Systems

Abstract

The paper presents the application of the Policy Gradient reinforcement learning method in the area of vehicle control. The purpose of the research presented is to design an end-to-end behavior control of a kinematic vehicle model placed in a simulated highway environment, by using reinforcement learning approach. The environment model for the surrounding traffic uses microscopic simulation to provide different situations for the agent. The environment sensing model is based on high-level sensor information, e.g. the odometry, lane position and surrounding vehicle states that can be reached from the current automotive sensors, such as camera and radar based systems. The objectives were reached through the definition of a rewarding system, with subrewards and penalties enforcing desired speed, lane keeping, keeping right and avoiding collision. After the description of the theoretical basis the environment model with the reward function is detailed. Finally the experiments with the learning process are presented and the results of the evaluation are given from some aspects of control quality and safety.