Zhaojian Li1, Tianshu Chu2, Uros V. Kalabic3
16:30 - 16:50 | Tue 20 Aug | Lau, 5-203 | TuC1.4
Reinforcement learning (RL) is in essence a trialand- error process which involves exploratory actions. These explorations can lead to system constraint violations and physical system damages, impeding RL’s use in many realworld engineered systems. In this paper, we develop a safe RL framework that integrates model-free learning with modelbased safety supervision to bridge the gap. We exploit the underlying system dynamics and safety-related constraints to construct a safety set using recursive feasibility techniques. We then integrate the safety set in RL’s exploration to guarantee safety while simultaneously preserving exploration efficiency by using the hit-and-run sampling. We design a novel efforts-toremain- safe penalty to effectively guide RL to learn system constraints. We apply the proposed safe RL framework to the active suspension system in which actuation and state constraints are present due to ride comfort, road handling, and actuation limits. We show that the developed safe RL is able to learn a safe control policy safely while outperforming a nominal controller.