Ryan Self1, Michael Harlan1, Rushikesh Kamalapurkar1
15:30 - 15:50 | Mon 19 Aug | Lau, 6-211 | MoC4.1
This paper focuses on the development of an online inverse reinforcement learning (IRL) technique for a class of nonlinear systems. The developed approach utilizes observed state and input trajectories, and determines the unknown reward function and the unknown value function online. A parameter estimation technique is utilized to facilitate estimation of the reward function in the presence of unknown dynamics. Theoretical guarantees for convergence of the reward function estimates are established, and simulation results are presented to demonstrate the performance of the developed technique.