Maximum Entropy Differential Dynamic Programming

Oswin So¹, Ziyi Wang², Evangelos Theodorou²

¹Massachusetts Institute of Technology
²Georgia Institute of Technology

Details

15:50 - 15:55 | Tue 24 May | Room 123 | TuB17.05

Session: Optimization and Optimal Control II

Full Text

Abstract

In this paper we present a novel maximum entropy formulation of the Differential Dynamic Programming algorithm and derive two variants using unimodal and multimodal value functions parameterizations. By combining the maximum entropy Bellman equations with a particular approximation of the cost function, we are able to obtain a new formulation of Differential Dynamic Programming which is able to escape from local minima via exploration with a multimodal policy. To demonstrate the efficacy of the proposed algorithm, we provide experimental results using four systems on tasks that are represented by cost functions with multiple local minima and compare them against vanilla Differential Dynamic Programming. Furthermore, we discuss connections with previous work on the linearly solvable stochastic control framework and its extensions in relation to compositionality.