Reinforcement Learning for Stabilizing an Inverted Pendulum Naturally Leads to Intermittent Feedback Control as in Human Quiet Standing

Details

08:45 - 09:00 | Wed 17 Aug | Fantasia B | WeAT2.4

Session: Postural and Balance

Abstract

Intermittent feedback control for stabilizing human upright stance is a promising strategy, alternative to the standard time-continuous stiffness control. Here we show that such an intermittent controller can be established naturally through reinforcement learning. To this end, we used a single inverted pendulum model of the upright posture and a very simple reward function that gives a certain amount of punishments when the inverted pendulum falls or changes its position in the state space. We found that the acquired feedback controller exhibits hallmarks of the intermittent feedback control strategy, namely the action of the feedback controller is switched-off intermittently when the state of the pendulum is located near the stable manifold of the unstable saddle-type upright equilibrium of the inverted pendulum with no active control: this action provides an opportunity to exploit transiently converging dynamics toward the unstable upright position with no help of the active feedback control. We then speculate about a possible physiological mechanism of such reinforcement learning, and suggest that it may be related to the neural activity in the pedunculopontine tegmental nucleus (PPN) of the brainstem. This hypothesis is supported by recent evidence indicating that PPN might play critical roles for generation and regulation of postural tonus, reward prediction, as well as postural instability in patients with Parkinson's disease.