Xiang Shen1, Xiang Zhang2, Yifan Huang3, Shuhang Chen1, Yiwen Wang1
16:30 - 18:30 | Thu 21 Mar | Grand Ballroom A | ThPO.61
16:30 - 18:30 | Thu 21 Mar | Grand Ballroom B | ThPO.61
Reinforcement learning(RL) scheme interprets the movement intentions in Brain-machine interfaces(BMIs) with a reward signal. This reward can be an external reward (food or water) and an internal representation of the reward which mimics the learning procedure of the subjects to link the correct movement with the external reward. Medial prefrontal cortex (mPFC) has been demonstrated to be closely related to the reward-guided learning. In this paper, we propose to model mPFC activities associated with different actions as an internal representation of the reward in the RL framework. Support vector machine (SVM) is adopted to distinguish the rewarded and unrewarded trials based on mPFC signals considering different action spaces. Then this discrimination result will be utilized to train a decoder. Here we introduce the attention-gated reinforcement learning (AGREL) as the decoder to generate a mapping between motor cortex(M1) and action states. To evaluate our approach, we test on the real neural physiological data collected from rats when performing a two-lever discrimination task. Compared with AGREL using the external reward, AGREL using internal reward evaluation can achieve a prediction accuracy of 94.8%, which is very close to using the explicit reward. This indicates the feasibility of modeling mPFC activities as an internal representation of the reward in the RL framework.