TY - GEN
T1 - Mix-up Consistent Cross Representations for Data-Efficient Reinforcement Learning
AU - Liu, Shiyu
AU - Cao, Guitao
AU - Liu, Yong
AU - Li, Yan
AU - Wu, Chunwei
AU - Xi, Xidong
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Deep reinforcement learning (RL) has achieved re-markable performance in sequential decision-making problems. However, it is a challenge for deep RL methods to extract task-relevant semantic information when interacting with limited data from the environment. In this paper, we propose Mix-up Consistent Cross Representations (MCCR), a novel self-supervised auxiliary task, which aims to improve data efficiency and encourage representation prediction. Specifically, we calculate the contrastive loss between low-dimensional and high-dimensional representations of different state observations to boost the mutual information between states, thus improving data efficiency. Furthermore, we employ a mixed strategy to generate intermediate samples, increasing data diversity and the smoothness of representations prediction in nearby timesteps. Experimental results show that MCCR achieves competitive results over the state-of-the-art approaches for complex control tasks in DeepMind Control Suite, notably improving the ability of pretrained encoders to generalize to unseen tasks.
AB - Deep reinforcement learning (RL) has achieved re-markable performance in sequential decision-making problems. However, it is a challenge for deep RL methods to extract task-relevant semantic information when interacting with limited data from the environment. In this paper, we propose Mix-up Consistent Cross Representations (MCCR), a novel self-supervised auxiliary task, which aims to improve data efficiency and encourage representation prediction. Specifically, we calculate the contrastive loss between low-dimensional and high-dimensional representations of different state observations to boost the mutual information between states, thus improving data efficiency. Furthermore, we employ a mixed strategy to generate intermediate samples, increasing data diversity and the smoothness of representations prediction in nearby timesteps. Experimental results show that MCCR achieves competitive results over the state-of-the-art approaches for complex control tasks in DeepMind Control Suite, notably improving the ability of pretrained encoders to generalize to unseen tasks.
KW - mutual information
KW - reinforcement learning
KW - self-supervised learning
KW - smoothness
UR - https://www.scopus.com/pages/publications/85140711365
U2 - 10.1109/IJCNN55064.2022.9892416
DO - 10.1109/IJCNN55064.2022.9892416
M3 - 会议稿件
AN - SCOPUS:85140711365
T3 - Proceedings of the International Joint Conference on Neural Networks
BT - 2022 International Joint Conference on Neural Networks, IJCNN 2022 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 International Joint Conference on Neural Networks, IJCNN 2022
Y2 - 18 July 2022 through 23 July 2022
ER -