Causal deconfounding deep reinforcement learning for mobile robot motion planning

Wenbing Tang, Fenghua Wu, Shang wei Lin, Zuohua Ding, Jing Liu*, Yang Liu, Jifeng He

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

Deep reinforcement learning (DRL) has emerged as an efficient approach for motion planning in mobile robot systems. It leverages the offline training process to enhance real-time computation efficiency. In DRL-based methods, the DRL models are trained to compute an action based on the current state of the robot and the surrounding obstacles. However, the trained models may capture spurious correlations through potential confounders, resulting in non-robust state representations, which limits the models’ robustness and generalizability. In this paper, we propose a Causal Deconfounding DRL method for Motion Planning, CD-DRL-MP, to address spurious correlations and learn robust and generalizable policies. Specifically, we formalize the temporal causal relationships between states and actions using a structural causal model. We then extract the minimal sufficient state representation set by blocking the backdoor paths in the causal model. Finally, using the representation set, CD-DRL-MP learns the causal effect between states and actions while mitigating the detrimental influence of potential confounders and computes motion commands for mobile robots. Comprehensive experiments show that the proposed method significantly outperforms non-causal DRL methods and existing causal methods, while guaranteeing good robustness and generalizability.

Original languageEnglish
Article number112406
JournalKnowledge-Based Systems
Volume303
DOIs
StatePublished - 4 Nov 2024

Keywords

  • Backdoor paths
  • Causal inference
  • Deep reinforcement learning
  • Mobile robots
  • Motion planning

Fingerprint

Dive into the research topics of 'Causal deconfounding deep reinforcement learning for mobile robot motion planning'. Together they form a unique fingerprint.

Cite this