TY - GEN
T1 - ENHANCED DEEP REINFORCEMENT LEARNING FOR PARCEL SINGULATION IN NON-STATIONARY ENVIRONMENTS
AU - Shen, Jiwei
AU - Lu, Hu
AU - Zhang, Hao
AU - Lyu, Shujing
AU - Lu, Yue
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - In the rapidly expanding logistics sector, parcel singulation has emerged as a significant bottleneck. To address this, we propose an automated parcel singulator utilizing a sparse actuator array, which presents an optimal balance between cost and efficiency, albeit requiring a sophisticated control policy. In this study, we frame the parcel singulation issue as a Markov Decision Process with a variable state space dimension, addressed through a deep reinforcement learning (RL) algorithm complemented by a State Space Standardization Module (S3). Distinct from previous RL approaches, our methodology initially considers the non-stationary environment during the problem modeling phase. To counter this challenge, the S3 module standardizes the dynamic input state, thereby stabilizing the RL training process. We validate our method through simulation experiments in complex environments, comparing it with several baseline algorithms. Results indicate that our algorithm excels in parcel singulation tasks, achieving a higher success rate and enhanced efficiency.
AB - In the rapidly expanding logistics sector, parcel singulation has emerged as a significant bottleneck. To address this, we propose an automated parcel singulator utilizing a sparse actuator array, which presents an optimal balance between cost and efficiency, albeit requiring a sophisticated control policy. In this study, we frame the parcel singulation issue as a Markov Decision Process with a variable state space dimension, addressed through a deep reinforcement learning (RL) algorithm complemented by a State Space Standardization Module (S3). Distinct from previous RL approaches, our methodology initially considers the non-stationary environment during the problem modeling phase. To counter this challenge, the S3 module standardizes the dynamic input state, thereby stabilizing the RL training process. We validate our method through simulation experiments in complex environments, comparing it with several baseline algorithms. Results indicate that our algorithm excels in parcel singulation tasks, achieving a higher success rate and enhanced efficiency.
KW - Markov decision process
KW - Reinforcement learning
KW - nonstationary environment
KW - parcel singulation
KW - state space standardization
UR - https://www.scopus.com/pages/publications/85195425412
U2 - 10.1109/ICASSP48485.2024.10446437
DO - 10.1109/ICASSP48485.2024.10446437
M3 - 会议稿件
AN - SCOPUS:85195425412
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 86
EP - 90
BT - 2024 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2024 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024
Y2 - 14 April 2024 through 19 April 2024
ER -