TY - JOUR
T1 - End-to-end hardware-in-the-loop temperature control for semiconductor laser based on deep reinforcement learning
AU - Zeng, Chunlan
AU - Chen, Shaoqiang
AU - Liu, Qi
AU - Zhang, Zeli
AU - Weng, Guoen
N1 - Publisher Copyright:
© 2026 IOP Publishing Ltd. All rights, including for text and data mining, AI training, and similar technologies, are reserved.
PY - 2026/5
Y1 - 2026/5
N2 - This paper presents a temperature control system for semiconductor lasers, employing a novel end-to-end hardware-in-the-loop control strategy that integrates deep reinforcement learning control with mechanical structure design. Three mechanical structures were proposed and simulated to enhance temperature control flexibility and uniformity: single-stage (Fan only, thermoelectric cooler (TEC) only) and dual-stage (TEC + Fan). Simulations demonstrate that the dual-stage structure under the same driving conditions provides superior temperature control flexibility and uniformity. The key innovation lies in the end-to-end hardware-in-the-loop control strategy based on deep reinforcement learning. The End-to-end Deep Reinforcement Learning (E2EDRL) algorithm is capable of autonomously exploring optimal control policies without manual tuning. Simulation results demonstrate that, compared with the Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC) algorithms, E2EDRL not only achieves the highest performance metrics but also converges within a reasonable number of episodes (approximately 45). Compared with Proportional-Integral-Derivative (PID), E2EDRL achieves approximately 50% improvement in settling time, overshoot, and steady-state error. This approach achieves structure–algorithm synergy, thereby comprehensively accounting for system-level factors. Experimental results demonstrate substantial performance improvements, achieving temperature fluctuation control within ±0.8 °C, optical power fluctuations limited to 2.2%, and an 87% reduction in central wavelength redshift. Furthermore, the system demonstrates robust intelligent behavior and adaptability across a broad range of temperature control scenarios, underscoring its potential for advanced thermal management applications in semiconductor lasers.
AB - This paper presents a temperature control system for semiconductor lasers, employing a novel end-to-end hardware-in-the-loop control strategy that integrates deep reinforcement learning control with mechanical structure design. Three mechanical structures were proposed and simulated to enhance temperature control flexibility and uniformity: single-stage (Fan only, thermoelectric cooler (TEC) only) and dual-stage (TEC + Fan). Simulations demonstrate that the dual-stage structure under the same driving conditions provides superior temperature control flexibility and uniformity. The key innovation lies in the end-to-end hardware-in-the-loop control strategy based on deep reinforcement learning. The End-to-end Deep Reinforcement Learning (E2EDRL) algorithm is capable of autonomously exploring optimal control policies without manual tuning. Simulation results demonstrate that, compared with the Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC) algorithms, E2EDRL not only achieves the highest performance metrics but also converges within a reasonable number of episodes (approximately 45). Compared with Proportional-Integral-Derivative (PID), E2EDRL achieves approximately 50% improvement in settling time, overshoot, and steady-state error. This approach achieves structure–algorithm synergy, thereby comprehensively accounting for system-level factors. Experimental results demonstrate substantial performance improvements, achieving temperature fluctuation control within ±0.8 °C, optical power fluctuations limited to 2.2%, and an 87% reduction in central wavelength redshift. Furthermore, the system demonstrates robust intelligent behavior and adaptability across a broad range of temperature control scenarios, underscoring its potential for advanced thermal management applications in semiconductor lasers.
KW - hardware-in-the-loop
KW - online learning
KW - reinforcement learning
KW - semiconductor laser
KW - temperature control
UR - https://www.scopus.com/pages/publications/105037431268
U2 - 10.1088/2631-8695/ae56d0
DO - 10.1088/2631-8695/ae56d0
M3 - 文章
AN - SCOPUS:105037431268
SN - 2631-8695
VL - 8
JO - Engineering Research Express
JF - Engineering Research Express
IS - 9
ER -