TY - GEN
T1 - MonitorLight
T2 - 31st ACM International Conference on Information and Knowledge Management, CIKM 2022
AU - Fang, Zekuan
AU - Zhang, Fan
AU - Wang, Ting
AU - Lian, Xiang
AU - Chen, Mingsong
N1 - Publisher Copyright:
© 2022 ACM.
PY - 2022/10/17
Y1 - 2022/10/17
N2 - Although Reinforcement Learning (RL) has achieved significant success in the Traffic Signal Control (TSC), most of them focus on the design of RL elements while the impact of the phase duration is neglected. Due to the lack of exploring dynamic phase duration, the overall performance and convergence rate of RL-based TSC approaches cannot be guaranteed, which may result in poor adaptability of RL methods to different traffic conditions. To address these issues, in this paper, we formulate a novel phase-duration-aware TSC (PDA-TSC) problem and propose an effective RL-based TSC approach, named MonitorLight. Our approach adopts a new traffic indicator, mixed pressure, which enables RL agents to simultaneously analyze the impacts of stationary and moving vehicles on intersections. Based on the observed mixed pressure of intersections, RL agents can autonomously determine whether or not to change the current signals in real-time. In addition, MonitorLight can adjust the control method for scenarios with different real-time requirements and achieve excellent results in different situations. Extensive experiments on both real-world and synthetic datasets demonstrate that MonitorLight outperforms the current state-of-the-art IPDALight by up to 2.84% and 5.71% in average vehicle travel time, respectively. Moreover, our method significantly speeds up the convergence, leading IPDALight by 36.87% and 34.58% in the start to converge episode and jumpstart performance, respectively.
AB - Although Reinforcement Learning (RL) has achieved significant success in the Traffic Signal Control (TSC), most of them focus on the design of RL elements while the impact of the phase duration is neglected. Due to the lack of exploring dynamic phase duration, the overall performance and convergence rate of RL-based TSC approaches cannot be guaranteed, which may result in poor adaptability of RL methods to different traffic conditions. To address these issues, in this paper, we formulate a novel phase-duration-aware TSC (PDA-TSC) problem and propose an effective RL-based TSC approach, named MonitorLight. Our approach adopts a new traffic indicator, mixed pressure, which enables RL agents to simultaneously analyze the impacts of stationary and moving vehicles on intersections. Based on the observed mixed pressure of intersections, RL agents can autonomously determine whether or not to change the current signals in real-time. In addition, MonitorLight can adjust the control method for scenarios with different real-time requirements and achieve excellent results in different situations. Extensive experiments on both real-world and synthetic datasets demonstrate that MonitorLight outperforms the current state-of-the-art IPDALight by up to 2.84% and 5.71% in average vehicle travel time, respectively. Moreover, our method significantly speeds up the convergence, leading IPDALight by 36.87% and 34.58% in the start to converge episode and jumpstart performance, respectively.
KW - average travel time
KW - fairness
KW - phase duration
KW - reinforcement learning
KW - traffic signal control
UR - https://www.scopus.com/pages/publications/85140848974
U2 - 10.1145/3511808.3557400
DO - 10.1145/3511808.3557400
M3 - 会议稿件
AN - SCOPUS:85140848974
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 478
EP - 487
BT - CIKM 2022 - Proceedings of the 31st ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery
Y2 - 17 October 2022 through 21 October 2022
ER -