TY - GEN
T1 - D3L
T2 - 33rd ACM International Conference on Multimedia, MM 2025
AU - Liu, Wenxiang
AU - Liu, Yongkang
AU - Meng, Weiliang
AU - He, Gaoqi
AU - Li, Jianhua
N1 - Publisher Copyright:
© 2025 ACM.
PY - 2025/10/27
Y1 - 2025/10/27
N2 - Monocular 3D lane detection is a challenging task for autonomous driving systems. Recent advances primarily focus on one-step methods for lane detection based on front-view features, which show promising results on straight lanes. However, curved lanes are difficult to handle with one-step prediction, which performs prediction in a single leap without gradual refinement. To address this issue, we propose a novel Denoising Diffusion Model for 3D Lane Detection framework (D3L). The main idea is to leverage the progressive generation capability of the diffusion model to generate accurate 3D curved lanes, and ensuring lane continuity through curvature constraints. The framework includes three creative components: coarse-to-fine denoiser (CFD), curvature-constrained loss (CCL) and multi-sampling aggregation strategy (MSAS). In CFD, both lane-level and point-level transformer blocks are integrated to accurately denoise 3D lanes, which effectively captures both global and local features. CCL is designed to reduce deviations in lane curvature, resulting in smoother lane continuity. This loss enhances both the accuracy and geometric consistency of lane detection, especially in complex curved scenes. MSAS is proposed to select the optimal lane point-by-point from multiple candidates, thus robustness of the lane prediction is significantly improved. Extensive experiments on two popular 3D lane detection benchmarks demonstrate that our D3 L outperforms the state-of-the-art methods.
AB - Monocular 3D lane detection is a challenging task for autonomous driving systems. Recent advances primarily focus on one-step methods for lane detection based on front-view features, which show promising results on straight lanes. However, curved lanes are difficult to handle with one-step prediction, which performs prediction in a single leap without gradual refinement. To address this issue, we propose a novel Denoising Diffusion Model for 3D Lane Detection framework (D3L). The main idea is to leverage the progressive generation capability of the diffusion model to generate accurate 3D curved lanes, and ensuring lane continuity through curvature constraints. The framework includes three creative components: coarse-to-fine denoiser (CFD), curvature-constrained loss (CCL) and multi-sampling aggregation strategy (MSAS). In CFD, both lane-level and point-level transformer blocks are integrated to accurately denoise 3D lanes, which effectively captures both global and local features. CCL is designed to reduce deviations in lane curvature, resulting in smoother lane continuity. This loss enhances both the accuracy and geometric consistency of lane detection, especially in complex curved scenes. MSAS is proposed to select the optimal lane point-by-point from multiple candidates, thus robustness of the lane prediction is significantly improved. Extensive experiments on two popular 3D lane detection benchmarks demonstrate that our D3 L outperforms the state-of-the-art methods.
KW - 3d lane detection
KW - curvature constraint
KW - denoising diffusion model
UR - https://www.scopus.com/pages/publications/105024069781
U2 - 10.1145/3746027.3755667
DO - 10.1145/3746027.3755667
M3 - 会议稿件
AN - SCOPUS:105024069781
T3 - MM 2025 - Proceedings of the 33rd ACM International Conference on Multimedia, Co-Located with MM 2025
SP - 4923
EP - 4931
BT - MM 2025 - Proceedings of the 33rd ACM International Conference on Multimedia, Co-Located with MM 2025
PB - Association for Computing Machinery, Inc
Y2 - 27 October 2025 through 31 October 2025
ER -