TY - JOUR
T1 - DANIM
T2 - Domain adaptation network with intermediate domain masking for night-time scene parsing
AU - Tian, Qijian
AU - Wang, Sen
AU - Yi, Ran
AU - Zhang, Zufeng
AU - Sheng, Bin
AU - Tan, Xin
AU - Ma, Lizhuang
N1 - Publisher Copyright:
© 2025 Elsevier Ltd
PY - 2026/5
Y1 - 2026/5
N2 - Night-time scene parsing is important for practical applications such as autonomous driving and robot vision. Since annotating is time-consuming, Unsupervised Domain Adaptation (UDA) is an effective solution for night-time scene parsing. Due to the low illumination, over/under-exposure, and motion blur in night-time scenes, existing methods can not connect daytime scenes and night-time scenes well, limiting their performance. Some methods rely on day-night paired images, which are costly to collect and therefore impractical. In this paper, we propose DANIM, a self-training UDA network for night-time scene parsing. We introduce an intermediate domain that explicitly models the connection between daytime scenes and night-time scenes from lighting and structure. The intermediate domain shares similar structure information with the night-time target domain and similar lighting information with the daytime source domain. By harnessing the rich prior knowledge of a pre-trained text-driven generative model, the intermediate domain can be generated, and we propose a scoring mechanism for selecting the high-quality one for training. Besides, we propose intermediate domain masking to address the inconsistency between the intermediate domain and the target domain. We further design a coupled mask strategy to make the mask more effective. Extensive experiments show that DANIM has achieved first place on the DarkZurich leaderboard and outperforms state-of-the-art methods on other widely used night-time scene parsing benchmarks, i.e., ACDC-night, NightCity, and NighttimeDriving.
AB - Night-time scene parsing is important for practical applications such as autonomous driving and robot vision. Since annotating is time-consuming, Unsupervised Domain Adaptation (UDA) is an effective solution for night-time scene parsing. Due to the low illumination, over/under-exposure, and motion blur in night-time scenes, existing methods can not connect daytime scenes and night-time scenes well, limiting their performance. Some methods rely on day-night paired images, which are costly to collect and therefore impractical. In this paper, we propose DANIM, a self-training UDA network for night-time scene parsing. We introduce an intermediate domain that explicitly models the connection between daytime scenes and night-time scenes from lighting and structure. The intermediate domain shares similar structure information with the night-time target domain and similar lighting information with the daytime source domain. By harnessing the rich prior knowledge of a pre-trained text-driven generative model, the intermediate domain can be generated, and we propose a scoring mechanism for selecting the high-quality one for training. Besides, we propose intermediate domain masking to address the inconsistency between the intermediate domain and the target domain. We further design a coupled mask strategy to make the mask more effective. Extensive experiments show that DANIM has achieved first place on the DarkZurich leaderboard and outperforms state-of-the-art methods on other widely used night-time scene parsing benchmarks, i.e., ACDC-night, NightCity, and NighttimeDriving.
KW - Mask image modeling
KW - Night-time scene parsing
KW - Unsupervised domain adaptation
UR - https://www.scopus.com/pages/publications/105024999090
U2 - 10.1016/j.patcog.2025.112796
DO - 10.1016/j.patcog.2025.112796
M3 - 文章
AN - SCOPUS:105024999090
SN - 0031-3203
VL - 173
JO - Pattern Recognition
JF - Pattern Recognition
M1 - 112796
ER -