TY - JOUR
T1 - DS3former
T2 - Dual-Stream Semantic Separation Transformer for Single Image Reflection Separation
AU - Yin, Wenbin
AU - Zhang, Junkang
AU - Fang, Faming
AU - Zhang, Guixu
N1 - Publisher Copyright:
© 1999-2012 IEEE.
PY - 2026
Y1 - 2026
N2 - Single image reflection separation is a challenging and ill-posed problem owing to diverse reflective surfaces and lighting conditions. This paper introduces DS3former, a dual-stream transformer network employing a semantic separation strategy to effectively distinguish between the transmission (T) and reflection (R) layers. We observe that within pre-trained deep semantic features of mixed images, individual channels exhibit varying affinities towards either the T or R layer, facilitating their differentiation. Based on this observation, we propose a novel semantic separation attention mechanism that adaptively extracts layer-specific features from different channels and performs inter-stream feature transfer and aggregation to enhance separation. To further improve performance at the semantic level, features from deeper decoder stages and external pre-trained models are integrated to guide the separation process in shallower encoder layers. Experimental results show that the proposed method outperforms state-of-the-art reflection separation methods in terms of quantitative metrics and visual quality.
AB - Single image reflection separation is a challenging and ill-posed problem owing to diverse reflective surfaces and lighting conditions. This paper introduces DS3former, a dual-stream transformer network employing a semantic separation strategy to effectively distinguish between the transmission (T) and reflection (R) layers. We observe that within pre-trained deep semantic features of mixed images, individual channels exhibit varying affinities towards either the T or R layer, facilitating their differentiation. Based on this observation, we propose a novel semantic separation attention mechanism that adaptively extracts layer-specific features from different channels and performs inter-stream feature transfer and aggregation to enhance separation. To further improve performance at the semantic level, features from deeper decoder stages and external pre-trained models are integrated to guide the separation process in shallower encoder layers. Experimental results show that the proposed method outperforms state-of-the-art reflection separation methods in terms of quantitative metrics and visual quality.
KW - Bidirectional Semantic Attention
KW - Dual- Branch Architecture
KW - Fast Fourier Convolution
KW - Pre-trained Model
KW - Single Image Reflection Separation
UR - https://www.scopus.com/pages/publications/105027965366
U2 - 10.1109/TMM.2026.3654343
DO - 10.1109/TMM.2026.3654343
M3 - 文章
AN - SCOPUS:105027965366
SN - 1520-9210
JO - IEEE Transactions on Multimedia
JF - IEEE Transactions on Multimedia
ER -