TY - JOUR
T1 - Optimal Transport with Arbitrary Prior for Dynamic Resolution Network
AU - Zhang, Zhizhong
AU - Li, Shujun
AU - Zhang, Chenyang
AU - Ma, Lizhuang
AU - Tan, Xin
AU - Xie, Yuan
N1 - Publisher Copyright:
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025.
PY - 2025/9
Y1 - 2025/9
N2 - Dynamic resolution network is proved to be crucial in reducing computational redundancy by automatically assigning satisfactory resolution for each input image. However, it is observed that resolution choices are often collapsed, where prior works tend to assign images to the resolution routes whose computational cost is close to the required FLOPs. In this paper, we propose a novel optimal transport dynamic resolution network (OTD-Net) by establishing an intrinsic connection between resolution assignment and optimal transport problem. In this framework, each sample owns a resolution assignment choice viewed as supplier, and each resolution requires unallocated images considered as demander. With two assignment priors, OTD-Net benefits from the non-collapse division under theoretical support, and produces the desired assignment policy by balancing the computation budget and prediction accuracy. On that basis, a multi-resolution inference is proposed to ensemble low-resolution predictions. Extensive experiments including image classification, object detection and depth estimation, show our approach is both efficient and effective for both ResNet and Transformer, achieving state-of-the-art performance on various benchmarks.
AB - Dynamic resolution network is proved to be crucial in reducing computational redundancy by automatically assigning satisfactory resolution for each input image. However, it is observed that resolution choices are often collapsed, where prior works tend to assign images to the resolution routes whose computational cost is close to the required FLOPs. In this paper, we propose a novel optimal transport dynamic resolution network (OTD-Net) by establishing an intrinsic connection between resolution assignment and optimal transport problem. In this framework, each sample owns a resolution assignment choice viewed as supplier, and each resolution requires unallocated images considered as demander. With two assignment priors, OTD-Net benefits from the non-collapse division under theoretical support, and produces the desired assignment policy by balancing the computation budget and prediction accuracy. On that basis, a multi-resolution inference is proposed to ensemble low-resolution predictions. Extensive experiments including image classification, object detection and depth estimation, show our approach is both efficient and effective for both ResNet and Transformer, achieving state-of-the-art performance on various benchmarks.
KW - Computational Redundancy
KW - Dynamic Inference
KW - Dynamic Resolution Network
KW - Model Compression
UR - https://www.scopus.com/pages/publications/105006479660
U2 - 10.1007/s11263-025-02483-7
DO - 10.1007/s11263-025-02483-7
M3 - 文章
AN - SCOPUS:105006479660
SN - 0920-5691
VL - 133
SP - 6187
EP - 6200
JO - International Journal of Computer Vision
JF - International Journal of Computer Vision
IS - 9
ER -