TY - GEN
T1 - SkipVSR
T2 - 32nd ACM International Conference on Multimedia, MM 2024
AU - Ai, Zekun
AU - Luo, Xiaotong
AU - Qu, Yanyun
AU - Xie, Yuan
N1 - Publisher Copyright:
© 2024 ACM.
PY - 2024/10/28
Y1 - 2024/10/28
N2 - Deep neural networks have revealed enormous potential in video super-resolution (VSR), yet the expensive computational expense limits their deployment on resource-limited devices and actual scenarios, especially for restoring multiple frames simultaneously. Existing VSR models contain considerable redundant filters, which drag down the inference efficiency. To accelerate the inference of VSR models, we propose a scalable method based on adaptive patch routing to achieve practical speedup. Specifically, we design a confidence estimator to predict the aggregation performance of each block for adjacent patch information. It learns to dynamically perform block skipping, i.e., choose which basic blocks of the VSR network to execute during inference so as to reduce total computation to the maximum extent without degrading reconstruction accuracy dramatically. However, we observe that skipping error would be amplified as the hidden states propagate along with recurrent networks. To alleviate the issue, we design temporal feature alignment to guarantee the performance. This proposal essentially proposes an adaptive routing scheme for each patch. Extensive experiments demonstrate that our method can not only accelerate inference but also provide strong quantitative and qualitative results. Built upon the BasicVSR model, our method achieves a speedup of 20% on average, going as high as 50% for some images, while even maintaining competitive performance on REDS4.
AB - Deep neural networks have revealed enormous potential in video super-resolution (VSR), yet the expensive computational expense limits their deployment on resource-limited devices and actual scenarios, especially for restoring multiple frames simultaneously. Existing VSR models contain considerable redundant filters, which drag down the inference efficiency. To accelerate the inference of VSR models, we propose a scalable method based on adaptive patch routing to achieve practical speedup. Specifically, we design a confidence estimator to predict the aggregation performance of each block for adjacent patch information. It learns to dynamically perform block skipping, i.e., choose which basic blocks of the VSR network to execute during inference so as to reduce total computation to the maximum extent without degrading reconstruction accuracy dramatically. However, we observe that skipping error would be amplified as the hidden states propagate along with recurrent networks. To alleviate the issue, we design temporal feature alignment to guarantee the performance. This proposal essentially proposes an adaptive routing scheme for each patch. Extensive experiments demonstrate that our method can not only accelerate inference but also provide strong quantitative and qualitative results. Built upon the BasicVSR model, our method achieves a speedup of 20% on average, going as high as 50% for some images, while even maintaining competitive performance on REDS4.
KW - adaptive inference
KW - dynamic network
KW - video super-resolution
UR - https://www.scopus.com/pages/publications/85209797449
U2 - 10.1145/3664647.3681637
DO - 10.1145/3664647.3681637
M3 - 会议稿件
AN - SCOPUS:85209797449
T3 - MM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia
SP - 5874
EP - 5882
BT - MM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia
PB - Association for Computing Machinery, Inc
Y2 - 28 October 2024 through 1 November 2024
ER -