TY - GEN
T1 - Rethinking Gradient Projection Continual Learning
T2 - 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023
AU - Zhao, Zhen
AU - Zhang, Zhizhong
AU - Tan, Xin
AU - Liu, Jun
AU - Qu, Yanyun
AU - Xie, Yuan
AU - Ma, Lizhuang
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Continual learning aims to incrementally learn novel classes over time, while not forgetting the learned knowledge. Recent studies have found that learning would not forget if the updated gradient is orthogonal to the feature space. However, previous approaches require the gradient to be fully orthogonal to the whole feature space, leading to poor plasticity, as the feasible gradient direction becomes narrow when the tasks continually come, i.e., feature space is unlimitedly expanded. In this paper, we propose a space decoupling (SD) algorithm to decouple the feature space into a pair of complementary subspaces, i.e., the stability space I and the plasticity space R. I is established by conducting space intersection between the historic and current feature space, and thus I contains more task-shared bases. R is constructed by seeking the orthogonal complementary subspace of T and thus R mainly contains task-specific bases. By putting distinguishing constraints on R and I, our method achieves a better balance between stability and plasticity. Extensive experiments are conducted by applying SD to gradient projection baselines, and show SD is model-agnostic and achieves SOTA results on publicly available datasets.
AB - Continual learning aims to incrementally learn novel classes over time, while not forgetting the learned knowledge. Recent studies have found that learning would not forget if the updated gradient is orthogonal to the feature space. However, previous approaches require the gradient to be fully orthogonal to the whole feature space, leading to poor plasticity, as the feasible gradient direction becomes narrow when the tasks continually come, i.e., feature space is unlimitedly expanded. In this paper, we propose a space decoupling (SD) algorithm to decouple the feature space into a pair of complementary subspaces, i.e., the stability space I and the plasticity space R. I is established by conducting space intersection between the historic and current feature space, and thus I contains more task-shared bases. R is constructed by seeking the orthogonal complementary subspace of T and thus R mainly contains task-specific bases. By putting distinguishing constraints on R and I, our method achieves a better balance between stability and plasticity. Extensive experiments are conducted by applying SD to gradient projection baselines, and show SD is model-agnostic and achieves SOTA results on publicly available datasets.
KW - Transfer
KW - continual
KW - low-shot
KW - meta
KW - or long-tail learning
UR - https://www.scopus.com/pages/publications/85173959820
U2 - 10.1109/CVPR52729.2023.00362
DO - 10.1109/CVPR52729.2023.00362
M3 - 会议稿件
AN - SCOPUS:85173959820
T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
SP - 3718
EP - 3727
BT - Proceedings - 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023
PB - IEEE Computer Society
Y2 - 18 June 2023 through 22 June 2023
ER -