跳到主要导航 跳到搜索 跳到主要内容

Low-cost Lipschitz-independent adaptive importance sampling of stochastic gradients

  • Huikang Liu
  • , Xiaolu Wang
  • , Jiajin Li
  • , Anthony Man Cho So
  • Chinese University of Hong Kong

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Stochastic gradient descent (SGD) usually samples training data based on the uniform distribution, which may not be a good choice because of the high variance of its stochastic gradient. Thus, importance sampling methods are considered in the literature to improve the performance. Most previous work on SGD-based methods with importance sampling requires the knowledge of Lipschitz constants of all component gradients, which are in general difficult to estimate. In this paper, we study an adaptive importance sampling method for common SGD-based methods by exploiting the local first-order information without knowing any Lipschitz constants. In particular, we periodically changes the sampling distribution by only utilizing the gradient norms in the past few iterations. We prove that our adaptive importance sampling non-asymptotically reduces the variance of the stochastic gradients in SGD, and thus better convergence bounds than that for vanilla SGD can be obtained. We extend this sampling method to several other widely used stochastic gradient algorithms including SGD with momentum and ADAM. Experiments on common convex learning problems and deep neural networks illustrate notably enhanced performance using the adaptive sampling strategy.

源语言英语
主期刊名Proceedings of ICPR 2020 - 25th International Conference on Pattern Recognition
出版商Institute of Electrical and Electronics Engineers Inc.
2150-2157
页数8
ISBN(电子版)9781728188089
DOI
出版状态已出版 - 2020
已对外发布
活动25th International Conference on Pattern Recognition, ICPR 2020 - Virtual, Online, 意大利
期限: 10 1月 202115 1月 2021

出版系列

姓名Proceedings - International Conference on Pattern Recognition
ISSN(印刷版)1051-4651

会议

会议25th International Conference on Pattern Recognition, ICPR 2020
国家/地区意大利
Virtual, Online
时期10/01/2115/01/21

指纹

探究 'Low-cost Lipschitz-independent adaptive importance sampling of stochastic gradients' 的科研主题。它们共同构成独一无二的指纹。

引用此