跳到主要导航 跳到搜索 跳到主要内容

BaLeNAS: Differentiable Architecture Search via the Bayesian Learning Rule

  • Miao Zhang
  • , Shirui Pan
  • , Xiaojun Chang
  • , Steven Su*
  • , Jilin Hu
  • , Gholamreza Haffari
  • , Bin Yang
  • *此作品的通讯作者
  • Aalborg University
  • Monash University
  • AAII
  • Royal Melbourne Institute of Technology University
  • Shandong First Medical University & Shandong Academy of Medical Sciences

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Differentiable Architecture Search (DARTS) has received massive attention in recent years, mainly because it significantly reduces the computational cost through weight sharing and continuous relaxation. However, more recent works find that existing differentiable NAS techniques struggle to outperform naive baselines, yielding deteriorative architectures as the search proceeds. Rather than directly optimizing the architecture parameters, this paper formulates the neural architecture search as a distribution learning problem through relaxing the architecture weights into Gaussian distributions. By leveraging the natural-gradient variational inference (NGVI), the architecture distribution can be easily optimized based on existing codebases without incurring more memory and computational consumption. We demonstrate how the differentiable NAS benefits from Bayesian principles, enhancing exploration and improving stability. The experimental results on NAS benchmark datasets confirm the significant improvements the proposed framework can make. In addition, instead of simply applying the argmax on the learned parameters, we further leverage the recently-proposed training-free proxies in NAS to select the optimal architecture from a group architectures drawn from the optimized distribution, where we achieve state-of-the-art results on the NAS-Bench-201 and NAS-Bench-1shot1 benchmarks. Our best architecture in the DARTS search space also obtains competitive test errors with 2.37%, 15.72%, and 24.2% on CIFAR-10, CIFAR-100, and ImageNet, respectively.

源语言英语
主期刊名Proceedings - 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022
出版商IEEE Computer Society
11861-11870
页数10
ISBN(电子版)9781665469463
DOI
出版状态已出版 - 2022
已对外发布
活动2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022 - New Orleans, 美国
期限: 19 6月 202224 6月 2022

出版系列

姓名Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
2022-June
ISSN(印刷版)1063-6919

会议

会议2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022
国家/地区美国
New Orleans
时期19/06/2224/06/22

指纹

探究 'BaLeNAS: Differentiable Architecture Search via the Bayesian Learning Rule' 的科研主题。它们共同构成独一无二的指纹。

引用此