跳到主要导航 跳到搜索 跳到主要内容

BBTv2: Towards a Gradient-Free Future with Large Language Models

  • Tianxiang Sun
  • , Zhengfu He
  • , Hong Qian
  • , Yunhua Zhou
  • , Xuanjing Huang
  • , Xipeng Qiu*
  • *此作品的通讯作者
  • Fudan University

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Most downstream adaptation methods tune all or part of the parameters of pre-trained models (PTMs) through gradient descent, where the tuning cost increases linearly with the growth of the model size. By contrast, gradient-free methods only require the forward computation of the PTM to tune the prompt, retaining the benefits of efficient tuning and deployment. Though, past work on gradient-free tuning often introduces gradient descent to seek a good initialization of prompt and lacks versatility across tasks and PTMs. In this paper, we present BBTv2, an improved version of Black-Box Tuning (Sun et al., 2022b), to drive PTMs for few-shot learning. We prepend continuous prompts to every layer of the PTM and propose a divide-and-conquer gradient-free algorithm to optimize the prompts at different layers alternately. Extensive experiments across various tasks and PTMs show that BBTv2 can achieve comparable performance to full model tuning and state-of-the-art parameter-efficient methods (e.g., Adapter, LoRA, BitFit, etc.) under few-shot settings while maintaining much fewer tunable parameters.

源语言英语
主期刊名Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022
编辑Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
出版商Association for Computational Linguistics (ACL)
3916-3930
页数15
ISBN(电子版)9781959429401
DOI
出版状态已出版 - 2022
活动2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 - Hybrid, Abu Dhabi, 阿拉伯联合酋长国
期限: 7 12月 202211 12月 2022

出版系列

姓名Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022

会议

会议2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022
国家/地区阿拉伯联合酋长国
Hybrid, Abu Dhabi
时期7/12/2211/12/22

指纹

探究 'BBTv2: Towards a Gradient-Free Future with Large Language Models' 的科研主题。它们共同构成独一无二的指纹。

引用此