跳到主要导航 跳到搜索 跳到主要内容

Hierarchical Safety Realignment: Lightweight Restoration of Safety in Pruned Large Vision-Language Models

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

With the growing size of Large Vision-Language Models (LVLMs), network pruning techniques designed to compress these models for deployment in resource-constrained environments have attracted significant attention. However, we observe that pruning frequently results in a degradation in safety performance. To address this issue, we propose a novel and lightweight approach, named Hierarchical Safety Realignment (HSR). HSR operates by first quantifying the contribution of each attention head to safety, identifying the most critical ones, and then selectively restoring neurons directly within these attention heads that play a pivotal role in maintaining safety. This process hierarchically realigns the safety of pruned LVLMs, progressing from the attention head level to the neuron level. We validate HSR across various models and pruning strategies, consistently achieving notable improvements in safety performance. To the best of our knowledge, this is the first work explicitly focused on restoring safety in LVLMs post-pruning. The code will be available at https://github.com/TheShineyue/HSR.

源语言英语
主期刊名Findings of the Association for Computational Linguistics
主期刊副标题ACL 2025
编辑Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
出版商Association for Computational Linguistics (ACL)
7600-7612
页数13
ISBN(电子版)9798891762565
DOI
出版状态已出版 - 2025
活动63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025 - Vienna, 奥地利
期限: 27 7月 20251 8月 2025

出版系列

姓名Proceedings of the Annual Meeting of the Association for Computational Linguistics
ISSN(印刷版)0736-587X

会议

会议63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025
国家/地区奥地利
Vienna
时期27/07/251/08/25

指纹

探究 'Hierarchical Safety Realignment: Lightweight Restoration of Safety in Pruned Large Vision-Language Models' 的科研主题。它们共同构成独一无二的指纹。

引用此