TY - JOUR
T1 - It’s Morphing Time
T2 - Unleashing the Potential of Multiple LLMs via Multi-Objective Optimization
AU - Li, Bingdong
AU - Di, Zixiang
AU - Yang, Yanting
AU - Qian, Hong
AU - Yang, Peng
AU - Hao, Hao
AU - Tang, Ke
AU - Zhou, Aimin
N1 - Publisher Copyright:
© 1997-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - In this paper, we introduce a novel approach for addressing the multi-objective optimization problem in large language model merging via black-box multi-objective optimization algorithms. The goal of model merging is to combine multiple models, each excelling in different tasks, into a single model that outperforms any of the individual source models. However, the effectiveness of conventional model merging methods is constrained by human intuition or domain knowledge. While existing optimization-based model merging methods can automatically search for model merging parameter configurations, they often struggle to find a satisfactory configuration within a limited evaluation budget. To address this challenge, we propose a novel and sample-efficient automated model merging method, named MM-MO. This method leverages multi-objective Bayesian optimization algorithms to autonomously search for great merging configurations across various tasks. In MMMO, we proposed an enhanced acquisition strategy and an auxiliary optimization objective to improve the search process. Our enhanced acquisition strategy integrates a weak-to-strong method to refine the acquisition function, enabling previously evaluated superior configurations to guide the search for new ones. Meanwhile, Fisher information is utilized to further filter these configurations, increasing the possibility of finding high-quality merging configurations. Additionally, we design a sparsity metric as an auxiliary optimization objective, further enhance the models generalization performance across different tasks. We conducted comprehensive experiments with other mainstream model merging methods, demonstrating that the proposed MMMO algorithm is competitive and effective in achieving high-quality model merging.
AB - In this paper, we introduce a novel approach for addressing the multi-objective optimization problem in large language model merging via black-box multi-objective optimization algorithms. The goal of model merging is to combine multiple models, each excelling in different tasks, into a single model that outperforms any of the individual source models. However, the effectiveness of conventional model merging methods is constrained by human intuition or domain knowledge. While existing optimization-based model merging methods can automatically search for model merging parameter configurations, they often struggle to find a satisfactory configuration within a limited evaluation budget. To address this challenge, we propose a novel and sample-efficient automated model merging method, named MM-MO. This method leverages multi-objective Bayesian optimization algorithms to autonomously search for great merging configurations across various tasks. In MMMO, we proposed an enhanced acquisition strategy and an auxiliary optimization objective to improve the search process. Our enhanced acquisition strategy integrates a weak-to-strong method to refine the acquisition function, enabling previously evaluated superior configurations to guide the search for new ones. Meanwhile, Fisher information is utilized to further filter these configurations, increasing the possibility of finding high-quality merging configurations. Additionally, we design a sparsity metric as an auxiliary optimization objective, further enhance the models generalization performance across different tasks. We conducted comprehensive experiments with other mainstream model merging methods, demonstrating that the proposed MMMO algorithm is competitive and effective in achieving high-quality model merging.
KW - Large language model
KW - model merging
KW - multi-objective optimization
UR - https://www.scopus.com/pages/publications/105017446869
U2 - 10.1109/TEVC.2025.3613937
DO - 10.1109/TEVC.2025.3613937
M3 - 文章
AN - SCOPUS:105017446869
SN - 1089-778X
JO - IEEE Transactions on Evolutionary Computation
JF - IEEE Transactions on Evolutionary Computation
ER -