TY - GEN
T1 - Max-Min Diversification with Fairness Constraints
T2 - 2023 SIAM International Conference on Data Mining, SDM 2023
AU - Wang, Yanhao
AU - Mathioudakis, Michael
AU - Li, Jia
AU - Fabbri, Francesco
N1 - Publisher Copyright:
Copyright © 2023 by SIAM.
PY - 2023
Y1 - 2023
N2 - Diversity maximization aims to select a diverse and representative subset of items from a large dataset. It is a fundamental optimization task that finds applications in data summarization, feature selection, web search, recommender systems, and elsewhere. However, in a setting where data items are associated with different groups according to sensitive attributes like sex or race, it is possible that algorithmic solutions for this task, if left unchecked, will under- or over-represent some of the groups. Therefore, we are motivated to address the problem of max-min diversification with fairness constraints, aiming to select k items to maximize the minimum distance between any pair of selected items while ensuring that the number of items selected from each group falls within predefined lower and upper bounds. In this work, we propose an exact algorithm based on integer linear programming that is suitable for small datasets as well as a 1−5ε -approximation algorithm for any parameter ε ∈ (0, 1) that scales to large datasets. Extensive experiments on real-world datasets demonstrate the superior performance of our proposed algorithms over existing ones.
AB - Diversity maximization aims to select a diverse and representative subset of items from a large dataset. It is a fundamental optimization task that finds applications in data summarization, feature selection, web search, recommender systems, and elsewhere. However, in a setting where data items are associated with different groups according to sensitive attributes like sex or race, it is possible that algorithmic solutions for this task, if left unchecked, will under- or over-represent some of the groups. Therefore, we are motivated to address the problem of max-min diversification with fairness constraints, aiming to select k items to maximize the minimum distance between any pair of selected items while ensuring that the number of items selected from each group falls within predefined lower and upper bounds. In this work, we propose an exact algorithm based on integer linear programming that is suitable for small datasets as well as a 1−5ε -approximation algorithm for any parameter ε ∈ (0, 1) that scales to large datasets. Extensive experiments on real-world datasets demonstrate the superior performance of our proposed algorithms over existing ones.
KW - algorithmic fairness
KW - max-min diversification
UR - https://www.scopus.com/pages/publications/85166175320
M3 - 会议稿件
AN - SCOPUS:85166175320
T3 - 2023 SIAM International Conference on Data Mining, SDM 2023
SP - 91
EP - 99
BT - 2023 SIAM International Conference on Data Mining, SDM 2023
PB - Society for Industrial and Applied Mathematics Publications
Y2 - 27 April 2023 through 29 April 2023
ER -