TY - GEN
T1 - Distance-Based Sampling of Software Configuration Spaces
AU - Kaltenecker, Christian
AU - Grebhahn, Alexander
AU - Siegmund, Norbert
AU - Guo, Jianmei
AU - Apel, Sven
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/5
Y1 - 2019/5
N2 - Configurable software systems provide a multitude of configuration options to adjust and optimize their functional and non-functional properties. For instance, to find the fastest configuration for a given setting, a brute-force strategy measures the performance of all configurations, which is typically intractable. Addressing this challenge, state-of-the-art strategies rely on machine learning, analyzing only a few configurations (i.e., a sample set) to predict the performance of other configurations. However, to obtain accurate performance predictions, a representative sample set of configurations is required. Addressing this task, different sampling strategies have been proposed, which come with different advantages (e.g., covering the configuration space systematically) and disadvantages (e.g., the need to enumerate all configurations). In our experiments, we found that most sampling strategies do not achieve a good coverage of the configuration space with respect to covering relevant performance values. That is, they miss important configurations with distinct performance behavior. Based on this observation, we devise a new sampling strategy, called distance-based sampling, that is based on a distance metric and a probability distribution to spread the configurations of the sample set according to a given probability distribution across the configuration space. This way, we cover different kinds of interactions among configuration options in the sample set. To demonstrate the merits of distance-based sampling, we compare it to state-of-the-art sampling strategies, such as t-wise sampling, on 10 real-world configurable software systems. Our results show that distance-based sampling leads to more accurate performance models for medium to large sample sets.
AB - Configurable software systems provide a multitude of configuration options to adjust and optimize their functional and non-functional properties. For instance, to find the fastest configuration for a given setting, a brute-force strategy measures the performance of all configurations, which is typically intractable. Addressing this challenge, state-of-the-art strategies rely on machine learning, analyzing only a few configurations (i.e., a sample set) to predict the performance of other configurations. However, to obtain accurate performance predictions, a representative sample set of configurations is required. Addressing this task, different sampling strategies have been proposed, which come with different advantages (e.g., covering the configuration space systematically) and disadvantages (e.g., the need to enumerate all configurations). In our experiments, we found that most sampling strategies do not achieve a good coverage of the configuration space with respect to covering relevant performance values. That is, they miss important configurations with distinct performance behavior. Based on this observation, we devise a new sampling strategy, called distance-based sampling, that is based on a distance metric and a probability distribution to spread the configurations of the sample set according to a given probability distribution across the configuration space. This way, we cover different kinds of interactions among configuration options in the sample set. To demonstrate the merits of distance-based sampling, we compare it to state-of-the-art sampling strategies, such as t-wise sampling, on 10 real-world configurable software systems. Our results show that distance-based sampling leads to more accurate performance models for medium to large sample sets.
KW - Configurable Systems
KW - Configuration Sampling
KW - Distance Based Sampling
KW - Performance Modeling
UR - https://www.scopus.com/pages/publications/85062345445
U2 - 10.1109/ICSE.2019.00112
DO - 10.1109/ICSE.2019.00112
M3 - 会议稿件
AN - SCOPUS:85062345445
T3 - Proceedings - International Conference on Software Engineering
SP - 1084
EP - 1094
BT - Proceedings - 2019 IEEE/ACM 41st International Conference on Software Engineering, ICSE 2019
PB - IEEE Computer Society
T2 - 41st IEEE/ACM International Conference on Software Engineering, ICSE 2019
Y2 - 25 May 2019 through 31 May 2019
ER -