TY - GEN
T1 - Mirage
T2 - 40th IEEE International Conference on Data Engineering, ICDE 2024
AU - Wang, Qingshuai
AU - Li, Hao
AU - Hu, Zirui
AU - Zhang, Rong
AU - Yang, Chengcheng
AU - Cai, Peng
AU - Zhou, Xuan
AU - Zhou, Aoying
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - To optimize query parallelism techniques, substantial workloads are required with specific query plans and customized output size for each operator (denoted as cardinality constraint). To this end, a rich body of query-aware database generators (QAG) are proposed. However, the complex data dependencies hidden behind queries make previous QAGs suffer from deficiencies in supporting complex operators and controlling the generation errors. In this paper, we design a new generator Mirage supporting well for complex operators with low error bounds for cardinality constraints. First, Mirage leverages Query Rewriting and Set Transforming Rules to decouple dependencies between key and non-key columns, which could help generate each of them individually. Then, for the non-key columns, Mirage abstracts cardinality constraints of operators as placement requirements within each column's domain, and further models the generation problem as a classic bin packing problem. Finally, for the key columns, Mirage proposes a uniform representation of join cardinality constraints for all types of PK-FK joins and partitions the data according to the matching status between PK and F K columns. Then, it formulates the key population as a Constraint Programming problem, which can be solved by an existing CP Solver. The experiments show that Mirage conquers all previous work in either operator support or generation error.
AB - To optimize query parallelism techniques, substantial workloads are required with specific query plans and customized output size for each operator (denoted as cardinality constraint). To this end, a rich body of query-aware database generators (QAG) are proposed. However, the complex data dependencies hidden behind queries make previous QAGs suffer from deficiencies in supporting complex operators and controlling the generation errors. In this paper, we design a new generator Mirage supporting well for complex operators with low error bounds for cardinality constraints. First, Mirage leverages Query Rewriting and Set Transforming Rules to decouple dependencies between key and non-key columns, which could help generate each of them individually. Then, for the non-key columns, Mirage abstracts cardinality constraints of operators as placement requirements within each column's domain, and further models the generation problem as a classic bin packing problem. Finally, for the key columns, Mirage proposes a uniform representation of join cardinality constraints for all types of PK-FK joins and partitions the data according to the matching status between PK and F K columns. Then, it formulates the key population as a Constraint Programming problem, which can be solved by an existing CP Solver. The experiments show that Mirage conquers all previous work in either operator support or generation error.
KW - benchmarking
KW - performance evaluation
KW - query optimization
KW - query-aware database generator
UR - https://www.scopus.com/pages/publications/85200452210
U2 - 10.1109/ICDE60146.2024.00306
DO - 10.1109/ICDE60146.2024.00306
M3 - 会议稿件
AN - SCOPUS:85200452210
T3 - Proceedings - International Conference on Data Engineering
SP - 3989
EP - 4001
BT - Proceedings - 2024 IEEE 40th International Conference on Data Engineering, ICDE 2024
PB - IEEE Computer Society
Y2 - 13 May 2024 through 17 May 2024
ER -