TY - GEN
T1 - Parallelizing multiple pipelines of one query in a main memory database cluster
AU - Fang, Zhuhe
AU - Weng, Chuliang
AU - Wang, Li
AU - Zhou, Aoying
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/10/24
Y1 - 2018/10/24
N2 - To fully use the advanced resources of a main memory database cluster, we take independent parallelism into account to parallelize multiple pipelines of one query. However, scheduling resources to multiple pipelines is an intractable problem. Traditional static approaches to this problem may lead to a serious waste of resources and suboptimal execution order of pipelines, because it is hard to predict the actual data distribution and fluctuating workloads at compile time. In response, we propose a dynamic scheduling algorithm, List with Filling and Preemption (LFPS), based on two techniques. (1) Adaptive filling improves resource utilization by issuing more extra pipelines to adaptively fill idle resource 'holes' during execution. (2) Cost-based preemption strictly guarantees scheduling the pipelines on a critical path first at run time. We implement LFPS in our prototype database system. Under the workloads of TPC-H, experiments show our work improves the finish time of parallelizable pipelines from one query up to 2.3X than a static approach and 1.7X than a serialized execution.
AB - To fully use the advanced resources of a main memory database cluster, we take independent parallelism into account to parallelize multiple pipelines of one query. However, scheduling resources to multiple pipelines is an intractable problem. Traditional static approaches to this problem may lead to a serious waste of resources and suboptimal execution order of pipelines, because it is hard to predict the actual data distribution and fluctuating workloads at compile time. In response, we propose a dynamic scheduling algorithm, List with Filling and Preemption (LFPS), based on two techniques. (1) Adaptive filling improves resource utilization by issuing more extra pipelines to adaptively fill idle resource 'holes' during execution. (2) Cost-based preemption strictly guarantees scheduling the pipelines on a critical path first at run time. We implement LFPS in our prototype database system. Under the workloads of TPC-H, experiments show our work improves the finish time of parallelizable pipelines from one query up to 2.3X than a static approach and 1.7X than a serialized execution.
KW - Adaptive Filling
KW - Independent Parallelism
KW - Multiple Pipelines
KW - Preemption
UR - https://www.scopus.com/pages/publications/85057106839
U2 - 10.1109/ICDE.2018.00123
DO - 10.1109/ICDE.2018.00123
M3 - 会议稿件
AN - SCOPUS:85057106839
T3 - Proceedings - IEEE 34th International Conference on Data Engineering, ICDE 2018
SP - 1256
EP - 1259
BT - Proceedings - IEEE 34th International Conference on Data Engineering, ICDE 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 34th IEEE International Conference on Data Engineering, ICDE 2018
Y2 - 16 April 2018 through 19 April 2018
ER -