TY - GEN
T1 - IMBridge
T2 - 2024 International Conference on Management of Data, SIGMOD 2024
AU - Zhang, Chenyang
AU - Peng, Junxiong
AU - Xu, Chen
AU - Xu, Quanqing
AU - Yang, Chuanhui
N1 - Publisher Copyright:
© 2024 ACM.
PY - 2024/6/9
Y1 - 2024/6/9
N2 - Prediction queries that apply machine learning (ML) models to perform analysis on data stored in the database are prevalent with the advance of research. Thanks to the prosperity of ML frameworks in Python, current database systems introduce Python UDFs into query engines for inference invocation. However, there are impedance mismatches between database engines and prediction query execution with this approach. In particular, the database engine is oblivious to the semantics within prediction functions, which incurs the repetitive inference context setup. Moreover, the evaluation of the prediction function is coupled with the operator, which results in an undesirable inference batch size with low inference throughput. To mitigate these, we propose a system called IMBridge, which leverages aprediction function rewriter to eliminate redundant inference context setup and introduces adecoupled prediction operator to ensure that the evaluation batch size matches the desirable inference batch size. In this demonstration, we will showcase how IMBridge addresses these mismatches and boosts prediction query execution.
AB - Prediction queries that apply machine learning (ML) models to perform analysis on data stored in the database are prevalent with the advance of research. Thanks to the prosperity of ML frameworks in Python, current database systems introduce Python UDFs into query engines for inference invocation. However, there are impedance mismatches between database engines and prediction query execution with this approach. In particular, the database engine is oblivious to the semantics within prediction functions, which incurs the repetitive inference context setup. Moreover, the evaluation of the prediction function is coupled with the operator, which results in an undesirable inference batch size with low inference throughput. To mitigate these, we propose a system called IMBridge, which leverages aprediction function rewriter to eliminate redundant inference context setup and introduces adecoupled prediction operator to ensure that the evaluation batch size matches the desirable inference batch size. In this demonstration, we will showcase how IMBridge addresses these mismatches and boosts prediction query execution.
KW - machine learning prediction query
KW - query optimization
UR - https://www.scopus.com/pages/publications/85196378375
U2 - 10.1145/3626246.3654754
DO - 10.1145/3626246.3654754
M3 - 会议稿件
AN - SCOPUS:85196378375
T3 - Proceedings of the ACM SIGMOD International Conference on Management of Data
SP - 456
EP - 459
BT - SIGMOD-Companion 2024 - Companion of the 2024 International Conferaence on Management of Data
PB - Association for Computing Machinery
Y2 - 9 June 2024 through 15 June 2024
ER -