跳到主要导航 跳到搜索 跳到主要内容

SylphDB: An Active and Adaptive LSM Engine for Update-Intensive Workloads

  • Jun Peng Zhu
  • , Zhiwei Ye
  • , Xiaolong He
  • , Peng Cai*
  • , Xuan Zhou
  • , Aoying Zhou
  • , Dunbo Cai
  • , Ling Qian
  • , Kai Xu
  • , Liu Tang
  • , Qi Liu
  • *此作品的通讯作者
  • East China Normal University
  • PingCAP
  • Ltd.

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Update-intensive workloads are prevalent in contemporary OLTP and AI/ML scenarios. An update operation typically involves deleting the old version of the target record and then inserting a new version. In this work, we demonstrate that an LSM-tree faces two issues when dealing with update-intensive workloads. Firstly, the deleted old versions are not promptly garbage collected until they merge with their new versions during compaction. This may lead to space waste and write amplification. Secondly, it is common for an update operation to modify only a small fraction of a data record, such as one of a hundred attributes. However, state-of-the-art LSM-trees fail to effectively utilize the incremental storage strategy, which involves storing only the updated fraction rather than the entire new version to enhance efficiency. In this paper, we propose two techniques, active and fast garbage collection, and adaptive incremental updating, to address these issues, respectively. Active and fast garbage collection probes the distribution of invalid data versions in an LSM-tree and performs garbage collection in a more promptly manner. Adaptive incremental updating applies different storage modes to the update operation to achieve balanced write and read amplification ratios as much as possible. Based on the techniques, we introduce SylphDB implemented based on the codebase of RocksDB and optimized for update-intensive workloads. Experimental results demonstrated that, compared to traditional LSM-tree based systems, SylphDB can improve the efficiency of garbage collection by 2× and reduce write amplification by 20%.

源语言英语
主期刊名Proceedings - 2025 IEEE 41st International Conference on Data Engineering, ICDE 2025
出版商IEEE Computer Society
4360-4372
页数13
ISBN(电子版)9798331536039
DOI
出版状态已出版 - 2025
活动41st IEEE International Conference on Data Engineering, ICDE 2025 - Hong Kong, 中国
期限: 19 5月 202523 5月 2025

出版系列

姓名Proceedings - International Conference on Data Engineering
ISSN(印刷版)1084-4627
ISSN(电子版)2375-0286

会议

会议41st IEEE International Conference on Data Engineering, ICDE 2025
国家/地区中国
Hong Kong
时期19/05/2523/05/25

指纹

探究 'SylphDB: An Active and Adaptive LSM Engine for Update-Intensive Workloads' 的科研主题。它们共同构成独一无二的指纹。

引用此