TY - GEN
T1 - Workload-Aware Log-Structured Merge Key-Value Store for NVM-SSD Hybrid Storage
AU - Chen, Lixiang
AU - Chen, Ruihao
AU - Yang, Chengcheng
AU - Han, Yuxing
AU - Zhang, Rong
AU - Zhou, Xuan
AU - Jin, Peiquan
AU - Qian, Weining
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - The log-structured merge tree (LSM-tree) has been widely adopted as a backbone of modern key-value stores. However, the multiple exponentially increased levels of LSM-tree makes it suffer from high write amplification. Existing studies often improve the write performance by sacrificing the read performance, which is inefficient to make trade-offs between the update and search efficiency. In this paper, we exploit nonvolatile memory (NVM) to address the write amplification issue for systems with NVM-SSD hybrid storage, and further propose a reinforcement learning method to navigate between update and search efficiency on the varying workloads. Specifically, we first propose a lightweight hot data identification method to efficiently capture access recency as well as frequency in NVM with relative large capacity. On this basis, we can eliminate different versions of frequently updated data in high-performance NVM without pushing them to SSD. To improve the data access locality and facilitate fine-grained index tuning in each level, we devise a virtual-split method to partition the key space gradually without extra write amplification. Finally, we propose a cost based Q-learning algorithm to adaptively tune the data organizations of each partition according to the changing access patterns. Experimental results show that our approach outperforms existing methods by up to 2.67×.
AB - The log-structured merge tree (LSM-tree) has been widely adopted as a backbone of modern key-value stores. However, the multiple exponentially increased levels of LSM-tree makes it suffer from high write amplification. Existing studies often improve the write performance by sacrificing the read performance, which is inefficient to make trade-offs between the update and search efficiency. In this paper, we exploit nonvolatile memory (NVM) to address the write amplification issue for systems with NVM-SSD hybrid storage, and further propose a reinforcement learning method to navigate between update and search efficiency on the varying workloads. Specifically, we first propose a lightweight hot data identification method to efficiently capture access recency as well as frequency in NVM with relative large capacity. On this basis, we can eliminate different versions of frequently updated data in high-performance NVM without pushing them to SSD. To improve the data access locality and facilitate fine-grained index tuning in each level, we devise a virtual-split method to partition the key space gradually without extra write amplification. Finally, we propose a cost based Q-learning algorithm to adaptively tune the data organizations of each partition according to the changing access patterns. Experimental results show that our approach outperforms existing methods by up to 2.67×.
UR - https://www.scopus.com/pages/publications/85167728164
U2 - 10.1109/ICDE55515.2023.00171
DO - 10.1109/ICDE55515.2023.00171
M3 - 会议稿件
AN - SCOPUS:85167728164
T3 - Proceedings - International Conference on Data Engineering
SP - 2207
EP - 2219
BT - Proceedings - 2023 IEEE 39th International Conference on Data Engineering, ICDE 2023
PB - IEEE Computer Society
T2 - 39th IEEE International Conference on Data Engineering, ICDE 2023
Y2 - 3 April 2023 through 7 April 2023
ER -