TY - GEN
T1 - BBS
T2 - 41st IEEE International Conference on Data Engineering, ICDE 2025
AU - Peng, Xiaoshuang
AU - Fan, Xiaopeng
AU - Cheng, Shi
AU - Meng, Lingbin
AU - Fu, Cuiyun
AU - Zhou, Wenchao
AU - Weng, Chuliang
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Many cloud databases provide fine-grained regular snapshots and sparsely deleted snapshots based on importance, and dynamically maintain large-scale snapshots to ensure data security and mine the value of cold data. However, in existing snapshot technologies, the write amplification feature of Copy-on-Write (CoW) introduces additional expensive I/O operations in a cloud environment. In Redirect-on-Write (RoW), the modified data blocks are scattered among the snapshots, resulting in a dependency between the snapshots, which seriously affects the recovery performance. In this paper, we observed that access to snapshots has the characteristics of locality and continuity. We therefore propose an efficient Batch-Based Snapshot index, called BBS, which batches snapshot indexes according to database workload and access behavior of snapshots. Specifically, we use two key techniques: Shared-Subtrees Indexing and Batch-Based Dividing, to perform split dependency of the snapshot index. The snapshot index dependency chain is divided into batches, and there is no dependency on snapshot indexes between batches. In-batch snapshot indexes reduce memory overhead by sharing subtrees. The index can directly locate data blocks instead of iterative traversal. At the same time, the design of the snapshot index deletion method is adapted to the snapshot sparse deletion model. We have implemented a working system in Ceph. Evaluation results on datasets demonstrate that, compared with existing techniques, BBS can effectively balance the overhead between index memory capacity and recovery time.
AB - Many cloud databases provide fine-grained regular snapshots and sparsely deleted snapshots based on importance, and dynamically maintain large-scale snapshots to ensure data security and mine the value of cold data. However, in existing snapshot technologies, the write amplification feature of Copy-on-Write (CoW) introduces additional expensive I/O operations in a cloud environment. In Redirect-on-Write (RoW), the modified data blocks are scattered among the snapshots, resulting in a dependency between the snapshots, which seriously affects the recovery performance. In this paper, we observed that access to snapshots has the characteristics of locality and continuity. We therefore propose an efficient Batch-Based Snapshot index, called BBS, which batches snapshot indexes according to database workload and access behavior of snapshots. Specifically, we use two key techniques: Shared-Subtrees Indexing and Batch-Based Dividing, to perform split dependency of the snapshot index. The snapshot index dependency chain is divided into batches, and there is no dependency on snapshot indexes between batches. In-batch snapshot indexes reduce memory overhead by sharing subtrees. The index can directly locate data blocks instead of iterative traversal. At the same time, the design of the snapshot index deletion method is adapted to the snapshot sparse deletion model. We have implemented a working system in Ceph. Evaluation results on datasets demonstrate that, compared with existing techniques, BBS can effectively balance the overhead between index memory capacity and recovery time.
KW - Block device
KW - Index
KW - Snapshot recovery
UR - https://www.scopus.com/pages/publications/105015507688
U2 - 10.1109/ICDE65448.2025.00317
DO - 10.1109/ICDE65448.2025.00317
M3 - 会议稿件
AN - SCOPUS:105015507688
T3 - Proceedings - International Conference on Data Engineering
SP - 4248
EP - 4261
BT - Proceedings - 2025 IEEE 41st International Conference on Data Engineering, ICDE 2025
PB - IEEE Computer Society
Y2 - 19 May 2025 through 23 May 2025
ER -