TY - JOUR
T1 - SAFE
T2 - A source deduplication framework for efficient cloud backup services
AU - Tan, Yujuan
AU - Jiang, Hong
AU - Sha, Edwin Hsing Mean
AU - Yan, Zhichao
AU - Feng, Dan
PY - 2013/9
Y1 - 2013/9
N2 - Due to the relatively low bandwidth of WAN that supports cloud backup services and the increasing amount of backed-up data stored at service providers, the deduplication scheme used in the cloud backup environment must remove the redundant data for backup operations to reduce backup times and storage costs and for restore operations to reduce restore times. In this paper, we propose SAFE, a source deduplication framework for efficient cloud backup and restore operations. SAFE consists of three salient features, (1) Hybrid Deduplication, combining the global file-level and local chunk-level deduplication to achieve an optimal tradeoff between the deduplication efficiency and overhead to achieve a short backup time; (2) Semantic-aware Elimination, exploiting file semantics to narrow the search space for the redundant data in hybrid deduplication process to reduce the deduplication overhead; and (3) Unmodified Data Removal, removing the files and data chunks that are kept intact from data transmission for some restore operations. Through extensive experiments driven by real-world datasets, the SAFE framework is shown to maintain a much higher deduplication efficiency/overhead ratio than existing solutions, shortening the backup time by an average of 38.7 %, and reduce the restore time by a ratio of up to 9.7: 1.
AB - Due to the relatively low bandwidth of WAN that supports cloud backup services and the increasing amount of backed-up data stored at service providers, the deduplication scheme used in the cloud backup environment must remove the redundant data for backup operations to reduce backup times and storage costs and for restore operations to reduce restore times. In this paper, we propose SAFE, a source deduplication framework for efficient cloud backup and restore operations. SAFE consists of three salient features, (1) Hybrid Deduplication, combining the global file-level and local chunk-level deduplication to achieve an optimal tradeoff between the deduplication efficiency and overhead to achieve a short backup time; (2) Semantic-aware Elimination, exploiting file semantics to narrow the search space for the redundant data in hybrid deduplication process to reduce the deduplication overhead; and (3) Unmodified Data Removal, removing the files and data chunks that are kept intact from data transmission for some restore operations. Through extensive experiments driven by real-world datasets, the SAFE framework is shown to maintain a much higher deduplication efficiency/overhead ratio than existing solutions, shortening the backup time by an average of 38.7 %, and reduce the restore time by a ratio of up to 9.7: 1.
KW - Cloud backup services
KW - Data backup
KW - Data deduplication
KW - Data restore
UR - https://www.scopus.com/pages/publications/84880325208
U2 - 10.1007/s11265-013-0775-x
DO - 10.1007/s11265-013-0775-x
M3 - 文章
AN - SCOPUS:84880325208
SN - 1939-8018
VL - 72
SP - 209
EP - 228
JO - Journal of Signal Processing Systems
JF - Journal of Signal Processing Systems
IS - 3
ER -