Storage and recreation trade-off for multi-version data management

  • Yin Zhang
  • , Huiping Liu
  • , Cheqing Jin*
  • , Ye Guo
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

With the tremendous development of data acquisition technology, massive observation data have been accumulated in scientific disciplines. As the difference between the successive observations only changes slightly, it is critical to utilize multi-version data management technology to compress data to minimize both storage and recreation. However, the existing work on this field only optimizes the total storage and recreation costs, but ignores the recreation cost of some special versions. Consequently, in this paper, we investigate the trade-off among all of three metrics, including total storage cost, total recreation cost, and the maximum recreation cost for each version. We formulate two problems, including (1) discover a storage plan to lower the total recreation and the individual recreation if the total storage is limited; (2) find a storage plan to minimize the total storage with restricted total recreation and individual recreation. To solve above problems, we model all versions with a directed graph and then devise two efficient algorithms based on spanning tree. A series of experiments indicate that our proposals are effective and efficient in dealing with the problems.

Original languageEnglish
Title of host publicationWeb and Big Data - Second International Joint Conference, APWeb-WAIM 2018, Proceedings
EditorsYi Cai, Yoshiharu Ishikawa, Jianliang Xu
PublisherSpringer Verlag
Pages394-409
Number of pages16
ISBN (Print)9783319968926
DOIs
StatePublished - 2018
Event2nd Asia Pacific Web and Web-Age Information Management Joint Conference on Web and Big Data, APWeb-WAIM 2018 - Macau, China
Duration: 23 Jul 201825 Jul 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10988 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference2nd Asia Pacific Web and Web-Age Information Management Joint Conference on Web and Big Data, APWeb-WAIM 2018
Country/TerritoryChina
CityMacau
Period23/07/1825/07/18

Keywords

  • Multi-version data management
  • Scientific data management
  • Storage and recreation trade-off

Fingerprint

Dive into the research topics of 'Storage and recreation trade-off for multi-version data management'. Together they form a unique fingerprint.

Cite this