A scalable framework for universal data generation in parallel

Ling Gu, Minqi Zhou, Qiangqiang Kang, Aoying Zhou

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Nowadays, more and more companies, such as Amazon, Twitter and etc., are facing the big data problem, which requires higher performance to manage tremendous large data sets. Data management systems with a new architecture taking full advantages of computer hardware are emerging, on the purpose of maximizing the system performance and fulfilling customs’ current or even future requirements. How to test performance and confirm the suitability of the new data management system becomes a primary task of these companies. Hence, how to generate a scaled data set with desired volumes and in desired velocity effectively becomes a problem imperative to be solved, together with the goal to keep the characters of their real data set as many as possible (realistic). In this paper, we proposed PSUG to generate a realistic database in terms of required volume and velocity in a scalable parallel manner. Our extensive experimental studies confirm the efficiency and effectiveness of our proposed method.

Original languageEnglish
Title of host publicationPerformance Characterization and Benchmarking
Subtitle of host publicationTraditional to Big Data - 6th TPC Technology Conference, TPCTC 2014, Revised Selected Papers
EditorsMeikel Poess, Raghunath Nambiar
PublisherSpringer Verlag
Pages64-81
Number of pages18
ISBN (Electronic)9783319153490
DOIs
StatePublished - 2015
Event6th TPC Technology Conference on Performance Evaluation and Benchmarking, TPCTC 2014 held in conjunction with 40th International Conference on Very Large Data Bases, VLDB 2014 - Hangzhou, China
Duration: 1 Sep 20145 Sep 2014

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8904
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference6th TPC Technology Conference on Performance Evaluation and Benchmarking, TPCTC 2014 held in conjunction with 40th International Conference on Very Large Data Bases, VLDB 2014
Country/TerritoryChina
CityHangzhou
Period1/09/145/09/14

Fingerprint

Dive into the research topics of 'A scalable framework for universal data generation in parallel'. Together they form a unique fingerprint.

Cite this