TY - GEN
T1 - A scalable framework for universal data generation in parallel
AU - Gu, Ling
AU - Zhou, Minqi
AU - Kang, Qiangqiang
AU - Zhou, Aoying
N1 - Publisher Copyright:
© 2015 Springer International Publishing Switzerland.
PY - 2015
Y1 - 2015
N2 - Nowadays, more and more companies, such as Amazon, Twitter and etc., are facing the big data problem, which requires higher performance to manage tremendous large data sets. Data management systems with a new architecture taking full advantages of computer hardware are emerging, on the purpose of maximizing the system performance and fulfilling customs’ current or even future requirements. How to test performance and confirm the suitability of the new data management system becomes a primary task of these companies. Hence, how to generate a scaled data set with desired volumes and in desired velocity effectively becomes a problem imperative to be solved, together with the goal to keep the characters of their real data set as many as possible (realistic). In this paper, we proposed PSUG to generate a realistic database in terms of required volume and velocity in a scalable parallel manner. Our extensive experimental studies confirm the efficiency and effectiveness of our proposed method.
AB - Nowadays, more and more companies, such as Amazon, Twitter and etc., are facing the big data problem, which requires higher performance to manage tremendous large data sets. Data management systems with a new architecture taking full advantages of computer hardware are emerging, on the purpose of maximizing the system performance and fulfilling customs’ current or even future requirements. How to test performance and confirm the suitability of the new data management system becomes a primary task of these companies. Hence, how to generate a scaled data set with desired volumes and in desired velocity effectively becomes a problem imperative to be solved, together with the goal to keep the characters of their real data set as many as possible (realistic). In this paper, we proposed PSUG to generate a realistic database in terms of required volume and velocity in a scalable parallel manner. Our extensive experimental studies confirm the efficiency and effectiveness of our proposed method.
UR - https://www.scopus.com/pages/publications/84922359088
U2 - 10.1007/978-3-319-15350-6_5
DO - 10.1007/978-3-319-15350-6_5
M3 - 会议稿件
AN - SCOPUS:84922359088
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 64
EP - 81
BT - Performance Characterization and Benchmarking
A2 - Poess, Meikel
A2 - Nambiar, Raghunath
PB - Springer Verlag
T2 - 6th TPC Technology Conference on Performance Evaluation and Benchmarking, TPCTC 2014 held in conjunction with 40th International Conference on Very Large Data Bases, VLDB 2014
Y2 - 1 September 2014 through 5 September 2014
ER -