TY - GEN
T1 - Design of a more scalable database system
AU - Zhuang, Hang
AU - Lu, Kun
AU - Li, Changlong
AU - Sun, Mingming
AU - Chen, Hang
AU - Zhou, Xuehai
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/7/7
Y1 - 2015/7/7
N2 - With the development of cloud computing and internet, e-Commerce, e-Business and corporate world revenue are increasing with high rate. These areas require scalable and consistent databases. NoSQL databases such as HBase has been proven to scalability and well performance on cloud computing platforms. However, the inevitable special data with few increment and frequent access leads to hotspot data and unbalanced accessing distribution between data storage servers. Due to their properties, these data often cannot be stored in multiple tables. Some storage nodes become the bottleneck of the distributed storage system, therefore, it becomes difficult to improve the performance by increasing the number of nodes which severely limits the scalability of the storage system. In order to make the performance of the cluster increases with the size of the cluster simultaneously, we devise a new distributed database storage framework to solve those issues mentioned above by changing the storage and read-write mode of the hotspot data. This structure guarantees that the hotspot data will not aggregate in the same storage node, as it guarantees that the data is not too hot in a single storage node. We implement the scalable database based on Apache HBase, which achieve almost double performance of throughput considering heavy read-write pressure situation only with double reading substites. Besides, heavy load node owing to hotspot data will no longer present in the new distributed database.
AB - With the development of cloud computing and internet, e-Commerce, e-Business and corporate world revenue are increasing with high rate. These areas require scalable and consistent databases. NoSQL databases such as HBase has been proven to scalability and well performance on cloud computing platforms. However, the inevitable special data with few increment and frequent access leads to hotspot data and unbalanced accessing distribution between data storage servers. Due to their properties, these data often cannot be stored in multiple tables. Some storage nodes become the bottleneck of the distributed storage system, therefore, it becomes difficult to improve the performance by increasing the number of nodes which severely limits the scalability of the storage system. In order to make the performance of the cluster increases with the size of the cluster simultaneously, we devise a new distributed database storage framework to solve those issues mentioned above by changing the storage and read-write mode of the hotspot data. This structure guarantees that the hotspot data will not aggregate in the same storage node, as it guarantees that the data is not too hot in a single storage node. We implement the scalable database based on Apache HBase, which achieve almost double performance of throughput considering heavy read-write pressure situation only with double reading substites. Besides, heavy load node owing to hotspot data will no longer present in the new distributed database.
KW - HBase
KW - Hotspot
KW - Scalability
KW - Storage balance
KW - Throughput
UR - https://www.scopus.com/pages/publications/84941253900
U2 - 10.1109/CCGrid.2015.70
DO - 10.1109/CCGrid.2015.70
M3 - 会议稿件
AN - SCOPUS:84941253900
T3 - Proceedings - 2015 IEEE/ACM 15th International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2015
SP - 1213
EP - 1216
BT - Proceedings - 2015 IEEE/ACM 15th International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2015
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 15th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2015
Y2 - 4 May 2015 through 7 May 2015
ER -