摘要
Objectives: Because of the fast growing acquisition of real-time spatiotemporal data for various applications such as smart city or real-time air-quality monitoring, the traditional database technologies cannot satisfy the higher standards for large-scale data indexing, querying, and storing operations. As the viable alternative, NoSQL databases that are scalable and possess fast input/output capabilities offer potential solutions to accommodate the needs. Methods: We propose a Hilbert curve and Cassandra technologies based approach for efficient indexing and storing of large-scale spatiotemporal datasets aiming to provide an effective framework for processing, querying, and analyzing large amount of data with spatial and temporal features. For example, the dataset of vehicle trajectories contains valuable spatial and temporal features those are being employed in the real world. The collected spatiotemporal datasets are preprocessed in order to fit the proposed structures for different applications. Specifically, two types of query applications commonly used in the real world are the spatiotemporal range query and query upon vehicle IDs respectively. Two corresponding indexing structures are designed and implemented in order to accommodate the requests.S2 Geometry Library open sourced by Google is utilized to divide the earth surface into grids, and data points fall in grids are assigned with the specific IDs as the keys. The keys and columns are so designed by applying the Hilbert curve and Cassandra techniques that the resultant structures will physically store the spatially neighboring data points close to each other, and they are more suitable for large-scale spatiotemporal data querying and analyzing applications. Results: The datasets acquired from the real applications are used to conduct the computational experiments to validate the efficiency of the proposed approach. The query efficiency and the time consumed to store large amount of spatiotemporal data are investigated and benchmarked against some existing database technologies. Conclusions: The computational experiments reveal the superiority of the proposed approach comparing to the existing methodologies, the required time to store (insert) data in the database is reduced by 6 times while the time needed to query data is decreased by at least 10 times. The efficiency of the proposed methodology is validate further by applying it to query the vehicle trajectories gathering the real-time air quality data.
| 投稿的翻译标题 | Hilbert Curve and Cassandra Based Indexing and Storing Approach for Large-Scale Spatiotemporal Data |
|---|---|
| 源语言 | 繁体中文 |
| 页(从-至) | 620-629 |
| 页数 | 10 |
| 期刊 | Wuhan Daxue Xuebao (Xinxi Kexue Ban)/Geomatics and Information Science of Wuhan University |
| 卷 | 46 |
| 期 | 5 |
| DOI | |
| 出版状态 | 已出版 - 5 5月 2021 |
联合国可持续发展目标
此成果有助于实现下列可持续发展目标:
-
可持续发展目标 11 可持续城市和社区
关键词
- Cassandra
- Database keys
- Distributed storage
- Spatial encoding
- Spatiotemporal data
- Vehicle trajectory
指纹
探究 '利用Hilbert曲线与Cassandra技术实现时空大数据存储与索引' 的科研主题。它们共同构成独一无二的指纹。引用此
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver