跳到主要导航 跳到搜索 跳到主要内容

Neos: A NVMe-GPUs Direct Vector Service Buffer in User Space

  • Yuchen Huang
  • , Xiaopeng Fan
  • , Song Yan
  • , Chuliang Weng*
  • *此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

With the development of AI generated content and LLM (Large Language Model), demands of vector management have brought prosperity to vector databases. However, the status that vectors cannot be retrieved before being indexed, harms timeliness of vector databases. Updating indexes immediately when adding new vectors, reduces throughput of storage. Due to this contradiction, when facing streaming data, using vector database solely in vector services cannot have it both ways: real-time searches and high-throughput storage. This paper proposes a vector buffer engine, Neos. It is designed for real-time unindexed-vector searches on streaming input and buffering vectors with high throughput before loading them into vector databases. On one hand, we build a lightweight storage on raw NVMe device and liberate throughput from indexes, to maximize storage performance. On the other hand, we realize direct NVMe-GPUs 110 stack and a CPU-GPU heterogeneous task architecture for low-latency unindexed-vector searches on streaming data. Experiments show that our approach performs with 1.5x to 3.4x bandwidth, as low as 20% latency compared to existing 110 stacks, and up to orders-of-magnitude higher vector storage throughput under concurrent RIW workloads. Further, N eos can handle real-time unindexed -vector searches with millisecond-level latency on streaming input, a capability that current vector systems lack.

源语言英语
主期刊名Proceedings - 2024 IEEE 40th International Conference on Data Engineering, ICDE 2024
出版商IEEE Computer Society
3767-3781
页数15
ISBN(电子版)9798350317152
DOI
出版状态已出版 - 2024
活动40th IEEE International Conference on Data Engineering, ICDE 2024 - Utrecht, 荷兰
期限: 13 5月 202417 5月 2024

出版系列

姓名Proceedings - International Conference on Data Engineering
ISSN(印刷版)1084-4627
ISSN(电子版)2375-0286

会议

会议40th IEEE International Conference on Data Engineering, ICDE 2024
国家/地区荷兰
Utrecht
时期13/05/2417/05/24

指纹

探究 'Neos: A NVMe-GPUs Direct Vector Service Buffer in User Space' 的科研主题。它们共同构成独一无二的指纹。

引用此