跳到主要导航 跳到搜索 跳到主要内容

SAI: Latency-Aware Satellite Edge LAM Inference with Looped Transformer

  • Honggang Yuan
  • , Zixin Wang
  • , Yuning Jiang
  • , Xin Liu
  • , Yuanming Shi
  • , Ting Wang*
  • *此作品的通讯作者
  • East China Normal University
  • Hong Kong University of Science and Technology
  • Automatic Control Laboratory
  • ShanghaiTech University

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

The rapid advancements in computing and communication capabilities of Low Earth Orbit (LEO) satellites have made it feasible to execute complex and collaborative inorbit computation missions. Transformer-based large AI models (LAMs), known for their exceptional performance in in-context learning (ICL) and prompt-based reasoning, have attracted significant attention, providing powerful intelligence across sectors such as industry and aerospace. However, the significant parameter volume of LAMs poses a substantial challenge for direct deployment on satellites with constrained computing power and energy provision. To address this, the looped Transformer model reduces parameter requirements through layerwise parameter sharing, achieving performance comparable to vanilla Transformer-based LAMs in ICL tasks. Despite this efficiency, the limited and heterogeneous space-borne computing and storage capabilities complicate the orchestration for balanced workload allocation during multi-satellite cooperation. In this paper, we propose SAI, a collaborative multi-satellite space AI system that exploits the memory efficiency of the looped Transformer and the inherent parallelism in batch data processing. SAI enables accelerated on-satellite inference by integrating heterogeneous onboard resources and introducing a novel hybrid approach combining data and pipeline parallelism. This approach supports cross-satellite cooperation with parallelism planning and asynchronous inter-batch overlapping, significantly reducing inference latency and enhancing resource efficiency. Furthermore, SAI optimizes inference latency by formulating it as a shortest-path problem, effectively solved via Dijkstras algorithm. Extensive evaluations demonstrate SAIs superior performance in reducing inference latency and runtime memory usage compared to existing baselines.

源语言英语
主期刊名ICC 2025 - IEEE International Conference on Communications
编辑Matthew Valenti, David Reed, Melissa Torres
出版商Institute of Electrical and Electronics Engineers Inc.
2472-2477
页数6
ISBN(电子版)9798331505219
DOI
出版状态已出版 - 2025
活动2025 IEEE International Conference on Communications, ICC 2025 - Montreal, 加拿大
期限: 8 6月 202512 6月 2025

出版系列

姓名IEEE International Conference on Communications
ISSN(印刷版)1550-3607

会议

会议2025 IEEE International Conference on Communications, ICC 2025
国家/地区加拿大
Montreal
时期8/06/2512/06/25

联合国可持续发展目标

此成果有助于实现下列可持续发展目标:

  1. 可持续发展目标 8 - 体面工作和经济增长
    可持续发展目标 8 体面工作和经济增长
  2. 可持续发展目标 12 - 负责任消费和生产
    可持续发展目标 12 负责任消费和生产

指纹

探究 'SAI: Latency-Aware Satellite Edge LAM Inference with Looped Transformer' 的科研主题。它们共同构成独一无二的指纹。

引用此