SAI: Latency-Aware Satellite Edge LAM Inference with Looped Transformer

  • Honggang Yuan
  • , Zixin Wang
  • , Yuning Jiang
  • , Xin Liu
  • , Yuanming Shi
  • , Ting Wang*
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The rapid advancements in computing and communication capabilities of Low Earth Orbit (LEO) satellites have made it feasible to execute complex and collaborative inorbit computation missions. Transformer-based large AI models (LAMs), known for their exceptional performance in in-context learning (ICL) and prompt-based reasoning, have attracted significant attention, providing powerful intelligence across sectors such as industry and aerospace. However, the significant parameter volume of LAMs poses a substantial challenge for direct deployment on satellites with constrained computing power and energy provision. To address this, the looped Transformer model reduces parameter requirements through layerwise parameter sharing, achieving performance comparable to vanilla Transformer-based LAMs in ICL tasks. Despite this efficiency, the limited and heterogeneous space-borne computing and storage capabilities complicate the orchestration for balanced workload allocation during multi-satellite cooperation. In this paper, we propose SAI, a collaborative multi-satellite space AI system that exploits the memory efficiency of the looped Transformer and the inherent parallelism in batch data processing. SAI enables accelerated on-satellite inference by integrating heterogeneous onboard resources and introducing a novel hybrid approach combining data and pipeline parallelism. This approach supports cross-satellite cooperation with parallelism planning and asynchronous inter-batch overlapping, significantly reducing inference latency and enhancing resource efficiency. Furthermore, SAI optimizes inference latency by formulating it as a shortest-path problem, effectively solved via Dijkstras algorithm. Extensive evaluations demonstrate SAIs superior performance in reducing inference latency and runtime memory usage compared to existing baselines.

Original languageEnglish
Title of host publicationICC 2025 - IEEE International Conference on Communications
EditorsMatthew Valenti, David Reed, Melissa Torres
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2472-2477
Number of pages6
ISBN (Electronic)9798331505219
DOIs
StatePublished - 2025
Event2025 IEEE International Conference on Communications, ICC 2025 - Montreal, Canada
Duration: 8 Jun 202512 Jun 2025

Publication series

NameIEEE International Conference on Communications
ISSN (Print)1550-3607

Conference

Conference2025 IEEE International Conference on Communications, ICC 2025
Country/TerritoryCanada
CityMontreal
Period8/06/2512/06/25

Keywords

  • In-context Learning
  • Satellite
  • Space Computing
  • Transformer

Fingerprint

Dive into the research topics of 'SAI: Latency-Aware Satellite Edge LAM Inference with Looped Transformer'. Together they form a unique fingerprint.

Cite this