Ace-Sniper: Cloud-Edge Collaborative Scheduling Framework with DNN Inference Latency Modeling on Heterogeneous Devices

  • Weihong Liu
  • , Jiawei Geng
  • , Zongwei Zhu*
  • , Yang Zhao
  • , Cheng Ji
  • , Changlong Li
  • , Zirui Lian
  • , Xuehai Zhou
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

18 Scopus citations

Abstract

The cloud-edge collaborative inference requires efficient scheduling of artificial intelligence (AI) tasks to the appropriate edge intelligence devices. Gls DNN inference latency has become a vital basis for improving scheduling efficiency. However, edge devices exhibit highly heterogeneous due to the differences in hardware architectures, computing power, etc. Meanwhile, the diverse deep neural networks (DNNs) are continuing to iterate over time. The diversity of devices and DNNs introduces high computational costs for measurement methods, while invasive prediction methods face significant development efforts and application limitations. In this article, we propose and develop Ace-Sniper, a scheduling framework with DNN inference latency modeling on heterogeneous devices. First, to address the device heterogeneity, a unified hardware resource modeling (HRM) is designed by considering the platforms as black-box functions that output feature vectors. Second, neural network similarity (NNS) is introduced for feature extraction of diverse and frequently iterated DNNs. Finally, with the results of HRM and NNS as input, the performance characterization network is designed to predict the latencies of the given unseen DNNs on heterogeneous devices, which can be combined into most time-based scheduling algorithms. Experimental results show that the average relative error of DNN inference latency prediction is 11.11%, and the prediction accuracy reaches 93.2%. Compared with the nontime-Aware scheduling methods, the average waiting time for tasks is reduced by 82.95%, and the platform throughput is improved by 63% on average.

Original languageEnglish
Pages (from-to)534-547
Number of pages14
JournalIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Volume43
Issue number2
DOIs
StatePublished - 1 Feb 2024

Keywords

  • Cloud-edge collaborative
  • hardware resource modeling (HRM)
  • heterogeneous platform
  • inference latency modeling

Fingerprint

Dive into the research topics of 'Ace-Sniper: Cloud-Edge Collaborative Scheduling Framework with DNN Inference Latency Modeling on Heterogeneous Devices'. Together they form a unique fingerprint.

Cite this