TY - JOUR
T1 - Over-the-Air Computation Empowered Vertically Split Inference
AU - Yang, Peng
AU - Wen, Dingzhu
AU - Zeng, Qunsong
AU - Zhou, Yong
AU - Wang, Ting
AU - Cai, Haibin
AU - Shi, Yuanming
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - To tackle the issue of heterogeneous input raw data samples obtained by different devices and enhance the feature extraction capability of edge devices, we propose a vertically split neural network based edge-device collaborative artificial intelligence (AI) inference framework. The local results calculated by various light-size sub-networks at edge devices are transmitted and aggregated at the server for the downstream inference task. Nevertheless, the transmission of such high-dimensional local results involves severe communication overhead. To resolve this issue, the technique of over-the-air computation (AirComp) is adopted to enable low-latency aggregation. The same entry of all devices’ local results is transmitted over a same wireless resource block and aggregated via the waveform superposition property. Furthermore, to simultaneously support the aggregation of all dimensions of the local results, we consider a broadband channel and leverage orthogonal frequency division multiplexing (OFDM) to divide the system bandwidth into multiple subcarriers which are then assigned for different dimensions. Consequently, an extra degree of freedom is introduced to design the aggregation of all dimensions. We then propose a scheme of joint subcarrier allocation, power allocation, and receiver beamforming to minimize the aggregation distortion and enhance inference performance. Extensive experiments are conducted to verify the superiority of the proposed design over benchmarks.
AB - To tackle the issue of heterogeneous input raw data samples obtained by different devices and enhance the feature extraction capability of edge devices, we propose a vertically split neural network based edge-device collaborative artificial intelligence (AI) inference framework. The local results calculated by various light-size sub-networks at edge devices are transmitted and aggregated at the server for the downstream inference task. Nevertheless, the transmission of such high-dimensional local results involves severe communication overhead. To resolve this issue, the technique of over-the-air computation (AirComp) is adopted to enable low-latency aggregation. The same entry of all devices’ local results is transmitted over a same wireless resource block and aggregated via the waveform superposition property. Furthermore, to simultaneously support the aggregation of all dimensions of the local results, we consider a broadband channel and leverage orthogonal frequency division multiplexing (OFDM) to divide the system bandwidth into multiple subcarriers which are then assigned for different dimensions. Consequently, an extra degree of freedom is introduced to design the aggregation of all dimensions. We then propose a scheme of joint subcarrier allocation, power allocation, and receiver beamforming to minimize the aggregation distortion and enhance inference performance. Extensive experiments are conducted to verify the superiority of the proposed design over benchmarks.
KW - Split inference
KW - neural networks
KW - over-the-air computation
UR - https://www.scopus.com/pages/publications/85209109095
U2 - 10.1109/TWC.2024.3485678
DO - 10.1109/TWC.2024.3485678
M3 - 文章
AN - SCOPUS:85209109095
SN - 1536-1276
VL - 23
SP - 19634
EP - 19648
JO - IEEE Transactions on Wireless Communications
JF - IEEE Transactions on Wireless Communications
IS - 12
ER -