AOBO: A Fast-Switching Online Binary Optimizer on AArch64

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

As the complexity of real-world server applications continues to grow, performance optimizations for large-scale applications are becoming increasingly challenging. The success of online optimization offered by OCOLOS and Dynimize proves that binary rewriting based on edge profiling data can significantly accelerate these applications. However, no similar online binary optimizer is currently available on the AArch64 platform. In response to the growing adoption of the AArch64 platform, this article introduces AOBO, a fast-switching online binary optimizer specifically designed for AArch64. In addition to providing practical and efficient engineering support for AArch64-specific features, AOBO overcomes the challenge of lacking hardware counters for edge profiling on most commercially available AArch64 servers. In particular, AOBO embraces a novel edge weight estimation scheme to deliver more accurate edge estimation, which in turn allows AOBO's binary rewriter to generate more efficient code. Furthermore, time spent on AOBO's online code replacement stage is optimized to work at a subsecond level, thus enabling a fast switch from running the original binary to running the optimized one. We evaluate AOBO with CINT2017, GCC, MySQL and MongoDB, measuring the accuracy and coverage of the estimated edge weights, the performance improvements of the optimized binaries, and the online optimization cost. To make a fair comparison, we are using the performance data of the binaries generated by the default compilation scripts in the software packages as a baseline. Experimental data shows that AOBO can offer a more accurate edge weight estimation and generate binaries with superior performance. Furthermore, AOBO achieves online optimization with a very small overhead and significantly improves the performance of large-scale applications. Compared with the baselines, AOBO's online optimization can achieve 24.7% and 31.11% performance improvement respectively for MySQL and MongoDB. Notably, application pause time is reduced from 1,599.8 milliseconds to 462.1 milliseconds for MySQL, and from 1,765.9 milliseconds to 507.1 milliseconds for MongoDB.

Original languageEnglish
Article number82
JournalACM Transactions on Architecture and Code Optimization
Volume22
Issue number2
DOIs
StatePublished - 1 Jul 2025

Keywords

  • AArch64 instruction set architecture
  • Post-link optimization
  • code layout optimization
  • online code replacement

Fingerprint

Dive into the research topics of 'AOBO: A Fast-Switching Online Binary Optimizer on AArch64'. Together they form a unique fingerprint.

Cite this