MuDP: multi-granularity data placement for uniform loops on SPM-DRAM architectures to minimize latency

Research output: Contribution to journalArticlepeer-review

Abstract

Scratch-pad memory (SPM) has been widely used in embedded systems because it allows software-controlled data placement. By designing data placement strategies, optimal solutions with minimal memory access latency for loops on SPM-DRAM architecture can be explored. Although existing works effectively reduce the latency by using fine-grained data placement methods, they fail in solving the case of inconsecutive array access. Meanwhile, fine-grained strategy can lead to excessive memory activation overhead, making it less efficient. Therefore, in this paper, we first propose a finegrained dynamic programming algorithm, called FiDP, to tackle unsolved case and minimize latency. In order to mitigate the frequent activation before data access, we then add a medium-grained scheme to our strategy. It can achieve a better solution than FiDP by strictly formulating an integer linear programming (ILP) problem and considering multiple granularities, which is called MuILP. Furthermore, to compensate for the high time complexity of ILP, we develop a heuristic multi-granularity data placement algorithm, called HMuDP, which achieves a near-optimal solution with lower complexity. Experimental results show that our FiDP reduces the total latency by 75.90%, 47.70% and 12.34% compared with LRU-cache, a greedy-based comparison method (called Uday) and a dynamic programming-based comparison method (called DLAA). Besides, our MuILP and HMuDP yield less latency than FiDP with 45.10% and 43.14% average improvement, respectively.

Original languageEnglish
Article number195107
JournalFrontiers of Computer Science
Volume19
Issue number5
DOIs
StatePublished - May 2025

Keywords

  • data placement
  • embedded system
  • loops
  • scratch-pad memory

Fingerprint

Dive into the research topics of 'MuDP: multi-granularity data placement for uniform loops on SPM-DRAM architectures to minimize latency'. Together they form a unique fingerprint.

Cite this