TY - JOUR
T1 - MuDP
T2 - multi-granularity data placement for uniform loops on SPM-DRAM architectures to minimize latency
AU - Du, Yixuan
AU - Sha, Edwin Hsing Mean
AU - Song, Yuhong
AU - Guo, Yibo
AU - Xu, Longshan
AU - Zhuge, Qingfeng
N1 - Publisher Copyright:
© Higher Education Press 2025.
PY - 2025/5
Y1 - 2025/5
N2 - Scratch-pad memory (SPM) has been widely used in embedded systems because it allows software-controlled data placement. By designing data placement strategies, optimal solutions with minimal memory access latency for loops on SPM-DRAM architecture can be explored. Although existing works effectively reduce the latency by using fine-grained data placement methods, they fail in solving the case of inconsecutive array access. Meanwhile, fine-grained strategy can lead to excessive memory activation overhead, making it less efficient. Therefore, in this paper, we first propose a finegrained dynamic programming algorithm, called FiDP, to tackle unsolved case and minimize latency. In order to mitigate the frequent activation before data access, we then add a medium-grained scheme to our strategy. It can achieve a better solution than FiDP by strictly formulating an integer linear programming (ILP) problem and considering multiple granularities, which is called MuILP. Furthermore, to compensate for the high time complexity of ILP, we develop a heuristic multi-granularity data placement algorithm, called HMuDP, which achieves a near-optimal solution with lower complexity. Experimental results show that our FiDP reduces the total latency by 75.90%, 47.70% and 12.34% compared with LRU-cache, a greedy-based comparison method (called Uday) and a dynamic programming-based comparison method (called DLAA). Besides, our MuILP and HMuDP yield less latency than FiDP with 45.10% and 43.14% average improvement, respectively.
AB - Scratch-pad memory (SPM) has been widely used in embedded systems because it allows software-controlled data placement. By designing data placement strategies, optimal solutions with minimal memory access latency for loops on SPM-DRAM architecture can be explored. Although existing works effectively reduce the latency by using fine-grained data placement methods, they fail in solving the case of inconsecutive array access. Meanwhile, fine-grained strategy can lead to excessive memory activation overhead, making it less efficient. Therefore, in this paper, we first propose a finegrained dynamic programming algorithm, called FiDP, to tackle unsolved case and minimize latency. In order to mitigate the frequent activation before data access, we then add a medium-grained scheme to our strategy. It can achieve a better solution than FiDP by strictly formulating an integer linear programming (ILP) problem and considering multiple granularities, which is called MuILP. Furthermore, to compensate for the high time complexity of ILP, we develop a heuristic multi-granularity data placement algorithm, called HMuDP, which achieves a near-optimal solution with lower complexity. Experimental results show that our FiDP reduces the total latency by 75.90%, 47.70% and 12.34% compared with LRU-cache, a greedy-based comparison method (called Uday) and a dynamic programming-based comparison method (called DLAA). Besides, our MuILP and HMuDP yield less latency than FiDP with 45.10% and 43.14% average improvement, respectively.
KW - data placement
KW - embedded system
KW - loops
KW - scratch-pad memory
UR - https://www.scopus.com/pages/publications/85209765639
U2 - 10.1007/s11704-023-3566-y
DO - 10.1007/s11704-023-3566-y
M3 - 文章
AN - SCOPUS:85209765639
SN - 2095-2228
VL - 19
JO - Frontiers of Computer Science
JF - Frontiers of Computer Science
IS - 5
M1 - 195107
ER -