Performance optimization for parallel systems with shared DWM via retiming, loop scheduling, and data placement

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Domain Wall Memory (DWM) as an ideal candidate for replacing traditional memories especially in parallel systems, has many desirable characteristics such as low leakage power, high density and low access latency. However, due to the tape-like architecture of DWM, shift operations have a vital impact on performance. Considering data-intensive applications with massive loops and arrays, increasing parallelism of loops, appropriate loop scheduling and data placement on DWM will significantly improve the performance of parallel systems. This paper explores optimizing performance of parallel systems through retiming, loop scheduling and data placement especially when the data are arrays. It proposes Integer Linear Programming (ILP) formulation and Scheduling While Placing (SWP) algorithm to generate optimal or nearly optimal loop scheduling and data placement with minimum execution time. The experimental results show that SWP and ILP can effectively reduce execution time when compared with greedy List Scheduling First Access First Place (LF) algorithm. Besides, this paper proposes Threshold Retiming Repetition (TRR) algorithm to combine the retiming technique with SWP and ILP. The experimental results show that SWP+TRR and ILP+TRR can further reduce the execution time when compared to results without retiming.

Original languageEnglish
Article number101842
JournalJournal of Systems Architecture
Volume112
DOIs
StatePublished - Jan 2021

Keywords

  • Data placement
  • Domain wall memory
  • Loop scheduling
  • Retiming
  • Shift operation

Fingerprint

Dive into the research topics of 'Performance optimization for parallel systems with shared DWM via retiming, loop scheduling, and data placement'. Together they form a unique fingerprint.

Cite this