TY - GEN
T1 - Optimizing Data Layout for Racetrack Memory in Embedded Systems
AU - Hui, Peng
AU - Sha, Edwin H.M.
AU - Zhuge, Qingfeng
AU - Xu, Rui
AU - Wang, Han
N1 - Publisher Copyright:
© 2023 Copyright held by the owner/author(s).
PY - 2023/1/16
Y1 - 2023/1/16
N2 - Racetrack memory (RTM), which consists of multiple domain block clusters (DBC) and access ports, is a novel non-volatile memory and has potential as scratchpad memory (SPM) in embedded devices due to its high density and low access latency. However, too many shift operations decrease the performance of RTM and cause unpredictable performance. In this paper, we propose three schemes to optimize the performance of RTM from different aspects, including intra-DBC, inter-DBC, and hybrid SPM with SRAM and RTM. Firstly, a balanced group-based data placement method for the data layout inside one DBC is proposed to reduce shifts. Second, a grouping method for the data allocation among DBCs is proposed. It helps with the shift reduction while using fewer DBCs by using one DBC as multiple DBCs. Finally, we use SRAM to further help the cost reduction, and a cost evaluation metric is proposed to assist the shrinking method which determines the data allocation for hybrid SPM with SRAM and RTM. Experiments show that the proposed schemes can significantly improve the performance of pure RTM and hybrid SPM while using fewer DBCs.
AB - Racetrack memory (RTM), which consists of multiple domain block clusters (DBC) and access ports, is a novel non-volatile memory and has potential as scratchpad memory (SPM) in embedded devices due to its high density and low access latency. However, too many shift operations decrease the performance of RTM and cause unpredictable performance. In this paper, we propose three schemes to optimize the performance of RTM from different aspects, including intra-DBC, inter-DBC, and hybrid SPM with SRAM and RTM. Firstly, a balanced group-based data placement method for the data layout inside one DBC is proposed to reduce shifts. Second, a grouping method for the data allocation among DBCs is proposed. It helps with the shift reduction while using fewer DBCs by using one DBC as multiple DBCs. Finally, we use SRAM to further help the cost reduction, and a cost evaluation metric is proposed to assist the shrinking method which determines the data allocation for hybrid SPM with SRAM and RTM. Experiments show that the proposed schemes can significantly improve the performance of pure RTM and hybrid SPM while using fewer DBCs.
KW - data layout
KW - hybrid SPM
KW - racetrack memory
KW - shift operation
UR - https://www.scopus.com/pages/publications/85148485078
U2 - 10.1145/3566097.3567854
DO - 10.1145/3566097.3567854
M3 - 会议稿件
AN - SCOPUS:85148485078
T3 - Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC
SP - 110
EP - 115
BT - ASP-DAC 2023 - 28th Asia and South Pacific Design Automation Conference, Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 28th Asia and South Pacific Design Automation Conference, ASP-DAC 2023
Y2 - 16 January 2023 through 19 January 2023
ER -