TY - GEN
T1 - Loop scheduling with complete memory latency hiding on multi-core architecture
AU - Xue, Chun
AU - Shao, Zili
AU - Liu, Meilin
AU - Qiu, Meikang
AU - Sha, Edwin H.M.
PY - 2006
Y1 - 2006
N2 - The widening gap between processor and memory performance is the main bottleneck for modern computer systems to achieve high processor utilization. In this paper, we propose a new loop scheduling with memory management technique, Iterational Retiming with Partitioning (IRP), that can completely hide memory latencies for applications with multi-dimensional loops on architectures like CELL processor [1]. In IRP, the iteration space is first partitioned carefully. Then a two-part schedule, consisting of processor and memory parts, is produced such that the execution time of the memory part never exceeds the execution time of the processor part. These two parts are executed simultaneously and complete memory latency hiding is reached. Experiments on DSP benchmarks show that IRP consistently produces optimal solutions as well as significant improvement over previous techniques.
AB - The widening gap between processor and memory performance is the main bottleneck for modern computer systems to achieve high processor utilization. In this paper, we propose a new loop scheduling with memory management technique, Iterational Retiming with Partitioning (IRP), that can completely hide memory latencies for applications with multi-dimensional loops on architectures like CELL processor [1]. In IRP, the iteration space is first partitioned carefully. Then a two-part schedule, consisting of processor and memory parts, is produced such that the execution time of the memory part never exceeds the execution time of the processor part. These two parts are executed simultaneously and complete memory latency hiding is reached. Experiments on DSP benchmarks show that IRP consistently produces optimal solutions as well as significant improvement over previous techniques.
UR - https://www.scopus.com/pages/publications/34047199043
U2 - 10.1109/ICPADS.2006.58
DO - 10.1109/ICPADS.2006.58
M3 - 会议稿件
AN - SCOPUS:34047199043
SN - 0769526128
SN - 9780769526126
T3 - Proceedings of the International Conference on Parallel and Distributed Systems - ICPADS
SP - 375
EP - 382
BT - Proceedings - 12th International Conference on Parallel and Distributed Systems, ICPADS 2006
T2 - 12th International Conference on Parallel and Distributed Systems, ICPADS 2006
Y2 - 12 July 2006 through 15 July 2006
ER -