TY - GEN
T1 - Combined partitioning and data padding for scheduling multiple loop nests
AU - Wang, Zhong
AU - Sha, Edwin H.M.
AU - Hu, Xiaobo
N1 - Publisher Copyright:
Copyright 2001 ACM.
PY - 2001/11/16
Y1 - 2001/11/16
N2 - With the widening performance gap between processors and main memory, efficient memory accessing behavior is necessary for good program performance. Loop partition is an effective way to exploit the data locality. Traditional loop partition techniques, however, consider only a single- ton nested loop. This paper presents multiple loop partition scheduling technique, which combines the loop partition and data padding to generate the detailed partition schedule. The computation and data prefetching are balanced in the partition schedule, such that the long memory latency can be hidden efficiently. Multiple loop partition scheduling explores parallelism among computations, and exploit the data locality between different loop nests as well in each loop nest. Data padding is applied in our technique to eliminate the cache interference, which overcomes the problem of cache conflict misses arisen from loop partition. Therefore, our technique can be applied in architectures with low as- sociativity cache. The experiments show that multiple loop partition scheduling can achieve the significant improvement over the existing methods.
AB - With the widening performance gap between processors and main memory, efficient memory accessing behavior is necessary for good program performance. Loop partition is an effective way to exploit the data locality. Traditional loop partition techniques, however, consider only a single- ton nested loop. This paper presents multiple loop partition scheduling technique, which combines the loop partition and data padding to generate the detailed partition schedule. The computation and data prefetching are balanced in the partition schedule, such that the long memory latency can be hidden efficiently. Multiple loop partition scheduling explores parallelism among computations, and exploit the data locality between different loop nests as well in each loop nest. Data padding is applied in our technique to eliminate the cache interference, which overcomes the problem of cache conflict misses arisen from loop partition. Therefore, our technique can be applied in architectures with low as- sociativity cache. The experiments show that multiple loop partition scheduling can achieve the significant improvement over the existing methods.
UR - https://www.scopus.com/pages/publications/84870713429
U2 - 10.1145/502217.502228
DO - 10.1145/502217.502228
M3 - 会议稿件
AN - SCOPUS:84870713429
T3 - CASES 2001 - Proceedings of the 2001 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems
SP - 67
EP - 75
BT - CASES 2001 - Proceedings of the 2001 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems
PB - Association for Computing Machinery, Inc
T2 - 2nd International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, CASES 2001
Y2 - 16 November 2001 through 17 November 2001
ER -