Abstract
Massive uniform nested loops are broadly used in scientific and DSP applications. Due to the large amount of data handled by such applications, the optimization of data accesses by fully utilizing the local memory and minimizing communication overhead is important in order to improve the overall system performance. Most of the traditional partition strategies do not consider the effect of data access on the computational performance. In this study, multi-level partitioning method, based on a static data scheduling technique known as carrot-hole data scheduling, is proposed to control the data traffic between different levels of memory. Based on this data schedule, optimal partition vector, scheduling vector and the partition size are chosen in such a way to minimize communication overhead. Non-homogeneous size partitions are the final result of the partition scheme which produces a significant performance improvement. Experiments show that by using this technique, local cache misses are significantly reduced as compared to results obtained from traditional methods.
| Original language | English |
|---|---|
| Pages (from-to) | 612-619 |
| Number of pages | 8 |
| Journal | IEEE Symposium on Parallel and Distributed Processing - Proceedings |
| State | Published - 1995 |
| Externally published | Yes |
| Event | Proceedings of the 1995 7th IEEE Symposium on Parallel and Distributed Processing - San Antonio, TX, USA Duration: 25 Oct 1995 → 28 Oct 1995 |