TY - GEN
T1 - Nested loops optimization for multiprocessor architecture design
AU - Leonardi, Andrea
AU - Passos, Nelson L.
AU - Sha, Edwin H.M.
N1 - Publisher Copyright:
© 1999 IEEE.
PY - 1998
Y1 - 1998
N2 - Multi-dimensional systems, including image processing, geophysical signal processing, and fluid dynamics, are becoming one of the most important targets of computational improvement studies. Most of the optimized solutions to those problems point to the use of application specific integrated circuits (ASICs). From the analysis of the multi-dimensional programming code, one can observe that nested loop like structures are often the most time consuming part. Designing ASICs with multiple processing units is usually the appropriate solution to achieve the required computational performance. In this paper, a new loop transformation algorithm, which allows an efficient utilization of the multiprocessor system is presented. Uniform nested loops are modeled as multi-dimensional data flow graphs. New loop structures are generated so that an arbitrary number of processors available in the system can run in parallel. An example demonstrates the effectiveness of the algorithm.
AB - Multi-dimensional systems, including image processing, geophysical signal processing, and fluid dynamics, are becoming one of the most important targets of computational improvement studies. Most of the optimized solutions to those problems point to the use of application specific integrated circuits (ASICs). From the analysis of the multi-dimensional programming code, one can observe that nested loop like structures are often the most time consuming part. Designing ASICs with multiple processing units is usually the appropriate solution to achieve the required computational performance. In this paper, a new loop transformation algorithm, which allows an efficient utilization of the multiprocessor system is presented. Uniform nested loops are modeled as multi-dimensional data flow graphs. New loop structures are generated so that an arbitrary number of processors available in the system can run in parallel. An example demonstrates the effectiveness of the algorithm.
UR - https://www.scopus.com/pages/publications/85045581810
U2 - 10.1109/MWSCAS.1998.759519
DO - 10.1109/MWSCAS.1998.759519
M3 - 会议稿件
AN - SCOPUS:85045581810
T3 - Midwest Symposium on Circuits and Systems
SP - 415
EP - 418
BT - Proceedings - 1998 Midwest Symposium on Circuits and Systems, MWSCAS 1998
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 1998 Midwest Symposium on Circuits and Systems, MWSCAS 1998
Y2 - 9 August 1998 through 12 August 1998
ER -