TY - GEN
T1 - Parallel embedded systems
T2 - Emerging Information Technology Conference 2005
AU - Sha, Edwin H.M.
PY - 2005
Y1 - 2005
N2 - With the advance of system level integration and system-on-chip, the high-tech industry is now moving toward multiple-core parallel embedded systems using hardware/software co-design approach. To design and optimize an embedded system and its software is technically hard because of the strict requirements of an embedded system in timing, code size, memory, low power, security, etc. while optimizing a parallel embedded system makes research even more challenging. We will focus on loops because they are usually the most critical parts to be optimized in DSP or any computation-intensive applications. Because of the space limit, this paper will only show the basic ideas of fully parallelizing nested loops while minimizing code size overhead. Using our technique based on multidimensional retiming, any uniform nested loops can be transformed with minimal overhead such that all the computations in the new loop body can be executed simultaneously. This is the best possible result and can be applied to many applications executed on VLIW or other types of parallel systems.
AB - With the advance of system level integration and system-on-chip, the high-tech industry is now moving toward multiple-core parallel embedded systems using hardware/software co-design approach. To design and optimize an embedded system and its software is technically hard because of the strict requirements of an embedded system in timing, code size, memory, low power, security, etc. while optimizing a parallel embedded system makes research even more challenging. We will focus on loops because they are usually the most critical parts to be optimized in DSP or any computation-intensive applications. Because of the space limit, this paper will only show the basic ideas of fully parallelizing nested loops while minimizing code size overhead. Using our technique based on multidimensional retiming, any uniform nested loops can be transformed with minimal overhead such that all the computations in the new loop body can be executed simultaneously. This is the best possible result and can be applied to many applications executed on VLIW or other types of parallel systems.
UR - https://www.scopus.com/pages/publications/33751195499
U2 - 10.1109/EITC.2005.1544328
DO - 10.1109/EITC.2005.1544328
M3 - 会议稿件
AN - SCOPUS:33751195499
SN - 0780393295
SN - 9780780393295
T3 - Emerging Information Technology Conference 2005
SP - 5
EP - 8
BT - Emerging Information Technology Conference 2005
Y2 - 15 August 2005 through 16 August 2005
ER -