Multiloop Parallelisation Using Unrolling and Fission

A technique for parallelising multiple loops in a heterogeneous computing system is presented. Loops are first unrolled and then broken up into multiple tasks which are mapped to reconfigurable hardware. A performance-driven optimisation is applied to find the best unrolling factor for each loop und...

Full description

Saved in:
Bibliographic Details
Main Authors: Yuet Ming Lam, José Gabriel F. Coutinho, Chun Hok Ho, Philip Heng Wai Leong, Wayne Luk
Format: Article
Language:English
Published: Wiley 2010-01-01
Series:International Journal of Reconfigurable Computing
Online Access:http://dx.doi.org/10.1155/2010/475620
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:A technique for parallelising multiple loops in a heterogeneous computing system is presented. Loops are first unrolled and then broken up into multiple tasks which are mapped to reconfigurable hardware. A performance-driven optimisation is applied to find the best unrolling factor for each loop under hardware size constraints. The approach is demonstrated using three applications: speech recognition, image processing, and the N-Body problem. Experimental results show that a maximum speedup of 34 is achieved on a 274 MHz FPGA for the N-Body over a 2.6 GHz microprocessor, which is 4.1 times higher than that of an approach without unrolling.
ISSN:1687-7195
1687-7209