Christian Pilato and Francesca Palumbo
Modern Systems-on-Chip (SoC) architectures and CPU+FPGA computing platforms are moving towards heterogeneous systems featuring an increasing number of hardware accelerators. These specialized components can deliver energy-efficient high performance, while they can maintain a certain level of flexibility by combining different configurations that can be dynamically changed. Hence, it is crucial to understand how to design and optimize such components together with the associated memory in order to efficiently implement the desired functionality. This lecture presents and discusses the current challenges for accelerating an application in hardware. It will also present a complete system-level methodology to address most of the issues for the generation of highly-optimized dataflow accelerators with specialized and multi-bank memories. We will show how a certain set of functionalities can be specified in terms of execution graph, and how this graph can be processed to deploy the optimal accelerator configuration and the proper memory architecture. This methodology combines high-level synthesis methods for the automatic generation of the accelerators and prototype CAD tools for the optimization of the resources and the memory subsystem.