Performance evaluation over HW/SW co-design SoC memory transfers for a CNN accelerator

Antonio Ríos-Navarro; Ricardo Tapiador-Morales; Ángel Jiménez-Fernández; C. Amaya; Manuel Jesus Dominguez Morales; Tobi Delbrück; Alejandro Linares-Barranco

doi:10.1109/nano.2018.8626313

Verified authors • Institutional access • DOI aware

50,000+ researchers120,000+ datasets90% satisfaction

Article

2018

Performance evaluation over HW/SW co-design SoC memory transfers for a CNN accelerator

0 Datasets

0 Files

2018

DOI: 10.1109/nano.2018.8626313

Get instant academic access to this publication’s datasets.

Create free account How it works

Frequently asked questions

Is access really free for academics and students?

Yes. After verification, you can browse and download datasets at no cost. Some premium assets may require author approval.

How is my data protected?

Files are stored on encrypted storage. Access is restricted to verified users and all downloads are logged.

Can I request additional materials?

Yes, message the author after sign-up to request supplementary files or replication code.

Advance your research today

Join 50,000+ researchers worldwide. Get instant access to peer-reviewed datasets, advanced analytics, and global collaboration tools.

Get free academic access Learn more

✓ Immediate verification • ✓ Free institutional access • ✓ Global collaboration

FPGA devices have demonstrated to be one of the most popular prototyping solutions. N anoelectronics is one of the fields that can prototype its architectures in these devices. Many FPGA vendors have recently included embedded processors in their devices, such as Xilinx with ARM -Cortex A cores, together with programmable logic cells. These devices are known as Programmable System on Chip (PSoC). Their ARM cores (embedded in the processing system or PS) communicates with the programmable logic cells (PL) using ARM-standard interface buses. ARM proposed the Advanced Microcontroller Bus Architecture (AMBA) as an open-standard. Its third generation included the Advanced eXtensible Interface (AXI) to reach higher performances. In this paper we analyse the performance of exhaustive data transfers between PS and PL for a Xilinx Zynq FPGA in a co-design real scenario for Convolutional Neural Networks (CNN) accelerator. This CNN accelerator processes, in dedicated hardware, a stream of visual information from a neuromorphic visual sensor for classification. In the PS side, a Linux operating system is running, which recollects visual events from the neuromorphic sensor into a normalized frame, and then it transfers these frames to the CNN accelerator of multi-layered CNNs; and read results, using an AXI-DMA bus in a per-layer way. As these kind of accelerators try to process information as quick as possible, data bandwidth becomes critical. Maintaining a good balanced data throughput rate requires some considerations, such as data partitioning techniques to balance RX and TX transfers, or different transfer management techniques: polling versus dedicated interrupt-based kernel-level driver. For longer enough packets, the kernel-level driver solution improves global computation timings in a CNN classification example. Kernel-level driver ensures safer solutions and enables OS tasks, scheduling for better computation distribution.

Performance evaluation over HW/SW co-design SoC memory transfers for a CNN accelerator

Frequently asked questions

Is access really free for academics and students?

How is my data protected?

Can I request additional materials?

Advance your research today

Performance evaluation over HW/SW co-design SoC memory transfers for a CNN accelerator

Frequently asked questions

Is access really free for academics and students?

How is my data protected?

Can I request additional materials?

Advance your research today

Access Research Data

This PDF is not available in different languages.

Manuel Jesus Dominguez Morales

Abstract

How to cite this publication

Related publications

Why join Raw Data Library?

Quality

Control

Free for Academia

Publication Details

Join Research Community