0 Datasets
0 Files
Get instant academic access to this publication’s datasets.
Yes. After verification, you can browse and download datasets at no cost. Some premium assets may require author approval.
Files are stored on encrypted storage. Access is restricted to verified users and all downloads are logged.
Yes, message the author after sign-up to request supplementary files or replication code.
Join 50,000+ researchers worldwide. Get instant access to peer-reviewed datasets, advanced analytics, and global collaboration tools.
✓ Immediate verification • ✓ Free institutional access • ✓ Global collaborationJoin our academic network to download verified datasets and collaborate with researchers worldwide.
Get Free AccessSpecialized image processing accelerators are necessary to deliver the performance and energy efficiency required by important applications in computer vision, computational photography, and augmented reality. But creating, “programming,” and integrating this hardware into a hardware/software system is difficult. We address this problem by extending the image processing language Halide so users can specify which portions of their applications should become hardware accelerators, and then we provide a compiler that uses this code to automatically create the accelerator along with the “glue” code needed for the user’s application to access this hardware. Starting with Halide not only provides a very high-level functional description of the hardware but also allows our compiler to generate a complete software application, which accesses the hardware for acceleration when appropriate. Our system also provides high-level semantics to explore different mappings of applications to a heterogeneous system, including the flexibility of being able to change the throughput rate of the generated hardware. We demonstrate our approach by mapping applications to a commercial Xilinx Zynq system. Using its FPGA with two low-power ARM cores, our design achieves up to 6× higher performance and 38× lower energy compared to the quad-core ARM CPU on an NVIDIA Tegra K1, and 3.5× higher performance with 12× lower energy compared to the K1’s 192-core GPU.
Jing Pu, Steven Bell, Xuan Yang, Jeff Setter, Stephen Richardson, Jonathan Ragan‐Kelley, Mark Horowitz (2017). Programming Heterogeneous Systems from an Image Processing DSL. ACM Transactions on Architecture and Code Optimization, 14(3), pp. 1-25, DOI: 10.1145/3107953.
Datasets shared by verified academics with rich metadata and previews.
Authors choose access levels; downloads are logged for transparency.
Students and faculty get instant access after verification.
Type
Article
Year
2017
Authors
7
Datasets
0
Total Files
0
Language
English
Journal
ACM Transactions on Architecture and Code Optimization
DOI
10.1145/3107953
Access datasets from 50,000+ researchers worldwide with institutional verification.
Get Free Access