cloud_sync Request Raw Data

(Raw Data Set) DRAM Bandwidth and Latency Stacks: Visualizing DRAM Bottlenecks

Abstract

For memory-bound applications, memory bandwidth utilization and memory access latency determine performance. DRAM specifications mention the maximum peak bandwidth and uncontended read latency, but this number is never achieved in practice. Many factors impact the actually achieved bandwidth, and it is often not obvious to hardware architects or software developers how higher bandwidth usage, and thus higher performance, can be achieved. Similarly, latency is impacted by numerous technology constraints and queueing in the memory controller. DRAM bandwidth stacks intuitively visualize the memory bandwidth consumption of an application and indicate where potential bandwidth is lost. The top of the stack is the peak bandwidth, while the bottom component shows the actually achieved bandwidth. The other components show how much bandwidth is wasted on DRAM refresh, precharge and activate commands, or because of (parts of) the DRAM chip being idle when there are no memory operations available. DRAM latency stacks show the average latency of a memory read operation, divided into base read time, row conflict, and multiple queue components. DRAM bandwidth and latency stacks are complementary to CPI stacks and speedup stacks, providing additional insight to optimize the performance of an application or to improve the hardware.


article Proceedings Paper
date_range 2022
language English
link

Similar Articles