Introduction to memory Mountain and graph analysis
The storage mountain is a tool that comprehensively studies the storage hierarchy. It reflects the bandwidth of different layers in the memory hierarchy. It also reflects the performance of programs with different time locality and spatial locality. By analyzing the data of the storage Mountain, we can also see some hardware parameters of the storage system.
T. Stricker introduced the concept of storage Mountain in his paper in 1997, used it to fully describe the storage system, and put forward the term "Storage Mountain" in later work ". Computer Systems: A programmer's Perspective, Randal Bryant, professor at Carnegie Mellon University) section 6.6.1 of this book also introduces the concept of memory Mountain and provides a detailed analysis.
Memory mountain is a kind of memory mountain. Its hierarchy mainly includes cache and memory. Below is an intel i7 processor memory Hill.
The upper-left corner of the figure shows the CPU parameter. The main parameter is the size of the third-level cache. This is critical because these three numbers make the memory mountain present the state in the figure.
First, let's take a look at the coordinate axis. working set size refers to the size of the working set. stride is the step to access data. stride is 1, which means to access data in order. If it is 2, there is an access interval, this is related to the data locality. Read throughput refers to the memory access speed, efficiency, or throughput. In short, the higher the index, the better.
Now, we can take a look at this mountain. We can see that the border between green and purple is about 32 K, the border between green and yellow is about 256 K, and the border between light purple and blue is about 8 M, this is exactly the size of the third-level cache. This is a good explanation. If the work set is small enough to be fully loaded into a level-1 cache, the access speed is almost the access speed of level-1 cache. Therefore, a sharp downhill occurs at the cache size boundary. The layer-4 structure in the figure is formed.
The longer the step, the slower the speed. The large step-size access method completely destroys the time locality, and thus causes the access speed to decrease.
In summary, I hope I can describe several things clearly. First, when the step size is very small, even if the working set is very large, the access rate is also in the purple area. This is because the time locality is fully utilized. Only the first data in the same cache block has a miss. After this miss, other data is loaded together. Second, when the step size is very large and the working set is very small, the access speed is also very high. In fact, the step size does not affect the access speed, because the entire working set will be loaded into the cache. Third, for large work sets and large step sizes, the cache is essentially a virtual machine, because there is no locality at all, and the cache is generated locally. Therefore, the access speed can only be at the memory level, that is, the light blue in the figure.