This blog is original, follow the CC3.0 protocol, reprint please indicate the source: http://blog.csdn.net/lux_veritas/article/details/24766015
Certificate ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Bandwidth is a memory Bandwidth benchmark test program. It is mainly used for x86 and x86_64 platforms to test the system's memory Bandwidth performance by reading and writing data blocks of different sizes in sequence and random.
Project address
Bandwidth provides a set of support libraries for implementation in assembler languages to perform specific operations related to the architecture, such as reading the content of certain registers.
This assembler library is used to detect the current system CPU model and supported features, and select the corresponding working mode. For example, the author's machine CPU is:
CPU family: GenuineIntelCPU features: MMX SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 XD Intel64
When the main program runs, select the working mode based on the CPU characteristics:
if (mode == SSE2) { print (L"(128-bit), size = "); } else if (mode == AVX) { print (L"(256-bit), size = "); } else {#ifdef __x86_64__ print (L"(64-bit), size = ");#else print (L"(32-bit), size = ");#endif }
The author's CPU supports SSE2 and does not support AVX. Therefore, the-Bit Data bit width is used for memory read/write operations.
Taking the author's machine as an example, the test is mainly divided into the following parts:
| ------- | 128bit | 64bit |
|: -----: |: ----: |
| Sequential read |
| Random read |
| Sequential write |
| Random write |
You can choose whether to bypass all levels of cache. The CPU cache of the author's machine is as follows:
Cache 0: L1 data cache, line size 64, 8-ways, 64 sets, size 32kCache 1: L1 instruction cache, line size 64, 8-ways, 64 sets, size 32kCache 2: L2 unified cache, line size 64, 16-ways, 4096 sets, size 4096k
The size of data blocks used for read and write increases from 128 B to MB. Because the cache sizes at different levels are different, smaller data blocks are stored in the cache during memory read and write operations, large data blocks are stored in the primary storage through cache. Therefore, as the size of the data block increases, the bandwidth on several nodes may change significantly, mainly because the bandwidth reaches the upper limit of cache capacity at all levels and changes to the lower storage. Bandwidth generates a log file and a chart for the test results. This bandwidth hop is the most intuitive in the chart ., There is a significant decrease in bandwidth between 32 KB and 4 MB.