1. Check your system cache size:
$ cat/sys/devices/system/cpu/cpu0/cache/index2/size
My system is CentOS 5.8. The above command is to see the level 2cache size, on my server is 256k, remember this number, write the program to use.
2. Check the size of the cache line:
$ cat/sys/devices/system/cpu/cpu0/cache/index2/coherency_line_size
My server is 64, unit is bytes, remember this number, also need to use.
3, write the test procedure cache.c:
[CPP] View plain copy print? int matrix[8192][16]; //4*8192*16=2^18=512k bytes void Bad_access () { int k, j, sum = 0; for (k = 0; k < 16; k++) for (j = 0; j < 8192; j++) sum += matrix[j][k]; } Int main () { int i; for (i = 0; i< 5000000; i++) bad_access (); return 0; } The above code is simple, but to understand the need to understand the simple structure and principle of the cache: the cache is a 64-byte or 128-byte line, divided into groups (or called multiple), each time the cache miss fetch data, the cache will follow the cache Line is the unit (this is to take 64 bytes at a time) to access data from within.
The first step is that level 2 data cache total size is 256k, the second step to get each cache line is 64 bytes, so, LEVEL2 data cache A total of 256k/64=2^12=4096 rows.
Imagine a table with 64 bytes per row, a total of 4096 rows, a 256k size, which is the simple structure of our cache. In order to ensure that each fetch of data will occur miss, we must take the data in >=64 byte step.
First create a 512K large array, one times larger than the cache. If the array is also 256k, when the first loop ends and the array is used to fetch data again from the beginning, the cache is no longer replaced, so the cache miss will not occur again, in order to ensure that each fetch of data will occur cache miss, the array must be at least twice times the cache size and above.
Iterate through the data in the array, read an int size each time, then add 64, then read the next cache line data, loop until the array data is all taken out.
Oprofile statistics Cache miss has a minimum limit (my 0.9.8 version is 2 million times), so the occurrence of the number of miss is too small to marry, so increase the number of cycles to 5000000.
4, at this point can be 100% cache Miss Test, but after testing found that the cache miss did not happen, think of the solution, ask the boss only remembered, x86 have a stream buffer hardware prefetch, if you take data very regular, Then the hardware prefetch is trained to put the data you want directly into the cache before you actually take the data. Therefore, to do the cache Miss test on the server to the strong processor, the hardware prefetch must be turned off after rebooting the system. Otherwise it is necessary to modify the program, write the real random data code, but this does not guarantee that the cache miss rate is 100%, only to ensure that the cache hit ratio is low.
5, the MCF in the SPEC CPU2006 occurs cache miss Rate is very high, you can use it for testing.