Use of stream Benchmark

Source: Internet
Author: User

Stream is one of the most popular comprehensive measuring tools for memory bandwidth performance. As the number of cores processed by the processor increases, the memory bandwidth becomes more important for improving the performance of the entire system. If a system cannot transmit data in the memory to the processor quickly enough, several processing cores are waiting for data to be idle, and the idle time generated will not only reduce the system efficiency, but also offset the performance improvement factors brought by multiple cores and high clock speed. Stream has good spatial locality and is a test that is TLB friendly and cache friendly. Stream supports copy, scale, add, and triad operations.

Compile and use stream:

Download the C source stream. C (http://www.streambench.org/), in the Command terminal through-GCC stream. c

Important parameter adjustment (set during compilation ):

1. stream_array_size: Adjust the array size. The setting method is 100 MB (pay attention to setting the appropriate size, maybe you only need 10 MB:

Gcc-o-dstream_array_size = 100000000 stream. C-o stream.100m

2. ntimes: adjusts the number of stream operations in each kernel and outputs the best one. Set 7 times.

It can be adjusted through-dntimes = 7.

3.-doffset settings

4. multi-core OpenMP support is added through-o-fopenmp

 

Complete example:

Complete example:
Gcc-o-fopenmp-dstream_array_size = 100000000-dntime = 20 stream. C-O Stream. o
For more information about other parameters, see http://www.cs.virginia.edu/stream/ref.html.

Command to compile the stream benchmark file and run the command./stream. O. Results of a program with a memory test:

In the function, copy, scale, add, and triad represent the following 4 operations:

The copy operation is the easiest. It first accesses one memory unit to read the value and then writes the value to another memory unit.

The scale operation first reads the value from the memory unit, performs a multiplication operation, and then writes the result to another memory unit.

The add operation first reads two values from the memory unit, performs addition operations, and then writes the result to another memory unit.

The Chinese meaning of triad is to combine the three operations. In this test, it means to combine the copy, scale, and add operations for testing. The specific operation method is: first read two values A and B from the memory unit and perform the multiplication and addition Hybrid Operation (a + factor * B) on them ), write the calculation result to another memory unit.

The best rate indicates the bandwidth between the memory and the CPU when the program executes different operations:

Add> triad> copy> scale. Why? One add operation requires three accesses to the memory (two read operations and one write operation), triad operations also require three accesses to the memory, and copy and scale operations require two accesses to the memory. In a unit operation, the more times the memory is accessed, the greater the latency of memory access, the larger the bandwidth. Within a unit operation, the more complex the operation, the longer the operation completion time, the longer the operation cycle is completed. The add operation is simple and has a large number of accesses. Therefore, the maximum bandwidth is required. The scale operation is complex and the number of accesses is small. Therefore, the minimum bandwidth is required. The copy operation is simple, but the number of access requests is small. The Triad operation is complex, but the number of access requests is large. It may be because of the multi-core reasons that the operation time is much longer than the operation time, the Triad bandwidth is slightly larger than the copy bandwidth.

 

References:

(Http://www.cs.virginia.edu/stream/ref.html

Source code: http://www.cs.virginia.edu/stream/FTP/Code/

Official Website: http://www.streambench.org/

I would also like to thank the other three for their research.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.