Measure the test taker's knowledge about the impact of multithreading on multi-core CPU branch prediction.

Source: Internet
Author: User
Tags intel core i5
Preface:

Modern CPUs all have pipelines and branch prediction functions. The accuracy of CPU branch prediction can reach more than 98%. However, if the prediction fails, the pipeline fails and the performance loss is very serious.

For details about the branch prediction technology used by the CPU, refer:

History and current example of research by processor branch Branch

Same-time multi-thread Processors

Use these features correctly to write efficient programs.

For example, when writing an if or else statement, you should place the high probability event in the IF statement and the small probability event in the else statement.

However, this kind of consideration is usually based on a single thread, which may lead to exceptions in multiple threads. For example, multiple threads execute code at the same time.

Test:

The following is an Intel Core I5-based multi-thread branch prediction test.

Test ideas (more than three cases are found during actual tests. For details, refer to the test results below ):

The two threads execute the same code, and the IF statement is always true.

The two threads execute the same code. The if value of one thread is true, the IF value of the other thread is true, and the false value of the other thread is false.

Two threadsDifferent Code (the logic function is the same, but the location is different), The if value of one is true, the IF value of the other is true, and false.

The Code is as follows:

Test1 tests the same code. When an even number parameter is input, if is always true. When an odd number parameter is input, if is true, if is false.

The Test2 function tests different codes. When an even number parameter is input, if is always true. When an odd number parameter is input, if is true, if is false.

import java.util.concurrent.CountDownLatch;public class Test {public static int loop = 1000000000;public static int sum = 0;public static CountDownLatch startGate;public static CountDownLatch endGate;public static void test1(int x1, int x2) throws InterruptedException{startGate = new CountDownLatch(1);endGate = new CountDownLatch(2);new Thread(new T1(x1)).start();new Thread(new T1(x2)).start();Test.startGate.countDown();Test.endGate.await();}public static void test2(int x1, int x2) throws InterruptedException{startGate = new CountDownLatch(1);endGate = new CountDownLatch(2);new Thread(new T1(x1)).start();new Thread(new T2(x2)).start();Test.startGate.countDown();Test.endGate.await();}}class T1 implements Runnable{int xxx = 0;public T1(int xxx){this.xxx = xxx;}@Overridepublic void run() {try {int sum = 0;int temp = 0;Test.startGate.await();long start = System.nanoTime();for(int i = 0; i < Test.loop; ++i){temp += xxx;if(temp % 2 == 0){sum += 100;}else{sum += 200;}}Test.sum += sum;long end = System.nanoTime();System.out.format("%s, T1(%d): %d\n", Thread.currentThread().getName(), xxx, end - start);} catch (InterruptedException e) {e.printStackTrace();}finally{Test.endGate.countDown();}}}class T2 implements Runnable{int xxx = 0;public T2(int xxx){this.xxx = xxx;}@Overridepublic void run() {try {int sum = 0;int temp = 0;Test.startGate.await();long start = System.nanoTime();for(int i = 0; i < Test.loop; ++i){temp += xxx;if(temp % 2 == 0){sum += 100;}else{sum += 200;}}Test.sum += sum;long end = System.nanoTime();System.out.format("%s, T2(%d): %d\n", Thread.currentThread().getName(), xxx, end - start);} catch (InterruptedException e) {e.printStackTrace();}finally{Test.endGate.countDown();}}}

Because there are many test cases, the simple statement is as follows:

A test1 function has two results. For example, test1 (2, 3) returns two results 2.1 S and 2.2 s, indicating that the two threads execute the same code, one if statement is always true (2 is an even number), and the other is always false (3 is an odd number). The average computing time of the first thread is 2.1 S, the average computing time of the second thread is 2.2 S.

The main function to be tested has two for loops, each of which is 10 times. The test results are briefly expressed, for example:

A row of data indicates the result of executing the main function. The subsequent time is obtained by rough average calculation.

Test1 (2, 3) 2.1 s 2.1 s Test1 (2, 4) 2.1 s 2.1 s

Main function:

public static void main(String[] args) throws InterruptedException {for(int i = 0; i < 10; ++i){test1(2, 3);}System.out.println("!!!!!!!!!!!!!!!!!!!!");for(int i = 0; i < 10; ++i){test1(2, 4);}}

The test results are as follows:

1 Test1 (2, 3) 2.0 s 2.0 s Test1 (2, 4) 1.8 S 2.0 s
2 Test1 (2, 4) 1.3 s 1.3 s Test1 (2, 3) 1.3 s 1.8 S
3            
4 Test2 (2, 3) 1.3 s 1.7 s Test2 (2, 4) 1.3 s 1.9 s
5 Test2 (2, 4) 1.3 s 1.3 s Test2 (2, 3) 1.3 s 1.8 S

First, analyze 1st rows of data. The test1 (2, 3) result is the worst. Obviously, this is because two threads execute the same code and the branch prediction results of the two threads interfere with each other, so they are always inaccurate.

But why is the result of test1 (2, 4) below worse? Although the two threads execute the same code, the IF Statements in both threads are always true. Why is it time-consuming?

Simple speculation 1: it may be that test1 (2, 3) affects the prediction result of test1 (2, 4) Branch. The branch prediction tool has a history table, the branch prediction history previously executed affects the subsequent selection.

Then we analyze the 2nd rows. Obviously test1 (2, 4) is the best result. Two threads execute the same code and if is always true. Let's look at the test (2, 3) Result of the second row, which is better than the result of the second row. Why is the difference between this and the data of the second row so much?

Simple speculation 2: different cores should have their own branch schedulers.

Let's take a look at the test results of the Test2 function. From the perspective of row 4th, the results of Test2 () are as expected, but obviously the results of Test2 () are not ideal, why are the results different when two threads execute code in different places and the IF statement is always true?

Simple speculation 3: influenced by the results of test (2, 3), combined with simple speculation 1 and 2, we can understand why the first result of test (2, 4) is 1.3 s, the last one is 1.9s.

Then we analyze 5th rows of data, which is exactly as expected.

Summary:

Speculation: The branch prediction tool of I5 is a hybrid branch prediction tool. Each core has a history table, and each core has its own branch estimator.

Branch Prediction under multithreading is not very optimistic. If you can avoid multiple threads to execute the same piece of code and the results of the branch prediction conditions are always changing, try to avoid it.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.