Branch prediction ):
This is an advanced processing method starting from the P5 era to handle pipeline failures caused by the if-then-else command. The CPU determines the direction of the program branch, it can speed up operations.
When the processor containing the assembly line technology processes branch commands, it may encounter a problem. Based on the true/false judgment conditions, the switch may occur, this interrupts the processing of commands in the pipeline, because the processor cannot determine the next command of the command until the branch execution is complete. The longer the assembly line, the longer the processor will wait, because it must wait for the processing of the branch command to complete the next command to enter the assembly line.
Branch prediction technology emerged to solve this problem.
The branch prediction technology includes Static branch prediction made at compilation and dynamic branch prediction made by hardware at execution.
Static branch prediction
The simplest Static branch prediction method is to select one branch. In this way, the average hit rate is 50%. The more accurate method is to make statistics based on the original running results to try to predict whether the branch will jump.
The effect of any branch prediction policy depends on the accuracy of the policy and the frequency of the condition branch.
Dynamic Branch Prediction
Dynamic branch prediction is a technology recently used by processors. The simplest dynamic branch prediction strategy is branch prediction buff or branch history table ).
1. Branch command Prediction
Generally, a program contains branch transfer commands. According to statistics, on average, one of the seven commands is a branch transfer command, which is quite sensitive to Branch Transfer commands in the Command assembly line structure. Assume that the first instruction in the 80486 instruction line has entered the decoding stage, and the second instruction line has entered the extraction stage (prepared to enter the decoder ), if the first command is a branch command (for example, jump to an address), the prefetch of the next and next commands in the Command prefetch queue is invalid. At this time (specifically, the target address of the Branch is formed during the execution of the First Command), the Command needs to be obtained from the target address and delivered for execution. At the same time, the command prefetch queue should be cleared immediately, then, prefetch the command following the target address and fill it in the queue. This indicates that the entire command line is disrupted once when a branch command is run, and it can be recovered to normal later. Obviously, this affects the running speed of the machine. Therefore, branch target buffer (BTB) is used in the Pentium processor to predict branch commands.
BTB is actually an address storage part that can store several (usually 256 or 512) entries. When a branch command leads to a program branch, BTB will write down the target address of this command, and use this information to predict the path when this command causes the branch again, prefetch from this place in advance. Next, let's take a look at the application of BTB in a loop program. Loop programs are widely used in programming. Transfer Instructions (conditional transfer instructions or unconditional transfer instructions) must be used to form a cyclic program in the directive-level target program ). See the following example:
MoV cx.100
Loop :......
......
Dec CX
Jnz Loop
......
When the jnz command is executed for the first time, the predicted transfer address is the target address of the previous jnz command in BTB, not the loop. This prediction is incorrect. However, after execution, the target address loop is saved to BTB. After the next execution of the jnz command, it is correct to predict and transfer the data to the Loop Based on the content in BTB. In this case, it is correct until the Cx value is 0. When the value of Cx is changed to 0, the jnz command does not implement the transfer because the condition is not true. The prediction is still loop, and the prefetch is still based on the Prediction. This is the second prefetch error. It can be seen that in this example, there are 98 predictions for 100 cycles. Specifically, the prefetch under the guidance of 98 predictions is correct. Likewise, for 1000 cycles, 998 prefetch is correct. That is, the more cycles, the higher the benefits of BTB.
2. Speculative execution and dynamic branch prediction
Speculative execution technology is also called predictive execution technology. The basic idea is: in the finger fetch stage, the most likely position of the next instruction to be retrieved is pre-determined within a local range, that is, the finger fetch component has a partial execution function, in order to get the branch prediction, ensure that the command obtained by the component is obtained in the execution order of the instruction code, rather than in the storage order of the program instructions in the memory.
Dynamic branch prediction is a specific method of speculative execution. It is relative to Static branch prediction. Static branch prediction uses the target address information in BTB to predict the target address of the branch instruction (such as the Pentium processor) when the instruction is sent to the decoder for decoding ); the prediction of dynamic branch prediction occurs before decoding, that is, the command buffer (which is basically the same as the command prefetch queue of 8086 and 80386, but is different .) The part in the decoder that has not yet entered indicates the start and end of each command, and makes predictions based on the information in BTB, so that it is early to find the branch command. Therefore, for dynamic branch prediction, once the prediction is incorrect, the instructions that need to be cleared in the pipeline are less than those in the Static branch prediction, thus improving the CPU running efficiency.