About machine commands and micro commands

Source: Internet
Author: User

Recently, I was reading "deep understanding of computer systems", which really gave me a better understanding of CPU.

As we all know, in fact, the programs we write in advanced languages are compiled into executable programs. The files that store executable programs are actually some machine codes that can be executed by hardware. In this step, we call it Machine Instruction. At this step, we often think that we have reached the so-called "bottom layer ".

Some time ago, I also raised a question in the Forum, that is, the Code Compiled by the Intel compiler is optimized for the Intel processor, and the execution efficiency on the AMD processor is average, why does this happen? Because the machine command sequence is the same, the number of cycles (clock cycle) of each machine command varies slightly according to the parameters given by the hardware vendor.

Now I think I can try to answer this question. In our opinion, each step of machine commands is an atomic operation, but in order to pursue a higher throughput (through output), the hardware divides these commands into independent stages, this is called micro-operation (μ-op). These micro-commands can be executed sequentially in different stages of a pipeline. When the pipeline header is empty, you can execute the microcommand of the next machine command. In this way, many machine commands are executed simultaneously, increasing the throughput.

Of course, it is impossible to simply add the pipeline, because there is a latency between one stage and the other, and the clock signals are shared in these stages, therefore, the time required for the stage with the longest execution time is used as the public clock, and many operations cannot be cut into too many micro commands. Is the assembly line of the 4-core processor:

Therefore, in my previous concept, operations such as mov reg only need to consume one clock. This is incorrect. The correct statement is that the average duration of this operation is one clock, it must complete all the steps of the assembly line, and each step of the assembly line must have at least one clock. The execution of a machine instruction includes values, decoding, computing, memory access, and PC update. A micro instruction can be regarded as a decoded result and cached on the buffer, this reduces the decoding time when the same command is run next time.

Another problem is that if there is dependency between adjacent machine commands and the previous command is not in the pipeline, the next one will go in, so that the next one may use the wrong value, this problem can be solved using the Intermediate Value of the Forward (bypass) pipeline to the next pipeline or stall, but stall is unreasonable, in this way, part of the assembly line is stuck, and another method is to execute out of order and refer to the previous instructions without dependency, fill the assembly line as full as possible.

Therefore, we can imagine that the microinstruction layer of Intel and AMD must be very different. So how to arrange the machine Command Order (of course, the CPU will determine whether the order can be out of order during execution, but it is now later, and it is better if the compiler can remove more dependencies ), and how to use registers (sometimes different machine codes can be used for the same function. Therefore, Intel compiler can provide more efficient compilation results for its own CPU.

For example, the AMD document provides compiler developers with a suggestion that when the redirection follows the ret command, it is best to insert a rep to improve CPU efficiency without introducing errors.

......je  .L33repret

Additional reading:

A journey of exploring the CPU assembly line (this is the translated version. Original English address:
Journey Through the CPU Pipeline)

References

In-depth understanding of computer systems

Http://en.wikipedia.org/wiki/Pipeline_ (computing)

Http://en.wikipedia.org/wiki/Micro-operation

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.