[Compilation and C-language relations] 5. Volatile qualifier

Source: Internet
Author: User
Tags volatile

Now look at what the compiler optimizations will do to the generated instructions, and then introduce the volatile qualifiers for C. First look at the following C program:

/*Artificial Device Registers*/unsignedCharrecv;unsignedCharsend;/*Memory Buffer*/unsignedCharbuf[3];intMainvoid) {buf[0] =recv; buf[1] =recv; buf[2] =recv; Send= ~buf[0]; Send= ~buf[1]; Send= ~buf[2]; return 0;}

We use two global variables, recv and send, to simulate the device registers. Assuming that a platform uses memory-mapped I/O, the serial send register and the serial receive register are located in a fixed memory address, and recv and send are both global variables with fixed memory addresses. So in this example we think of them as serial-port receive registers and serial-port send registers. In the main function, the first to receive three bytes from the serial port into the BUF, and then the three bytes to reverse, sequentially sent out from the serial port. We look at the disassembly results of this code:

  

The MOVZ instruction stores a shorter byte value into a long-byte storage unit with a high-level 0 padding. The directive can have B (byte), W (word), L (long) three suffixes, representing single-byte, two-byte, and four-byte, respectively. For example Movzbl 0x804a019,%eax said to the address 0x804a019 place a byte into the EAX register, and the EAX register is four bytes, the three bytes with 0 padding, and the next instruction mov%al, The Al register in the 0x804a01a is the low byte of the EAX register, and this byte is stored in a byte at the address 0x804a01a. The low 8-bit, lower 8-bit, low 16-bit, or full 32-bit can be accessed individually with different names, with EAX as an example, AL for low 8 bits, ah for lower 8 bits, and ax for low 16 bits. As shown in the following:

  

But if you specify the optimization option-o compilation, the disassembly results are different:

  

The first three statements receive three bytes from the serial port, and the compiled instructions obviously do not conform to our intent (the contents of the device register are re-evaluated from the corresponding memory address every time the change is required): only the first statement reads a byte from the memory address 0x804a019 to the register eax, Then save from the register Al to Buf[0], the latter two statements are no longer read from the memory address 0x804a019, but directly to the register of the value of Al saved to Buf[1], buf[2]. After three statements to the buf in the three bytes to the serial port, the compiler generated instructions are not consistent with our intention: only the last statement to eax the value of the memory address 0x804a018, the first two are not the same, no instructions.

Why does the compiler optimize the results wrong? Because the compiler does not know that 0x804a018 and 0x804a019 are the address of the device registers, they are treated as normal memory units. If it is a normal memory unit, as long as the program does not rewrite it, it will not change, you can first read the memory unit values to register cache, each time this value is used to read directly from the register, so more efficient, we know that read register is much faster than read memory. On the other hand, if a normal memory unit does three successive writes, only the last value is saved to the memory unit, so the first two writes are superfluous and can be optimized. The code to access the device register is wrong because the device registers tend to have the following characteristics:

    • The data in the device register does not need to be rewritten to change itself, and the value of each reading may be different.
    • Writing data to a device register multiple times in a row is not a hard work, but it has a special meaning.

Compiling the generated instructions with the optimization option is significantly more efficient, but improper use will make mistakes, in order to avoid the compiler to be smart, optimization should not be optimized, the programmer should explicitly tell the compiler which memory unit access is not optimized, in C language can be modified with the volatile qualifier variable, is to tell the compiler , even if the optimization option is specified at compile time, the variable is still read from memory every time it is read, and every time this variable is written back to memory, you cannot omit any steps. We changed the first few lines of the code to:

  

/** *volatilechar  recv; volatile Char send;

Then specify the tuning Option-O compilation to see the results of the disassembly:

  

The compilation optimization options for GCC are-o0,-O,-o1,-o2,-o3, and-os. -o0 indicates no optimization, which is the default option. -o1,-o2, and-o3 are a couple of options that are more optimized than one and compile longer. -O is the same as-o1. -os is optimized to reduce the size of the target code.

With the volatile qualifier, it is possible to prevent the compiler from optimizing access to the device registers, but for the platform with the cache, it is not enough to prevent the cache from optimizing access to the device registers. When accessing a normal memory unit, the cache is transparent to the programmer, such as executing an instruction such as Movzbl 0x804a019,%eax, and we do not know whether the value of EAX is actually read from the memory address 0x804a019, or read from the cache, If the cache has cached the data for this address from the cache, if the cache is not cached and read from memory, these steps are automatically made by the hardware, instead of using instructions to control the cache, the programmer writes only registers, memory addresses, and no caches. Programmers don't even need to know the cache exists. Similarly, if the implementation of MOV%al,0x804a01a such an instruction, we do not know that the value of the register is really write back to memory, or only write to the cache, and then write back to the memory by the cache, even if only write to the cache and temporarily not write back to memory, The next time you read 0x804a01a this address, you can still read the last written data from the cache. However, when reading and writing device registers, the existence of the cache can not be ignored, if the serial port sending and receiving register memory address is cached by the cache what will be the problem? As shown in.

  

If the address of the serial port sending register is CAHCE cache, the CPU execution unit writes to the serial port register to write to the cache, the serial port sends the register not to obtain the data in time, also cannot send in time, the CPU execution unit has issued 1, 2, 33 bytes will be written to the same unit in the cache, the last cache only saved the 3rd byte, if the cache writes the data back to the serial port to send the register, can only send the 3rd byte, the first two bytes are lost. Similarly, if the address of the serial receive register is cached by the cache, and the CPU execution unit reads the 1th byte, the cache reads the buffer from the serial receive register, but the 2, 32 bytes received after the serial receive register are not known to the cache, Because the cache takes the serial receive register as a normal memory unit, and believes that the data in the memory unit does not change itself, the cache will provide the 1th byte of the buffer to the CPU execution unit after each read of the serial port receiving register at a later time. Usually, the platform with the cache has a way to disable the cache for a certain range of addresses, typically set in a page table, you can set which pages allow cache caching, which pages do not allow cache caching, the MMU not only to do address translation and access checks, but also work with the cache.
In addition to device registers that require volatile qualification, a global variable must be volatile when it is accessed by multiple control processes in the same process , such as signal processing functions and multithreading.

[Compilation and C-language relations] 5. Volatile qualifier

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.