(Nonsense: I recently came into contact with the memory alignment concept. For memory alignment rules, I verified them one by one in VC and found that the compiler indeed alignment my data in memory. I know that memory alignment must have its advantages. What is it? I am looking for materials to say that memory alignment can improve memory access efficiency, and can be transplanted to different platforms. But why? There is no clear explanation. The following are the results of my efforts over the past few days !)
Why should the compiler align with our memory? After learning about the computer composition principle, I learned that the basic unit of memory is one byte, and the memory can be addressed randomly. So I naively think that memory is a byte container, and the basic unit is a single byte.
Figure 1. memory space layout in my eyesThe tragedy is that the CPU of the real visitor who reads and writes memory is not like this. The CPU reads memory according to the memory access granularity (memory access granularity, Which is abbreviated as mag). The mag is the data volume of one memory access operation by the CPU. The specific value depends on a specific platform, generally 2 bytes, 4 bytes, and 8 bytes.
Figure 2. memory space layout in the CPU eyeTherefore, there are differences between programmers and CPUs in memory space layout. Alas, what should we do if we cannot have too many demanding programmers and make the CPU comfortable? The compiler has to implicitly align the memory of our code (of course, all it can do is to align the memory of the data in the program. As for directly accessing the memory with pointers, ).
I would like to pay tribute to the developers of compilers!The benefits of memory alignment: the benefits of memory alignment are analyzed using a small instance: it is very simple to access a memory alignment address space (starting from address 0) on a 32-bit machine) and an unaligned address space (starting from address 1), read four bytes to the CPU register, and compare the read process of the two.
Case1: Memory Access granularity is 1 byte (the memory model in the CPU eye is equivalent to the memory model in the programmer's eye ):
Figure 3. Mag = 1Result: read 4 bytes, both of which require 4 memory access operations. Flattening, memory alignment is not required when mag = 1.
Case2: the memory access granularity is 2 bytes:
Figure 4. Mag = 2Result: Four bytes are read. On the left side (memory alignment address), only two memory access operations are required. On the right side, three memory access operations and additional operations are required (see the following figure ). Memory alignment address wins!
Case3: Memory Access granularity is 4 Bytes:
Figure 5. Mag = 4
Result: read four bytes. Only one memory access operation is required on the left side, and two memory access operations and additional operations are required on the right side. Memory alignment address wins again!
Conclusion:Memory alignment address vs address without memory alignment. In three different memory access granularities, it achieved two wins and one wins. For 32-bit machines, the actual memory access granularity is 4 bytes, for the following reasons:
- Each memory access operation requires a constant overhead;
- When the data volume is certain, reducing memory access operations can improve the program running performance;
- By increasing the memory access granularity (of course, it does not exceed the bandwidth of the Data Bus), it can reduce memory access operations (as can be seen from the above instance );
In a word, memory alignment can indeed improve program performance.How does the CPU handle data access without memory alignment? Continue to analyze the CPU processing process (hardware mode) for the above instance with a memory access granularity of 2 and reading four bytes from address 1 ):
- The first memory space (0-1) in which the data is read, and the excess bytes (0) are removed );
- The second memory space (2-3) of the data to be read );
- The third memory space (4-5) of the data to be read and the excess bytes (5) are removed );
- Splice the three pieces of data (1-4) into a register.
How powerful is memory alignment when accessing a piece of data of the same size! If the CPU can do this, it will only affect the running performance of our program, at least it can still run! The tragedy is that the previous CPU is not so "Diligent". When there is no memory alignment for data access, it directly throws an exception: the operating system may respond to this exception, if you use software, the performance will only be worse, or the program will crash.
In a word, code with memory alignment is indeed more portable!Over! For more information, refer to this article: Workshop