Data Alignment in the eye

Source: Internet
Author: User
Tags modulus
[Original] Data Alignment in my eyes FavoritesGuidance:
Poster: hejiwen
Time:
Http://bbs.pediy.com/showthread.php? Threadid = 14526
Details:

Data Alignment in my eyes
Author: mingchu

In reading the snow Forum about memcpy post (http://bbs.pediy.com/showthread.php? S = & threadid = 14128) I talked about Data Alignment and re-aroused my thoughts on it (I used to understand it and gradually forgot ^_^). This is my personal opinion, please criticize and correct me!
Download related attachments.
 
1. What is Data Alignment? Please refer to the official explanation:
Note that words need not be aligned at even-numbered addresses and doublewords need not be aligned at addresses evenly divisible by four. This allows maximum

Flexibility in data structures (e.g., records containing mixed byte, word, and doubleword items) and efficiency in memory utilization. When used in

Configuration with a 32-bit bus, actual transfers of data between processor and memory take place in units of doublewords beginning at addresses evenly

Divisible by four; however, the processor converts requests for misaligned words or doublewords into the appropriate sequences of requests acceptable to

Memory interface. Such misaligned data transfers reduce performance by requiring extra memory cycles. For maximum performance, data structures (including

Stacks) shoshould be designed in such a way that, whenever possible, word operands are aligned at even addresses and doubleword operands are aligned

Addresses evenly divisible by four.
Data Alignment means that the w o r d variable should always be stored at the address multiple of 2, and the d w o r d variable should always be stored at the address multiple of 4, and so on.
 
2. Data Alignment varies with the processor and compiler. Processing Data Alignment improves the space-time efficiency (space-saving and efficiency-increasing) of programs, especially in Assembler programs.

1. In terms of processors:
Starting from 486, the 18th-bit (AC) of the eflags register and the 18th-bit (AM) of the Cr0 register are used for Data Alignment check, only when AC = 1, am = 1, the Data Alignment check is performed only when the processor works in protection mode or ring3 in virtual 86 mode,

0x11 if the error occurs. In Windows, the AC (AC = 0) bit is not set, so no data is interrupted no matter whether the data is aligned or not, and the correct data can always be accessed, however, access to non-alignment data is slower than alignment, because 80366dx

The address line of the later processor only accepts an address multiple of 4 (the data line is 32 bits). If the provided address is not a multiple of 4, the processor converts it to several 4-fold addresses for access, and each time it accesses 4 bytes of data. For example:
(1) mov ax, word PTR [3], will first extract dword ptr [0], and then extract it from dword ptr [4, assemble the required data and place it on the data line, for example.
 
 

(2) mov eax, dword ptr [2], the same extraction as above, but the assembled data is different.

2. Compiler (Taking Microsoft's ml and link as an example ):

(1) global variables:
In. Data,. Data ?,. Const ,. the data stored in the Code is closely related to each other, and there is no gap between variables. Therefore, Data Alignment is easy to occur, especially when defining arrays. Data Alignment reduces the access efficiency.

Rate.
The global variables of the VC compiler use Comm. (It seems that automatic align is available. Please let us know the details of Comm.) The global variables in the. c file are like this after disassembly:
_ Data Segment
Comm _ x: byte
Comm _ y: DWORD
_ DATA ends

Of course, we can also use comm as shown in MASM.
 
Suggestion: arrange the front and back relationships between variables based on the size of the variables to align them. If necessary, use align, even, org, $, and others to force alignment, especially for struct variables and arrays.
For example:
. Code
Test1 DB?
Test2 DW?
Test3 DB?
Changed:
Test1 DB?
Test3 DB?
Test2 DW?
 
(2) function parameters and local variables:
Function parameters and local variables are stored in the stack. Based on EBP (mov EBP, ESP, actually ESP), the offset is automatically aligned (relative to the EBP offset ), the EBP values are multiples of 4, so these variables are all automatic

Alignment.
Tested, DQ, DD, real8, and real4 are aligned with two characters, DW, DF, DT, and real10 are aligned with words, and DB is aligned with bytes.
Based on the quiz:

 

Testalign proc
Local pad: byte, test1: real4
RET
Testalign endp
 
Suggestion: Before calling a function, it is best not to use commands such as push ax to make the ESP-2, which affects data alignment. Call as follows:
Push BX
Call testalign
Pop BX
This will make esp a multiple of 2 rather than a multiple of 4, resulting in data not alignment.
 
(3) structure field alignment (especially when porting a program, it seems important to write a network program ):
Word_count struct
Lpletter DB?
Dwcount DQ?
Word_count ends
Wordcount word_count <>
How are these fields arranged in the memory? This depends on the compiler settings, that is, the ZP option of ML.
ML/ZP [N], where n is 1, 2, 4, 8, 16.
In VC, the corresponding settings are as follows:

<1>: Project-> setting-> C/C ++ tag-> code generation classification-> struct member alignment.
In fact, ZP is set in cl.
<2>: Use the # paragram pack (n) statement multiple times in the program to change the alignment settings of some struct.
 
These options can only set the alignment of struct fields (offset alignment), and its (ML) cannot ensure the alignment of struct variables (the address is a multiple of the maximum field, wordcount is stored in an address multiple of 8.

The struct variable is placed in the alignment place (Shuang). If the alignment modulus (n) is smaller than the multiple of the maximum field, the adjusted modulus prevails. For example, if it is set to 1, there is no gap between fields (actually the structure is compressed ).
The above struct:
If it is set to 8 or 16 (7b is wasted ):

 
 
When it is 4 (waste 3b ):

 

Set to 2? You can draw ^_^ by yourself.
Structure Variable Start address: global variables are arranged without gaps. Therefore, you must manually align the local variables automatically.
Suggestion: Reasonably arrange the relationship between fields based on the size of the field type, so that the fields have no gap alignment to reduce waste of space, align is used to align the global struct variables at the address of the maximum field type multiple of the struct.
 
(4) command alignment:
. In the code, align and even are only aligned with the command (please check memcpy. (When ASM Uses command alignment), insert align between local variables, and even will not work, and use them between commands may automatically insert some non-Pre-available

Instructions (some gaps may be left in order to achieve alignment compiler ).

Official explanation:
Due to instruction prefetching and queuing within the CPU, there is no requirement for instructions to be aligned on word or doubleword boundaries.

(However, a slight increase in speed results if the target addresses of control transfers are evenly divisible by four .)
 
(5) Section alignment type:
In MASM,. Code,. Data,. Data? The. Const section attribute alignment type is DWORD (also include byte, word, para, and PAG alignment types) (who knows how to change alignment methods such as. Code? Thank you !), That is, these segments are 4 times

Therefore, alin 8, align 16, and align 32 cannot be used in these segments. If you want to use other alignment types, use the segment custom segment.
 
(6) sectionalignment and filealignment in PE files:
It seems that it has nothing to do with this article. Don't pile it up here !!!.
3. Try to analyze/vc98/CRT/src/platform/memcpy. ASM (on the vc6.0 installation disk, you can find vc7. 0 on the installation disk CRT/src/Intel)
See C language: CRT/src/memmove. c
The general idea of memcpy. ASM is to copy from Back to Back when not overwriting, and copy from Back to Back When overwriting.
1. Coverage: When DST> SRC & DST <SRC + Len, for example:

 

For example, the SRC address is 100, the DST address is 106, and 12 characters are copied.

 

Do not handle the coverage (copied from the front to the back ):

 
 
Handle the coverage (copy from Back to Back ):

 
 
2. memcpy. ASM code analysis.
(1) align and copy the target address to improve efficiency.
(2) pipeline operations (u, v, n) using the processor ).
(3) Align @ wordsize is used in the command to align the command.
Note: @ wordsize: two for a 16-bit segment or four for a 32-bit segment (numeric equate ).
(4) When the alignment is less than 8 dual-characters, use the mov command. Otherwise, use the rep movsd command. Here we mainly consider the clock cycle.
(5) try to draw a flowchart.
Copying from the back is the same as copying from the back.

Iv. Conclusion:
This experience is finally completed, I feel that my ability to express is still very limited, please correct, my e_mail: hejiwen2001@sohu.com, if it can help you, I will be very pleased !, At least no busy schedule.
 
 
Reference:
1. The art of assemle language from http://asm.yeah.net/
2. Windows core programming from http://www.infoxa.com/
3. Intel 80386 programmer's reference manual 1986 from http://purec.binghua.com/
4. http://blog.dreambrook.com/soloist/archive/2004/12/12/388.aspx
5. http://wncj.vicp.net/course/hep/huibianyuyan/04-3.htm
6. http://msdn.microsoft.com/library
I would like to thank you for your other network resources.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.