Big-endian format, small-end format
Big-endian format:
In this format, the high byte of the character data is stored in the low address, while the low byte of the word data is stored in the high address, 2.1:
Figure 2.1
Small End Format:
In contrast to the big-endian storage format, in the small-end storage format, the low-level address holds the low byte of the character data, and the high address holds the high byte of the character data. 2.2 is shown below:
Figure 2.2
Please write a C function, if the processor is Big_endian, then return 0, if Little_endian, then return 1
Answer:
int Checkcpu () { Union w { int A; Char b; C C.A = 1; return (C.B = = 1);}
Analysis:
Embedded system developers should be well aware of the Little-endian and Big-endian models. For example, the 16bit wide number of 0x1234 in Little-endian mode CPU memory (assuming that it is stored from the address 0x4000) is as follows:
Memory address |
0x4000 |
0x4001 |
Store content |
0x34 |
0x12 |
In Big-endian mode, the CPU memory is stored in the following way:
Memory address |
0x4000 |
0x4001 |
Store content |
0x12 |
0x34 |
The 32bit wide number of 0x12345678 in Little-endian mode CPU memory (assuming that it is stored from the address 0x4000) is as follows:
Memory address |
0x4000 |
0x4001 |
0x4002 |
0x4003 |
Store content |
0x78 |
0x56 |
0x34 |
0x12 |
In Big-endian mode, the CPU memory is stored in the following way:
Memory address |
0x4000 |
0x4001 |
0x4002 |
0x4003 |
Store content |
0x12 |
0x34 |
0x56 |
0x78 |
The Union union is stored in the order that all members are stored from the low address.
=============== hehe or attach another piece of code, excerpt from an open source project = = =
int Big_endian (void) {union{long L; Char c[sizeof (long)]; }u; U.L = 1; Return (u.c[sizeof (long)-1] = = 1);}
Sometimes, you need to know whether to write programs in C is big-endian or small-end mode. The so-called big-endian mode is that the low data is kept in the high address of the memory, and the data is kept in the low address of the memory, and the so-called small-end mode means that the low-level data is kept in the lower address of the memory, while the high position of the data is kept in the higher address of memory. Why do you have the size and end mode of the points? This is because in the computer system, we are in bytes, each address unit corresponds to a byte, one byte is 8bit. But in the C language in addition to 8bit of Char, there are 16bit short type, 32bit long (to see the specific compiler), in addition, for the number of bits greater than 8 bits of the processor, such as 16-bit or 32-bit processor, because the register width is greater than one byte, Then there must be a problem if multiple bytes are scheduled. The result is a big-endian storage mode and a small-end storage mode. For example, a 16bit short x, where the value of address 0x0010,x in memory is 0x1122, then 0x11 is a high byte and 0x22 is a low byte. For big-endian mode, put 0x11 in the low address, that is, 0x0010, 0x22 placed in the high address, that is, 0x0011. Small-end mode, just the opposite. The X86 structure we commonly use is the small-end mode, while the Keil C51 is the big-endian mode. Many of the arm,dsp are in the small-end mode. Some arm processors can also be hardware to choose between big-endian or small-end mode. The following code can be used to test whether your compiler is big-endian or small-end mode: short int x;
Char x0,x1;
x=0x1122;
X0= ((char*) &x) [0]; Low address unit
X1= ((char*) &x) [1]; High address Unit
If x0=0x11, it is big-endian; If x0=0x22, it is small end ... Detailed description of big-endian mode and small-end mode The origins of big-endian and small-end patterns on the origin of the small end nouns, there is an interesting story from Jonathan Swift's "Gulliver Travels": Lilliput and Blefuscu, the two powerful countries have been fighting for the past 36 months. The cause of war: we all know that when eating eggs, the original method is to break the larger end of the egg, you can then the emperor's grandfather because of the hour hou eat eggs, according to this method to break the fingers, so his father, ordered that all the people eat eggs, must first break the small end of the egg, breach people heavy punishment. Then the common people to this decree extremely disgusted, during a number of rebellions, one of the emperor died, another lost the throne, the cause of rebellion is another country Blefuscu King minister stirred up, after the rebellion subsided, fled to the imperial refuge. It is estimated that several 11,000 people would rather die than break the egg with the smaller end. This was in fact a satire on the continuing conflict between Britain and France. Danny Cohen, the founder of a network protocol, used these terms for the first time to refer to byte order, which was later widely accepted.
Second, what is big and small end Big-endian and Little-endian are defined as follows:
1) Little-endian is the low byte emissions in the memory of the lower address, high-byte emissions at the high address of the memory.
2) Big-endian is the high-bit bytes emitted at the low address of the memory, low bytes emitted in the memory of the higher address.
For example, for example, digital 0x12 34 56 78 is represented in memory:
1) Big-endian mode:
Low address-----------------> High address
0x12 | 0x34 | 0x56 | 0x78
2) Small terminal mode:
Low address------------------> High address
0x78 | 0x56 | 0x34 | 0x12
It can be seen that the big-endian pattern is similar to string storage.
3) Here are two specific examples:
The 16bit wide number of 0x1234 in Little-endian mode (and Big-endian mode) in CPU memory (assuming storage from address 0x4000) is:
Memory address |
Small Terminal mode storage content |
Store content in big-endian mode |
0x4000 |
0x34 |
0x12 |
0x4001 |
0x12 |
0x34 |
32bit wide number of 0x12345678 in Little-endian mode and Big-endian mode) in CPU memory (assuming that the store starts from the address 0x4000) is:
Memory address |
Small Terminal mode storage content |
Store content in big-endian mode |
0x4000 |
0x78 |
0x12 |
0x4001 |
0x56 |
0x34 |
0x4002 |
0x34 |
0x56 |
0x4003 |
0x12 |
0x78 |
4) Big-endian small end no who gifted who inferior, each advantage is each other disadvantage:
Small-end mode: Cast data does not need to adjust byte content, 1, 2, 4 bytes are stored the same way.
Big-endian mode: the symbol bit is fixed to the first byte, easy to determine the positive or negative.
Three, the storage of the array in the case of the big-endian small end: Take unsigned int value = 0x12345678 As an example, look at its storage in two byte order, we can use unsigned char buf[4] to represent value:
Big-endian: Low address holds high, as follows:
High Address
---------------
BUF[3] (0x78)--Low
BUF[2] (0x56)
BUF[1] (0x34)
Buf[0] (0x12)--high
---------------
Low Address
Little-endian: Low address storage lower, as follows:
High Address
---------------
BUF[3] (0x12)--high
BUF[2] (0x34)
BUF[1] (0x56)
Buf[0] (0x78)--Low
--------------
Low Address
Four, why there is the size of the end of the model?
This is because in the computer system, we are in bytes, each address unit corresponds to a byte, one byte is 8bit. But in the C language in addition to 8bit of Char, there are 16bit short type, 32bit long (to see the specific compiler), in addition, for the number of bits greater than 8 bits of the processor, such as 16-bit or 32-bit processor, because the register width is greater than one byte, Then there must be a problem if multiple bytes are scheduled. The result is a big-endian storage mode and a small-end storage mode. For example, a 16bit short x, where the value of address 0x0010,x in memory is 0x1122, then 0x11 is a high byte and 0x22 is a low byte. For big-endian mode, put 0x11 in the low address, that is, 0x0010, 0x22 placed in the high address, that is, 0x0011. Small-end mode, just the opposite. The X86 structure we commonly use is the small-end mode, while the Keil C51 is the big-endian mode. Many of the arm,dsp are in the small-end mode. Some arm processors can also be hardware to choose between big-endian or small-end mode.
V. How to determine the byte order of a machine you can write a small test program to determine the machine's byte order:
[CPP]View Plaincopy
- BOOL Isbigendian ()
- {
- int a = 0x1234;
- Char b = * (char *) &a; //By converting the int coercion type to char single byte, by judging the starting storage location. is equal to the low address portion of B equal to a
- if (b = = 0x12)
- {
- return TRUE;
- }
- return FALSE;
- }<span style="font-family:arial, Verdana, Sans-serif; White-space:normal; Background-color:rgb (255, 255, 255); "> </span>
The order in which the Union union is stored is that all members are stored from the low address, which makes it easy to obtain CPU-to-memory Little-endian or Big-endian mode reads and writes:
[CPP]View Plaincopy
- BOOL Isbigendian ()
- {
- Union NUM
- {
- int A;
- Char b;
- }num;
- NUM.A = 0x1234;
- if (num.b = = 0x12)
- {
- return TRUE;
- }
- return FALSE;
- }<span style="font-family:arial, Verdana, Sans-serif; White-space:normal; Background-color:rgb (255, 255, 255); "> </span>
Six, the common byte-order general operating system are small, and communication protocol is big-endian.
4.1 Common CPU byte-order big Endian:powerpc, IBM, Sun
Little Endian:x86, DEC
Arm can work in either the big-endian mode or the small-end mode.
4.2 Byte-order of common Files Adobe Ps–big Endian
Bmp–little Endian
DXF (AutoCAD) –variable
Gif–little Endian
Jpeg–big Endian
Macpaint–big Endian
Rtf–little Endian
In addition, Java and all network communication protocols are encoded using Big-endian.
Vii. How to convert for Word data (16-bit):
[CPP]View Plaincopy
- #define BIGTOLITTLE16 (A) (((UInt16) (a) & 0xff00) >> 8) |
- (((UInt16) (A) & 0X00FF) << 8))
For double-word data (32-bit):
[CPP]View Plaincopy
- #define BIGTOLITTLE32 (A) (((UInt32) (a) & 0xff000000) >> 24) |
- (((UInt32) (A) & 0x00ff0000) >> 8) | \
- (((UInt32) (A) & 0x0000ff00) << 8) | \
- (((UInt32) (A) & 0x000000ff) << 24))
Eight, from the software perspective to understand the end mode from the software perspective, the different end-mode processor data transfer must consider the different end mode. In the case of network data transfer, it is necessary to consider the conversion of the end mode. In the socket programming, the following functions are used to convert the byte order of the size end.
[CPP]View Plaincopy
- #define NTOHS (N)//16-bit data type network byte order conversion to host byte order
- #define HTONS (N)//16-bit data type host byte order to network byte order conversion
- #define NTOHL (N)//32-bit data type network byte order conversion to host byte order
- #define HTONL (N)//32-bit data type host byte order to network byte order conversion
The network byte order in which the Internet is used is addressed in the big-endian mode, and the host byte order differs depending on the processor, such as the PowerPC processor using the endian mode, while the Pentuim processor uses the small-end mode.
The byte order to the network byte order of the big-endian mode processor does not need to be converted, at this time Ntohs (n) =n,ntohl = n, while the byte order of the small-end mode processor must be converted, at this time Ntohs (n) = __swab16 (n), Ntohl = __swab32 (n). The __SWAB16 and __swab32 function definitions are as follows.
[CPP]View Plaincopy
- #define ___SWAB16 (x)
- {
- __u16 __x = (x);
- ((__U16) (
- ((__U16) (__x) & (__U16) 0x00ffu) << 8) |
- ((__U16) (__x & (__U16) 0xff00u) >> 8)));
- }
- #define ___SWAB32 (x)
- {
- __u32 __x = (x);
- ((__U32) (
- ((__U32) (__x) & (__U32) 0x000000fful) << 24) |
- ((__U32) (__x) & (__U32) 0x0000ff00ul) << 8) |
- ((__U32) (__x) & (__U32) 0x00ff0000ul) >> 8) |
- ((__U32) (__x & (__U32) 0xff000000ul) >> 24)));
- }
The PowerPC processor provides LWBRX,LHBRX,STWBRX,STHBRX four instructions for processing byte-order conversions to optimize functions such as __SWAB16 and __SWAP32. In addition, the Rlwimi instructions in the PowerPC processor can also be used to implement functions such as __SWAB16 and __SWAP32.
It is also necessary to consider the problem of the end mode when dealing with ordinary files. In the big-endian processor, 32, 16-bit read and write operations of the file are different from the small-end mode processor. Purely from the perspective of software is far from the real understanding of the difference between the size and end of the pattern. In fact, the real understanding of the difference between the size and the end pattern must be from the system perspective, from the instruction set, register and the data bus to understand in depth, the difference between the size and end pattern.
Nine, from the perspective of the system to understand the end of the model to add two keywords, MSB and LSB:
Msb:most significant bit-------most significant bit
Lsb:least significant bit-------the least significant bit processor on the hardware due to the problem of the end mode is different in the design. From the point of view of system, the problem of terminal mode has different influence on the design of software and hardware, and when the size-end mode exists in a processor system, the access of these different end modes must be handled specially.
PowerPC processor dominates the network market, it can be said that the vast majority of communications equipment using PowerPC processor for protocol processing and other control information processing, which may also be in the network of the vast majority of protocols are the use of big-endian address the reason. Therefore, in the software design of the network protocol, the processor using the small-end method needs to process the change of the end mode in the software. And Pentium dominate the personal computer market, so most of the peripherals used for personal computer use small-end mode, including some in the network equipment used in the PCI bus, flash and other devices, which also requires in the hardware design to pay attention to the conversion of the end mode.
The small-end peripherals mentioned in this article refer to the small-ended storage of registers in such peripherals, such as the configuration space of the PCI device, the registers in NOR Flash, and so on. For some devices, such as DDR particles, there is no register that is stored in small-end mode, so it is logically not necessary to convert the terminal pattern. In the design, only the two data buses should be connected to the one by one corresponding, without the need to convert the bus.
From a practical point of view, a processor with a small-end mode needs to handle the conversion of the end mode in the software, because a small-end mode processor does not require any conversion when it is connected to a small-end peripheral. A processor with a big-endian pattern needs to be transformed at the hardware design-time processing end mode. The big-endian processor needs to be processed in registers, instruction sets, data buses, and connections to small-end peripherals, and so on, to address the problem of end-mode switching when connected to small-end peripherals. On the bit-order definitions for registers and data buses, processors based on the size-end pattern are different.
A 32-bit processor in big-endian mode, such as MPC8541 based on the E500 core, defines the highest MSB (most significant bit) of its register as 0, and the lowest bit LSB (lease significant bit) is defined as 31 , while the 32-bit processor in the small-end mode defines the highest bit of its register as 31, and the low-level address is defined as 0. Corresponding to this, the highest bit of the 32-bit processor data bus with the big-endian mode is 0, the highest bit is 31, and the highest bit for a 32-bit processor with small-end mode is 31 and the lowest bit is 0.
The bit sequence of the external bus of the size-end mode processor follows the same rule, depending on the data bus used is 32-bit, 16-bit and 8-bit, the bit sequence of the external bus of the size end processor is different. The MSB of the 32-bit data bus in the big-endian mode is the No. 0 bit, the MSB is the No. 0 to 7th field of the bus, and the LSB is the 31st bit and the LSB is the 24th to 31st field. The MSB of the 32-bit bus in small-ended mode is the 31st bit, the MSB is the 31st to 24th bit of the data bus, the LSB is No. 0 bit, and the LSB is the 7~0 field. The MSB of the 16-bit data bus in the big-endian mode is the No. 0 bit, the MSB is the No. 0 to 7th field of the bus, and the LSB is the 15th bit and the LSB is the 8th to 15th field. The MSB of the 16-bit bus in small-ended mode is the 15th bit, the MSB is the 15th to 7th bit of the data bus, the LSB is No. 0 bit, and the LSB is the 7~0 field. The MSB of the 8-bit data bus in the big-endian mode is the No. 0 bit, the MSB is the No. 0 to 7th field of the bus, and the LSB is the 7th bit and the LSB is the No. 0 to 7th field. The MSB of the 8-bit bus in small-ended mode is the 7th bit, the MSB is the 7th to No. 0 bit of the data bus, the LSB is No. 0 bit, and the LSB is the 7~0 field.
From the above analysis, we can know that for 8-bit, 16-bit and 32-bit width of the data bus, in the big-endian mode, the location of the MSB and MSB of the data bus will not change, and in the small-end mode, the LSB and LSB location of the data bus will not change.
For this reason, the processor in the big-endian mode has 8-bit, 16-bit, and 32-bit memory accesses (including peripheral access) that generally contain the No. 0 to 7th field, the MSB. The small-end mode processor for 8-bit, 16-bit, and 32-bit memory accesses contains the 7th to No. 0 bit, the small-ended method of the 7th to No. 0 field, which is the LSB. Because the data bus of the size end processor has a different definition for its 8-bit, 16-bit and 32-bit-width data bus, it is necessary to discuss separately how to handle the end-mode conversion at the system level. In a big-endian processor system, it is necessary to handle the access of the small end processor to the minor peripherals.
Examples in practice although many times, the work of the byte-order has been completed by the compiler, but in some small details, still need to carefully consider, especially in the Ethernet communication, Modbus communication, software portability. Here, give an example of modbus communication. In Modbus, the data need to be organized into data packets, the data in this message is the big-endian mode, that is, low address storage high and low address. Suppose there is a 116-bit buffer m_regmw[256], because it is on the x86 platform, so the data in memory is the small-ended mode: M_regmw[0].low, M_regmw[0].high, M_regmw[1].low, M_regmw[1].high ... ...
For ease of discussion, suppose m_regmw[0] = 0x3456; In memory for 0x56, 0x34.
Now you want to send this data, if the data is not converted directly sent, at this time the data sent is 0x56,0x34. And Modbus is big-endian, will interpret the data as 0x5634 instead of the original data 0x3456, this time will be catastrophic error. So, before this, you need to convert the small end data to big-endian, that is, high-byte and low-byte exchange, you can call the Step V function BigtoLittle16 (m_regmw[0]), and then send to get the correct data.
Example:
Main () {unsigned int a = 0x12345678;unsigned char *p = (unsigned char *) (&A);//78 Low, 12 high printf ("%s\n", (0x78 = = * (P +0))? "Little Endian": "Big Endian"); return;}
Main () {Union test{unsigned int a;unsigned char b;} C;C.A = 0x12345678;//78 Low, 12 high
printf ("%s\n", (0x78 = = c.b)? "Little Endian": "Big Endian"); return;}
Big-endian format, small-end format (RPM)