Network byte code, local bytecode (big end, small side)

Source: Internet
Author: User

The origins of big-endian and small-end patterns

There is an interesting story about the origin of the small end nouns, from Jonathan Swift's Travels of Gulliver: The two powerful powers of Lilliput and Blefuscu have been fighting for the past 36 months. The cause of war: we all know that when eating eggs, the original method is to break the larger end of the egg, you can then the emperor's grandfather because of the hour hou eat eggs, according to this method to break the fingers, so his father, ordered that all the people eat eggs, must first break the small end of the egg, breach people heavy punishment. Then the common people to this decree extremely disgusted, during a number of rebellions, one of the emperor died, another lost the throne, the cause of rebellion is another country Blefuscu King minister stirred up, after the rebellion subsided, fled to the imperial refuge. It is estimated that several 11,000 people would rather die than break the egg with the smaller end. This was in fact a satire on the continuing conflict between Britain and France. Danny Cohen, the founder of a network protocol, used these terms for the first time to refer to byte order, which was later widely accepted.

Big endian and small end

The definitions of Big-endian and Little-endian are as follows:
1) Little-endian is the low byte emissions in the memory of the lower address, high-byte emissions at the high address of the memory.
2) Big-endian is the high-bit bytes emitted at the low address of the memory, low bytes emitted in the memory of the higher address.
For example, for example, digital 0x12 34 56 78 is represented in memory:
1) Big-endian mode:

Low address ————— –> high address
0x12 | 0x34 | 0x56 | 0x78
2) Small terminal mode:

Low address —————— > High address
0x78 | 0x56 | 0x34 | 0x12
It can be seen that the big-endian pattern is similar to string storage.

Specific examples

The 16bit wide number of 0x1234 in Little-endian mode (and Big-endian mode) in CPU memory (assuming storage from address 0x4000) is:

in big-endian mode
memory Address small terminal mode storage content Store content
0x4000 0x34 0x12
0x4001 0x12 0x34

32bit wide number of 0x12345678 in Little-endian mode and Big-endian mode) in CPU memory (assuming that the store starts from the address 0x4000) is:

in big-endian mode
memory Address small terminal mode storage content Store content
0x4000 0x78
0x4001 0x56 0x34
0x4002 0x34 0x56
0x4003 0x12 0x78
Big-endian small end no who gifted who inferior, each advantage is each other disadvantage:

Small-end mode: Cast data does not need to adjust byte content, 1, 2, 4 bytes are stored the same way.
Big-endian mode: the symbol bit is fixed to the first byte, easy to determine the positive or negative.

Why do you have the size and end mode of the points?

This is because in the computer system, we are in bytes, each address unit corresponds to a byte, one byte is 8bit. However, in addition to the 8bit char in the programming language, there are 16bit short, 32bit long (to see the specific compiler), in addition, for processors with a bit greater than 8 bits, such as 16-bit or 32-bit processors, because the register width is greater than one byte, Then there must be a problem if multiple bytes are scheduled. The result is a big-endian storage mode and a small-end storage mode. For example, a 16bit short x, where the value of address 0x0010,x in memory is 0x1122, then 0x11 is a high byte and 0x22 is a low byte. For big-endian mode, put 0x11 in the low address, that is, 0x0010, 0x22 placed in the high address, that is, 0x0011. Small-end mode, just the opposite.

Determine the byte order of the Machine

A small test program can be written to determine the byte order of the machine:

BOOL IsBigEndian()  {      int0x1234;      char b =  *(char *)&a;  //通过将int强制类型转换成char单字节,通过判断起始存储位置。即等于取b等于a的低地址部分      if0x12)      {          return TRUE;      }      return FALSE;  }

The order in which the Union union is stored is that all members are stored from the low address, which makes it easy to obtain CPU-to-memory Little-endian or Big-endian mode reads and writes:

BOOL IsBigEndian()  {      union NUM      {          int a;          char b;      }num;      0x1234;      if0x12 )      {          return TRUE;      }      return FALSE;  }
Application of the size end

Common byte-order general operating systems are small, and communication protocols are big-endian.
Common CPU byte-order Big Endian:powerpc, IBM, Sun
Little Endian:x86, DEC
Arm can work in either the big-endian mode or the small-end mode.

The byte-order of Common Files Adobe ps–big Endian
Bmp–little Endian
DXF (AutoCAD) –variable
Gif–little Endian
Jpeg–big Endian
Macpaint–big Endian
Rtf–little Endian
In addition, Java and all network communication UDP/TCP/IP protocols are encoded using Big-endian.

Related issues

are byte-order conversions required for network communication?

A platform with the same byte order may not perform byte-order conversions while network communication is in progress, but byte-order conversions must be performed for network data communication across platforms.
The reasons are as follows: The network protocol specifies that the first byte to be received is high byte, which is stored to a low address, so the sending will first go to the low address to fetch the high byte of the data. The small-end mode of multi-byte data at the time of storage, low-address storage is low-byte, and the Sender network protocol function is sent to the low address to fetch data first (want to take high bytes, really get low bytes), the Receiver network protocol function will receive the first byte to the low address (want to receive high bytes, Really receive a low byte), so the last two sides are correctly send and receive data. While the same platform for communication, if both sides of the conversion, although the data can be correctly sent and received, but the conversion is meaningless, resulting in a waste of resources. While the different platforms to communicate with the conversion, do not convert will cause errors in sending and receiving data, the byte-order conversion function will be based on the current platform's storage mode to make the appropriate conversion, if the current platform is big, then directly return no conversion, if the current platform is a small end, will receive the network byte order conversion.

Network byte order

The data transmitted on the network is a byte stream, for a multi-byte numeric value, in the network transmission, the first pass which bytes? That is, when the receiver receives the first byte, it will be a high-byte or low-byte processing, is a more meaningful problem; The UDP/TCP/IP protocol provides that the first byte received is treated as a high byte, which requires that the first byte sent by the sending side is a high byte, whereas the first byte sent when the sender sends the data is the byte that corresponds to the value at the start address in memory, i.e., The byte that corresponds at the start address in memory is the first high-order byte to be sent (that is, the high byte is stored at the low address), so that the multibyte value is stored in memory as the big-endian method before it is sent; So, the network byte order is the big endian byte sequence.

Post

The PowerPC processor provides LWBRX,LHBRX,STWBRX,STHBRX four instructions for processing byte-order conversions to optimize functions such as __SWAB16 and __SWAP32. In addition, the Rlwimi instructions in the PowerPC processor can also be used to implement functions such as __SWAB16 and __SWAP32.
It is also necessary to consider the problem of the end mode when dealing with ordinary files. In the big-endian processor, 32, 16-bit read and write operations of the file are different from the small-end mode processor. Purely from the perspective of software is far from the real understanding of the difference between the size and end of the pattern. In fact, the real understanding of the difference between the size and the end pattern must be from the system perspective, from the instruction set, register and the data bus to understand in depth, the difference between the size and end pattern.

Understanding the end mode from a system perspective

Add two keywords, MSB and LSB first:
Msb:most significant bit ——-most significant bit
Lsb:least significant bit ——-least significant bit

The processor differs in design because of the end-mode problem on the hardware. From the point of view of system, the problem of terminal mode has different influence on the design of software and hardware, and when the size-end mode exists in a processor system, the access of these different end modes must be handled specially.
PowerPC processor dominates the network market, it can be said that the vast majority of communications equipment using PowerPC processor for protocol processing and other control information processing, which may also be in the network of the vast majority of protocols are the use of big-endian address the reason. Therefore, in the software design of the network protocol, the processor using the small-end method needs to process the change of the end mode in the software. And Pentium dominate the personal computer market, so most of the peripherals used for personal computer use small-end mode, including some in the network equipment used in the PCI bus, flash and other devices, which also requires in the hardware design to pay attention to the conversion of the end mode.
The small-end peripherals mentioned in this article refer to the small-ended storage of registers in such peripherals, such as the configuration space of the PCI device, the registers in NOR Flash, and so on. For some devices, such as DDR particles, there is no register that is stored in small-end mode, so it is logically not necessary to convert the terminal pattern. In the design, only the two data buses should be connected to the one by one corresponding, without the need to convert the bus.
From a practical point of view, a processor with a small-end mode needs to handle the conversion of the end mode in the software, because a small-end mode processor does not require any conversion when it is connected to a small-end peripheral. A processor with a big-endian pattern needs to be transformed at the hardware design-time processing end mode. The big-endian processor needs to be processed in registers, instruction sets, data buses, and connections to small-end peripherals, and so on, to address the problem of end-mode switching when connected to small-end peripherals. On the bit-order definitions for registers and data buses, processors based on the size-end pattern are different.
A 32-bit processor in big-endian mode, such as MPC8541 based on the E500 core, defines the highest MSB (most significant bit) of its register as 0, and the lowest bit LSB (lease significant bit) is defined as 31 , while the 32-bit processor in the small-end mode defines the highest bit of its register as 31, and the low-level address is defined as 0. Corresponding to this, the highest bit of the 32-bit processor data bus with the big-endian mode is 0, the highest bit is 31, and the highest bit for a 32-bit processor with small-end mode is 31 and the lowest bit is 0.
The bit sequence of the external bus of the size-end mode processor follows the same rule, depending on the data bus used is 32-bit, 16-bit and 8-bit, the bit sequence of the external bus of the size end processor is different. The MSB of the 32-bit data bus in the big-endian mode is the No. 0 bit, the MSB is the No. 0 to 7th field of the bus, and the LSB is the 31st bit and the LSB is the 24th to 31st field. The MSB of the 32-bit bus in small-ended mode is the 31st bit, the MSB is the 31st to 24th bit of the data bus, the LSB is No. 0 bit, and the LSB is the 7~0 field. The MSB of the 16-bit data bus in the big-endian mode is the No. 0 bit, the MSB is the No. 0 to 7th field of the bus, and the LSB is the 15th bit and the LSB is the 8th to 15th field. The MSB of the 16-bit bus in small-ended mode is the 15th bit, the MSB is the 15th to 7th bit of the data bus, the LSB is No. 0 bit, and the LSB is the 7~0 field. The MSB of the 8-bit data bus in the big-endian mode is the No. 0 bit, the MSB is the No. 0 to 7th field of the bus, and the LSB is the 7th bit and the LSB is the No. 0 to 7th field. The MSB of the 8-bit bus in small-ended mode is the 7th bit, the MSB is the 7th to No. 0 bit of the data bus, the LSB is No. 0 bit, and the LSB is the 7~0 field.
From the above analysis, we can know that for 8-bit, 16-bit and 32-bit width of the data bus, in the big-endian mode, the location of the MSB and MSB of the data bus will not change, and in the small-end mode, the LSB and LSB location of the data bus will not change.
For this reason, the processor in the big-endian mode has 8-bit, 16-bit, and 32-bit memory accesses (including peripheral access) that generally contain the No. 0 to 7th field, the MSB. The small-end mode processor for 8-bit, 16-bit, and 32-bit memory accesses contains the 7th to No. 0 bit, the small-ended method of the 7th to No. 0 field, which is the LSB. Because the data bus of the size end processor has a different definition for its 8-bit, 16-bit and 32-bit-width data bus, it is necessary to discuss separately how to handle the end-mode conversion at the system level. In a big-endian processor system, it is necessary to handle the access of the small end processor to the minor peripherals.
Examples in practice although many times, the work of the byte-order has been completed by the compiler, but in some small details, still need to carefully consider, especially in the Ethernet communication, Modbus communication, software portability. Here, give an example of modbus communication. In Modbus, the data need to be organized into data packets, the data in this message is the big-endian mode, that is, low address storage high and low address. Suppose there is a 116-bit buffer m_regmw[256], because it is on the x86 platform, so the data in memory is the small-ended mode: M_regmw[0].low, M_regmw[0].high, M_regmw[1].low, M_regmw[1].high ... ...
For ease of discussion, suppose m_regmw[0] = 0x3456; In memory for 0x56, 0x34.
Now you want to send this data, if the data is not converted directly sent, at this time the data sent is 0x56,0x34. And Modbus is big-endian, will interpret the data as 0x5634 instead of the original data 0x3456, this time will be catastrophic error. So, before this, you need to convert the small end data to big-endian, that is, high-byte and low-byte exchange, you can call the Step V function BigtoLittle16 (m_regmw[0]), and then send to get the correct data.

Network byte code, local bytecode (big end, small side)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.