Network Programming: Host byte and network byte, network programming byte
Recently, I was using python to develop a general testing tool for testing engine services. Here I will sort out a concept in network programming that is inevitable: Host byte and network byte.
Here we will first introduce the small-end mode (LE little-endian) and the big-endian mode (BE big-endian ).
1. Small-end mode:
The sub-terminal mode is the most consistent with the human thinking in the byte sequence: that is, the low level of the low level storage value of the address, the High Level storage value of the address, how to say is the most consistent with the human thinking in the byte sequence, it is because the low value is small from the human's first perspective, and it should be placed in a place with a small memory address, that is, a low memory address; otherwise, the high value should be placed in a place with a large memory address, that is, the memory address is too high. For example, in the small-end mode, the data 0x12345678 is stored in the memory as follows:
2. Big end Mode
The large-end mode is the most intuitive byte order. The low-end storage value is the highest, and the low-end storage value is the highest. Why is it intuitive? do not consider the corresponding relationship. You only need to write the memory address from left to right in the order from low to high, and write the values in the order of high to low, fill in one byte and one byte. For example, in the big-end mode, the data 0x12345678 is stored in the memory in the form:
3. Why is there a big-size-End division?
This is because each processor vendor's processor processes different data storage policies. In a computer system, for a processor whose digits are greater than 8 bits, such as a 16-bit or 32-bit processor, because the register width is greater than one byte, there must be a problem where multiple bytes are arranged. Therefore, the large-end storage mode and the Small-end storage mode are created.
For example, if the address of a 16-bit short x in the memory is 0x0010 and the value of x is 0x1122, 0x11 is a high byte and 0x22 is a low byte. In the big-end mode, place 0x11 in the low address, that is, 0x0010, and 0x22 in the high address, that is, 0x0011. The small-end mode is the opposite. The commonly used X86 architecture is the small-end mode, while the KEIL C51 mode is the large-end mode. Many ARM and DSP are in small-end mode. Some ARM processors can also choose big-end mode or small-end mode from hardware.
4. Determine whether your machine is in the big-end or small-end mode.
Whether the machine's byte sequence is a large-end mode or a small-end mode has a processor type related to the installed operating system. In general, we can use a small piece of C ++ code to test the mode of our machine. I have written a piece of c ++ code here. The final integer data is 0x12345678, And the char * pointer p points to the first address of data, if the content of the byte to which p points is 0x12, it indicates that the local machine is in the big-end mode. If the content of the byte to which p points is 0x78, it indicates the small-part mode. The Code is as follows:
1 #include <iostream> 2 int main() 3 { 4 int i=sizeof(int); 5 std::cout<<i<<std::endl; 6 int data=0x12345678; 7 char *p=(char *)&data; 8 if(*p==0x12) 9 {10 std::cout<<"big end"<<std::endl;11 }12 if(*p==0x78)13 {14 std::cout<<"small end"<<std::endl;15 }16 return 1;17 }
5. Host byte and network byte
The so-called host byte sequence is the internal host (which can be obtained according to the code in section 4th). The data processing method in the memory is either large or small. Network byte order refers to the byte order of the large-end mode.
In the process of network transmission, there must be a standardized process, that is, from host a to host B for communication, a's inherent data storage ------- standardization -------- is converted into B's inherent format
As mentioned above: the inherent data storage format of a or B is the host's byte sequence. The above standardization is the network byte sequence (that is, the large-end byte sequence ): the host's byte order ---------- the network's byte order --------- the host's byte order of B.
So when do we need to convert the host's byte order to the network's byte order?
1) if the collation of host a and host B is inconsistent, for example, host a is in the big-end mode and host B is in the small-end mode, then, before sending data, a should convert the host's byte order to the network's byte order (this can be skipped because host a's byte order is the same as the network's byte order and both are big-end byte orders ), when host B receives data, it must convert the network byte order to the host byte order, that is, the conversion from the big end to the small end.
2) If the collation of host a is the same as that of host B, if both are in the small-end mode, you can skip the conversion from host byte to network byte during data transmission. If one day I port the program on host a to host c (large-end byte order) and use host c to communicate with host B, because the byte order of host c and host B is inconsistent, in this case, the data value parsing error occurs.
Therefore, in actual engineering applications, no matter whether the byte order of host a and host B is the same, for the portability and compatibility of the program, we recommend that the data sending host convert the host's byte order to the network's byte order, and the data receiver convert the network's byte order to the host's byte order.