One, byte order
The sequence of bytes , which is the order of bytes, refers to the order in which multibyte data is stored in memory.
On almost all machines, multi-byte objects are stored as contiguous sequence of bytes. For example, if the starting address of a type variable a in C + + int
is &a = 0x100
, then the four bytes of a will be stored in the memory,, 0x100
0x101
0x102
, 0x103
location.
Based on the order in which integer a is stored in contiguous 4 byte memory, the byte order is divided into two categories: big Endian and small-endian (Little Endian) . Then there are two major CPU factions involved:
Motorola 6800,powerpc 970,sparc (except V9) processors use the Big endian method to store data;
x86 series, vax,pdp-11 and other processors use little endian way to store data.
In addition, there are some processors like arm, DEC Alpha's byte order is configurable.
Second, big and small end
So, what is a big-endian, what is a small end? Such as:
I believe the above diagram is intuitive enough. Other words:
- The Big Endian is the low address side that holds the high byte.
- Little Endian refers to low-address-side storage of lower bytes.
Advantages of each:
- Big Endian: The determination of the symbol bit is fixed to the first byte, it is easy to judge the positive or negative.
- Little Endian: The length is 1,2,4 bytes, the arrangement is the same, the data type conversion is very convenient.
Third, why pay attention to the byte order
If you write a program that only runs under a single machine environment and does not deal with other people's programs, you can ignore the existence of the byte order altogether.
But what if your program has to interact with other people's programs? For example, when a C + + program interacts with a Java program:
The data store sequence in a program written by the C + + language is related to the CPU on which the platform is compiled, and now the more common x86 processor is Little Endian
Java-written programs only use the Big Endian way to store data
Imagine that if your C + + program passed the first address of the variable a = 0x12345678
to the Java program, because Java takes the Big Endian way to store the data, it will naturally translate your data into 0x78563412
. Obviously, the problem arises!!!
In addition, the network transmission generally uses the Big Endian, also is called the network byte order , or the network order . When two hosts communicate in different byte order, they must be converted into the network byte order before transmitting the data.
Iv. Judging the machine's byte order
Because the byte-order of the data is stored by the CPU of the platform, we can judge the end-order of the machine through C + + program:
1 2 3 4 5 6 7 8 |
Endianness() { 0x12345678 if (* (0x12) "Big Endian" << Endl; Else "Little Endian" << Endl; }
|
Five, network sequence and host sequence
network byte order : The TCP/IP layer protocol defines the byte order as the Big Endian, so the byte order used in the TCP/IP protocol is the endian.
host byte order: The order in which integers are stored in memory is now more prevalent in Little Endian. (different CPUs have different byte-order)
In network communication, it is usually necessary to call the corresponding function to convert the host order and the network sequence. The Berkeley socket API defines a set of conversion functions for converting 16 and 32bit integers between the network order and the native byte order. Htonl,htons for native order conversion to network order; Ntohl,ntohs for network order conversion to native order
The htons and HTONL functions are required for Linux and Windows network programming to convert host byte order to network byte order.
Under Intel machines, execute the following program
int Main () { printf ("%d/n", htons ()); return 0 ;}
The result is 4096, it feels strange to look at the very end.
As explained below, the 16 binary of the number 16 is represented as 0x0010, and the 16 binary of the number 4096 is represented as 0x1000. Since the Intel machine is a small end, the actual order of storing the number 16 o'clock is 1000, and the actual order of 4096 is 0010. Therefore, in order to send the network packet for the data in the message is 0010, need to pass htons byte conversion. If using large end-end machines such as IBM, there is no such byte-order conversion, but for the portability of the program, it is also best to use this function.
Also note that when the number of digits is less than or equal to one byte (8 bits), do not convert with htons. This is because for the host, the smallest unit of the size end is byte (byte).
"Network Programming series" One: big-endian and small-end representations of byte order