"Network Programming series" One: big-endian and small-end representations of byte order

Source: Internet
Author: User
Tags htons

One, byte order

The sequence of bytes , which is the order of bytes, refers to the order in which multibyte data is stored in memory.

On almost all machines, multi-byte objects are stored as contiguous sequence of bytes. For example, if the starting address of a type variable a in C + + int is &a = 0x100 , then the four bytes of a will be stored in the memory,, 0x100 0x101 0x102 , 0x103 location.

Based on the order in which integer a is stored in contiguous 4 byte memory, the byte order is divided into two categories: big Endian and small-endian (Little Endian) . Then there are two major CPU factions involved:

    • Motorola 6800,powerpc 970,sparc (except V9) processors use the Big endian method to store data;

    • x86 series, vax,pdp-11 and other processors use little endian way to store data.

In addition, there are some processors like arm, DEC Alpha's byte order is configurable.

Second, big and small end

So, what is a big-endian, what is a small end? Such as:


I believe the above diagram is intuitive enough. Other words:

    • The Big Endian is the low address side that holds the high byte.
    • Little Endian refers to low-address-side storage of lower bytes.

Advantages of each:

    1. Big Endian: The determination of the symbol bit is fixed to the first byte, it is easy to judge the positive or negative.
    2. Little Endian: The length is 1,2,4 bytes, the arrangement is the same, the data type conversion is very convenient.

Third, why pay attention to the byte order

If you write a program that only runs under a single machine environment and does not deal with other people's programs, you can ignore the existence of the byte order altogether.

But what if your program has to interact with other people's programs? For example, when a C + + program interacts with a Java program:

    • The data store sequence in a program written by the C + + language is related to the CPU on which the platform is compiled, and now the more common x86 processor is Little Endian

    • Java-written programs only use the Big Endian way to store data

Imagine that if your C + + program passed the first address of the variable a = 0x12345678 to the Java program, because Java takes the Big Endian way to store the data, it will naturally translate your data into 0x78563412 . Obviously, the problem arises!!!

In addition, the network transmission generally uses the Big Endian, also is called the network byte order , or the network order . When two hosts communicate in different byte order, they must be converted into the network byte order before transmitting the data.

Iv. Judging the machine's byte order

Because the byte-order of the data is stored by the CPU of the platform, we can judge the end-order of the machine through C + + program:

 1 
2
3
4
5
6
7
8
Endianness()
{
0x12345678
if (* (0x12)
"Big Endian" << Endl;
Else
"Little Endian" << Endl;
}

Five, network sequence and host sequence

network byte order : The TCP/IP layer protocol defines the byte order as the Big Endian, so the byte order used in the TCP/IP protocol is the endian.

host byte order: The order in which integers are stored in memory is now more prevalent in Little Endian. (different CPUs have different byte-order)

In network communication, it is usually necessary to call the corresponding function to convert the host order and the network sequence. The Berkeley socket API defines a set of conversion functions for converting 16 and 32bit integers between the network order and the native byte order. Htonl,htons for native order conversion to network order; Ntohl,ntohs for network order conversion to native order

The htons and HTONL functions are required for Linux and Windows network programming to convert host byte order to network byte order.

Under Intel machines, execute the following program

int Main () {   printf ("%d/n", htons ());       return 0 ;}

The result is 4096, it feels strange to look at the very end.

As explained below, the 16 binary of the number 16 is represented as 0x0010, and the 16 binary of the number 4096 is represented as 0x1000. Since the Intel machine is a small end, the actual order of storing the number 16 o'clock is 1000, and the actual order of 4096 is 0010. Therefore, in order to send the network packet for the data in the message is 0010, need to pass htons byte conversion. If using large end-end machines such as IBM, there is no such byte-order conversion, but for the portability of the program, it is also best to use this function.

Also note that when the number of digits is less than or equal to one byte (8 bits), do not convert with htons. This is because for the host, the smallest unit of the size end is byte (byte).

"Network Programming series" One: big-endian and small-end representations of byte order

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.