This article from Csdn Blog, reproduced please indicate the source: http://blog.csdn.net/yu444/archive/2010/05/13/5587781.aspx
What is the network byte order and host byte order?
When doing network programming, you need to transform to unify the "format"
Briefly:
Network byte sequence Nbo (network byte order):
You can avoid compatibility problems by storing them in order from highest to lowest, using a uniform network byte order on the network.
Host byte sequence (hbo,host byte order):
Different machine HBO is not the same, with CPU design
Different computer structures sometimes store data in different byte order. For example, Intel-based computers store data
The order is the opposite of the Macintosh (Motorola) computer. Intel byte order is called "Little-endian" (small End),
On the other hand, the Macintosh (Motorola), and the use of the Internet standard is "Big-endian" (Big Big).
Big-endian |
A high Byte in word is placed in memory at the lower address of the word area. |
Little-endian |
A low Byte in word is placed in memory at the lower address of the word area. |
Detailed
Different CPUs have different byte-order types These are the order in which integers are stored in memory. This is called the host sequence.
The most common of these are two
1. Little endian: Storing low order bytes at the start address
2. Big endian: Storing High-order bytes at the start address
LE Little-endian The most consistent byte-order of human thinking
Low address Low store value
High Address High store value
How to say is the most consistent with the human mind of the byte sequence, because from the first impression of people
Low value is small, it should be placed in the memory address of the small place, that is, memory address low
Conversely, the high value should be placed in the memory address of the place, that is, high memory address
Be Big-endian
Most intuitive byte-order
Low Address store value high
Low address High store value
Why the intuitive, do not consider the corresponding relationship
Only need to write the memory address from left to right in the order from low to high
Write out the values in the order of the usual highs to lows
In contrast, a byte padding of one byte
Example: How to store a double character 0x01020304 (DWORD) in memory
Memory address
4000 4001 4002 4003
LE 04 03 02 01
be 01 02 03 04
Example: If we write 0X1234ABCD to memory starting with 0x0000, the result is
Big-endian Little-endian
0x0000 0x12 0xCD
0x0001 0x23 0xAB
0x0002 0xAB 0x34
0x0003 0xCD 0x12
The x86 series CPU is a Little-endian byte sequence.
Network byte order is a set of data representation format in TCP/IP, which is independent of specific CPU type, operating system, etc., so that data can be interpreted correctly when transmitting between different hosts. The network byte order takes the big endian sort way.
In order to convert the BSD socket provided the converted function has the following four
Htons converts the unsigned short type from the host sequence to the network sequence
HTONL converts the unsigned long type from the host sequence to the network sequence
Ntohs converts the unsigned short type from the network sequence to the host sequence
Ntohl converts the unsigned long type from the network sequence to the host sequence
In systems using little endian, these functions convert the byte order
These functions are defined as empty macros in systems that use the big endian type
Also in the development of Web programs or cross-platform development should be careful to ensure that only one byte-order or two-party interpretation is not the same will produce bugs.
Htonl of Function Example parsing ()
Briefly:
Converts the unsigned long integer number of hosts to network byte order.
#include <winsock.h>
U_long PASCAL FAR htonl (U_long hostlong);
Hostlong: 32 Digits of host byte order expression.
Comments:
This function converts a 32-digit number from host byte order to network byte order.
return value:
HTONL () returns the value of a network byte order.
Inet_ntoa ()
Briefly:
Converts the network address to "." The dotted string format.
#include <winsock.h>
Char far* PASCAL FAR inet_ntoa (struct in_addr in);
In: A structure that represents the address of an Internet host.
Comments:
This function converts an Internet address structure represented by the in parameter to a string of "." Intervals such as a.b.c.d. Note that the string returned by Inet_ntoa () is stored in the memory allocated by the Windows sleeve interface implementation. The application should not assume that the memory is allocated. Data is guaranteed to be valid until the next Windows socket interface is invoked on the same thread.
return value:
If no error occurs, Inet_ntoa () returns a character pointer. Otherwise, return to NVLL. The data should be copied before the next Windows socket interface is invoked.
The data transmitted in the network is consistent with the local byte storage order, while others are very different, for the consistency of the data, the local data must be converted into the format used on the network, then sent out, receive the same, after the conversion and then to use the data, The basic library functions provide such functions as byte conversion, such as and Htons () htonl () Ntohs () Ntohl (), where n indicates that network,h () host,htons () is used to convert local bytes to network bytes. s represents short, that is, 2-byte operations, L represents long, and 4-byte operations. similarly ntohs () Ntohl () is used to convert network bytes to local format.
Note:
Definition of a byte order
byte order, as the name implies byte order, say two more words is greater than a byte type of data in the memory of the order (a byte of the data of course there is no need to talk about the order of the problem).
In fact, most people rarely deal directly with byte-order in actual development. Only the byte-order in Cross-platform and network programs is a problem that should be considered.
In all of the articles that introduce the byte sequence, the byte order is divided into two categories: Big-endian and Little-endian. References to standard Big-endian and Little-endian are defined as follows:
A) Little-endian is the low byte emissions in the memory of the lower address end, high byte emissions at the high address of memory.
b Big-endian is the high byte emissions in the memory of the low address end, low byte emissions in the memory of the higher address.
c) Network byte order: the 4 byte bit value is transmitted in the following order: First is 0~7bit, second 8~15bit, then 16~23bit, and finally 24~31bit. This transmission order is called a big endian byte sequence. Because all binary integers in the TCP/IP header are required in this order when they are transmitted across the network, it is also known as network byte order. For example, a 2-byte "Ethernet frame Type" in the Ethernet header indicates the type of data that follows. For the Ethernet frame type of the ARP request or answer, the order in which the network is transmitted is 0x08,0x06. The image in memory is shown in the following illustration:
Bottom of stack (high address)
---------------
0X06--Low
0X08--High
---------------
Top of stack (low address)
The value of this field is 0x0806. stored in memory in the big way.
Two, high/low address and high-low byte
First we need to know our C program image in the space layout of memory: In the "c expert programming" or "UNIX environment Advanced Programming" in the memory space layout of the description, roughly the following figure:
-----------------------Maximum memory address 0xFFFFFFFF
| Bottom of Stack
.
. Stack
.
Top of Stack
-----------------------
|
|
\|/
NULL (empty)
/|\
|
|
-----------------------
Heap
-----------------------
Uninitialized data
----------------(collectively, data segment)
Initialized data
-----------------------
Body section (Code snippet)
-----------------------Minimum memory address 0x00000000
For example, if we assign a unsigned char buf[4 on the stack, how does the array variable lay on the stack [note 1]. Look at the picture below:
Bottom of stack (high address)
----------
BUF[3]
BUF[2]
BUF[1]
BUF[0]
----------
Top of stack (low address)
Now that we've figured out the high and low byte, then we're going to figure out if we have a 32-bit unsigned integer 0x12345678 (oh, just to see the 4 bytes above buf as an integer), what's the highs and lows? It's actually very simple. In the decimal we all say that the left side is high, the right side is low, in other systems as well. Take 0x12345678, the bytes from high to low are 0x12, 0x34, 0x56, and 0x78 in turn.
The high and low addresses and the high and low bytes are clear. Let's review the definitions of Big-endian and Little-endian and illustrate the two byte sequences graphically:
Take the unsigned int value = 0x12345678 as an example, to see the storage situation in both byte order, we can use unsigned char buf[4] to represent value:
Big-endian: Low address storage high, as shown below:
Bottom of stack (high address)
---------------
BUF[3] (0x78)--Low
BUF[2] (0x56)
BUF[1] (0x34)
Buf[0] (0x12)--high
---------------
Top of stack (low address)
Little-endian: Lower address storage low, as shown below:
Bottom of stack (high address)
---------------
BUF[3] (0x12)--high
BUF[2] (0x34)
BUF[1] (0x56)
Buf[0] (0x78)--Low
---------------
Top of stack (low address)
Intel's X86 on the existing platform is Little-endian, and the sun-like SPARC uses Big-endian.
Iii. examples
Embedded system developers should have a good understanding of Little-endian and Big-endian patterns. The CPU in Little-endian mode stores the operands from low byte to high byte, whereas the Big-endian mode stores the operands from high byte to low byte.
For example, the 16bit-wide number of 0x1234 stored in Little-endian mode CPU memory (assuming the address 0x4000 starts with) is:
Memory Address Store Content
0x4001 0x12
0x4000 0x34
In Big-endian mode CPU memory is stored in the following way:
Memory Address Store Content
0x4001 0x34
0x4000 0x12
The 32bit-wide number of 0x12345678 in Little-endian mode CPU memory (assuming starting from address 0x4000) is:
Memory Address Store Content
0x4003 0x12
0x4002 0x34
0x4001 0x56
0x4000 0x78
In Big-endian mode CPU memory is stored in the following way:
Memory Address Store Content
0x4003 0x78
0x4002 0x56
0x4001 0x34
0x4000 0x12
Four.
Different CPUs run different operating systems, and the byte sequence is different, see the table below.
Processor OS byte sort
Alpha all Little endian
Hp-pa NT Little Endian
Hp-pa UNIX Big Endian
Intelx86 all Little Endian <-----x86 system is a small-end byte-order system
motorola680x () All big endian
MIPS NT Little Endian
MIPS UNIX Big Endian
PowerPC NT Little Endian
PowerPC non-NT big endian <-----PPC system is a big-endian byte-order system
rs/6000 UNIX Big Endian
SPARC UNIX Big Endian
IXP1200 Arm Core All Little endian
V. Description of code Examples
Let's look at the code below and see what's going on.
This is run under Hp-unix 9000/800 complete C language code, that is, the Big-endian mode.
#include <unistd.h>
void Main ()
{
int i=0x41424344;
printf ("int address:%x value:%x\n", &i,i);
printf ("-------------------------------\ n");
char* paddress= (char*) &i;
Int J;
for (j=0;j<=3;j++)
{
printf ("Char address:%x value:%c\n", paddress,*paddress);
paddress++;
}
}
Compile output (cc-g ...) :
int address:7f7f08f0 value:41424344
-------------------------------
Char address:7f7f08f0 value:a
Char Address:7f7f08f1 value:b
Char Address:7f7f08f2 VALUE:C
Char address:7f7f08f3 value:d
Let's go back to Windows XP and take a look at the output of this piece of code. Little-endian mode.
#include <stdio.h>
void Main ()
{
int i=0x41424344;
printf ("int address:%x value:%x\n", &i,i);
printf ("-------------------------------\ n");
char* paddress= (char*) &i;
Int J;
for (j=0;j<=3;j++)
{
printf ("Char address:%x value:%c\n", paddress,*paddress);
paddress++;
}
}
Compile output (VC 6.0):
int address:12ff7c value:41424344
-------------------------------
Char address:12ff7c value:d
Char address:12ff7d VALUE:C
Char address:12ff7e value:b
Char address:12ff7f value:a
Read the above code, it should be very clear, what byte order. It's so simple to die. int i=0x41424344;
Using the 16 system, we know that a acsii code is 65,16 is 41, you can understand, this example is to pass the output
A,b,c,d to verify the byte order. I have a list of memory data, I believe there will be a deeper understanding.
The Big-endian memory placement sequence is as follows:
Address: 0x7f7f08f0 0x7f7f08f1 0x7f7f08f2 0x7f7f08f3
0x41 0x42 0x43 0x44
The Little-endian memory placement sequence is as follows:
Address: 0x0012ff7c 0x0012ff7d 0x0012ff7e 0x0012ff7f
0x44 0x43 0x42 0x41
Six, use the function to judge the system is big endian or little endian
BOOL Isbig_endian ()
Returns true if the byte order is Big-endian;
return false on the contrary to Little-endian
{
unsigned short test = 0x1122;
if (* (unsigned char*) &test = = 0x11)
return TRUE;
Else
return FALSE;
}//isbig_endian ()
Vii. Final note
Host byte order (host)
Little-endian [Intel, VAX and Unisys processors, etc.]
Network byte order (network)
Big-endian [IBM 370, Motorola and most RISC designs----IBM mainframes and most UNIX platforms]
Byte conversions are mostly used in the case of network programming, or code porting.
Some related functions in the UNIX environment: (Must include header file #include <netinet/in.h>)
Htons ()--"Host to network short"
HTONL ()--"Host to Network Long"
Ntohs ()--"network to Host short"
Ntohl ()--"network to Host Long"
Some related functions of Windows. Net:
Hosttonetworkorder
Networktohostorder
Add: Ace uses ACE_INPUTCDR and ACE_OUTPUTCDRCDR to handle byte-order and byte-alignment issues.