The current memory, more than a byte to access the smallest unit, when a logical address must be divided into a number of physical units when there is the problem of who put who, so the end (endian) problem arises, for different storage methods, there is a big (Big-endian) and small End ( Little-endian) of two descriptions.
Byte sorting is divided into big and small ends, the concept is as follows
Big endian: Low address holds high-efficient bytes
Small end (little endian): Low byte storage valid byte
Now the mainstream Cpu,intel series is the use of little endian format to store data, and the Motorola series of CPUs using big endian,arm while supporting big and little, network programming, tcp/ IP unification uses the big-endian way of transmitting data, so sometimes we also call the big-endian network byte order.
In particular, it is important to note that the sequence of data stored in a program written in the C + + language is related to the CPU on which the build platform resides, while Java-written programs only use the big endian to store data. I'm only going to talk about the C + + language here.
1. The way and judgment of big and small end
For example, my machine is a 32-bit Windows system, and the processor is AMD. For an int type number 0x12345678, for convenience, here is a 16 binary representation. This number is stored in the same order as the CPUs stored in different byte order:
0x12345678 16 Binary, two count is one byte
High efficient bytes--Low Valid bytes: 12 34 56 78
Low Address bit
Big- endian :
Small End : 78
The following verifies which byte storage order the native CPU belongs to. The code is as follows:
1234567891011121314151617181920 |
#include <iostream>
using
namespace
std;
typedef
unsigned
int
UINT
;
typedef
unsigned
char
UCHAR
;
int
main()
{
UINT
i=0x12345678;
cout<
UCHAR *p = (
UCHAR
*)&i;
//将i的地址传给数组指针p,实际上p指向的地址是i在内存中存储的第一个字节,大端就是0x12,小端就是0x78
if
((*p==0x78)&(*(p+1)==0x56))
cout<<
"小端"
<<endl;
else if
((*p==0x12)&(*(p+1)==0x34))
cout<<
"大端"
<<endl;
else
cout<<
"这是神马字节顺序呢?"
;
return
0;
}
|
Debug display when the small end, I use the machine bytes stored as a small end way.
2. Byte conversion of big and small ends
When two hosts communicate in different byte order, they must be transformed into a network byte order (that is, big-endian) before sending the data before transmitting. In addition, a program written on a small-end machine with C/+ + will also be converted to a small end and a Java program.
Here the so-called conversion is to change the order of bytes, so that the interaction data consistency. To give an example, or 16 binary representation of the number of 0x12345678, in the small end of the machine on the order of 0x78563412, when the memory of such a number of transmission, in the big-endian mode is 0x78563412 this value, and the original value, to want the same as the original value, before transmission, In the big-endian mode is the 0x12345678, when the original number in memory for 0x12345678, the original data 0x12345678 in memory storage sequence for 0x12345678, that is, to convert to a big-endian way.
To transfer values: 12 34 56 78
When not converted, small end: 78 56 34 12
Convert to Big-endian: 12 34 56 78
Based on the above big end and small-endian byte ordering, it is convenient to use shift operation to complete the conversion function. Go from small to big-endian code as follows:
12345678910111213141516171819202122232425262728293031323334 |
#include <iostream>
using
namespace std;
typedef
unsigned
int
UINT
;
typedef
unsigned
char
UCHAR
;
int
main()
{
UINT i=0x12345678;
cout<
UCHAR
*p = (
UCHAR
*)&i;
UINT
num,num1,num2,num3,num4;
num1=(
UINT
)(*p)<<24;
num2=((
UINT
)*(p+1))<<16;
num3=((
UINT
)*(p+2))<<8;
num4=((
UINT
)*(p+3));
num=num1+num2+num3+num4;
cout<<
"num1:"
< //看num1的16进制表示,下同
cout<<
"num2:"
<
cout<<
"num3:"
<
cout<<
"num4:"
<
cout<<
"num:"
<
unsigned
char
*q = (unsigned
char
*)#
if
((*q==0x78)&(*(q+1)==0x56))
cout<<
"小端"
<<endl;
else
if
((*q==0x12)&(*(q+1)==0x34))
cout<<
"大端"
<<endl;
else
cout<<
"这是神马字节顺序呢?"
;
return
0;
}
|
As for (UINT) (*p) Why move 24-bit, it is very well understood that the 0x00000012 into 0x12000000, is not to move to the left 24 bits.
Of course, it is easier to write a function to complete the above conversion function, as follows:
12345 |
UINT
EndianConvertLToB(
UINT
InputNum) {
UCHAR
*p = (
UCHAR
*)&InputNum;
return
(((
UINT
)*p<<24)+((
UINT
)*(p+1)<<16)+
((
UINT
)*(p+2)<<8)+(
UINT
)*(p+3));
}
|
The same principle applies to the small end of the big-endian, but the big end of the shift is different, the function is as follows:
12345 |
UINT
EndianConvertBToL(
UINT
InputNum) {
UCHAR
*p = (
UCHAR
*)&InputNum;
return
(((
UINT
)*p)+((
UINT
)*(p+1)<<8)+
((
UINT
)*(p+2)<<16)+(
UINT
)*(p+3)<<24);
}
|
Category: C/
BYTE storage ordering: The identification and conversion of big and small ends