Article Reference address: http://www.cnblogs.com/kubixuesheng/p/4107060.html
http://blog.csdn.net/hackbuteer1/article/details/7722667
The original author of the article link, write very well, no need to re-analysis, see this is enough. Generally in the "computer composition principle", or "microcomputer principle", or "assembly language" and other courses will also be introduced, but not so detailed and thorough. The red notes are my annotations.
--------------------------------------------------------------------------------------------------------------- -
First of all, the source of the problem of the size and end, the author should be based on 32-bit machine to explain.
In various computer architectures, the storage mechanism of byte, word and so on is different, which leads to a very important problem in the field of communication, that is, the information unit (bits, bytes, words, double words, etc.) communicated by both parties should be transmitted in what order. If no consistent rules are reached, both parties will not be able to properly encode/decode and cause communication failure. There are two main types of byte storage mechanisms commonly used in computers in various systems: Big-endian and Little-endian, which begin with the byte sequence.
First, what is the byte order
BYTE-order, as the name implies the order of bytes, and then say two more than a byte type of data in memory storage order (a byte of data of course there is no need to talk about the order of the problem). In fact, most people in the actual development of the very few direct and byte-order dealings. Byte order is a problem that should be considered only in cross-platform and network programs.
In all the articles that introduce the byte order, the byte order is mentioned in two categories: Big-endian and Little-endian, and the reference standard Big-endian and Little-endian are defined as follows:
A) The Little-endian is the low-bit bytes emitted at the lower address of the memory, high-bit bytes emitted in the memory of the higher address.
b) The Big-endian is the high-bit byte emitted at the low address of the memory, and the low byte is discharged at the upper address of the memory. (Two understand one is OK, the size end is the opposite)
c) Network byte order: The TCP/IP layer protocol defines the byte order as Big-endian, so the byte order used in the TCP/IP protocol is often referred to as the network byte order.
1.1 What is the high/low address end
First we need to know the spatial layout of the memory in the C program image: In c expert programming or advanced Programming for UNIX environments, there is a description of the layout of the memory space, roughly as follows:
-----------------------Maximum memory address 0xFFFFFFFF (32-bit 2 decimal, 1, 16 binary is 0xFFFFFFFF)
Bottom of Stack
Stack
Top of Stack
-----------------------(contrary to our usual drawing, the stack grows downward in memory, and the memory is found top-down, the address is changed from high to low, commonly known as high address to low address)
NULL (void)
-----------------------
Heap
-----------------------
Uninitialized data
-----------------------collectively referred to as data segments
Initialized data
-----------------------
Body segment (Code snippet)
-----------------------Minimum memory address 0x00000000
As you can see in the memory distribution, the stack is growing downward, and the heap is growing upward. For example, if we allocate a unsigned char buf[4] on the stack, how is this array variable laid out on the stack? See:
bottom of stack (high address)
----------
BUF[3]
BUF[2]
BUF[1]
Buf[0]
----------
Top of stack (low address)
In fact, we can create an array ourselves in the compiler, and then output the address of each element of the array separately, to verify.
1.2 What is high/low byte
Now we've figured out the high/low address, then the high/low byte . Some articles say that the lower byte is the least significant bit and the high byte is the most significant bit. If we have a 32-bit unsigned integer 0x12345678, then what is high and what is low? It's actually very simple. In decimal we say that the left side is high, the right side is low, and so is the other system. Take 0x12345678, the bytes from high to low are 0x12, 0x34, 0x56, and 0x78 in turn.
The high/low address end and high/low byte are all clear. Let's review the definitions of Big-endian and Little-endian and illustrate the two byte-sequences with illustrations:
Take unsigned int value = 0x12345678 as an example, and look at its storage in two byte sequences, we can use unsigned char buf[4] to represent value:
Big-endian: Low address holds high , such as:
Bottom of stack (high address)
---------------
BUF[3] (0x78)--Low
BUF[2] (0x56)
BUF[1] (0x34)
Buf[0] (0x12)--high
---------------
Top of stack (low address)
Little-endian: Low address storage , such as:
Bottom of stack (high address)
---------------
BUF[3] (0x12)--high
BUF[2] (0x34)
BUF[1] (0x56)
Buf[0] (0x78)--Low
--------------
Top of stack (low address)
Second, various endian
2.1 Big-endian
A term in computer architecture that describes the order of multi-byte storage in which the most important byte (MSB) is stored at the lowest-end address. The processors with this mechanism are the IBM3700 series, the PDP-10, the Mortolora microprocessor family and the vast majority of RISC processors.
MSB the most significant byte
+----------+
| 0x34 |<--0x00000021
+----------+
| 0x12 |<--0x00000020
+----------+
Figure 1: Double-byte number 0x1234 in Big-endian mode with start address 0x00000020
In Big-endian, the sequence number in the bit sequence is as follows (in double-byte 0x8b8a= (1000 1011 1000 1010) 2 for example):
Bit 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
+-----------------------------------------+
Val | 1 0 0 0 1 0 1 1 | 1 0 0 0 1 0 1 0 |
+----------------------------------------+
Figure 2:big-endian Bit Sequence encoding method
2.2 Little-endian
A term in computer architecture that describes the order of multi-byte storage in which the least significant byte (LSB) is stored at the lowest-end address. The processors with this mechanism are PDP-11, VAX,Intel series microprocessors , and some network communication devices. This term is often used to describe the order of emissions for each bit in a byte, in addition to the multi-byte storage sequence.
+----------+
| 0x12 |<--0x00000021
+----------+
| 0x34 |<--0x00000020
+----------+
Figure 3: Double-byte number 0x1234 in Little-endian mode with start address 0x00000020
In Little-endian, the sequence number in the bit sequence is the opposite of Big-endian in the following way (in the case of double-byte 0x8b8a= (1000 1011 1000 1010) 2):
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
+-----------------------------------------+
Val | 1 0 0 0 1 0 1 1 | 1 0 0 0 1 0 1 0 |
+-----------------------------------------+
Figure 4:little-endian Bit Sequence encoding method
Note: The host order, which we usually call, is followed by the Little-endian rule . Therefore, when the two hosts to communicate through the TCP/IP protocol, it is necessary to call the corresponding function for host order (Little-endian) and network Order (Big-endian) conversion.
The number of CPUs in the
Little-endian mode is stored from low to high bytes, while the Big-endian mode holds operands from high to low bytes. 32bit wide number of 0x12345678 in Little-endian mode CPU memory (assuming storage from address 0x4000) is:
memory address 0x4000 0x4001 0x4002 0x4003
Store Content 0x78 0x56 0x34 0x12
In Big-endian mode, the CPU memory is stored in the following way:
Memory address 0x4000 0x4001 0x4002 0x4003
Storage content 0x12 0x34 0x56 0x78
The specific differences are as follows:
Iii. Advantages and disadvantages of Big-endian and Little-endian
Big-endian Advantages:
First, you can always determine whether the number is positive or negative (because the low address holds high bytes) by looking at the byte at offset 0. You don't have to know how long this number is, or you don't have to go through some bytes to see if the value contains a sign bit. This value is stored in the order in which they are printed, so functions from binary to decimal are particularly effective. Therefore, for different requirements of the machine, in the design of access mode will be different.
Little-endian Advantages:
The assembly instruction for extracting one, two, four, or longer bytes of data in the same manner as all other formats: first extracting the lowest bit byte at the offset address of 0, because the address offset and the number of bytes are a one-to-many relationship, and the mathematical function of multiple precision is relatively easy to write.
If you increase the value of the number, you may increase the number on the left (the high-level non-exponential function requires more numbers). Therefore, it is often necessary to increase the number of two digits and move all Big-endian in the memory, moving all numbers to the right, which increases the workload of the computer. However, the non-important bytes in the memory using Little-endian can exist in its original position, and the new number can exist in its right high address. This means that some computations in the computer can become simpler and faster.
Hard work, reproduced please indicate the source, thank you ...
Size-end mode parsing for computer storage