Floating Point Number Encoding
(1) floating point:
The data that can be moved to the decimal point is called a floating point. It can be expressed as follows: n = m * re
Where, M-tail number,
The base of the R-level (that is, the base of the index ). Generally, R is set to 2, 8, or 16, which is the agreed constant. For most machines, R is set to 2.
The level code of the e-level.
After the base number convention, you only need to encode the ending number and level code for the floating point number. The floating point number format in the machine is as follows:
The ending number M is represented by a fixed point decimal number, and the order code E is an integer. After M is multiplied by Re, the decimal point location changes, changing the re value of the index part, and the decimal point location changes accordingly. Therefore, the data represented in the above representation is a floating point number.
(2) floating point number encoding
Level code E is generally represented by a shift code or a complement code, and the ending number is represented by an original code or a complement code.
Machine zero when m of the ending part of the floating point number is 0, no matter what the level code value is, it is regarded as a zero value, called machine zero.
When the absolute value of the floating point number is too large and the machine cannot represent it, the order of the floating point number is greater than the maximum order that the machine can represent.
The absolute value of a floating point is too small (the level code is smaller than the minimum level code that the machine can represent. When floating point numbers overflow, the tail number is usually set to zero and processed by zero.
(3) normalized floating point number
To facilitate calculation and comparison between floating point numbers and improve the accuracy of floating point numbers, it is required that the ending number of floating point numbers in the computer must meet the requirements of 1/r ≤ | M | <1, that is, the first digit after the decimal point must be a valid number. When the ending number is indicated by a complement code and r = 2, the normalization form is generally:
The preceding formula indicates that, when the maximum value of the ending number is opposite to the symbol bit, it is normalized. However, m <0 has two special cases to consider.
* M =-1/2. The number is normalized as required, but [-0.5] fill = 1.10... 0, which is contrary to the general situation. To facilitate hardware judgment, it is stipulated that-0.5 is not a normalized number (for complement code ).
* M =-1, because the decimal complement can represent-1, and [-1] fill = 1.00... 0. Therefore,-1 is used as the normalized number (for the complement code)
(4) iee754 Standard
In modern computers, floating point numbers generally use international standards set by IEEE, in the form of the following;
The total number of digits of the end number of the second-order code E.
Short real number (single precision) 1 8 23 32
Long real number (Double Precision) 1 11 52 64
Temporary real number 1 15 64 80
In the ieee754 floating point standard, the symbol bit is also "0", indicating a positive number, and "1" indicates a negative number. The order code is also indicated by the shift code, and the ending number is also normalized, but the form is as follows: 1. FF --- F. in the actual representation, 1 of the integer is omitted, indicating the hidden bit (the temporary real number does not use the hidden bit scheme ). Due to the variation of the form of the tail number, the order code is also different from the general shift code, for the short real number, [x] shift = 27 + X-1 = 127 + X, that is to say, this type of transfer code is 1 smaller than the general value of the transfer code, such. [810] shifted to 13310 instead of 13410. Therefore, the offset of the short real number, the long real number, and the temporary real number are 7fh, 3ffh, and 3fffh respectively. The value of a single precision is: (-1) 5 1. FF --- F * 2e-127.
Note: There are multiple methods to encode floating point numbers. in actual application, you must first identify the encoding method and identify the differences between different encoding methods to avoid errors.
4. Text Encoding
(1) currently, the common Encoding System for Spanish characters is the American Standard Code for information interchange ).
ASCII code features:
* Each character is represented by a 7-bit binary code. In the computer, each symbol is actually represented by eight bits, and the highest position is "0" or used as the parity bit.
* There are 128 symbols in total. Among them, 95 printable characters (including spaces), and the rest are control characters.
* The character 0-9 is encoded as 011, and the four characters are encoded as bytes --1001 (exactly as 0-9 in binary format), satisfying the normal sorting relationship, in addition, the correspondence between uppercase letters and lowercase letters is simple. uppercase letters are 10 characters in height and 5 Characters in height are 00001-11010 (1-26 in binary form ), the lowercase letter is 11 characters in height and 0000-11010 characters in height.
(2) Chinese Encoding
Chinese character encoding is divided into three categories: input code, in-machine code, and form code.
The Chinese character input codes mainly include digit encoding, Pinyin encoding, and font encoding. These encoding methods are based on the corresponding encoding rules, represented by letters and numbers
For Chinese characters, enter Chinese characters on the standard western keyboard.
The inner code of a Chinese character machine is used to store, exchange, and retrieve Chinese characters. Generally, two or three bytes are used to represent a Chinese character. To distinguish it from ASCII
The maximum byte in the Chinese character machine code is "1 ".
The Chinese Character Font code is encoded based on the Chinese character font information. It is stored in the font library and used for Chinese character output.
(3) decimal number encoding
* A string stores a decimal digit or symbol in one byte, and multiple consecutive bytes are used to represent a complete decimal data.
The base of the decimal data indicates the common ASCII code. There are two ways to separate strings and strings.
# The separator character string occupies one byte before the digit. The character "+" (2b) 16 indicates the positive number, and the "-" (2D) 16 indicates the negative number.
# Then embed a string to embed the symbol bit into the lowest digit. Rule: Change "-" To (40) 16 and add it to the minimum number of digits. "+" Is omitted.
The preceding two representation methods are mainly used in non-numerical calculation, which is inconvenient for arithmetic operations.
* Compress the decimal number string to store two decimal digits in one byte. Multiple consecutive bytes are used to represent a complete decimal data. It is widely used to save storage space and facilitate data processing.
In the compressed decimal number string format, the four-bit lower ASCII code or BCD code can be used to represent the decimal number. The symbol bit is also represented by four binary codes and placed after the lowest digit (c) 16 = (1100) 2 represents the positive number, (d) 16 = (1101) 2 represents the negative number.
A decimal number string indicates that the length of the decimal data is variable, but the first address and length of the string must be given.
Insert a bit of content: decimal to binary
Example: 0.25,
The conversion process is 0.25 × 2 = 0.5. Therefore, the first digit after the decimal point is 0.
0.5 × 2 = 1.0, so the second digit after the decimal point is 1. Now the decimal point is 0, so the conversion ends. The result is 0.01.
Example: 0.65,
Conversion process: 0.65 × 2 = 1.3, 1,
0.3 × 2 = 0.6, 0,
0.6 × 2 = 1.2, take 1,
0.2 × 2 = 0.4, take 0 ..................
The result is 0.1010 ......
Okay. The following is a simple example: (it was a headache before I understood the implementation mechanism)
Float F = 0.5, which should be 1*2 ^ (-1)
See how f = 0x3f 00 00 in the memory
The binary representation is 0011 1111 0000 0000 ............
31st bits are 0, indicating the symbol +
23rd-30 is the order code, which is generally indicated by the shift code. The-1 (1000 0001) anticode is 0111 1110 (the anticode used here), which is good.
The 0-22 digit is the base number. How can it be 0? It should be 1. Originally, the floating point number of ieee754 used an implicit digit. That is, adding 1 to the ending part is the real ending number.
... Yes, there are many rules.
Look at float F = 2.5 again
0010.1 = 1.01*2 (1)
Positive number, so 32 bits are 0
-1: 0000 0001 + 127 = 128 = 1000 0000
So the memory of 2.5 is 0100 0000 0000 ...... (0x40 00 00)
If float F =-2.5
Negative, So 32-bit is 1
2.5 = 10.1 = 1.01*2 ^ 1
1 + 127 = 128 = 1000 0000
So the Memory Format of-2.5 is 1100 0000 0010 0000 ...................
The hexadecimal value is 0xc0200000.
This article from http://blog.csdn.net/cslie/article/details/2121355
Floating Point Number Encoding