Data type conversion symbol Extension

Last Update:2013-12-08 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

======================= About symbol extension ============================
1. extend short data type to long data type

1. The short data type to be extended is signed

Symbol extension, that is, the short data type symbol bit is filled to the long data type high byte bit (that is, the part that is more than the short data type), to ensure that the expanded value size remains unchanged

For example, 1: char x = 10001001b; short y = x; then, the value of y should be 11111111 10001001b;

2: char x = 1271001b; short y = x; then, the value of y should be 00000000 1271001b;

2. The short data type to be extended is unsigned

Zero expansion, that is, using zero to fill the high byte bits of Long Data Types

For example, 1: unsigned char x = 10001001b; short y = x; then, the value of y should be 00000000 10001001b;

2: unsigned char x = 100001001b; short y = x; then, the value of y should be 00000000 100001001b;

Ii. reduce long data type to short Data Type

If the High bytes of the long data type are all 1 or all 0, the low bytes are directly truncated and assigned to the short data type. If the high bytes of the long data type are not all 1 or not all 0, A transfer error occurs.

3. Conversion Between the signed number and the unsigned number of Data Types of the same length

The data in the memory is directly assigned to the type to be converted, and the value size changes. When the short type is extended to the long type, but the short type and the long type belong to the signed number and the unsigned number, type expansion is performed according to Rule 1, then, according to this rule, the values in the memory are automatically assigned to the other party.

Appendix: Conversion of signed numbers

From to Method
Char short symbol bit extension
Char long symbol bit extension
Char unsigned char the highest bit loses the meaning of the symbol bit and changes to the data bit.
Char unsigned short signs are extended to short; then short is switched to unsigned short
Char unsigned long signs are extended to long, and then transferred from long to unsigned long
The char float symbol bit is extended to long, and then transferred from long to float
The char double symbol is extended to long; then it is converted from long to double.
The char long double symbol is extended to long; then it is converted from long to long double.
Short char reserved low byte
Short long symbol bit extension
Short unsigned char reserved low byte
The short unsigned short highest bit loses the meaning of the symbol bit and changes to the data bit.
Short unsigned long symbol bit is extended to long; then from long to unsigned double
The short float symbol bit is extended to long, and then transferred from long to float
The short double symbol is extended to long; then it is converted from long to double.
The short long double symbol is extended to long; then it is converted from long to double.
Long char retains the low byte
Long short retains low bytes
Long unsigned char reserved low byte
Long unsigned short retains low bytes
Long unsigned long, the highest bit loses the meaning of the symbol bit and changes to the data bit.
Long Float is represented by a single-precision floating point number. Precision may be lost.
Long double is represented by a double-precision floating point number. Precision may be lost.
Long double is represented by a double-precision floating point number. Precision may be lost.

Conversion of unsigned numbers

From to Method
Unsigned char highest bit as the symbol bit
Unsigned char short 0 Extension
Unsigned char long 0 Extension
Unsigned char unsigned short 0 Extension
Unsigned char unsigned long 0 Extension
Unsigned char float to long; then from long to float
Unsigned char double to long, and then from long to double
Unsigned char long double to long; then from long to double
Unsigned short char reserved low byte
Unsigned short highest bit as the symbol bit
Unsigned short long 0 Extension
Unsigned short unsigned char reserved low byte
Unsigned short unsigned long 0 Extension
Unsigned short float to long; then from long to float
Unsigned short double to long; then from long to double
Unsigned short long double to long; then from long to double
Unsigned long char reserved low byte
Unsigned long short retains low bytes
Unsigned long maximum bit as the symbol bit
Unsigned long unsigned char reserved low byte
Unsigned long unsigned short retains low bytes
Unsigned long float to long; then from long to float
Unsigned long double Convert directly to double
Unsigned long double to long; then from long to double

---------------------------------------------------------

Symbol extension, zero extension, and reduction

Modern advanced programming languages allow programmers to use expressions containing integer objects of different sizes. So what happens when the two operands of an expression have different sizes? Some languages report errors, while other languages automatically convert the operands into a unified format. This type of conversion is costly. Therefore, if you do not want the compiler to automatically add various conversions to your original perfect code without your knowledge, you need to know how the compiler processes these expressions.

In the hexadecimal complement system, the representation of the same negative number in the representation of different sizes is different. You cannot use 8-Bit Signed numbers in a 16-digit expression at will. Conversion is required. This conversion and its inverse operation (converting 16-digit to 8-bit) is the sign extension and contraction operations.

-64: for example, the 8-bit binary complement is $ C0, while the equivalent 16-bit binary complement is $ FFC0. Obviously, its bit mode is different. Let's look at the number + 64. The 8-bit and 16-bit values are $40 and $0040 respectively. Obviously, the size of the expanded negative number is completely different from that of the expanded non-negative number.

It is easy to extend the number from a single-digit symbol to a larger number of digits. You only need to copy the symbol bit to the new high-end users. For example, to extend an 8-bit number to 16 bits, you only need to copy the 7th bits of 8 bits to the 8th bits of 16 bits .. 15 digits. To extend a 16-digit symbol to a double character, you only need to copy the 15th-bit symbol to the 16th-31-bit double character.

Symbol extension is required when there are symbols of different lengths. For example, when adding a byte volume to a word volume, the byte volume symbol must be extended to 16 bits before the addition. Other operations may need to extend the symbol to 32 bits.

Table 2-5 example of symbol Extension

8-digit
16-bit
32-bit
Binary complement Representation

$80
$ FF80
$ FFFF_FF80
% 1111_1111_1111_1111_1111_1111_1000_0000

$28
$0028
$ Pai_0028
% 1__1__1__1__0000_0010_1000

$ 9A
$ FF9A
$ FFFF_FF9A
% 1111_1111_1111_1111_1111_1111_1001_1010

$ 7F
$ 007F
$ 0000_007F
% 1__1__1__1__1__0111_1111

N/
$1020
$ Pai_1020
% 1__1__1__0001__0010_0000

N/
$8086
$ FFFF_8086
% 1111_1111_1111_1111_1000_0000_1000_0110

When processing the Unsigned binary number, you can use zero extension to extend the unsigned number of small digits to the unsigned number of large digits. Zero expansion is very simple-you only need to use zero to fill the high-end bytes of the large-digit operands. For example, to extend the 8-digit $82 value to 16 bits, you only need to insert zero in the high-end bytes to get $0082.

Table 2-6 zero extension example

8-digit
16-bit
32-bit
Binary complement Representation

$80
$0080
$ Pai_0080
% 1__1__1__1__1__1000_0000

$28
$0028
$ Pai_0028
% 1__1__1__1__0000_0010_1000

$ 9A
$ 009A
$ Pai_009a
% 1__1__1__1__1__1001_1010

$ 7F
$ 007F
$ 0000_007F
% 1__1__1__1__1__0111_1111

N/
$1020
$ Pai_1020
% 1__1__1__0001__0010_0000

N/
$8086
$ Pai_8086
% 1__1__1__1000_0000_1000_1000_0110

Most advanced language compilers automatically process symbol extensions and zero extensions. The following example of C language shows how they work:

Signed char sbyte; // The character type in C is a byte

Short int sword; // The short integer in C language is usually 16 bits

Long int sdword; // The long integer in C language is generally 32 characters

...

Sword = sbyte; // automatically extends the 8-bit value to 16 characters

Sdword = sbyte; // automatically extends the 8-bit value to 32-bit

Sdword = sword; // automatically extends the 16-bit value to 32-bit

Explicit conversions (explicit cast) are required for converting a language (such as Ada) from a small data type to a big data type ). Check the reference manual of the language to see if this explicit conversion is required. The advantage of a language that requires Explicit conversions is that the compiler will never do anything without the programmer's knowledge. If you do not provide the necessary conversions, the compiler will provide a diagnostic message to let you know that the program still needs to be improved.

Symbol extension and zero extension, one thing to be clear is that they need to pay the price. Assigning a small integer value to a large integer may require more machine commands (longer execution time) than transferring data between integer variables of the same size ). Therefore, be careful when mixing variables of different sizes in a mathematical expression or a value assignment statement.

Symbol reduction, which is troublesome to convert a single digit to a number with the same value but smaller digits. Symbol extension will never fail. With symbol extension, a m-bit signed number can always be converted to an n-digit number (here n> m ). Unfortunately, in m <n, an n-digit cannot always be converted to m-digit. For example, the 16-bit hexadecimal representation of-448 is $ FE40, and the size of this number is too large for 8 bits, so we cannot reduce its symbol to 8 bits.

To correctly scale down a value, you must check the high-end bytes to be discarded. First, these high-end bytes must be all zero or $ FF. If they contain other values, we cannot reduce the number of symbols. Second, the highest bit of the final result must be consistent with all the bit discarded. Here are some examples of switching from 16-digit to 8-digit:

$ FF80 (% 1111_11_1000_0000) can be reduced to $80 (% 1000_0000 ).

$0040 (% 1__0000_0100_0000) can be reduced to $40 (% 0100_0000 ).

$ FE40 (% 1111_1110_0100_0000) cannot be reduced to eight characters

$0100 (% 0000_00000000000_0000) cannot be reduced to eight characters

It is difficult to reduce the number of languages. For example, in C language, the low-end part of the expression is directly stored in small variables, and discard the high-end part (in the best case, the C compiler may give a warning during the compilation process, prompting possible loss of precision ). You can take measures to let the compiler stop complaining, but it still does not check the value validity. Below is a typical code for downgrading symbols in C language:

Signed char sbyte; // The character type in C is a byte

Short int sword; // The short integer in C language is usually 16 bits

Long int sdword; // The long integer in C language is generally 32 characters

...

Sbyte = (signed char) sword;

Sbyte = (signed char) sdword;

Sword = (short int) sdword;

In the language, the only safe solution is to compare the result value with a certain upper and lower boundary value before storing the result value of the expression in a small variable. Unfortunately, if you need to do this operation frequently, the code will become clumsy. The following is the conversion code after these checks:

If (sword> =-128 & sword <= 127)

{

Sbyte = (signed char) sword;

}

Else

{

// Report error

}

// Another scheme, using assertions:

Assert (sword> =-128 & sword <= 127)

Sbyte = (signed char) sword;

Assert (sdword> =-32768 & sdword <= 32767)

Sword = (short int) sdword;

This makes the code ugly. In C/C ++, you may tend to write them as macros (# define) or functions to improve code readability.

In some advanced languages (such as Pascal and Delphi/Kylix), the symbol is automatically reduced, and the result is checked to ensure that it applies to the target operation 4. These languages generate some type of exceptions (or stop the program) when an out-of-bounds violation occurs ). Of course, if you want to add the Error Correction Code, either you need to write the exception handling code or use the if statement sequence used in the previous C language example.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Data type conversion symbol Extension

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Data type conversion symbol Extension

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support