32-to 64-bit Linux applications

Source: Internet
Author: User

A problem occurs when a program running properly on a 32-bit machine fails on a 64-bit machine. So I found this article about transplantation and took a good look.

Original post address: http://queniao.blog.51cto.com/10636/126564

In fact, there is another problem, that is, there is an error in the transfer of programs running normally on a virtual machine on a common multi-core machine, and there is still a random error. In fact, the reason is that the multi-thread concurrent program on the virtual machine is difficult to achieve real concurrency. We should pay attention to this point in future development.

 

With the popularization of the 64-bit architecture, preparing your Linux software for 64-bit systems has become more important than before. In this article, you will learn how to prevent portability defects during statement declaration, assignment, displacement, type conversion, string formatting, and more operations.

Linux is one of the cross-platform operating systems that can use 64-bit processors.
Bit systems are already very common on servers and desktops. Many developers are now facing the need to migrate their applications from a 32-bit environment to a 64-bit environment. With
The introduction of Intel itanium and other 64-bit processors makes it increasingly important to prepare the software for a 64-bit environment. Like UNIX and other UNIX-like operating systems, Linux uses the lp64 standard, where the pointer and long integer are both 64
While the general integer is still 32-bit. Although some advanced languages are not affected by different types of sizes
Language. Port the application from a 32-bit system to 64
It may be very simple or difficult to work on the system, depending on how these applications are written and maintained. Many trivial problems may cause problems, even if they are well written in one
This is also true for highly portable applications. Therefore, this article will summarize these problems and provide some suggestions for solving these problems. Advantages of 64-bit
32
There are many limitations on the platform, which are hindering the development of large applications (such as databases), especially for developers who want to take full advantage of computer hardware. Science
Computing usually depends on floating point computing, while some applications (such as financial computing) require a narrow range of numbers, but require higher accuracy, its accuracy is higher than that provided by floating point numbers. 64
Bitwise math provides more precise fixed-point math calculations and sufficient numeric ranges. There are many
The discussion of the address space expressed by the bit address space. 32-bit pointers can only address 4 GB
. We can overcome this restriction, but application development becomes very complicated and its performance will be significantly reduced. In terms of language implementation, the current C language standard requires that the "long" data type should be at least 64-bit. However, its implementation may define it as larger. Another thing to improve is the date. In Linux, the date is represented by a 32-bit integer, which indicates from January 1, January 1, 1970.
The number of seconds that have elapsed since the current day. This will expire in 2038. However, in a 64-bit system, the date is represented by a signed 64-bit integer, which can greatly expand its available range.

In short, 64-bit has the following advantages:

  • 64-bit applications can directly access the 4 eb virtual memory. The intel itanium processor provides continuous linear address space.
  • 64-bit Linux allows a maximum file size of 4 eb (63 power of 2). One of its important advantages is that it can process access to large databases.

 



Back to Top

Linux 64-bit architecture
Unfortunately, the C programming language does not provide a mechanism to add new basic data types. Therefore, to provide 64-bit addressing and integer computing capabilities, you must modify the binding or ing of existing data types or add new data types to the C language.

Table 1. 32-bit and 64-Bit Data Models

  Ilp32 Lp64 Llp64 Ilp64
Char 8 8 8 8
Short 16 16 16 16
Int 32 32 32 64
Long 32 64 32 64
Long long 64 64 64 64
Pointer 32 64 64 64

The difference between the three 64-bit models (lp64, llp64, and ilp64) lies in the non-floating point data type. When the width of one or more C data types is transformed from one model to another, the application may be affected in many ways. These effects can be divided into two types:

  • Data Object size
    . The compiler alignment the data type according to the natural boundary. In other words, the 32-bit data type must follow the 32
    The data type of the 64-bit system must be aligned according to the 64-bit boundary. This means that the size of data objects such as structure or union is 32
    Bit and 64-bit systems are different.
  • Size of the basic data type
    . Generally, the assumption about the relationship between basic data types is invalid in the 64-bit data model. Applications dependent on these relationships may fail to compile on the 64-bit platform. For example,sizeof (int) = sizeof (long) = sizeof (pointer)

    The assumption is valid for the ilp32 data model, but not for other data models.
In short, the compiler needs to align the data type according to the natural boundary, which means that the compiler will "fill" and thus force the alignment in this way, it is like what is done in the C structure and in the combination. The structure or union members are aligned based on the widest member. Listing 1 explains this structure.

Listing 1. c Structure

struct test {
int i1;
double d;
int i2;
long l;
}

 

Table 2 shows the size of each member in this structure and the size of this structure on 32-bit and 64-bit systems.

Table 2. Size of structure and structure members

Structure Member Size of a 32-bit System Size on 64-bit System
Struct test {    
Int I1; 32-bit 32-bit
  32-bit Filling
Double D; 64-bit 64-bit
Int I2; 32-bit 32-bit
  32-bit Filling
Long L; 32-bit 64-bit
}; The structure size is 20 bytes. The structure size is 32 bytes.

 

Note: In a 32-bit system, the compiler may not d

Alignment, although it is a 64-bit object, because the hardware treats it as two 32-bit objects. However, a 64-bit System d

And l

Are aligned to add two 4-byte padding.

 



Back to Top

Porting from a 32-bit system to a 64-bit System

This section describes how to solve some common problems:

  • Statement
  • Expression
  • Assignment
  • Numeric constant
  • Endianism
  • Type Definition
  • Displacement
  • String formatting
  • Function Parameters
Statement


To make your code work on both 32-bit and 64-bit systems, pay attention to the following statements:

  • Use "L" or "U" as needed to declare an integer constant.
  • Make sure that you use unsigned integers to prevent symbol extension problems.
  • If some variables must be 32-bit on both platforms, define the type as Int.
  • If some variables are 32-bit in a 32-bit system and 64-bit in a 64-bit system, define the type as long.
  • To align and performance, declare the numeric variable as Int or long. Do not try to use the char or short type to save bytes.
  • Declare the character pointer and byte as unsigned, which can prevent the problem of 8-character symbol extension.
Expression


In C/C ++, expressions are based on the combination law, operator priority, and a set of mathematical calculation rules. To make the expression work correctly on both 32-bit and 64-bit systems, pay attention to the following rules:

  • The result of adding two signed integers is a signed integer.
  • The two numbers of the int and long types are added, and the result is a number of the long type.
  • If one operand is an unsigned integer and the other operand is a signed integer, the expression returns an unsigned integer.
  • The int and doubule types are added, and the result is a double number. Here, the number of int type is converted to double type before the addition operation.
Assignment


Since the pointer, Int, and long are no longer the same size on a 64-bit system, problems may occur depending on how these variables are assigned and used in applications. The following are some tips for assigning values:
  • Do not use the int and long types, because this may lead to the truncation of high numbers. For example, do not do the following:

    int i;
    long l;
    i = l;
  • Do not use the int type to store pointers. The following example works well on a 32-bit system, but fails on a 64-bit system because a 32-bit integer cannot store a 64-bit pointer. For example, do not do the following:
    unsigned int i, *ptr;
    i = (unsigned) ptr;
  • Do not use pointers to store int-type values. For example, do not do the following;
    int *ptr;
    int i;
    ptr = (int *) i;
  • If the expression uses a mix of unsigned and signed 32-bit integers and assigns them to a signed long type, convert one of the operands
    The 64-bit type. This causes other operands to be converted to 64.
    In this way, the conversion is not required when the expression is assigned a value. Another solution is to convert the entire expression so that symbol extension can be performed when values are assigned. For example, consider the following
    Usage problems:

    long n;
    int i = -2;
    unsigned k = 1;
    n = i + k;

    In terms of mathematical calculations, the result of the expression displayed in the above Hei should be-1. However, because the expression is unsigned, symbol extension is not performed. The solution is to convert an operand to a 64-bit type (the first line below is like this), or convert the entire expression (the second line below ):

    n = (long) i + k;
    n = (int) (i + k);

Numeric constant


Hexadecimal constants are usually used as masks or special bit values. If a hexadecimal constant without a suffix is 32-bit and its high position is set, it can be defined as an unsigned integer. For example, the constant oxffffffffl is a signed long type. In a 32-bit system, all bits are set to one, but in a 64-bit system, only the low 32 bits are set. The result is 0x00000000ffffff. If we want all the bits to be set, a portable method is to define a signed constant with a value of-1. This will set all the bits as it uses the binary Complement Algorithm.

long x = -1L;

Another possible problem is the setting of the highest bit. In a 32-bit system, the constant 0x80000000 is used. However, a better portability method is to use a displacement expression:

1L << ((sizeof(long) * 8) - 1);

Endianism


Endianism refers to the method used to store data. It defines how to address bytes in integer and floating-point data types. Little-Endian stores low-level bytes in the low address of the memory and high-level bytes in the high address of the memory. Big-Endian stores high bytes in the low address of the memory, and stores low bytes in the high address of the memory. Table 3 provides a 64-bit long integer layout example.

Table 3. 64-bit long int Layout

  Low address             High address
Little endian Byte 0 Byte 1 Byte 2 Byte 3 Byte 4 Byte 5 Byte 6 Byte 7
Big endian Byte 7 Byte 6 Byte 5 Byte 4 Byte 3 Byte 2 Byte 1 Byte 0

 

For example, the 32-Bit 0x12345678 layout on the big endian machine is as follows:

Table 4. 0x12345678 layout on the big-Endian System

Memory offset 0 1 2 3
Memory content 0x12 0x34 0x56 0x78

 

If we treat 0x12345678 as two half words, which are 0x1234 and 0x5678, we can see the following situation on the big endian machine:

Table 5. 0x12345678 views the situation on the big-Endian system as two halves

Memory offset 0 2
Memory content Zero X 1234 Zero X 5678

 

However, on the little endian machine, the la s of the word 0x12345678 are as follows:

Table 6. 0x12345678 layout on the little-Endian System

Memory offset 0 1 2 3
Memory content 0x78 0x56 0x34 0x12

 

Similarly, the two half characters 0x1234 and 0x5678 are as follows:

Table 7. 0x12345678 is displayed as two halves on the little-Endian system.

Memory offset 0 2
Memory content Zero X 3412 Zero X 7856

 

The following example illustrates the differences between the byte sequence on the big endian and little endian machines. The following C program will print "Big endian" when compiling and running on a big endian machine, and "little endian" will be printed when compiling and running on a little endian machine ".

Listing 2. Big endian and little endian

#include <stdio.h>
main () {
int i = 0x12345678;
if (*(char *)&i == 0x12)
printf ("Big endian/n");
else if (*(char *)&i == 0x78)
printf ("Little endian/n");
}

Endianism is important in the following situations:

  • When bit mask is used
  • Indirect pointer address of an object

In C and C ++, there are bitwise domains to help with the endian problem. I recommend that you use a bitfield instead of a mask field or a hexadecimal constant. Several functions can be used to convert 16-bit and 32-bit data from "host byte sequence" to "Network byte sequence ". For example,htonl (3)

,ntohl (3)

It is used to convert a 32-bit integer. Similarly,htons (3)

,ntohs (3)

It is used to convert 16-digit integers. However, there is no standard function set for 64-bit integers. But on the big endian and little endian systems, Linux provides the following macros:

  • Bswap_16
  • Bswap_32
  • Bswap_64
Type Definition


We recommend that you do not use the data types in C/C ++ that change the size on 64-bit systems to write applications, instead, some type definitions or macros are used to explicitly describe the size and type of the data contained in the variable. Some definitions can make the code more portable.
  • ptrdiff_t

    :
    This is a signed integer. It is the result of the subtraction of two pointers.
  • size_t

    :
    This is an unsigned integer, which is an executionsizeof

    Operation result. This is intended for some functions (suchmalloc (3)

    ) When passing parameters, you can also use some functions (suchfred (2)

    .
  • int32_t

    ,uint32_t

    And so on:
    Defines an integer with a predefined width.
  • intptr_t

    Anduintptr_t

    :
    Define the integer type. Any valid pointer can be converted to this type.
Example 1:In the following statement bufferSize

When assigning values sizeof

The returned 64-bit value is truncated to 32-bit. int bufferSize = (int) sizeof (something);
The solution is to use size_t

Convert the return value type and assign it to the declared size_t

The buffersize is as follows: size_t bufferSize = (size_t) sizeof (something);
Example 2:In a 32-bit system, the int and long values are the same. Because of this, some developers exchange the two types. This may cause the pointer to be assigned to the int type, or vice versa. However, in a 64-bit system, assigning a pointer to the int type will result in truncation of a 32-bit high value. The solution is to store pointers as pointer types or special types defined for this purpose, such intptr_t

And uintptr_t

. Displacement


An unsigned integer constant is of the int type. This may cause truncation during displacement. For example, in the following code, a

The maximum value can be 31. This is because 1 << a

It is of the int type. long t = 1 << a;
To shift data on a 64-bit system, use 1L

, As shown below: long t = 1L << a;
String formatting


Function printf (3)

And related functions may be the root cause of the problem. For example, in a 32-bit system %d

Can print both int and long values, but on a 64-bit platform, this will cause the value of the long type to be truncated to a low 32-bit value. For long type variables, the correct usage is %ld

. Similarly, when a small INTEGER (char, short, INT) is passed printf (3)

It will be extended to 64-bit, and the symbol will be extended as appropriate. In the following example, printf (3)

Assume that the pointer is 32 bits. char *ptr = &something;
printf (%x/n", ptr);
The above Code fails on a 64-bit system, and only 4 bytes lower content is displayed. The solution to this problem is to use %p

, As shown below; this works well on both 32-bit and 64-bit systems: char *ptr = &something;
printf (%p/n", ptr);
Function Parameters


When passing parameters to a function, remember the following:

  • When the data type of a parameter is defined by the function prototype, the parameter should be converted to this type according to standard rules.
  • If the parameter type is not specified, the parameter is converted to a larger type.
  • In a 64-bit system, an integer is converted to a 64-bit integer value, and a single-precision floating point type is converted to a double-precision floating point type.
  • If the return value is not specified, the default return value of the function is int type.
A problem occurs when the sum of signed and unsigned integers is passed as the long type. Consider the following:

Listing 3. Passing the sum of signed and unsigned integers as the long type

long function (long l);
int main () {
int i = -2;
unsigned k = 1U;
long n = function (i + k);
}

 

The above Code fails on a 64-bit system because of the expression (i + k)

Is an unsigned 32-bit expression. The symbol is not extended when it is converted to the long type. The solution is to forcibly convert an operand to a 64-bit type. There is another problem in the register-based system: the system uses registers instead of stacks to pass parameters to functions. Consider the following example: float f = 1.25;
printf ("The hex value of %f is %x", f, f);
In a stack-based system, the corresponding hexadecimal value is printed. However, in a register-based system, the hexadecimal value is read from an integer register rather than from a floating-point register. The solution is to forcibly convert the address of the floating point variable into a pointer to the integer type, as shown below: printf ("The hex value of %f is %x", f, *(int *)&f);
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.