16-bit, 32-bit, 64-bit code segment

Source: Internet
Author: User

For ix86 programming, sometimes it is necessary to change from the real mode to the protection mode (for example, to access extended memory in the DOS era, or to write pilot code, of course, if you program in a 32-bit operating system, you won't be able to solve this problem.) It always involves the jump between 16-bit code segments and 32-bit code segments. Therefore, it is necessary to differentiate them.

The main difference between a 16-bit code segment and a 32-bit code segment is that in a 16-bit code segment, the offset of the jump target is represented by 16 bits. In a 32-bit code segment, the offset of the jump target is expressed in 32 bits.

In real mode, the CPU always performs a 16-bit jump, that is, when it parses the jump target, it always reads the 16-bit value in the memory as the jump target. Therefore, the assembler must work with the CPU to generate the code that uses 16 bits to represent the offset.

In the protection mode, the problem is a little more complicated, because how the CPU parses the jump target at this time is related to the attribute of the target code segment. In protection mode, each code segment is composed of one
Segment descriptor. A field in the code segment descriptor indicates whether the segment is a 16-bit code segment or a 32-bit code segment. If the jump target is in a 16-bit code segment, the CPU reads a 16-bit
The value is used as the jump offset. If the jump target is in a 32-bit code segment, the CPU reads a 32-bit value from the memory as the jump offset. In the protection mode, the offset is not always 32-bit.
Value. In a 32-bit operating system, user programs are always compiled into 32-bit code. Therefore, the assembler must work with the CPU to generate a 32-bit code indicating the offset.

Now the problem arises. When the CPU is changed from the real mode to the protection mode, how can we generate the correct code that allows the CPU to run?

There are two ways to solve this problem.

First, the Code executed in real mode should be compiled into a 16-bit offset code, and the code executed in protection mode should also be compiled into a 16-bit offset code, and in the description of the code segment
The code segment is set to 16 bits. The advantage of this method is that even if the assembler cannot generate 32-bit offset code, we can design programs in protection mode. The disadvantage of this method is that the offset
The number is 16 bits, so the size of the code segment is limited.

Second, the Code executed in real mode must be compiled into a 16-bit offset code, and the code executed in protection mode must be compiled into a 32-bit offset code, and in the descriptor of the code segment,
Set the code segment to 32-bit. The advantage of this method is that the program can take full advantage of the strong functionality of the 32-bit processor. However, this method has a problem: we inevitably need to shift the code from 16 bits.
Code to jump to the 32-bit offset. How can this be achieved? Because when writing the 16-bit offset code, we usually need to let the assembler generate the 16-bit offset code, but for the 32-bit code segment to jump
We also want the assembler to generate the 32-bit offset code. This is a contradiction. There are several ways to resolve this conflict. I know three methods: 1.
Instead of assembler code generation, programmers write machine code into code segments in situations defined by variables. This method is not helpful for assembler. 2.
Some compilers allow programmers to specify the jump offset size and generate correct code, such as Na ***. 3.
Converts the assembler from the 16-bit compiling mode to the 32-bit compiling mode. Both na *** and gas support this method. However, some compilers only allow specifying the compilation mode when defining segments.
This method won't work.

The following uses na *** as an example to describe how to change the 16-bit code to 32-bit code.

Section. Text

At this time, the CPU runs in the real mode

Bits 16; indicates that the compiling mode is 16 bits and the Code with 16 bits offset is generated.

; Code in real mode

; Load gdt. gdt contains the descriptor of this segment, which is 32-bit

To the protection mode, because this segment is described as 32-bit in gdt, so after the execution, the CPU considers all the offsets to be 32-bit.

Jmp dword cs_selector: PM; specifies that the offset is 32 bits, and cs_selector is the segment Selection Sub-of this code segment.

Bits 32; below the 32-bit compiling mode, all offsets are 32-bit


; Code

The following code has the same effect as the above Code.

Section. Text

At this time, the CPU runs in the real mode

Bits 16; indicates that the compiling mode is 16 bits and the Code with 16 bits offset is generated.

; Code in real mode

; Load gdt. gdt contains the descriptor of this segment, which is 32-bit

To the protection mode, because this segment is described as 32-bit in gdt, so after the execution, the CPU considers all the offsets to be 32-bit.

Bits 32; below the 32-bit compiling mode, all offsets are 32-bit

JMP cs_selector: PM; cs_selector is the segment selection child of this code segment.


; Code

The two examples above are the direct transformation from the real mode to the 32-bit protection mode, and the real mode code and the 32-bit protection mode code are in the same segment. There are other methods to achieve this change.
For example, 1. The real-mode code and 32-bit protection mode code are not in the same segment. 2.
Use a 16-bit protection mode code as the transition. And so on. As long as you know the principle, the method is easy to understand.



With the advent of a low-cost 64-bit platform and the falling price of memory and hard disk, the 32-bit program has undoubtedly become more powerful in Porting 64-bit hardware, those scientific operations, databases, and programs that consume a large amount of memory or intensive floating-point operations are also taking advantage of this ride. This article mainly discusses some minor issues that should be paid attention to when porting existing 32-bit code to the 64-bit platform.

The latest 64-bit platform is Binary compatible with 32-bit applications, which means that existing programs can be easily transplanted. Many programs that currently run well on 32-bit platforms may not need to be transplanted unless the program has the following requirements:

· More than 4 GB memory is required.

· The file size is usually larger than 2 GB.

· Intensive floating point operations require the advantage of a 64-bit architecture.

· Benefit from the optimized mathematical library of the 64-bit platform.

Otherwise, it is enough to simply recompile it. Most well-written programs can be transplanted to a 64-bit platform without any effort. Assuming that your program is well written and familiar with the issues to be discussed in this article.

  Ilp32 and lp64 Data Models

The 32-bit environment involves the "ilp32" data model because the C data type is 32-bit int, long, and pointer. The 64-bit environment uses different data models. At this time, the long and pointer are already 64-bit, which is called the "lp64" data model.

Currently, all 64-bit UNIX platforms use the lp64 data model, while 64-bit Windows uses the llp64 data model. Except for the 64-bit pointer, the basic types are not changed. We will discuss how to port ilp32 to lp64 here. Table 1 shows the differences between the ilp32 and lp64 data models.

When porting code to 64-bit, we can conclude a simple rule: never think that the length of int, long, and pointer is the same. Any code that violates this rule when running in
Different problems may occur in the lp64 data model, and it is difficult to find out the cause. In example 1, there are many violations of this rule, which need to be rewritten when transplanted to a 64-bit platform.

Example 1:

1 int * myfunc (int I)
2 {
3 return (& I );
6 int main (void)
7 {
8 int Myint;
9 long mylong;
10 int * myptr;
12 char * name = (char *) getlogin ();
14 printf ("enter a Number % s:", name );
15 (void) scanf ("% d", & mylong );
16 Myint = mylong;
17 myptr = myfunc (mylong );
18 printf ("mylong: % d pointer: % x/N", mylong, myptr );
19 Myint = (INT) mylong;
20 Exit (0 );
The first step is to require the compiler to capture issues during porting. The options may vary depending on the compiler used, but for the ibm xl compiler series, the available options are-qwarn64
-Qinfo = Pro: to obtain a 64-bit executable file, use option-q64 (if GCC is used, the option should be-M64, and other available GCC options are listed in table 2 ). Figure 1 is
Compile the code in Example 1.

Go to the discussion group.


The prototype truncation is missing.

If a function is called without a function prototype specified, the returned value is a 32-bit Int. If you do not use the prototype code, unexpected data truncation may occur, resulting in a segmentation error. The compiler caught the error of line 12th in Example 1.

Char * name = (char *) getlogin ();

The compiler assumes that the function returns an int value and truncates the result pointer. This line of code works normally in the ilp32 data model, because the int and pointer are of the same length at this time, changing to the lp64 model is not necessarily correct, and even the type conversion cannot avoid this error, because getlogin () has been truncated after the return.

To fix this problem, you need to include the header file <unistd. h> with the function prototype of getlogin.

Format specified character

If a 32-bit long or pointer is specified, a program error occurs. The compiler caught the error of line 15th in Example 1.

(Void) scanf ("% d", & mylong );

Note that scanf inserts a 32-bit value into the variable mylong, and the remaining 4 bytes are ignored. To fix this problem, use the % LD character in scanf.

Row 18th also demonstrates a similar problem in printf:

Printf ("mylong: % d pointer: % x/N", mylong, myptr );

To correct the error here, mylong should use % LD and % P instead of % x for myptr.

Value assignment Truncation

An example of a value truncation discovered by the compiler is in row 16th:

Myint = mylong;

This will not cause any problems in the ilp32 model, because int and long are both 32-bit, while in lp64, when mylong is assigned to Myint, if the value is greater than the maximum value of a 32-bit integer, the value is truncated.

Truncated Parameter

The next error found by the compiler is in row 17th. Although the myfunc function only accepts one int parameter, a long parameter is used during the call, and the parameter is quietly truncated during transmission.

  Conversion Truncation

Conversion truncation occurs when Long is converted to int, for example, row 19th in Example 1:

Myint = (INT) mylong;
The reason for conversion truncation is that int and long are not of the same length. The conversion of these types usually occurs in the Code as follows:

Int length = (INT) strlen (STR );
Strlen returns size_t (it is unsigned in lp64)
Long), when assigned to an int, truncation is inevitable. Generally, truncation occurs only when the STR length is greater than 2 GB. Even so,
You should also try to use the appropriate polymorphism types (such as size_t and uintptr_t), instead of worrying about the underlying base type.

Some other minor issues

The compiler can capture porting issues, but it cannot always count on the compiler to identify all the errors for you.

Constants expressed in hexadecimal or binary are usually 32 bits. For example, the unsigned 32-bit constant 0xffffffff is usually used to test whether it is-1:


# Define invalid_pointer_value 0 xffffffff
However, in 64-bit systems, this value is not-1, but 4294967295. In 64-bit systems, the correct value of-1 should be 0 xffffffffffffffff. To avoid this problem, when declaring a constant, use const with signed or unsigned.

Const signed int invalid_pointer_value = 0 xffffffff;
This line of code will run normally on both 32-bit and 64-bit systems.

Other issues related to constant hard encoding are based on improper understanding of the ilp32 data model, as shown below:

Int ** P; P = (INT **) malloc (4 * no_elements );
This line of code assumes that the pointer length is 4 bytes, which is incorrect in lp64 and is 8 bytes at this time. The correct method should use sizeof ():

Int ** P; P = (INT **) malloc (sizeof (* P) * no_elements );
Note the incorrect usage of sizeof (), for example:

Sizeof (INT) = sizeof (int *);
This is incorrect in lp64.

  Symbol Extension

Avoid arithmetic operations on the number of signed and unsigned numbers. When we compare the int value with the long value, the data generated at this time is different in lp64 and ilp32. Because it is a symbol-bit extension,
Therefore, it is difficult to find this problem. Only when the operands at both ends are signed or unsigned can this problem be fundamentally prevented.

Example 2:

Long K;
Int I =-2;
Unsigned Int J = 1;
K = I + J;

Printf ("Answer: % LD/N", k );
You cannot expect the answer in example 2 to be-1. However, when you compile this program in lp64, the answer will be 4294967295. The reason is that the expression (I + J) is
Int expression, but when it is assigned to K, the symbol bit is not extended. To solve this problem, the operands at both ends can be either signed or unsigned. As shown below:

K = I + (INT) J.


If the Union contains data types of different lengths, this may cause problems. For example, Example 3 is a common open-source package, which can be run in ilp32 but not in lp64. The Code assumes that the unsigned short array with a length of 2 occupies the same space as long, but this is incorrect on the lp64 platform.

Example 3:

Typedef struct {
Unsigned short BOM;
Unsigned short CNT;
Union {
Unsigned long bytes;
Unsigned short Len [2];
} Size;
} _ Ucheader_t;
To run on lp64, replace unsigned long in the Code with unsigned Int. Check the consortium carefully in all code to make sure that all data members are of the same length in lp64.


Due to the difference in the 64-bit platform, the 32-bit program may fail to be transplanted because of the difference in the byte sequence on the machine. Intel, IBM
PC and other CISC chips use little-Endian, while Apple and other chips use big-Endian-
Endian) usually hides the truncation Bug During the porting process.

Example 4:

Long K;
Int * PTR;

Int main (void)
K = 2;
PTR = & K;
Printf ("K has the value % lD, value pointed to by PTR is % LD/N", K, * PTR );
Return 0;
Example 4 is an obvious example of this problem. A declaration points to the int pointer, but inadvertently points to the long. On ilp32, this code prints 2 because the length of int and long is one
Sample. But on lp64, the pointer is truncated because the length of int and long is different. In any case, in the system with the smallest byte order, the code will still give K the correct answer 2, but in the big tail
In the big-Endian system, the value of K is 0.

(The picture is large. Please pull the scroll bar to watch it)

Table 3 illustrates why
In different byte order systems, different answers are generated due to truncation. In the small-tail byte order, all the truncated high-end addresses are 0, so the answer is still 2. In the large-tail byte order, the truncated high-end addresses are
If the value is 2, the result is 0. Therefore, truncation is a bug in both cases. However, you must be aware that the small-tail byte sequence will hide the truncation error of the small value, and this error is only valid when it is transplanted to the large-tail byte.
It is only possible to be found on the ordering system. Go to the discussion group.


Performance reduction after porting to a 64-bit Platform

After the code is transplanted to the 64-bit platform, we may find that the performance is actually reduced. The cause is related to the pointer length and data size in lp64, and the resulting problems such as reduced cache hit rate, data structure expansion, and data alignment.

In the 64-bit environment, the pointer occupies a larger byte, causing cache problems of 32-bit codes that run well to varying degrees. The specific manifestation is reduced execution efficiency. You can use a tool to analyze changes in the cache hit rate to check whether the performance is reduced.

After the data is migrated to lp64, the size of the data structure may change. In this case, the program may need more memory and disk space. For example, the structure in Figure 2 only needs 16 bytes in ilp32,
In lp64, 32 bytes are required, increasing by 100%. This is because long is 64-bit at this time, and the compiler adds additional data to align.

By changing the order of data in the structure, we can minimize the impact of this problem and reduce the storage space required. If we put two 32-bit int values together, the storage space will be reduced because the data is not filled. Now, the entire storage structure only needs 24 bytes.

Before you rearrange the data structure, you must carefully measure the data usage frequency to avoid performance loss due to reduced cache hit rate.

  How to generate 64-bit code

In some cases, 32-bit and 64-bit programs are difficult to distinguish between source code-level interfaces. Many header files use test macros to differentiate them. Unfortunately, these specific macros depend on specific
Server, a specific compiler, or a specific compiler version. For example, in GCC 3.4 or later versions, _ lp64 __is defined to enable all 64-bit platforms
Use Option-M64 to compile and generate 64-bit code. However, GCC versions earlier than 3.4 are specific to the platform and operating system.

Maybe your compiler is different from
_ Lp64 _ macro, such as IBM
When the XL compiler uses-q64 to compile a program, the _ 64bit _ macro is used, and the _ lp64 macro is used on other platforms. You can test the _ wordsize for specific situations. View the phase
Close the compiler documentation to find the most suitable macro. Example 5 is applicable to multiple platforms and compilers:

Example 5:

# If defined (_ lp64 _) defined (_ 64bit _) defined (_ lp64) (_ wordsize = 64)
Printf ("I am lp64/N ");
# Else
Printf ("I am ilp32/N ");
# Endif
  Shared data

A typical problem when porting data to a 64-bit platform is how to read and share data between 32-bit and 64-bit programs. For example, a 32-bit program may store struct objects as binary files on disks. Now you need to read these files in 64-bit code, the difference in the structure size in the lp64 environment may cause problems.

For new programs that must run on both 32-bit and 64-bit platforms, we recommend that you do not use data types (such as long) that may change the length due to lp64 and ilp32 ), if you want
You can use the fixed-width integer in the header file <inttypes. h> to share data at the 32-bit and 64-bit binary layers, whether through files or networks.

Example 6:


# Include <stdio. h>
# Include <inttypes. h>

Struct on_disk
/* Int32_t */
Long Foo;
Int main ()
File * file;
Struct on_disk data;
# Ifdef write
File = fopen ("test", "W ");
Data. Foo = 65535;
Fwrite (& Data, sizeof (struct on_disk), 1, file );
# Else
File = fopen ("test", "R ");
Fread (& Data, sizeof (struct on_disk), 1, file );
Printf ("data: % LD/N", Data. Foo );
# Endif
Fclose (File );
Let's take a look at Example 6. Ideally, this program runs properly on both the 32-bit and 64-bit platforms and can read the data of the other party. But it does not actually work, because long is
The length in lp64 varies. The Foo variable in the on_disk structure should be declared as int32_t. This fixed width type can be ensured in the current ilp32 or migrated lp64 Data Model
And generate data of the same size.

  Hybrid Fortran and C

Many scientific computing programs call from C/C ++
The function of FORTRAN itself does not have the problem of porting it to a 64-bit platform, because the data type of FORTRAN has a clear bit size. However, if
FORTRAN and C language, the problem is as follows: in Example 7, C language programs call the subroutines of FORTRAN in Example 8.

Example 7:

Void Foo (long * l );
Main ()
Long L = 5000;
Foo (& L );
Example 8:

Subroutine Foo (I)
Integer I
Write (*, *) 'In Fortran'
Write (*, *) I
End subroutine foo
Example 9:

% Gcc-M64-C cfoo. c
%/Opt/absoft/bin/f90-M64 cfoo. O Foo. f90-O out
After the two files are linked, the program prints the variable I value as "5000 ". In lp64, the program prints "0", because in lp64 mode, the child routine Foo transmits
In fact, the Fortran subroutine wants a 32-bit parameter. To correct this error, declare it
Integer * 8, which is the same length as long in C.


The 64-bit platform is a hope to solve large-scale complex scientific and commercial problems. Most well-written programs can be easily transplanted to the new platform. However, pay attention to the differences between the ilp32 and lp64 data models, to ensure a smooth migration process. Go to the discussion group.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.