Six compilation Modes

Last Update:2018-12-04 Source: Internet

Author: User

Tags compact

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Six compilation Modes

Turbo C provides six compilation modes. The compilation mode is also called addressing mode or memory mode, because it processes the following six modes: tiny mode tiny, small mode small, compact, medium, large mode large, and huge mode huge. The relationships between them are shown in the following table.

│ Small programs │ large programs
When there are too many threads, there are too many threads, too many threads.
Small Data │ micro, small │ Medium
Big Data │ compact │ large, giant

A small program has only one program segment. Of course, it cannot exceed 64 KB. The default code (function) pointer is near. A large program has multiple segments. Each segment cannot exceed 64 K bytes, but the total program volume can exceed 64 K Bytes. The default code pointer is far. The differences between them will be discussed one by one, and the output results of the same program in six different modes will further deepen the understanding of these six modes. However, it should be emphasized that no matter which compilation mode is used, a single Turbo C source file cannot generate code larger than 64 K bytes, nor can it generate static files larger than 64 K bytes (including global) data.

For example, the following program: int A [15000], B [20000];
Void main ()
{
}

Compilation is not allowed in any mode. This is because the total storage capacity required by the two arrays is 70 KB. The error message "too much global data defined in file" is reported during compilation. To process code or static data larger than 64 K bytes, it must be divided into several source files. For example, the above program can be divided into two files: a1.c and a2.c. The two source files are compiled in the giant mode, and finally connected into an executable file. Al. c a2.c A. prj
Int A [15000]; int B [20000]; A1
Void main () A2
A1.obj (30 k) a2.obj (40 k) a.exe (71 K)

The differences between the six compilation modes are: They process different codes and data segments from different source files, they process different heap spaces dynamically allocated, and they use different pointers. In addition, their formation. the information transmitted to the Connection Program in the OBJ file is different, so that the connection program can arrange the code and data segments accordingly, and put the corresponding descriptions in. in the header of the EXE file and inform DoS: how to load the code segment and Data Segment When executing this program, and how to set various segment registers.

The program used to demonstrate the six compilation modes is composed of two source files x.c and Y. C, as shown below:
/* X.c */
# Include <general. h>
Void ()
{
Static int B;
Int C;
Printf ("in function a n ");
Printf ("Cs: % x N", _ CS );
Printf ("ds: % x N", _ DS );
Printf ("SS: % x N", _ SS );
Printf ("static B: % P n", & B );
Pritnf ("automatic C: % P n", & C );
}

/* Y. C */
# Include <general. h>
Int D;
Void main ()
{
Int E;
A ();
Printf ("in function main N ");
Printf ("Cs: % x N", _ CS );
Pritnf ("ds: % x N", _ DS );
Pritnf ("SS: % x N", _ SS );

Pritnf ("Global D: % P n", & D );
Pritnf ("automatic E: % P n", & E );
Printf ("heap address: % P n", malloc (2 ));

# If defined (_ tiny _) | defined (_ small _) | defined (_ compact __)
Pritnf ("function A: % np n", );
Pritnf ("function main: % np n", main );
# Else
Printf ("function A: % fp n", );
Printf ("function main: % fp n", main );
# Endif
}

The first source file contains function a and a static (local) variable B. The second source file contains the main function and a global variable D. Each source file contains an automatic variable C and E. The main function of the second source file calls function a in the first source file, and calls the library function malloc of Turbo C to allocate a heap space. The two source files are compiled separately and then connected through the Connection Program.

By compiling these two source files in six different modes, we can see how they allocate space for code, data, and stack segments, you can see where static variables, automatic variables, and heap variables are stored, and where functions are stored. As we will see below, in some modes, the Data Pointer is near and the function pointer is far; in other modes, the opposite is true.

For data pointers, whether far or near, the format description % P in the pritnf function can print the pointer correctly. For functions, the pointer % P does not have this function. Therefore, the main function must add Conditional compilation control lines # If, # else, And # endif.

Micro Mode
In micro-mode, the entire program has only one segment, which contains code, static and global data, stacks, and heap. Because there is only one segment, DOS sets the CS, DS, and SS registers to be equal during execution, all pointing to this segment. In this section, the code is first loaded with the lowest address, followed by static changes and global variables, followed by heap and finally stack. Heap and stack are dynamic. The heap increases from low address to high address, and the stack increases from high address to low address. If the two are met, the memory space is exhausted. In micro-mode, all pointers are near and are relative to registers CS, DS, and SS. For the. exe file compiled and connected in micro mode, the DOS exe2bin utility is converted to a. com file. The following table shows the output results of the program. function a is lower than the address of function main, and variable B is lower than the address of variable D. This is because X. obj is in front of the connection and Y. obj is in the back.

Small Mode
Small mode is a common mode. Most examples in this book are compiled in small mode. Although the small mode and the micro mode are both small data and small program modes, there are two important differences between the small mode and the micro mode. First, the code and data/stack/heap segments are separated, so CS is not equal to DS and SS. Second, in addition to the heap that shares a segment with the data/stack, there is also a remote heap that uses the far pointer for access. From the end of the data/stack segment to the end of the general memory, it is a remote heap. Because the code, static data, and (near) heap are still in the same segment, the default data pointer and function pointer in small mode are all near. As a result, in small mode, you cannot directly use the Turbo C function in this mode to process variables in the remote heap. However, as long as the program provides its own operational functions, it can access any unit in the entire remote heap, that is, the entire general memory can be used.

Compact Mode
The Compact mode is the simplest in concept. The code, static data, stack, and heap have their own segments. The heap only has a remote heap and does not have a near heap. Like the remote heap in small mode and middle mode, the heap is accessed using the far pointer. You can use the library functions of Turbo C to process heap variables. All data pointers are far and function pointers are near. From the output of the demo program, we can see that the values of the three registers CS, DS, and SS are different from each other. It is worth noting that the total amount of static data cannot exceed 64 KB.

Moderate Mode
In terms of data/stack/heap allocation, the middle mode is the same as the small mode. The difference lies in the distribution of code segments. In medium mode, the code mode from different source files is placed in different code segments. Strictly speaking, the functions in the same source file are stored in different code segments. The total space of each code segment is limited only by the memory available on the microcomputer. Because there are multiple code segments, Turbo C must use the far function pointer. In the output result of the demo program, the address of function a is 74f9: 000e, and the address of function main is 74fe: 0004. The address of function a is low because it contains the X. OBJ of function a before the connection. In the middle mode, the heap can still be near-heap and far-heap.

Big Mode
In static data/stack/heap allocation, the large mode is equivalent to the compact mode. In terms of code allocation, the big mode is equivalent to the middle mode. Both data pointers and function pointers are far pointers. As in the compact mode, the total amount of static data cannot exceed 64 KB.

Giant Mode
The giant mode removes the limit that the total amount of static data cannot exceed 64 KB. Code from different source files is stored in different segments, and static data from different source files is stored in different segments. Only stacks are combined. The previous example uses this feature. From the output of the demo program, we can also see that when function a is called from the main function, not only CS is changed, but DS is also changed. Of course, the two functions share the same stack, otherwise they cannot be returned correctly. It should be noted that do not confuse the giant mode with the giant pointer. In the giant mode, the default pointer is still far rather than huge.

In function
Micro mode, small mode, compact mode, medium mode, large mode, giant Mode
CS: 74c8 74b1 74b1 74f9 74fd 74fe
DS: 74c8 75cc 7629 75ec 764a 7674
SS: 74c8 75cc 767a 75ec 76a0 76bb

Static B: 1704 048c 7629: 04c8 049a 764a: 04d6 7674: 0002
Automatic C: ffd0 ffd0 767a: 0fd6 ffcc 76a0: 0fd4 76bb: 0fd0

In function main
CS: 74c8 74b1 74b1 74fe 7502 7503
DS: 74c8 75cc 7629 75ec 764a 767b
SS: 74c8 75cc 767a 75ec 76a0 76bb

Global D: 1706 048e 7629: 04ca 049c 764a: 04d8 767b: 0004
Automatic E: ffd6 ffd6 767a: 0fde ffd4 76a0: 0fdc 76bb: 0fda
Heap address: 1792 051a 777c: 000c 0568 77a2: 000c 77bd: 000c
Function A: 0283 01a5 0167 74f9: 000e 74fd: 000d 74fe: 0003
Function main: 02c1 01e3 01ae 74fe: 0004 7502: 000c 7503: 0009

Stack Organization
The Turbo C stack is used to store data with the same lifetime as the function. Such data includes function parameters and automatic variables defined in the function body. To indicate the storage relationship of each data in the function stack, a function definition is provided as follows:
Long F (char a, int B)
{
Int C;
Char D;
....
}

Whenever function f is called, the call is first performed in the reverse order, that is, the call parameters are pushed to the stack in the order from right to left. In this example, press B first and then. Although parameter A is of the accept type, it is still pushed into 16 bits, because 80x86 machines do not have 8-Bit Pressure stack commands. After the parameter is pressed, press the return address of two or four bytes based on whether the call command is near or far.

After entering the called function f, it first pushes the current value of the Register BP into the stack and copies the value of the SP register to the BP register. Next, create the automatic variables in the function body in the stack in the reverse order of the license. In this example, the variables are D and C. Until now, the stack content will be as follows:
....
B
A
Return address
Reserved BP
D
C

There are three purposes for processing bp and sp. First, in order to use BP as the address register, access the parameters passed by the caller and the automatic variables of the called function in the stack using the addressing method such as [BP ± n. It is stipulated in 80x86 that when BP is used as the address register, the default segment address is SS rather than Ds. Second, the DS and other address registers are released to access data in the default data segment. Third, release the SP to call other functions in the function body.

After function f completes its work, it places the returned value in the corresponding position. If the returned value is char type, it is converted to int type before return. If the return value occupies two bytes, it is returned through the Register ax. If the return value occupies four bytes, it is returned through the Register DX: ax. If the returned value of struct exceeds 4 bytes, it is placed in a static variable. The returned value is a pointer to this variable. The return value of dboule is placed in the top_of_stack register of the coprocessor or the equivalent of this register in the simulation package of the coprocessor software. Then the function f copies the BP to the SP, and the BP value retained from the stack's entry and exit to the BP register. Finally, execute a near or far return command and return it to the caller. After the result is returned, the called function must clear the parameters pushed into the stack during the call from the stack.

The above function call rule is called the C call rule. From this process, we can see that the number of parameters between the called function and the called function can be different. If the called function presses too many parameters and the called function does not access these redundant parameters, the called function correctly removes these parameters after obtaining control again. If the called function is pushed into too few parameters, the called function may clear some content that is not a parameter and produce unexpected results. To overcome this difficulty, if the number of parameters is variable, it is best to specify the number of subsequent parameters for the first parameter.

Another set of different function calling rules is called Pascal rules, which have two important differences with C call rules. First, the order of the input parameters is from left to right. Second, after the function is called, the parameters popped up from the stack are completed by the called function rather than the called function. Pascal call rules require that the number of called functions and called function parameters be exactly the same. By the way, the Turbo Pascal language uses not Pascal call rules, but a more well-designed stack format, allowing auto variables of functions to be accessed from nested functions. The heap organization has already said that in the small mode and the middle mode, the heap has near-heap and far-heap, and the solution is different. A segment is shared between the near heap and the stack. If the segments meet each other, the default data segment is exhausted. The Remote Heap uses the entire space above the default data segment until the end of the general memory. To manage these two heaps, Turbo C provides two groups of functions:
Coreleft farcoreleft
Realloc farrealloc
Malloc farmalloc
Free farfree
Calloc farcalloc

The near heap function on the left uses the near pointer to address each heap variable, and the parameters used are also 16-bit unsigned. The Remote Heap function on the right uses a remote pointer to address each heap variable. The parameters used are also unsigned long.

In the micro mode, there is no remote heap. In the Compact mode, large mode, and giant mode, there is only one heap that does not change, and its organizational form is like the same remote heap. However, in these three modes, both near-heap functions and remote functions can be used to access the variables in the heap. This is because no matter which heap function is used, these three modes determine that all data pointers are far. If the near heap function is used, the size parameter of the required capacity must also be a 16-digit unsigned value. If you must process memory blocks larger than 64 KB, you must also use the remote heap function.

Distribution and release are random, and there is no certain order. As a result, the heap variables are not consecutive in the heap. Turbo C uses a linked list to process these heap variables. There is a head before each heap change. The header contains two information: the length of the variable and the pointer to the next heap variable. For the small data mode, each header occupies 4 bytes. For the big data mode, each header occupies 8 bytes. To illustrate how the assignment, release, and redistribution are performed in the heap, see the output result of HTAP. Dem in the following demo program.

# Include <stdio. h>
# Define report printf ("coreleft = % UN", coreleft ());
　　
Void main ()
{
Void * P, * q, * R;
Printf (""); Report; P = malloc (1 );
Printf ("P = malloc (1) = % P;", P); Report; q = malloc (2 );
Printf ("q = malloc (2) = % P;", q); Report; q = realloc (p, 3 );
Printf ("P = realloc (p, 3) = % P;", P); Report; r = malloc (1 );
Printf ("r = malloc (1) = % P;", R); Report; free (Q );
Printf ("free (q)"); Report; free (P );
Printf ("free (p)"); report;
}

The output of this program is as follows:
Coreleft = 63952
P = malloc (1) = 0500; coreleft = 63946
Q = malloc (2) = 0506; coreleft = 63940
P = realloc (p, 3) = 050c; coreleft = 63932
R = malloc (1) = 0500; coreleft = 63932
Free (q) coreleft = 63932
Freeze (p) coreleft = 63946

This demo program is compiled in small mode. First, coreleft reports the amount of memory available. Second, malloc creates the single-byte heap Variable P and double-byte variable q. Because the allocation is always two-byte integers, the single-byte Variable P actually occupies two bytes of space. Each heap variable requires a header of 4 bytes. In this way, each heap variable is allocated, the memory capacity is reduced by 6 bytes. Then, realloc expands the Variable P to three bytes, which requires re-allocation. The returned pointer also points to the new address 050c. The allocated heap Variable P occupies 8 bytes. Including its header. Although the occupied six bytes have been released at this time, coreleft still reports a reduction of 8 bytes rather than two bytes. This is because coreleft only reports the memory capacity continuously available after the last variable in the heap. That is to say, due to the fragmentation of the heap, The coreleft report value is inaccurate. Then, the program allocates a single-byte variable R, which occupies the 6-byte space allocated to Variable P for the first time and then released. After that, the program releases the variable q and leaves a hole between the variable R and P. It should be noted that neither allocating R nor releasing Q affects the value of coreleft report. Finally, the program releases the Variable P. In this case, the value reported by coreleft is accurate, because only one variable R is left at the beginning of the heap.

The following farheap. Dem program demonstrates how to allocate an array a larger than 64 K bytes from the remote heap. Array A is composed of 9000 double elements, which must be 72 K bytes in total. The far pointer returned by the farcalloc function is forcibly converted to a huge pointer. Then, you can use this huge pointer to access each element in the array.

# Include <malloc. h>
Void main ()
{
Int I, n = 9000;
Double huge *;
Double sum;
A = (double huge *) farcalloc (n, sizeof (double ));
For (I = 0; I <n; A [I ++] = I );
For (I = 0, sum = 0; I <n; sum + = A [I ++]);
Printf ("A [I] = I for I = 0. N-1; n = % DN ", N );

Printf ("sum of all a [I] = % 8.0fn", sum );
Printf ("(n-1) n/2 = % 1D N", (long) N * (n-1)/2 );
}

Other memory operation functions
Turbo C also provides many functions related to memory copying, comparison, setting, and searching. The descriptions of these functions are in the mem. h header file. In general, they do not involve any structure, but directly operate on the memory. These functions can operate on simple bytes, or perform operations on data structures that are not directly supported by the C language, such as assigning values to another Array Using an array, comparison between arrays or C structures. The following five Turbo C functions are used to copy data between memories:
Void * _ cdecl memccpy (void * DEST, const void * SRC, int C, size_t N );
Void * _ cdecl memcpy (void * DEST, const void * SRC, size_t N );
Void * _ cdecl memmove (void * DEST, const void * SRC, size_t N );
Void _ cdecl movmem (void * SRC, void * DEST, unsigned length );
Void _ cdecl movedata (unsigned srcseg, unsigned srcoff, unsigned dstseg, unsigned dstoff, size_t N );

The memcpy function copies kN bytes from the source SRC to the destination DeST. If the source and target overlap, the result is not necessarily correct.

The memccpy function is similar to memcpy. If the copied byte contains the character C, the copy is stopped after the character is copied. The returned Pointer Points to the next byte position in the object DeST. If all n Bytes have been copied, the returned pointer is null.

The memmove and movmem functions are also used for copying, but both solve the problem of overlapping source and target. The movmem function usually refers to the parameter order "Object = source", but the source is in the front and back.

In small and medium modes, the Source and Destination pointers received by the first four copy functions can only be near pointers, and cannot be used to copy arrays in the far data segment. The movedata function overcomes this defect. It allows you to specify the Source and Destination segment addresses and offsets. It does not solve the problem of overlapping source and destination, and requires that the source parameter be in front and the target parameter be in the back.

The utmovmem function in Turbo C tools is similar to the movedata function of Turbo C, but it automatically solves the overlapping problem between source and target.
Void utmovmem (const char far * psource, char far * pTARGET, unsigned int length );

The Turbo C functions used for memory comparison are as follows:
Int _ cdecl memcmp (const void * S1, const void * S2, size_t N );
Int _ cdecl memicmp (const void * S1, const void * S2, size_t N );

Both functions compare the first n Bytes of the two byte arrays. The return values are smaller than 0, equal to 0, and greater than 0, respectively, depending on whether S1 is smaller than, equal to, or greater than S2. However, the function memcmp is a precise comparison. Each byte is considered as an unsigned 8-digit number, while the function memicmp treats each byte as a character, regardless of the case.

The Turbo C functions used for memory settings include the following:
Void * _ cdecl memset (void * s, int C, size_t N );
Void _ cdecl setmem (void * DEST, unsigned length, char value );
These two functions both set a memory area to a certain byte value. The Parameter order is different, and the return value is different, but the actual effect cannot be seen.

The Turbo C function used to find a character from the first n Bytes of a memory block is memchr:
Void * _ cdecl memchr (const void * s, int C, size_t N );
If yes, the pointer returned is the position where the character C appears 1st times. If no value is found, the returned pointer is null.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More