Programmers who use C/C ++ to develop software often encounter this problem: sometimes there is no problem in program compilation, but the function is always reported when the link is used (the classic LNK 2001 error ), sometimes there are no errors in program compilation and linking, but a stack exception occurs when a function in the library is called. These phenomena usually occur when C and C ++ code are used together or when a third-party library is used in C ++ programs (not developed in C ++ ). ), in fact, this is a disaster caused by the Calling Convention and Decorated Name rules. The function call method determines the order in which function parameters are pushed to the stack. The caller's function or the called function is responsible for clearing the parameters in the stack, the function name modification rules determine the name modification method used by the compiler to distinguish different functions. If the call conventions between functions do not match or the name modification does not match, the above problems will occur. This article gives a detailed explanation of the function call conventions and function name modification rules of C and C ++ programming languages, and compares their similarities and differences, the reasons for the above problems are also illustrated.
Calling Convention)
The function call convention not only determines the order of function parameters in the stack when a function is called, but also determines whether the caller function is used to clear the parameters in the stack and restore the stack. There are many methods for function calling conventions. Besides common _ cdecl ,__ fastcall and _ stdcall, C ++ compilers also support the thiscall method, many C/C ++ compilers also support the naked call method. So many function call conventions are often confusing to many programmers. In the end, under what circumstances are they used? The following describes the call conventions of these functions.
1. _ cdecl
The command line parameter of the compiler is/GD. The _ cdecl method is the default function call convention of the C/C ++ compiler, all non-C ++ member functions and those that do not use the _ stdcall or _ fastcall declared functions are in the _ cdecl method by default, which uses the C function call method, function parameters are written to the stack in the order from right to left. The function caller is responsible for clearing the parameters in the stack. Since each function call requires the compiler to generate code to clear (Restore) the stack, therefore, programs compiled using the _ cdecl method are much larger than those compiled using the _ stdcall method. However, the function caller is responsible for clearing function parameters in the stack when calling the _ cdecl method, therefore, this method supports variable parameters. For example, the printf and Windows APIs wsprintf are _ cdecl call methods. For C functions, the naming convention in the __cdecl mode is to add an underline before the function name. For C ++ functions, unless extern "C" is used in particular ", c ++ functions use different name modifiers.
2. _ fastcall
The command line parameter of the compiler is/GR. _ Fastcall function call Conventions use registers to pass parameters whenever possible. Generally, the first two DWORD parameters or smaller parameters are transmitted using ECx and EDX registers, other parameters are pushed to the stack from the right to the left. The called function is responsible for clearing the parameters in the stack before returning the result. The compiler uses two @ modifiers for the function name, followed by the size of the function parameter list in decimal notation, for example, @ function_name @ number. Note that the _ fastcall function call Convention may have different implementations in different compilers, such as 16-bit compilers and 32-bit compilers. In addition, when using embedded assembly code, note that it cannot conflict with the registers used by the compiler.
3. _ stdcall
The command line parameter of the compiler is/GZ ,__ stdcall, which is the default call method of the PASCAL program. Most windows APIs are also the _ stdcall call convention. _ Stdcall function call Convention transfers function parameters from right to left into the stack. Unless pointer or reference type parameters are used, all parameters are passed as values, the called function is responsible for clearing the parameters in the stack. For C functions, the _ stdcall name modifier is to add an underline before the function name, add @ and the size of the function parameter after the function name, for example: _ functionname @ number
4. thiscall
Thiscall is only used for Calling C ++ member functions. function parameters are pushed to the stack from right to left. this pointer of a class instance is passed through the ECX register. Note that thiscall is not a keyword of C ++ and cannot be used to declare a function. It can only be used by the compiler.
5. naked call
The compiler automatically adds the code for saving the ESI, EDI, EBX, and EBP registers to the function when necessary, using the functions agreed in the previous call conventions, when the function exits, it restores the content of these registers. The function declared using the naked call method will not add such code, which is why it is called naked. The naked call is not a type modifier, so it must be used together with _ declspec.
By default, the compiling environment of VC uses the _ cdecl call Convention. You can also use the Project Setting in the compiling environment... choose "C/C ++ =" menu "and" Code Generation "to set the function call conventions. You can also directly add the keyword _ stdcall, _ cdecl, and _ fastcall before the function declaration to separately determine the function call method. WINAPI macros are commonly used for software development on Windows systems. It can be translated into appropriate function call conventions based on compilation settings. in WIN32, it is defined as _ stdcall.
Decorated Name
The Decorated Name of a function is a string created by the compiler during compilation. It is used to specify the definition or prototype of a function. LINK programs or other tools sometimes need to specify the name of the function to locate the correct position of the function. In most cases, programmers do not need to know the function name modification. The LINK program or other tools will automatically differentiate them. Of course, in some cases, you need to specify the name of a function. For example, in a C ++ Program, to allow the LINK program or other tools to match the correct function name, you must specify the name decoration for the overload function and some special functions (such as constructor and destructor. Another case where you need to specify the name of a function is to call a C or C ++ function in an assembler. If the function name or call Convention changes the type or parameter of the returned value, the original name modification will no longer be valid and a new name modification must be specified. The functions of C and C ++ use different name modifiers internally. The following describes the two methods respectively.
1. C compiler function name modification rules
For the _ stdcall call convention, the compiler and the linker will add an underline prefix before the output function name, and add a "@" symbol after the function name and the number of bytes of its parameters, for example, _ functionname @ number. The _ cdecl call Convention only adds an underline prefix before the output function name, for example, _ functionname. _ Fastcall: Add a "@" symbol before the output function name, followed by a "@" symbol and the number of bytes of its parameters, for example, @ functionname @ number.
2. function name modification rules of C ++ Compiler
The function name modification rules of C ++ are complicated, but the information is more adequate. By analyzing and modifying the function name, we can not only know the function call method, return value type, number of parameters, or even parameter type. Whether _ cdecl, _ fastcall, or _ stdcall, the function modifier is a "?" Start, followed by the name of the function, followed by the START identifier of the parameter table and the parameter table spelled out according to the parameter type code. For the _ stdcall method, the start identifier of the parameter table is "@ YG". For the _ cdecl method, it is "@ YA ", the _ fastcall method is "@ YI ". The spelling code of the parameter table is as follows:
X -- void
D -- char
E -- unsigned char
F -- short
H -- int
I -- unsigned int
J -- long
K -- unsigned long (DWORD)
M -- float
N -- double
_ N -- bool
U -- struct
....
The pointer method is special. PA is used to represent the pointer and PB is used to represent the const type pointer. The following code indicates the pointer type. If a pointer of the same type appears consecutively, it is replaced by "0". A "0" indicates a repetition. The u table shows the structure type, which is usually followed by the type name of the struct. "@" is used to indicate the end Of the structure type name. The return value of a function is not specially processed. It is described in the same way as a function parameter, followed by the START sign of the parameter table. That is to say, the first item of the function parameter table actually represents the type of the return value of the function. The end of the name is marked with "@ Z" after the parameter table. If this function does not have a parameter, it ends with "Z. The following two examples show the following function declaration:
Int Function1 (char * var1, unsigned long );
Its function modifier is "? Function1 @ YGHPADK @ Z ", and for function declaration:
Void Function2 ();
The function modifier is "? Function2 @ YGXXZ ".
For C ++ class member functions (whose call method is thiscall), the name modification of the function is slightly different from that of the non-member C ++ functions, first, insert the class name guided by the "@" character between the function name and the parameter table. Second, the start ID of the parameter table is different. public) the member function identifier is "@ QAE", the Protection (protected) member function identifier is "@ IAE", private) the member function identifier is "@ AAE". If the function declaration uses the const keyword, the corresponding identifier should be "@ QBE ", "@ IBE" and "@ ABE ". If the parameter type is A reference of A class instance, use "A *** 1". For A reference of the const type, use "ABV1 ". The following uses CTest as an example to describe the naming rules for C ++ member functions:
Class CTest
{
......
Private:
Void Function (int );
Protected:
Void CopyInfo (const CTest & src );
Public:
Long DrawText (HDC hdc, long pos, const TCHAR * text, RGBQUAD color, BYTE bUnder, bool bSet );
Long InsightClass (DWORD dwClass) const;
......
};
For a member Function, its Function name is "? Function @ CTest @ AAEXH @ Z. the string "@ AAE" indicates that this is a private Function. The member function CopyInfo has only one parameter, which is a const reference parameter of the class CTest. Its function name is "? CopyInfo @ CTest @ IAEXABV1 @ Z ". DrawText is a complicated function declaration. It includes not only string parameters, but also struct parameters and HDC handle parameters. It must be noted that HDC is actually a HDC _ structure pointer, this parameter indicates "PAUHDC __@". Its complete function name is "?". DrawText @ CTest @ QAEJPAUHDC __@jpbdutagrgbquad @ E_N @ Z ". InsightClass is a common const function. Its member function ID is "@ QBE". The complete modifier name is "? InsightClass @ CTest @ QBEJK @ Z ".
Whether it is a C function name modifier or a C ++ function name modifier, The Case sensitivity of the characters in the output function name is not changed, which is different from the PASCAL call convention, PASCAL agrees that the name of the function output is in uppercase without any modification.
3. view the function name Modification
There are two ways to check the name modification of the function in your program: Compile the output list or use the Dumpbin tool. Use the/FAc,/FAs, or/FAcs command line parameters to allow the compiler to output a list of function or variable names. You can also use the dumpbin.exe/SYMBOLS command to obtain the list of function or variable names in the obj file or lib file. In addition, you can use undname.exe to convert the modifier name to an unmodified form.
Common problems caused by mismatch between function call conventions and name modification rules
If a stack exception occurs during function calling, it is caused by the mismatch of the function call conventions. For example, dynamic link library a has the following export functions:
Long MakeFun (long lFun );
The function call Convention used to generate a dynamic library is _ stdcall. the call Convention for function MakeFun in dll is _ stdcall, that is, when the function is called, the parameter is pushed from the right to the left, and the stack is restored when the function is returned. Now, a program module B must reference MakeFun in a. B and a use the C ++ Method for compiling, but the function call method of Module B is _ cdecl, because B contains the MakeFun function declaration in the header file provided by a, MakeFun is considered as the _ cdecl call method by other functions that call MakeFun in Module B, these functions in Module B need to help restore the stack after calling MakeFun, but MakeFun has already restored the stack by itself at the end. In this case, functions in Module B cause a stack pointer error, this causes a stack exception. The macro phenomenon is that function calling is normal (because the parameter transmission order is the same). MakeFun also completes its own functions, but an error is thrown after the function is returned. The solution is also very simple, as long as the two modules set the same function call conventions during compilation.
After learning about the function call conventions and the function name modification rules, it is easy to see the LNK 2001 error that often occurs in libraries compiled in C language in C ++ programs. Take the two modules in the preceding example as an example. during compilation, both modules adopt the _ stdcall call convention, but. dll is compiled using the C language syntax (in C language), so. dll load into the database. the name of the MakeFun function in lib is "_ MakeFun @ 4 ". B contains the MakeFun function declaration in the header file provided by a, but because B uses C ++ for compilation, therefore, MakeFun is named "? MakeFun @ YGJJ @ Z ", the compilation process is safe. When linking the program, the c ++ linker goes to a. lib to find"? MakeFun @ YGJJ @ Z ", but only" _ MakeFun @ 4 "and no"? MakeFun @ YGJJ @ Z ", so the linker reports:
Error LNK2001: unresolved external symbol? MakeFun @ YGJJ @ Z
The solution is simple, that is, let Module B know that this function is compiled in C language, and extern "C" can do this. A library compiled in C language should consider that the program using this library may be a C ++ Program (using the C ++ compiler), so pay attention to this when designing header files. The header file should be declared as follows:
# Ifdef _ cplusplus
Extern "C "{
# Endif
Long MakeFun (long lFun );
# Ifdef _ cplusplus
}
# Endif
In this way, the C ++ compiler will know that the modified name of MakeFun is "_ MakeFun @ 4" and there will be no Link errors.
Many people don't understand why the "error LNK2001" error still occurs when the compilers I use are VC compilers? In fact, the VC compiler will select the compilation method based on the source file extension, if the file extension is ". C ", the compiler will use the C syntax to compile, if the extension is". cpp ", the compiler will use the C ++ syntax to compile the program, so the best way is to use extern" C ".