Preface:
The importance of function Overloading is unknown, but you know how function Overloading is implemented in C ++ (although this article describes the implementation of function overloading in C ++, but I think other languages are similar )? This can be broken down into the following two problems:
- 1. How does one resolve the name conflict when declaring/defining a overload function? (Aside from function overloading, using is a method to solve naming conflicts. There are many other methods to solve naming conflicts, which will not be discussed here)
- 2. How Do We parse an overloaded function? (That is, how do I know which function is called)
These two problems must be solved in any language that supports function overloading! With these two questions, we will start to explore this article. The main content of this article is as follows:
- 1. example introduction (symptom)
- What is function overload )?
- Why does function overload (why) be required )?
- 2. How does the compiler resolve naming conflicts?
- Why does function overload not consider the return value type?
- 3. Call matching of overload Functions
- 4. How does the compiler parse the call of overloaded functions?
- Determine the candidate function set based on the function name
- Determine available functions
- Determine the optimal matching function
- 5. Summary
1. example introduction (symptom)
1.1. What is function overload )?
Function overloading refersWithin the same scope, You can have a groupSame Function Name,Different parameter listsThis group of functions is called overload functions. Overload functions are usually used to name a group of functions with similar functions. This reduces the number of function names and avoids namespace pollution.ProgramThe readability is of great benefit.
When two or more different declarations are specified for a single name in the same scope, that name is saidOverloaded. By extension, two declarations in the same scope that declare the same name but with different types are calledOverloaded declarations. Only function declarations can be overloaded; object and type declarations cannot be overloaded. -- from ansi c ++ standard. p290
Let's take a look at the following example to understand: A printing function can print both the int type and the string type. In C ++, we can do this:
# Include <iostream> using namespace STD;VoidPrint (IntI) {cout <"Print a integer:"<I <Endl ;}VoidPrint (string Str) {cout <"Print a string:"<STR <Endl ;}IntMain () {print (12); print ("Hello world!");Return0 ;}
Through the aboveCodeYou can call print (INT) or print (string) according to the specific print () parameter ). Print (12) will call print (INT), print ("Hello World") will call print (string), the following results: (Use G ++ test first. c compile and execute)
1.2. Why does function overload (why) be required )?
- Imagine if there is no function overload mechanism, for example, in C, you must do this: get different names for the print function, such as print_int and print_string. There are only two cases. If there are many, you needUse multiple names for functions that implement the same function.Such as printing long, char *, and arrays of various types. This is unfriendly!
- The constructor of the class is the same as the class name, that is, the constructor has the same name. If there is no function overload mechanism, it is quite troublesome to instantiate different objects!
- Operator Overloading is essentially a function overload. It greatly enriches the meaning of existing operators and is easy to use. For example, + can be used to connect strings!
Through the above introduction, we should wake up the approximate memory of function overloading. Next we will analyze how C ++ implements the function overload mechanism.
2,How does the compiler resolve naming conflicts?
To understand how the compiler handles these overload functions, decompile the execution file we generated above and read the Assembly Code (the full text is an experiment in Linux, similar to windows, you can also refer to the article "thinking about a simple question", which uses both the disassembly under Linux and the disassembly under windows, and specifies the differences between Linux and Windows ). Run the following command:Objdump-d a. Out> log.txtMerge and redirect the result to the log.txt file. Then, analyze the log.txt file.
Discovering FunctionsVoidPrint (IntI) after compilation: (Note that its function signature is changed --_ Z5printi)
Discovering FunctionsVoidPrint (string Str) after compilation: (Note that its function signature is changed --_ Z5printss)
We can find that after compilation,The name of the overload function is no longer print.! In this case, there is no name conflict, but there is a new problem-what is the name change mechanism, that is, how to map the signature of an overloaded function to a new identifier? My first response is:Function Name+Parameter ListBecause the function overload depends on the type and number of parameters, and is not related to the return type. But let's look at the following ing relationship:
Void print (IntI) -->_ Z5printi
Void print (string Str) -->_ Z5printss
Further conjecture: The preceding z5 indicates the return value type, the print function name, I indicates the integer int, and SS indicates the string, that is, the ing isReturn type+Function Name+Parameter List. In the main function_ Z5printi,_ Z5printssTo call the corresponding function:
80489bc: E8 73 FF call 8048934<_ Z5printi>
...............
80489f0: E8 7A FF call 804896f<_ Z5printss>
Let's write several more overload functions to verify the conjecture, such:
void print (long l) --> _ z5printl
void print (char Str) --> _ z5printc
it can be found that it is about int-> I, long-> L, char-> C, string-> SS .... Basically, it is represented by the first letter. Now, whether the type of the return value of a function has an effect on the function name change, for example:
# include
using namespace STD;
int MAX (
int A,
int B) {
return A> = B? A: B ;}< span style = "color: # 0000ff;"> double MAX (
double A,
double B) {
return A> = B? A: B ;}< span style = "color: # 0000ff;"> int main () {cout <"
MAX Int Is: "
MAX double is: "
return 0 ;}
IntMax (IntA,IntB) ing_ Z3maxii,DoubleMax (DoubleA,DoubleB) ing_ Z3maxdd,This confirms my conjecture that the numeric code following Z has various return types. A more detailed ing relationship. For example, if the number corresponds to the return type and the character represents the unique parameter type, we will not study it in detail, because it is related to the compiler, the above research is based on the G ++ compiler. If the vs compiler is used, the correspondence is definitely different from this. But the rules are the same:"Return type+Function Name+Parameter List".
Since the return type also takes into account the ing mechanism, the function names after the ing of different return types are certainly different, but why not take the function return type into account in the function overload? -- This is to keep the parsing operator or function calls independent of the context (not dependent on the context). See the following example.
Float SQRT(Float);Double SQRT(Double);VoidF (DoubleDa,FloatFLA ){FloatFL =SQRT(DA );// Call SQRT (double)DoubleD =SQRT(DA );// Call SQRT (double)FL =SQRT(FLA );// Call SQRT (float)D =SQRT(FLA );// Call SQRT (float)}
If the return type takes into account the function overload, it is impossible to determine which function to call independently of the context.
At this point, it seems that the analysis has been completely clear, but we have missed the important limitations of function overloading --Scope. The function overload we introduced above is a global function. Next we will look at the function overload in the next class, call the print function with the class object, and call different functions according to the real parameters:
# Include <iostream> using namespace STD; Class test {public:VoidPrint (IntI) {cout <"Int"<Endl ;}VoidPrint (CharC) {cout <"Char"<Endl ;}};IntMain () {test T; T. Print (1); T. Print ('A ');Return0 ;}
Now let's take a look at the function name after the print function ing:
VoidPrint (IntI) --> _Zn4test5printei
VoidPrint (CharC) --> _Zn4test5printec
Note that the preceding n4test indicates the scope. N4 may be the namespace, test class name, and so on. This shows that the most accurate ing mechanism is:Scope+Return type+Function Name+Parameter List
3. Call matching of overload Functions
Now we have solved the naming conflict problem of the overload function. After defining the overload function, how does one resolve it when calling the function name? To determine which overload function is most suitable, you must follow the following rules in sequence:
- Exact match: Parameter Matching without conversion, or just trivial conversion, such as array name to pointer, function name to pointer to function, t to const T;
- Promote matching: Integers (such as bool to int, Char to int, short to int), float to double
- Use standard conversion matching: Int to double, double to int, double to long double, derived * to base *, T * to void *, int to unsigned int;
- Use custom match;
- Match with ellipsis: Similar to the ellipsis parameter in printf
If multiple matching functions are found at the highest level, the call will be rejected (because of ambiguity and modulo Ling ). See the following example:
Void Print ( Int ); Void Print ( Const Char *); Void Print (Double ); Void Print ( Long ); Void Print ( Char ); Void H ( Char C, Int I, Short S, Float F) {print (C ); // Exact match, call print (char) Print (I ); // Exact match, call print (INT) Print (s );// Integer increase, call print (INT) Print (f ); // Float to double, call print (double) Print ('A '); // Exact match, call print (char) Print (49 ); // Exact match, call print (INT) Print (0 ); // Exact match, call print (INT) Print (" A "); // Exact match, call print (const char *) }
Too few or too many overload functions can lead to Mo Ling. See the following example:
VoidF1 (Char);VoidF1 (Long);VoidF2 (Char*);VoidF2 (Int*);VoidK (IntI) {F1 (I );// Call F1 (char )? F1 (long )?F2 (0 );// Call F2 (char *)? F2 (int *)?}
At this time, the compiler will report an error and throw it to the user for processing: calling through display type conversion (for example, F2 (static_cast <int *> (0 ), of course, this is ugly, and it is useful for conversion when you want to call other methods ). The preceding example is just a parameter. Let's take a look at two parameters:
Int POW(Int,Int);Double POW(Double,Double);VoidG (){DoubleD =POW(2.0, 2)// Call POW (INT (2.0), 2 )? Pow (2.0, double (2 ))?}
4. How does the compiler parse the call of overloaded functions?
When the compiler calls the reload function parsing mechanism, it must first find some candidate functions with the same name, and then find the most suitable ones from the candidate functions. If it cannot be found, an error is returned. The following describes a method for reload function parsing: When the compiler processes a call to a reload function, it performs interactive processing by syntax analysis, C ++ syntax, symbol table, and abstract syntax tree, the interaction diagram is roughly as follows:
The four parsing steps are roughly as follows:
- Called by a function in the matching syntax to obtain the function name;
- Obtain the expression types of each function parameter;
- The syntax analyzer searches for overload functions, and the symbol table goes throughReload ParsingReturns the best function.
- The syntax analyzer creates an abstract syntax tree and binds the best functions stored in the symbol table to the abstract syntax tree.
Next we will focus on the heavy-load parsing, which must meet the matching sequence and rules described in "3. Call matching of heavy-load functions. Reload function parsing can be roughly divided into three steps:
- Determine the candidate function set based on the function name
- Select an available function set from the candidate function set
- Determine the best function from the available function set, or an error is returned due to mod Ling regression.
4.1,
Determine the candidate function set based on the function name
According to the function inWithin the same scopeAll functions with the same name must be visible (such as private, protected, public, and friend ). "Same scope" is also a limitation in the definition of function overload. If it is not in a scope, it cannot be considered a function overload. The following code:
VoidF (Int);VoidG (){VoidF (Double); F (1 );// F (double) is called here, instead of F (INT)}
That isFunctions in the inner scope will hide functions with the same name as the outer one.!Member functions of the same derived class hide functions of the same name of the base class.. This is easy to understand, as is variable access. For example, to access a global variable with the same name in a function body, use.
To search for candidate function setsDeep OptimizationSearchAlgorithm:
Step 1: Search for visible candidate functions from the function call point layer-by-layer Scope
Step 2: if the data collected in the previous step is not in the User-Defined namespace, a candidate function is used in the namespace introduced by the using mechanism. Otherwise, the process ends.
When collecting candidate functions, if the real parameters of the called functions are of non-struct type, the candidate functions only include the functions visible to call points; if the real parameter types of the called function include class object, class type pointer, class type reference, or pointer to class member, the candidate function is of the following set and:
- (1) functions visible on the call point;
- (2) functions declared in the namespace defining the type or the namespace defining the base class of the class;
- (3) friend functions of the class or its base class;
The following example is more intuitive:
VoidF ();VoidF (Int);VoidF (Double,Double= 314); names pace n {VoidF (Char3,Char3);} classa {public: operat orDouble(){}};IntMain () {using names pace N;// Using indicatorA A; F ();Return0 ;}
According to the above method, because the real parameters are class objects, collection of candidate functions is divided into three steps:
(1) Search for the declaration of function f from the main function scope where the function call is located. The result is not found. To Main Function
Searches for the outer scope of the scope. At this time, the declaration of the three function f is found in the global scope and placed into the candidate set;
(2) collect F (char3, char3) from namespace n pointed to by the using indicator );
(3) consider two types of sets. The first is to define the namespace of the class or the letter declared in the namespace of the base class of the class.
Number; the second is the membership function of the class or its base class. In this example, these two types of sets are empty.
The final candidate set is the four function f listed above.
4.2 determine available functions
The number of available functions matches the number of function parameters and each parameter has an implicit conversion sequence.
- (1) If the real parameter has M parameters, among all the candidate parameters, there are only M parameters;
- (2) Among all candidate parameters, the number of parameters is less than M. Currently, only when the parameter list contains ellipsis;
- (3) among all candidate parameters, the number of parameters exceeds M. Currently, only after the m + 1 parameter has the default value. If available
If the set is empty, function call will fail.
These rules are embodied in the preceding "3. Call matching for heavy-duty functions.
4.3 determine the best matching function
After determining the available function, calculate the priority of each function in the available function set if the real parameter of the function to be called needs to be called, and finally select the highest priority. For example, for the matching rules described in "3. Call matching of overload functions", weights are allocated in order, the total priority is calculated, and the optimal function is selected.
5. Summary
This article describes what function Overloading is, Why function Overloading is required, how the compiler solves the problem of function name duplication, and how the compiler resolves the call of overloaded functions. Through this article, I think you should be clear about the heavy load in C ++. Note: The function name ing mechanism is based on the G ++ compiler. There are some differences in Compiler ing. The Compiler parses the call of overloaded functions, but it is only one of all compilers. If you are interested in a compiler, Please study it yourself.
Finally, I will give you two questions:
- 1. Adding a number + in C ++ can be used for the addition of two int types, the addition of floating point numbers, and the connection between strings, isn't it operator overload? In another scenario, adding the number + in C language can be used for the addition between two int types, or for the addition of the number of floating point numbers. Isn't that an operator overload?
- 2. What happens when the template is reloaded? How does a template function match a normal function during calling?
Appendix: a c ++ function overload mechanism
This mechanism was proposed and implemented by Zhang Suqin and others. They wrote a C ++ compilation system CoC ++ (developed on a Chinese machine, c, C ++, and FORTRAN language compiling systems that are copyrighted by China in UNIX operating systems. These compilation systems meet the isoc90, at&t's c ++ 85, and isofortran90 standards respectively ). The function overload processing process in CoC ++ mainly includes two sub-processes:
- 1. During the process of function declaration, compile the system to create a function declaration prototype linked list, follow the naming rules and record the name after the function is changed in the function declaration prototype linked list (the naming rules are similar to the one described above, only the INT-character, char-character, and so on)
Figure 1. Process 1-create a function linked list (Note: The function name encoding format is: <original function name >_< scope name change> <function parameter table encoding>, this is a little different from G ++)
- 2. During function call Statement Translation, access the symbol table, find the prototype linked list of the corresponding function declaration, and find the optimal matching function node according to the type matching principle, output the name after the name change. The following shows the two sub-process algorithms. Create a function declaration. The prototype linked list algorithm flow is attached. 1. The function call statement translation algorithm flow is attached. 2.
Figure 2. process 2-Reload function call to find the linked list
Appendix-how does a template function match a normal function during the call?
Below is the answer of Bjarne stroustrup, founder of C ++:
1) Find the set of function template specializations that will take part in overload resolution.
2) If two template functions can be called and one is more specified than the other, consider only the most specialized template function in the following steps.
3) Do overload resolution for this set of functions, plus any ordinary functions as for ordinary functions.
4) if a function and a specialization are equally good matches, the function is perferred.
5) If no match is found, the call is an error.