The importance of function overloading is not clear, but you know how function overloading is implemented in C + + (although this article is about the implementation of function overloading in C + +, I think other languages are similar)? This can be broken down into the following two questions
- 1. How to resolve naming conflicts when declaring/defining overloaded functions? (Aside from the function overload, the using is a way to resolve the naming conflict, there are many other ways to resolve the naming conflict, this is not discussed here)
- 2, when we call an overloaded function, how to parse it? (That is, how do you know which function to call?)
These two issues are the problems that any language that supports function overloading must solve! With these two questions, we begin the discussion of this article. The main contents of this article are as follows:
- 1. Introduction of examples (phenomena)
- What is function overloading (what)?
- Why do you need function overloading (why)?
- 2. How does the compiler resolve naming conflicts?
- Why does the function overload not consider the return value type
- 3. Call matching for overloaded functions
- 4. How does the compiler parse an overloaded function call?
- Determine the set of candidate functions based on the function name
- Determining available functions
- Determining the best matching function
- 5. Summary
1. Introduction of examples (phenomena)
1.1. What is function overloading (what)?
function overloading means that within the same scope, there can be a set of functions with the same function name and different parameter lists, which are called overloaded functions. Overloaded functions are often used to name a set of functionally similar functions, which reduces the number of function names, avoids the pollution of namespaces, and has great benefits for the readability of the program.
When both or more different declarations was specified for a single name in the same scope, that name was said to Overl Oaded. By extension, both declarations in the same scope that declare the same name but with different types is called Overlo aded declarations. Only function declarations can is overloaded; object and type declarations cannot be overloaded. --excerpt from the ANSI C + + standard. P290 "
Take a look at one of the following examples: Implement a print function that can print either int or string type. In C + +, we can do this:
#include <iostream>
using namespace std;
void print (int i)
{
cout << "print a integer:" << i << endl;
}
void print (string str)
{
cout << "print a string:" << str << endl;
}
int main ()
{
print (12);
print ("hello world!");
return 0;
}
Through the implementation of the above code, you can call print (int) or print (string) according to the specific print () parameters. The above print (12) will call print (int), and print ("hello world") will call print (string), as shown in the following result: (first compile with g ++ test.c, then execute)
1.2 Why do you need function overloading (why)?
Imagine if there is no function overloading mechanism, as in C, you must do this: give the print function a different name, such as print_int, print_string. There are only two cases here, if there are many, you need to take many names for functions that implement the same function, such as adding printing long, char *, various types of arrays, and so on. This is very unfriendly!
The constructor of the class is the same as the class name, that is to say: the constructors have the same name. If there is no function overloading mechanism, it is quite troublesome to instantiate different objects!
Operator overloading is essentially function overloading, which greatly enriches the meaning of existing operators and is convenient to use, such as + can be used to connect strings!
Through the above introduction, we overloaded the function, it should wake up our approximate memory of the function overload. Below we will analyze how C ++ implements the function overloading mechanism.
2. How does the compiler resolve naming conflicts?
In order to understand how the compiler handles these overloaded functions, we decompile the executable file we generated above and look at the assembly code (the full text is an experiment done under Linux, similar to Windows, you can also refer to "a simple "Thinking caused by the title" article, where both disassembly under Linux and disassembly under Windows are used, and the difference between Linux and Windows assembly language is noted). We execute the command objdump -d a.out> log.txt disassembly and redirect the results to the log.txt file, and then analyze the log.txt file.
Found that the function void print (int i) after compilation is: (note that its function signature becomes _Z5printi)
Found that the function void print (string str) after compilation is: (note that its function signature becomes _Z5printSs)
We can find that after compilation, the name of the overloaded function has changed and it is no longer print! In this way, there is no problem of naming conflicts, but there are new problems-what is the name change mechanism, that is, how to map the signature of an overloaded function to a new identity? My first reaction is: function name + parameter list, because function overload depends on the type and number of parameters, and has nothing to do with the return type. But look at the following mapping relationship:
void print (int i)-> _Z5printi
void print (string str)-> _Z5printSs
Further conjecture, the previous Z5 represents the return value type, the print function name, i represents the integer int, and Ss represents the string string, which is mapped to the return type + function name + parameter list. Finally, in the main function, the corresponding function is called through _Z5printi, _Z5printSs:
80489bc: e8 73 ff ff ff call 8048934 <_Z5printi>
……………
80489f0: e8 7a ff ff ff call 804896f <_Z5printSs>
We write a few overloaded functions to verify the conjecture, such as:
void print (long l)-> _Z5printl
void print (char str)-> _Z5printc
It can be found that it is probably int-> i, long-> l, char-> c, string-> Ss .... Basically, they are all represented by the first letter. Now let ’s see whether the return value type of a function really changes the function Names have influence, such as:
#include <iostream>
using namespace std;
int max (int a, int b)
{
return a> = b? a: b;
}
double max (double a, double b)
{
return a> = b? a: b;
}
int main ()
{
cout << "max int is:" << max (1,3) << endl;
cout << "max double is:" << max (1.2,1.3) << endl;
return 0;
}
int max (int a, int b) maps to _Z3maxii, double max (double a, double b) maps to _Z3maxdd, which confirms my conjecture that the numeric codes behind Z return various types. More detailed correspondence, such as which number corresponds to the return type, which character represents which heavy parameter type, will not go into specific research, because this thing is related to the compiler, the above research is based on the g ++ compiler, if used If it is a vs compiler, the corresponding relationship is definitely not the same as this. But the rules are the same: "Return type + function name + parameter list".
Since the return type is also considered in the mapping mechanism, the function names after such different return types are mapped must be different, but why not consider the function return type in function overloading? -This is to keep the resolution operator or function call independent of the context (does not depend on the context), see the following example
float sqrt (float);
double sqrt (double);
void f (double da, float fla)
{
float fl = sqrt (da); // Call sqrt (double)
double d = sqrt (da); // Call sqrt (double)
fl = sqrt (fla); // Call sqrt (float)
d = sqrt (fla); // Call sqrt (float)
}
If the return type takes into account function overloading, it will no longer be possible to decide which function to call independently of the context.
At this point, it seems to have been completely analyzed, but we have also missed the important limitation of function overloading-scope. The function overloads we introduced above are all global functions. Let's take a look at the function overloading in a class, call the print function with the class object, and call different functions according to the actual parameters:
#include <iostream>
using namespace std;
class test {
public:
void print (int i)
{
cout << "int" << endl;
}
void print (char c)
{
cout << "char" << endl;
}
};
int main ()
{
test t;
t.print (1);
t.print (‘a‘);
return 0;
}
Let us now look at the function name after the print function mapping at this time:
void print (int i)-> _ZN4test5printEi
void print (char c)-> _ZN4test5printEc
Note the previous N4test, we can easily guess that it should represent the scope, N4 may be the namespace, test class name, etc. This shows that the most accurate mapping mechanism is: scope + return type + function name + parameter list
3. Call matching of overloaded functions
Now that the problem of naming conflicts for overloaded functions has been resolved, after defining the overloaded function, how is it resolved when called with the function name? In order to estimate which overloaded function is most suitable, you need to judge in accordance with the following rules in turn:
Exact matching: parameter matching without conversion, or just trivial conversion, such as array name to pointer, function name to pointer to function, T to const T;
Promotion matching: namely integer promotion (such as bool to int, char to int, short to int), float to double
Use standard conversion matching: such as int to double, double to int, double to long double, Derived * to Base *, T * to void *, int to unsigned int;
Use user-defined matching;
Use ellipsis matching: similar to the ellipsis parameter in printf
If multiple matching functions are found at the highest level, the call will be rejected (due to ambiguity and ambiguity). See the example below:
void print (int);
void print (const char *);
void print (double);
void print (long);
void print (char);
void h (char c, int i, short s, float f)
{
print (c); // Exact match, call print (char)
print (i); // Exact match, call print (int)
print (s); // Integer promotion, call print (int)
print (f); // Float to double promotion, call print (double)
print (‘a‘); // Exact match, call print (char)
print (49); // Exact match, call print (int)
print (0); // Exact match, call print (int)
print ("a"); // Exact match, call print (const char *)
}
Defining too few or too many overloaded functions may lead to ambiguity, see an example below:
void f1 (char);
void f1 (long);
void f2 (char *);
void f2 (int *);
void k (int i)
{
f1 (i); // Call f1 (char)? f1 (long)?
f2 (0); // Call f2 (char *)? f2 (int *)?
}
At this time, the compiler will report an error, and throw the error to the user to handle it: call through the display type conversion, etc. (such as f2 (static_cast <int *> (0), of course, this is ugly, and you want to call The method is useful for conversion.) The above example is just a one-parameter case, let ’s look at a two-parameter case:
int pow (int, int);
double pow (double, double);
void g ()
{
double d = pow (2.0,2) // Call pow (int (2.0), 2)? pow (2.0, double (2))?
}
4. How does the compiler parse overloaded function calls?
When the compiler implements the mechanism for calling the overloaded function resolution, it must first find out some candidate functions with the same name, and then find the most consistent ones from the candidate functions, and report errors if they are not found. The following introduces a method for parsing overloaded functions: when the compiler processes calls to overloaded functions, it is processed interactively by syntax analysis, C ++ grammar, symbol table, and abstract syntax tree. The interaction diagram is roughly as follows:
What these four parsing steps do is roughly as follows:
Called by the function in the matching grammar to obtain the function name;
Get the expression type of each parameter of the function;
The parser searches for overloaded functions, and the symbol table returns the best function after overload resolution.
The parser creates an abstract syntax tree and binds the best functions stored in the symbol table to the abstract syntax tree
Below we focus on explaining overload resolution, which must meet the matching order and rules described in "3. Calling and matching of overloaded functions". The overload function analysis can be roughly divided into three steps:
Determine candidate function set based on function name
Select the available function set from the candidate function set
From the set of available functions To determine the best function, or return an error due to ambiguity
4.1. Determine the candidate function set according to the function name
According to all functions with the same name in the same scope, the requirements are visible (like private, protected, public, friend, etc.). "Same scope" is also a limitation in the definition of function overloading. If it is not in a scope, it cannot be regarded as function overloading, as in the following code:
void f (int);
void g ()
{
void f (double);
f (1); // The call here is f (double), not f (int)
}
That is, the function of the inner scope will hide the function of the same name in the outer layer! Member functions of the same derived class will hide functions of the same name of the base class. It's easy to understand, the same is true for variable access. For example, a function body that wants to access the global variable of the same name must be qualified with "::".
In order to find the candidate function set, the depth optimization search algorithm is generally used:
step1: Start searching from the function call point, and look for the visible candidate function layer by layer
step2: If the last step collected is not in the user-defined namespace, the candidate function in the namespace introduced by the using mechanism is used, otherwise it ends
When collecting candidate functions, if the argument type of the calling function is a non-structural type, the candidate function only contains the function visible at the calling point; if the argument type of the calling function includes a class type object, class type pointer, class type reference or pointer Pointer to class members, candidate functions are the union of the following sets:
(1) The function visible at the call point;
(2) Functions declared in the namespace defining the type of the class or the namespace defining the base class of the class;
(3) Friends function of this class or its base class;
Let us look at an example is more intuitive:
void f ();
void f (int);
void f (double, double = 314);
names pace N
{
void f (char3, char3);
}
classA {
public: operat or double () {}
};
int main ()
{
using names pace N; // using indicator
A a;
f (a);
return 0;
}
According to the above method, since the actual parameters are objects of class type, the collection of candidate functions is divided into 3 steps:
(1) Find the declaration of the function f from the scope of the main function where the function is called, and the result is not found. To main function
The outer scope search of the scope, at this time, find the declaration of the three functions f in the global scope, and put them into the candidate set;
(2) Collect f (char3, char3) in the namespace N pointed to by the using indicator;
(3) Consider 2 types of collections. One is the function declared in the namespace defining the type of the class or the namespace defining the base class of the class
Number; the second is the friend function of this class or its base class. In this example, these two types of collections are empty.
The final candidate set is the four functions f listed above.
4.2. Determine available functions
Available functions refer to: the number of function parameters matches and each parameter has an implicit conversion sequence.
(1) If the actual parameter has m parameters, there are only m parameters among all candidate parameters;
(2) Among all candidate parameters, the number of parameters is less than m, currently only when there is an ellipsis in the parameter list;
(3) Among all the candidate parameters, the number of parameters exceeds m, and currently only has the default value after the m + 1 parameter. If available
If the collection is empty, the function call will fail.
These rules are reflected in the previous "3. Calling and matching of overloaded functions".
4.3. Determine the best matching function
After determining the available functions, for each function in the available function set, if the actual parameter of the calling function is to be called to calculate the priority, and finally select the highest priority. For example, in the matching rules introduced in "3. Calling and matching of overloaded functions", weights are assigned in order, then the total priority is calculated, and finally the optimal function is selected.
5. Summary
This article describes what function overloading is, why function overloading is needed, how the compiler solves the function duplication problem, and how the compiler resolves calls to overloaded functions. Through this article, I think everyone should be relatively clear about the overloading in C ++. Note: The introduction of the function name mapping mechanism is based on the g ++ compiler. Different compilers have different mappings; the compiler parses the call of the overloaded function, which is only one of all compilers. If you are interested in a certain compiler, please study it in depth.
In the end, I toss you two questions:
1. The plus sign + in C ++ can be used for the addition between two int types, it can also be used for the addition between floating-point numbers, and the connection between character strings. ? In another scenario, the plus sign + in the C language can be used for the addition between two int types and the addition between floating-point numbers. Is it operator overloading?
2. What happens when the template is overloaded? How do the overloads formed by template functions and ordinary functions match when they are called?
Appendix: A C ++ function overloading mechanism
This mechanism was proposed and implemented by Zhang Suqin and others. They wrote a C ++ compilation system COC ++ (developed on a domestic machine, under the UNIX operating system environment, with Chinese own copyright C, C ++ and FORTRAN language compilation system, these compilation systems are respectively Meet the ISOC90, AT & T C ++ 85 and ISOFORTRAN90 standards). The function overloading process in COC ++ mainly includes two sub-processes:
1. During the process of function declaration, the compilation system establishes a function declaration prototype linked list, changes the name according to the name change rule, and records the name after the function is renamed in the function declaration prototype list (the name change rule is similar to that described above in this article) , It's just the difference between which character is int- ", which character is char-", etc.)
Figure attached 1, process 1-create a function linked list (note that the encoding format of the function name is: <original function name> _ <scope name change> <function parameter table encoding>, which is a bit different from g ++)
2. During the translation of the function call statement, access the symbol table, find the corresponding function declaration prototype linked list, find the optimal matching function node according to the type matching principle, and output the name after the name change. The algorithm establishment function of the two subprocesses is given below Declaration of the prototype linked list algorithm flow is attached 1, and function call statement translation algorithm flow is attached 2.
Figure attached 2, process 2-overloaded function call, find the linked list
Attached-how does the overload of the template function and the ordinary function match when it is called?
Here is the answer from C ++ founder Bjarne Stroustrup:
1) Find the set of function template specializations that will take part in overload resolution.
2) if two template functions can be called and one is more specified than the other, consider only the most specialized template function in the following steps.
3) Do overload resolution for this set of functions, plus any ordinary functions as for ordinary functions.
4) If a function and a specialization are equally good matches, the function is perferred.
5) If no match is found, the call is an error.
Wu Qin
Source: http://www.cnblogs.com/skynet/
C ++ function overloading (transfer)