Based on the reference to ISO/IEC 14,882:2003 and cppreference.com C + + preprocessor, this paper gives a comprehensive summary of C + + preprocessing. If there are no special instructions, the listed content is based on the c++98 standard, but not specific platform-related (such as VC + +), c++11 new features will be specifically noted.
1. Introduction
Usually we say that the C + + build (where there is no "compile" is a fear of confusion) can be divided into 4 steps: preprocessing, compiling, compiling, linking. The preprocessing is the macro substitution, the header file contains, and so on. Compiling refers to the syntax and semantic analysis of the preprocessed code, and eventually the assembly code or other intermediate code that is close to the assembly; The Assembly refers to the binary instructions for converting the assembly or intermediate code from the previous step to the target machine. Typically, a binary file is generated for each source file (vs is. OBJ,GCC is. o); links are multiple binary files "linked" to the previous step as executable or library files.
"Preprocessing" is not very strict here, in the C + + standard of C + + translation is divided into 9 stages (phases of translation), wherein the 4th stage is preprocessor, and we say the usual "preprocessing" In fact refers to all these 4 stages, the following list of these 4 stages (said not detailed, see references):
- character mapping (trigraph replacement): Maps system-related characters to the corresponding characters defined by the C + + standard, but with the same semantics, such as changing the different line characters on different operating systems into the specified characters (set to newline);
- line splicing: for "\" immediately after newline, delete "\" and newline (we used in the # define and so on in the continuation of the process before the preprocessor), the procedure is only 1 times (if the "\ \ "After two lines will only delete a" \ ");
- String Segmentation (tokenization): The source code as a string is divided into the following string (Token) connection: Comments, whitespace, preprocessing tokens (identifiers, etc. are preprocessing Tokens, because at this time do not know who is the identifier, after the next step, the real preprocessor will be processed);
- Execute preprocessor: The 1-4 step is recursive for the # include directive, when the source code no longer contains any preprocessing statements (#开头的哪些).
It should be emphasized that preprocessing is done before the compilation , that is, the input file at compile time does not contain any preprocessing statements, which includes, the conditional compilation test does not pass the partial deletion , the macro is replaced , the header file is inserted and so on.
In addition, preprocessing is performed in translation unit, a translation unit is a source file together with all of the text files (see C + + standards) that are included (or indirectly included) by # include. Generally, the compiler generates a binary file for a translation unit (VS is. OBJ,GCC is. O).
With this knowledge, this article after the 4th step in the face of preprocessor do a detailed introduction.
2. General format and overview
The general format of the preprocessor directive is as follows:
# preprocessing_instruction [arguments] newline
Where Preprocessing_instruction is one of the following:define, undef, include, if, ifdef, ifndef, else, elif, endif, line, error, pragma arguments is an optional parameter, such as the file name after # include, preprocessor a row, can be "\" immediately follow newline continuation, but the continuation is not preprocessor's patent, And the continuation of the line before preprocessor processing.
The preprocessor directive has the following types:
- null, a # followed by newline, with no effect, similar to an empty statement;
- conditional compilation , defined by #if, #ifdef, #ifndef, #else, #elif, #endif;
- The source file contains , defined by the #include;
- macro substitution , defined by #define, #undef, #, # #;
- Redefine line numbers and filenames , defined by #line;
- error message , defined by #error;
- compiler reservation directives , defined by #pragma.
It should be noted that in addition to the preprocessor directives listed above, other directives are not supported by the C + + standard, although some compilers implement their own preprocessing directives. According to the principle of "portability is more important than efficiency", the preprocessor of the C + + standard should be applied as far as possible.
The next section details each of the above, in addition to the Null preprocessing directives.
3. Detailed explanation
conditional Compilation & nbsp , &NB Sp , &NB Sp , &NB Sp
Conditional compilation is #if, #ifdef, #ifndef, followed by a 0-n #elif followed by 0-1 #else followed by a #endif. #if, #ifdef, #ifndef, #elif followed by expression, conditional compilation of the control logic with the If-else if-else conditional statement (each not paired else and the most recent not paired if pairing this is also similar), but it is conditional on the code Compiled rather than executed. #if, the expression of #elif is constant, and expression is not 0 o'clock true, and expression can also contain the defined (token) test, which is true when token is defined for a macro. #ifdef token is equivalent to #if defined (token), #ifndef token is equivalent to #if!defined (token). Take a look at the example (excerpt from cppreference.com):
#include <iostream> #define ABCD 2int Main () {#ifdef ABCD std::cout << "1:yes\n"; #else Std::cout << "1:no\n", #endif #ifndef ABCD std::cout << "2:no1\n"; #elif abcd = = 2 std::cout << "2:yes\n"; #else std::cout << "2:no2\n"; #endif # if!defined (DCBA) && (ABCD < 2*4-3) std::cout << "3: Yes\n "; #endif Std::cin.get (); return 0;}
Conditional compilation is heavily used for systems-dependent and cross-platform code that typically detects the operating system, processor architecture, compiler, and then conditionally compiles different code to be compatible with the system by detecting certain macro definitions. But then again, the most value of the C + + standard is to make all versions of C + + implementation consistent, from this level, unless the system function is called, the system should not make any assumptions, in addition to assuming that it supports the C + + standard.
The source file contains
The file contains instructions to insert the contents of a file into that # include, where "a file" is recursively preprocessed (1-4 steps, see section 1th). The file contains 3 formats:#include <filename>(1),#include "filename"(2),#include pp-tokens(3), The 1th way in the standard include directory lookup filename (general C + + standard library header file in this), the second way to find the directory where the source files are processed, if not found to find the standard include directory, the 3rd way of Pp-tokens must be defined as <filename> or The "filename" macro, otherwise the result is unknown. Note that filename can be any text file without having to be a. h,. hpp suffix file, such as a. C or. cpp text file (so the title is "source file contains" rather than "header file contains"). Example:
File:b.cpp#ifndef _b_cpp_#define _b_cpp_int b = 999; #endif//#ifndef _B_CPP_
File:a.cpp#include <iostream> //Find in standard include directory # include "B.cpp" //in the directory where the source file is located, find it and then find the standard include directory for # define Cmath <cmath> #include cmathint main () { std::cout << b << ' \ n '; Std::cout << std::log10 (10.0) << ' \ n '; Std::cin.get (); return 0;}
Note The above example, put A.cpp and b.cpp in the same folder, only compile a.cpp.
Macro Substitution
#define Define a macro substitution, the macro that follows #define is replaced with the definition of the macro until the macro is defined with #undef. The macro definition is divided into macro constants (Object-like macros) with no parameters and function macros with parameters (Function-like macros). The format is as follows:
- # define identifier replacement-list & nbsp (1)
- # define identifier ( Parameters) Replacement-list (2)
- # define Identifier (parameters, ...) replacement-list (3) (since c++11)
- # define &NBSP Identifier (...) replacement-list &NB Sp (4) (since c++11)
- # undef identifier & nbsp , &NB Sp (5)
For a function macro with parameters, in Replacement-list, "#" is placed before identifier to indicate that the identifier becomes a string literal, "# #" connection, the following example from Cppreference.com:
#include <iostream>//make function factory and use It#define function (name, a) int fun_# #name () {return A;} FUNCTION (ABCD, 12); FUNCTION (FFF, 2); function (KKK), #undef function#define FUNCTION 34#define OUTPUT (a) std::cout << #a << ' \ n ' int main () { Std::cout << "ABCD:" << fun_abcd () << ' \ n '; Std::cout << "FFF:" << fun_fff () << ' \ n '; Std::cout << "KKK:" << fun_kkk () << ' \ n '; Std::cout << FUNCTION << ' \ n '; OUTPUT (million); Note the lack of quotes std::cin.get (); return 0;}
The Variadic macro is a new part of c++11 (from C99), using the __va_args__ to refer to the parameter "...", a sample excerpt from C + + Standard 2011 (The standard example is different):
#define DEBUG (...) fprintf (stderr, __va_args__) #define SHOWLIST (...) puts (#__VA_ARGS__) #define report (test, ...) (test) puts (#test): printf (__va_args__)) Debug ("Flag");d ebug ("x =%d\n", x); Showlist (the first, second, and third items .); Report (X>y, "X is%d and Y is%d", x, y);
This code, after preprocessing, produces the following code:
fprintf (stderr, "Flag"), fprintf (stderr, "x =%d\n", x);p UTS ("The first, second, and third items.");( (x>y)? Puts ("X>y"): printf ("x is%d but Y is%d", X, y));
In the conditional compilation above, sometimes with #ifdef macro_name to identify some information, the C + + standard specifies a number of predefined macros, listed in the following table (C++11 new macros have been marked):
Predefined macros |
Meaning |
Remark |
__cplusplus |
Defined as 201103L in c++98 as defined in 199711l,c++11 |
|
__line__ |
Indicates the number of source code lines (starting from 1), the decimal constant |
|
__file__ |
Indicates the source file name, string literal |
|
__date__ |
Processing date, string literal, format "Mmm DD yyyy" |
|
__time__ |
Processing time, string literal, format "HH:MM:SS" |
|
__stdc__ |
Indicates if Standard C is met and may not be defined |
Wikipedia articles |
__stdc_hosted__ |
If hosted implementation, defined as 1, otherwise 0 |
C++11 |
__stdc_mb_might_neq_wc__ |
See ISO/IEC 14,882:2011 |
C++11 |
__stdc_version__ |
See ISO/IEC 14,882:2011 |
C++11 |
__stdc_iso_10646__ |
See ISO/IEC 14,882:2011 |
C++11 |
__stdcpp_strict_pointer_safety__ |
See ISO/IEC 14,882:2011 |
C++11 |
__stdcpp_threads__ |
See ISO/IEC 14,882:2011 |
C++11 |
Where the above 5 macros are bound to be defined, the following macros starting from __STDC__ are not necessarily defined, and these predefined macros cannot be #undef. An example of using these macros is the following (contiguous string literals are automatically connected and "AB" "CDE" is equivalent to "ABCDE"):
1 #include <iostream> 2 int main () 3 {4 #define PRINT (ARG) std::cout << #arg ":" << arg << ' \ n ' 5 PRINT (__cplusplus); 6 PRINT (__line__); 7 Print (__file__), 8 print (__date__), 9 print (__time__), #ifdef __stdc__11 print (__stdc__); 12 # Endif13 std::cin.get (); return 0;15}
These macros are often used to output debugging information. Predefined macros are generally prefixed with "__", so a user-defined macro should avoid the beginning of "__".
It should be noted that modern C + + programming principles do not recommend the use of macro-defined constants or function macros, should be used sparingly #define, if possible, with a const variable or inline function instead.
Redefine line numbers and file names
Starting with the next line of source code for #line number ["FileName"], __line__ is redefined to start with number, and __file__ is redefined as "filename" (optional), an example of which is:
1 #include <iostream> 2 int main () 3 {4 #define PRINT (ARG) std::cout << #arg ":" << arg << ' \ n ' 5 #line 999 "WO" 6 7 print (__line__), 8 print (__file__), 9 std::cin.get (), return 0;11}
Error message
#error [message] instructs the compiler to report errors, which are commonly used for system-related code, such as detecting operating system types, and #error reporting errors in conditional compilation. Examples are as follows:
int main () {#error "W" return 0; #error}
The 2nd #error may not be executed because the compiler may stop the error when it encounters a #error "w".
Compiler reservation Directives
#pragma preprocessing directives are standards that are reserved for specific C + + implementations by the C + + standard, so #pragma parameters and meanings on different compilers may differ, for example, VC++2010 provides #pragma once to indicate that the source file is processed only once. OpenMP, as a shared memory parallel programming model, uses #pragma OMP guidance statements, as described in: OpenMP shared memory parallel programming.
VC + + #pragma instructions see MSDN related articles.
GCC #pragma instructions see the GCC documentation related entries.
4. Typical application of pretreatment
Common uses for preprocessing are:
- Include guard, see Wikipedia entry, which is used to ensure that header files are included only once in the same file (to be precise, the header file content appears only once in a translation unit) to prevent the violation of C + + 's "one-time definition" principle;
- Using #ifdef and special macros to identify the operating system, processor architecture, compilers, conditional compilation, and so on to the specific platform of functions, more for portability code;
- Define function macros to simplify the code, or to easily modify certain configurations;
- Use the #pragma to set up and implement the relevant configuration (see the link at the end of the previous section).
There is a project on SourceForge.net that is about using macros to detect the operating system, processor architecture, compilers (click links or see References). Here is an example (from here):
#ifdef _win64 //define something for Windows (64-bit) #elif _win32//define something for Windows (32-bit) #elif __ apple__ #include "TargetConditionals.h" #if target_os_iphone && target_iphone_simulator // Define something for simulator #elif target_os_iphone //define something for IPHONE #else #define TARGET_OS_OSX 1 //define something for OSX #endif #elif __linux //Linux#elif __unix//all unices not Caugh T above //Unix#elif __posix //Posix#endif
C + + preprocessing detailed