main function and start routine
When the kernel executes a C program using an EXEC function, a special startup routine is called before the main function is called, and the executable program designates this routine as the starting address of the program. Start routines get command-line arguments and environment variables from the kernel, and then prepare to invoke the main function.
We often gcc main.c -o main
command to compile a program, in fact, it can be done in three steps, the first step to generate assembly code, the second step to generate the target file, the third step to generate the executable file:
1 $ gcc-S main.c2 $ gcc-c main.s3 $ gcc MAIN.O
-S
The option generates the assembly code, the -c
option generates the target file, the -E
option is only preprocessed and not compiled, and if you do not add these options, the gcc
complete compilation step is performed until the last link is delivered to the executable file. Option diagram for GCC commands these options can be -o
used and paired, renaming the output file without using gcc
the default file name ( xxx.c
, xxx.s
, xxx.o
and a.out
), such as gcc main.o -o main
linking to an main.o
executable file
If we use
gcc
Do the link,
gcc
is actually called
ld
The target file
crt1.o
And our
hello.o
Linked together.
crt1.o
The inside has been provided
_start
Entry point, our assembler implements a
_start
Is the multiplicity of definitions, the linker does not know which to use, had to error. Other than that
crt1.o
Provided by
_start
Need to call
main
function, which is not implemented in our assembler program.
main
function, so error. If the target file was generated by C code compilation, use the
gcc
Do the link is right, the entire program entry point is
crt1.o
Provided in the
_start
, it first does some initialization work (hereinafter called the startup routine, startup Routin), and then calls the C code provided in the
main
Function. So, we used to say
main
function is the entry point of the program is not actually accurate,
_start
is the real entry point, and
main
function is
_start
Called by the.
main
The most standard prototype of a function should be
int main(int argc, char *argv[])
, which means that the start example routines two parameters to
main
function, the meaning of these two parameters we learned the pointers to explain later. We have so far
main
The prototype of the function is written
int main(void)
, which is also allowed by the C standard, if you carefully analyze the previous section of the problem, you should know that more than pass the parameters without the problem, less pass the parameters but use will be problematic. Because
main
The function is called by the initiating routine, so the
main
Function
return
is still returned to the startup routine,
main
The return value of the function is obtained by the startup routine, and if the startup routine is represented as equivalent C code (in fact, the startup routine is generally written directly with sinks), it invokes the
main
The form of the function is:
1 exit (Main (ARGC, argv));
That is, when the start routine gets the return value of main
function, it is immediately used as a parameter call exit
function. exit
is also a function in LIBC
, which first does some cleanup work, Then call the previous chapter about _exit
system call termination process, main The return value of the
function is eventually passed to _exit
system call, which becomes the exit state of the process. We can also call exit directly in main
function
The function terminates the process without returning to the startup routine, for example:
1 #include <stdlib.h>23int main (void)4{ 5 exit (4); 6 }
This int main(void) { return 4; }
is the same as the effect. Run the program in the shell and check its exit status:
1 ./a. Out2 echo $? 3 4
By convention, an exit status of 0 indicates that the program executed successfully and the exit status of 0 indicates an error. Note that the exit status is only 8 bits and is interpreted by the shell as an unsigned number, and if the above code is changed to exit(-1);
or return -1;
, the result is
./a . Out echo $? 255
Note that if the return value type of a function is declared, int
each branch control flow in the function must have a write return
statement specifying the return value, and if missing the return
return value is indeterminate (think about why), the compiler will usually report a warning, but if a branch control process calls exit
or c4/> is not written return
, the compiler is allowed because it does not have a chance to return, it doesn't matter if you don't specify a return value. Using exit
a function requires a header file stdlib.h
, and a _exit
function needs to contain a header file unistd.h
.
Process termination
There are 8 ways to terminate a process, the first 5 to terminate normally, and the last three to terminate abnormally:
1 returns from the main function;
2 Call the Exit function;
3 call _exit or _exit;
4 The last thread is returned from the boot routine;
5 The last thread calls Pthread_exit;
6 Call the Abort function;
7 received a signal and terminated;
8 The last thread responds to a cancellation request.
(1) Exit function
1 #include <stdlib.h>2voidint status); 3 void int status); 4 #include <unistd.h>5voidint status);
These three functions are used to terminate a program normally, _exit and _exit immediately into the kernel, and exit will have to do some cleanup (call execute each termination handler, close all standard I/O streams), and then enter the kernel. The integer parameter with three functions is called the terminating state or exit state, if (a) calls these functions without parameters, (b) The return statement in the main function has no return value, (c) The main function does not declare that the return type is integral, then the terminating state of the process is undefined. The main function returns an integer value that is equivalent to calling exit with that value.
Function name: Exit () header file: Stdlib.h (if the conversation file for "VC6.0" is: windows.h) feature: Closes all files and terminates the process being executed. Exit (0) indicates normal exit, exit (x) (x not 0) indicates an abnormal exit, and this x is returned to the operating system (including Unix,linux, and Ms DOS) for use by other programs. Stdlib.h:void exit (int status);//parameter status, the return value of the program exit. The difference between _exit () and exit: Header file: Exit: #include <stdlib.h> _exit: #include <unistd.h>_exit () function: Simply stop the process from running and clear the memory space it uses. and destroys its various data structures in the kernel; the exit () function makes some packaging on these bases, adding a number of operations before the exit is executed. The most important difference between the exit () function and the _exit () function is that the exit () function checks the opening of the file before calling the exit system, and writes the contents of the file buffer back to the file. Exit () Quit Program procedure 1. Call the function (Exit Function) registered by Atexit () and invoke all functions registered by it in the reverse order of atexit registration, which allows us to specify that we perform our own cleanup actions when the program terminates. For example, save the program state information to a file, unlock the lock on the shared database, and so on.
2.cleanup (); Closes all open streams, which will cause all buffered output to be written, deleting all temporary files created with the Tmpfile function.
3. Finally call the _exit () function to terminate the process.
_exit do 3 Things (man):
1, any open file descriptors that are part of this procedure are closed;
2, any child processes of the process are inherited by process 1, initialized;
3, this process the parent process sends the SIGCHLD signal.
Exit calls _exit after the cleanup is done to terminate the process. Sample Program
1#include <stdlib.h>2#include <conio.h>3#include <stdio.h>4 intMainintargcChar*argv[])5 {6 intstatus;7printf"Enter either 1 or 2\n");8status=getch ();9 /*sets DOS error level*/TenExit (status-'0'); One /*Note:this line is never reached*/ A return 0; -}
The difference between exit () and return: According to ANSI C, the effect of using return and exit () is the same as in the original invocation of main (). Note, however, that this is called "initial invocation." If main () is in a recursive program, exit () still terminates the program, but return transfers control to the previous level of recursion until the first level, at which point the return terminates the program. Another difference between return and exit () is that even if you call exit () in a function other than main (), it terminates the program. (2) Atexit function function Name: atexit header file: #include <stdlib.h> function: Register termination function (that is, function called after main execution ends) usage: void atexit (void (*func) (void)); Note: The order in which exit calls these registration functions is the reverse of the order in which they are registered. The same function can be called multiple times if it is registered multiple times. A process can register a number of functions, which are called automatically by exit, which are called termination handlers, and the Atexit function can register these functions. Exit calls the order of termination handlers in the reverse order of atexit registrations, and is called multiple times if a function is enlisted more than once. As specified in ISO C, a process can register at least 32 functions, which are automatically called by exit. The function type registered by atexit () should be a void function that does not accept any arguments. Here is an example of a program that has registered three functions:
1#include <stdio.h>2#include <stdlib.h>3 voidFunc1 (void)4 {5printf"In func1\n");6 }7 voidFunc2 (void)8 {9printf"In func2\n");Ten } One voidFUNC3 (void) A { -printf"In func3\n"); - } the intMain () - { - atexit (func3); - atexit (FUNC2); + atexit (FUNC1);
Sleep (5); -printf"In main\n"); +Exit (0); A}
Process Analysis : the atexit () function registers three func () functions and then waits 5 seconds before printing "int main" (if there is no "\ n" after the output portion of the main () function, the contents of the main () function to be output are first placed in the standard output buffer. When the exit () function is called in Main (), it does some cleanup work and flushes the contents of the buffer, and when it executes to exit (0), exit () automatically calls these registered functions, but because of the principle of first in and out in the process of pressing the stack, the first registered function is finally executed.
A process can register up to 32 functions, these functions will be automatically called by exit, usually these 32 functions are called termination handlers, and call the Atexit function to register these functions, atexit parameter is a function address, when this function is called without passing any arguments, the function cannot return a value , the Atexit function is called the termination handler registration program, after the registration is completed, when the function termination is the exit () function will invoke the previous registered functions, but the Exit function call these functions in the order of the registration of these functions is the opposite, I think this is essentially a parameter stack caused by The parameters are first in and out due to the stacking order. At the same time, if a function is registered multiple times, the function will be executed more than once.
The Exit function runs with a function that is enlisted by the atexit () function, and then does some of its own cleanup work, flushing all output streams, closing all open streams, and closing temporary files created by standard I/O functions tmpfile ().
The exit () function is used to end the program at any time while the program is running, the exit parameter state is returned to the operating system, return 0 indicates that the program ends normally, and non 0 indicates that the program is not properly terminated.
Environment table each program receives an environment table, which is an array of character pointers, each pointing to an environment string ending with ' environ ', and the environment pointer is a global variable that points to the address of the array of pointers. You typically use the getenv and PUTENV functions to access specific environment variables instead of environ global variables. If you want to view the entire environment, you must use the Environ global variable. C program's storage space layout body segment: The machine instruction portion of the cup execution is shared and read-only.
Initialization data segment: Also known as data segment, contains the variables that explicitly need to be assigned an initial value in the program.
Non-initialized data segment: The kernel initializes the data in this segment to a 0 or null pointer before the program begins execution.
Stacks: Automatic variables and the data you need to save each time the function is called are stored in this section.
Heap: Used for dynamic storage allocation. The heap is between the stack and the non-initialized data segment. Memory allocation # include <stdlib.h>
void *malloc (size_t size);
void *calloc (size_t nobj, size_t size);
void *realloc (void *ptr, size_t newsize);
void free (void *ptr);
The malloc function allocates a store of the specified number of bytes, the initial value in the store is not determined; the CALLOC function allocates storage space for the specified number of objects of the specified length, and each bit in that space is initialized to 0; the realloc function changes the length of the store (increase or decrease), The initial values in the new zone are not deterministic, and if PTR is empty, realloc and malloc function the same.
Most implementations of the above functions allocate more storage space than required, and additional space is used to store management information. If a write operation is performed at the end of an allocated area, the management record of the next allocation area is rewritten, and the administrative record of this allocation area is rewritten by writing before the starting position of a allocated area. This kind of error is disastrous, but it's hard to find out because it won't be exposed soon.
Environment Variables: Environment strings in the form of: Name=value, their interpretation is entirely dependent on the individual applications, not the kernel.
#include <stdlib.h>
Char *getenv (const char *name);
int putenv (char *str);
int setenv (const char *name, const char *value, int rewrite);
int unsetenv (const char *name);
The GETENV function returns a pointer to value in Name=value, the Putenv function places the string name=value into the environment table, and if name already exists, deletes the original definition.
The SETENV function sets name to value, and if name exists and rewrite is not 0, its existing definition is deleted, and if rewrite is 0, its existing definition is not deleted; The unsetenv function deletes the definition of name, even if it does not exist. setjmp and Longjmp#include <setjmp.h>
int setjmp (jmp_buf env);
void longjmp (jmp_buf env, int val);
The setjmp and LONGJMP functions are used to handle error situations that occur in deep function calls longjmp function can skip several call frames on the stack and return to a function on the current function call path. When you want to return to the location called setjmp, the data type jmp_buf is some form of an array that holds all the information that can be used to restore the stack state when calling longjmp. Because the env variable needs to be referenced in another function, env is defined as a global variable. When an error is checked, the longjmp function is called, and the first parameter, env, is the env used when calling setjmp, and the second parameter, Val, is not 0, which will be the value returned from setjmp. The reason for using the second parameter is that a setjmp can correspond to multiple longjmp, so that the returned longjmp function can be judged based on the return value to determine where the error occurred. Getrlimit and Setrlimit functions # include <sys/resource.h>
int getrlimit (int resource, struct rlimit *rlptr);
int setrlimit (int resource, const struct RLIMIT *rlptr);
The Getrlimit and Setrlimit functions are used to get or set resource limits for a process. Resource constraints are typically established by process 0 and are inherited by each successive process. When changing resource limits, note the following three rules:
1 The soft limit value of the process can only be used or equal to the hard limit value;
2 Any process can reduce its hard limit value, but it must be used or equal to its soft limit value, this operation is not reversible for ordinary users;
3 only Superuser processes can increase the hard limit value.
Resource constraints affect the calling process and are inherited by its child processes, which means that resource constraints need to be constructed in the shell in order to affect all processes of a user.
Process Environment of linux-process description (5)