Linux Kernel -- load executable binary files 1. copy_strings, notepad binary copy
The last core module exec is analyzed from now on. c. After analyzing this file, it will form a loop with all the previous analyses, from Process Creation, process program loading to process scheduling and memory management.
The core do_execve function of exec. c is very long and many other functions are used. copy_strings is one of them. We will analyze this function first.
First, let's look at the call, in main. c:
Static char * argv_rc [] = {"/bin/sh", NULL}; // string array of parameters when the program is called. Static char * envp_rc [] = {"HOME =/", NULL}; // an array of Environment strings when the program is called. Void init (void) {... execve ("/bin/sh", argv_rc, envp_rc); // replace it with/bin/sh and execute the program ....}
Look at exec. c again:
/** MAX_ARG_PAGES defines the maximum number of pages in memory allocated by the new program to parameters and environment variables. * 32-page memory should be enough, which makes the total space of the environment and parameter (env + arg) reach kb! */# Define MAX_ARG_PAGES 32do_execve (unsigned long * eip, long tmp, char * filename, char ** argv, char ** envp) {unsigned long page [MAX_ARG_PAGES]; // page pointer array of parameters and environment string space. Int I, argc, envc; // parameter and offset pointer in the Environment string space, initialized to point to the last long word of the space. Unsigned long p = PAGE_SIZE * MAX_ARG_PAGES-4;... // calculate the number of parameters and the number of environment variables. Argc = count (argv); envc = count (envp); // If the sh_bang flag is not set, set it, copy the specified number of environment variable strings and parameters to the parameters and environment space. If (sh_bang ++ = 0) {p = copy_strings (envc, envp, page, p, 0); p = copy_strings (-- argc, argv + 1, page, p, 0 );}...}
Mm. h:
# Define PAGE_SIZE 4096 // defines the memory page size (in bytes ).
Put exec. c and segment. h together:
/** Count () function calculates the number of command line parameters/environment variables. * //// Calculate the number of parameters. // Parameter: argv-parameter pointer array. The last pointer is NULL. // Return: number of parameters. Static intcount (char ** argv) {int I = 0; char ** tmp; if (tmp = argv) while (get_fs_long (unsigned long *) (tmp ++) I ++; return I ;}/// read the long characters (4 bytes) at the specified address in the fs segment ). // Parameter: addr-the specified memory address. // % 0-(long _ v returned); % 1-(memory address addr ). // Return: long words in memory fs: [addr. Extern inline unsigned longget_fs_long (const unsigned long * addr) {unsigned long _ v ;__ asm _ ("movl % fs: % 1, % 0 ": "= r" (_ v): "m" (* addr); return _ v ;}
First, analyze and obtain the number of parameters/environment variables. First, declare two pointer arrays, argv_rc and envp_rc, and pass execve.
Int * a [4] pointer Array
All elements in array a are int-type pointers.
Note that the shape parameter of do_execve is char ** argv, char ** envp, and pointer. Therefore, in the count function, tmp ++ is the address of the element in the array argv_rc, in get_fs_long, * addr refers to the value of the argv_rc element (that is, the char type Pointer "/bin/sh"), because fs: % 1 is used instead of fs: [% 1], SO _ v finally obtains the complete address of the char type. Therefore, count determines the Quantity Based on whether there is an address value.
/** The 'Copy _ string () 'function copies parameters and environment strings from the user's memory space to the memory on the idle page of the kernel. * These formats have been directly put into the new user memory. ** Modified by TYT (Tytso) on 1991.12.24, The from_kmem parameter is added, indicating whether the string or * string array comes from the user segment or kernel segment. ** From_kmem argv * argv ** 0 user space * 1 kernel space user space * 2 kernel space ** we are operating through clever processing of fs segment registers. Because loading a segment register is too costly, * we try to avoid calling set_fs () unless necessary. * //// Copy the specified number of parameter strings to the parameter and environment space. // Parameter: argc-number of parameters to be added; argv-parameter pointer array; page-parameter and environment space page pointer array. // P-the offset pointer in the parameter table space always points to the header of the copied string; from_kmem-string source flag. // In the do_execve () function, p is initialized to point to the last long word in the parameter table (Kb) space, and the parameter string // is copied and stored in reverse mode by stack operation, therefore, the p pointer always points to the header of the parameter string. // Return value: the parameter and the current header pointer of the Environment Space. Static unsigned longcopy_strings (int argc, char ** argv, unsigned long * page, unsigned long p, int from_kmem) {char * tmp, * pag; int len, offset = 0; unsigned long old_fs, new_fs; if (! P) return 0;/* bullet-proofing * // * offset pointer verification * // get the ds register value to new_fs and save the original fs register value to old_fs. New_fs = get_ds (); old_fs = get_fs (); // If the string and string array are from the kernel space, set the fs segment register to point to the kernel data segment (ds ). If (from_kmem = 2) set_fs (new_fs); // process parameters cyclically, Start copying from the last parameter to the specified offset address. While (argc --> 0) {// If the string is in the user space and the string array is in the kernel space, set the fs segment register to point to the kernel data segment (ds ). If (from_kmem = 1) set_fs (new_fs); // start the reverse operation from the last parameter and take the pointer from the last parameter in the fs segment to tmp. if it is null, an error occurs. If (! (Tmp = (char *) get_fs_long (unsigned long *) argv) + argc) panic ("argc is wrong "); // If the string is in the user space and the string array is in the kernel space, the original value of the fs segment register is restored. If (from_kmem = 1) set_fs (old_fs); // calculate the length of the parameter string len and point tmp to the end of the parameter string. Len = 0;/* remember zero-padding */do {/* we know that the string ends with a NULL Byte */len ++ ;} while (get_fs_byte (tmp ++); // if the length of the string exceeds the idle length of the parameter and environment space at this time, the fs segment register is restored and 0 is returned. If (p-len <0) {/* this shouldn't happen-128kB */set_fs (old_fs ); /* will not happen-because there is a space of KB */return 0;} // copy the currently specified parameter string in the fs segment, which starts reverse replication from the end of the string. While (len) {-- p; -- tmp; -- len; // at the beginning of the function execution, the offset variable is initialized to 0. Therefore, if offset-1 <0, it indicates that the string is copied for the first time, // It is equal to the offset value of the p pointer on the page, and an idle page is applied. If (-- offset <0) {offset = p % PAGE_SIZE; // if the string and string array are in the kernel space, the original value of the fs segment register is restored. If (from_kmem = 2) set_fs (old_fs); // if the current offset value p is in the string space, the page pointer array Item page [p/PAGE_SIZE] = 0, indicates that the corresponding page does not exist. // you need to apply for a new memory idle page, fill in the pointer array for this page, and also set the pag to point to this new page, if you do not apply for an idle page, 0 is returned. If (! (Pag = (char *) page [p/PAGE_SIZE]) &! (Pag = (char *) page [p/PAGE_SIZE] = (unsigned long *) get_free_page () return 0; // If the string and string array come from the kernel space, set the fs segment register to point to the kernel data segment (ds ). If (from_kmem = 2) set_fs (new_fs);} // copy one byte of the parameter string from the fs segment to the position of pag + offset. * (Pag + offset) = get_fs_byte (tmp) ;}// restore the original value of the fs segment register if the string and string array are in the kernel space. If (from_kmem = 2) set_fs (old_fs); // Finally, the header offset of the copied parameter information in the parameter and environment space is returned. Return p ;}
First, p points to the logical address of the last long word in the parameter and environment space, as shown in
First, start the reverse operation from the last parameter, and take the last parameter pointer in the fs segment to tmp.
Then take the string length. Note that * addr of get_fs_byte is the value pointed to by the character pointer, that is, _ v obtains a character value in one byte.
Finally, reverse replication starts from the end of the string. Note that the page array is not used for ing, but to save the address of the Memory page. Offset changes every cycle.
Finally, p is returned.