Two-point analysis in uClinux

Source: Internet
Author: User

 
Introduction
Some time ago, we had Transplanted the kernel of uClinux-2.0.x and uClinux-2.4.x,
I basically started from scratch, Linux does not support the Code of this target machine, so this
Migration is basically a new support for a target machine.

During my work, I learned a lot about programming in addition to the operating system.
I will introduce the technology of translation, debugging, assembly, and linking here. There may be many introductions.
Is the connector, because it is more closely related to the operating system.
I hope to share my experience with you. At the same time, you are welcome to make mistakes and mistakes.
Netizens pointed out that making progress together is the motivation for me to write these original posts.

"Programming is not a zero-sum game. Teach programmers what they know, and they won't take you
. I am happy to share what I know with others because I love programming ."
-- John Carmack

Execution of user programs in uClinux

The reason we talk about it from user programs is because we usually have the most contact with applications.
From the application to the operating system, I think it is quite natural. The following is a simple example.
Describes how a program runs in the operating system.

Suppose there is a C program:
Int main (INT argc, char ** argv [])
{
Printf ("Hello world! /N ");
Return 0;
}

This is the simplest program. Generally, a C language program starts from main.
Run. So, is the main function different from other functions and has a special position?
No. The main function has the same status as other functions. In fact, we can make
The C program is executed from anywhere. For example, in Linux, it does not have the main function. As we all know,
The system will jump to
Start_kernel.

So why do all user programs need to be executed from the main function? This is why the user C library is used.
Generally, you will call some library functions during C language development. After compiling to the OBJ file
Link the binary code of the library function to the program, and form a binary executable file.
During the link process, the linker inserts some initialization code in front of the user program. In uClinux
It is in crt0.s (I transplanted the uclibc library ). No matter what form of crt0.s in any platform,
The last few lines of code in this file must contain a JMP (or call or BR transfer command) Main
(Or _ uclibc_main ). This is why all your programs are executed from main. If you
This jump label is changed to any label, such as Foo. And your program contains both main and
If Foo exists, in this case, the program starts to execute From Foo. Therefore, the main function and other
Functions are the same and have no special status.

In uClinux, The argc and argv parameters of the main function are passed. We
Take the flat format as an example. UClinux supports an executable file format named flat.
This file format is relatively simple, basically tiled, so it is called flat. It seems
The version of the uClinux-2.4.x kernel already supports File Execution in ELF format. However,
The example is simple. I still use the flat format as an example. The flat file format is not analyzed.
Focus on parameter transfer. To develop a user program in uClinux, encode the program first, and then compile the program,
The compiled files are in the ELF format. Therefore, use the elf2flt tool to convert elf files to flat,
Assume that the job has been completed.

Run the foo x y file in the shell of uClinux. foo is the program name, and X and Y are
Parameters. Anyone who has learned C language knows that X and Y are passed to main as parameters, where argc = 3,
Argv [0] = "foo", argv [1] = "X", argv [2] = "Y ". How are these parameters passed in.
When you execute a program, the operating system will call

Do_execve (char * filename, char ** argv, char ** envp, struct pt_regs * regs ),
This operation will open the file according to the file path, load the file into the memory, argv is placed in the command line parameter, envp is
Environment variable parameters.

When a file is loaded, the system calls handler for loading different files according to different file formats. If
If it is in flat format, load_flat_binary () will be called in FS/binfmt_flat.c. For more information, see
According to the argv transmitted along the way, envp first processes and calculates the number of parameters argc and envc. Then in the letter
Create a parameter table in create_flat_tables. The entire function code is as follows:
Static unsigned long create_flat_tables (unsigned long PP, struct linux_binprm *
Bprm)
{
(1) unsigned long * argv, * envp;
(2) unsigned long * sp;
(3) char * P = (char *) pp;
(4) int argc = bprm-> argc;
(5) int envc = bprm-> envc;
(6) Char dummy;

(7) sp = (unsigned long *)/
(-(Unsigned long) sizeof (char *) & (unsigned long) P );

(8) SP-= envc 1;
(9) envp = sp;
(10) SP-= argc 1;
(11) argv = sp;

(12) flat_stack_align (SP );
(13) if (flat_argvp_envp_on_stack ()){
(14) -- sp; put_user (unsigned long) envp, SP );
(15) -- sp; put_user (unsigned long) argv, SP );
(16 )}

(17) put_user (argc, -- SP );
(18) Current-> MM-> arg_start = (unsigned long) P;
(19) While (argc --> 0 ){
(20) put_user (unsigned long) P, argv );
(21) do {
(22) get_user (dummy, P); P;
(23)} while (dummy );
(24 )}
(25) put_user (unsigned long) null, argv );
(26) Current-> MM-> arg_end = Current-> MM-> env_start = (unsigned long) P;
(27) While (envc --> 0 ){
(28) put_user (unsigned long) P, envp); envp;
(29) do {
(30) get_user (dummy, P); P;
(31)} while (dummy );
(32 )}
(33) put_user (unsigned long) null, envp );
(34) Current-> MM-> env_end = (unsigned long) P;
(35) Return (unsigned long) SP;
}
(1)-(6) rows are variable declarations. Argc and envc respectively record the number of previously calculated parameters and
Number of environment variable parameters. P = PP is the pointer to the array of parameters and environment variables. SP is the user area where you want to execute the program.
STACK: the starting address of the user space stack when the foo program is executed. (8)-(11) is a stack adjustment. First
First SP mobile envc 1 unit, this envc 1 is used to store a total of envc envp [0]-> envc [envp-1] elements
If the value is 0, the envp array ends. Then SP sets aside argc 1 in each unit of mobile argc 1.
Unit space, this argc 1 unit is used to store argc argv [0]-> argv [argc-1] element address, multiple
The remaining one is also 0, indicating that the argv array ends. After stack adjustment, argv and envp respectively point to themselves in the stack
. If the initial value of the Start Stack is recorded as init_sp, envp = init_sp-(envc 1) now ),
Argv = envp-(argc 1 ).

(12) It does not matter. (13)-(17) another stack adjustment. (14) The SP moves one more ticket
Then place the envp in this address (envp = init_sp-(envc 1), and then (15) Move the SP
Unit, which writes argv to. (17). It moves the stack and writes argc to it.

(18)-(35) rows write argv [0]-> argv [argc-1] (where P points to) in turn to the stack that argv points
Area. Then envp [0]-> envp [envc-1] (also referred to by P) is written into the stack area referred to by envp.
At the same time, you must set the data structure of the Process Control Block, such as arg_start, env_start, and env_end.

The following example is used to illustrate the process. For example, if Foo x y is executed, argc = 3, argv [0] = "foo ",
Argv [1] = "X", argv [2] = "Y", envc = 1, envp [0] = "Path =/bin". Assume that the user stack starts
The space stack address is sp = 0x1f0000, pp = 0x1c0000. After processing, his user is blank before foo is executed.
The stack frame is as follows:

--------------------------------
0x1f0000 | 0000 |
--------------------------------
0x1efffc | envp [0] = 0x1c0008 | ----> point to "Path =/bin"
--------------------------------
0x1efff8. | 0000 |
--------------------------------
0x1efff4 | argv [2] = 0x1c0006 | -----> point to "Y"
--------------------------------
0x1efff0 | argv [1] = 0x1c0004 | -----> point to "X"
--------------------------------
0x1effec | argv [0] = 0x1c0000 | -----> points to "foo"
--------------------------------
0x1effe8 | start ADDR of envp = 0x1efffc |
At the same time, you must set the data structure of the Process Control Block, such as arg_start, env_start, and env_end.

The following example is used to illustrate the process. For example, if Foo x y is executed, argc = 3, argv [0] = "foo ",
Argv [1] = "X", argv [2] = "Y", envc = 1, envp [0] = "Path =/bin". Assume that the user stack starts
The space stack address is sp = 0x1f0000, pp = 0x1c0000. After processing, his user is blank before foo is executed.
The stack frame is as follows:

--------------------------------
0x1f0000 | 0000 |
--------------------------------
0x1efffc | envp [0] = 0x1c0008 | ----> point to "Path =/bin"
--------------------------------
0x1efff8. | 0000 |
--------------------------------
0x1efff4 | argv [2] = 0x1c0006 | -----> point to "Y"
--------------------------------
0x1efff0 | argv [1] = 0x1c0004 | -----> point to "X"
--------------------------------
0x1effec | argv [0] = 0x1c0000 | -----> points to "foo"
--------------------------------
0x1effe8 | start ADDR of envp = 0x1efffc |
To the r2-r6. Of course, if there are more than five, you need to use the stack.

Since main includes parameters, before calling Main, put argc in R2, And put argv in R3.
And place envp in R4. As mentioned earlier, SP is the starting address of the user space stack. So execute foo
In code, R0 = sp. in the above example, R0 is equal to 0x1effe0. The following pseudo Assembly Code allows the parameter to be loaded
Correct register.

Load R2, (R0)/* r2 = argc */
Load R3, (r0, 4)/* R3 = argv */
Load R4, (r0, 8)/* r4 = envp */
Call Main/* jump to the main function */

Call _ exit

The above code is the easiest preprocessing before entering the main function. Of course, different formats of files in different systems
The logic is different. Some examples just now are some of the scenarios and solutions I have encountered. I just learned the C language.

I think the main is quite mysterious. I know it once I have done the system. In fact, there is no difference between main and other functions :)

Printf and standard output

Parameters written to the main function last time are passed. Now proceed. I 've been busy with the lab for a week.
There is no progress in the article. Write a technical post on the weekend and relax :-)

After entering the main function, you need to call printf ("Hello world! /N "); by the way, the C language parameter is passed
The. String "Hello world! The/N "compiler is used as a String constant, although printf is in the main
Internal call, but "Hello world! /N "is not placed in the main stack, and The String constant must at least be placed in the. Data Segment.
Put it in the read-only data segment. rodata, Which I verified on the workstation. If the edited file name is
Hello. C. compile and generate the ELF binary file GCC hello. C-O hello and then run the command
The objdump-s hello-S parameter will dump all the segment information. You will see "Hello world! /N "is located
. Rodata segment.

Printf () is a standard C-library function. Although the function is simple, it is not easy to implement. This is platform-related.
Function. In a PC, printf outputs to the terminal screen. On an embedded device, printf () outputs to the serial port.
The same is the call of printf (), but the final output devices are different. intuitively, it is certain that the underlying and platform of printf () are similar.
So how is printf () implemented?

Let's take a look at the code of the C library program. Here we use uclibc as an example.
Int printf (const char * _ restrict format ,...)
{
Va_list ARG;
Int RV;

Va_start (ARG, format );
Rv = vfprintf (stdout, format, ARG );
Va_end (ARG );

Return RV;
}
Printf supports string formatting output. The specific parameter processing is not mentioned here. You can see that printf () is called.
Vfprintf (), the first parameter of vfprintf () Is that stdout is a standard output device, and the standard output device is a struct,
The most important member is his descriptor. Its value is 1.

Follow up with the vfprintf () function, which involves complicated parameter processing, because the parameter format of printf () is flexible,
Therefore, in vfprintf (), parse the passed parameters to form the final output format. Interested
Let's take a look at it. This allows you to implement your own printf () on a platform without an operating system ()
Function. In this way, it is more convenient to output debugging information when you increase the program on bare metal (in fact, the printk of uClinux is
).

Vfprintf () is output after parameter processing. The output calls putc (), enters putc (), and
Then we followed up on several layers of functions and found that the Linux system was called to call write (). Yes, the output is an operation.
The system code is complete. Before writing, all the code is the code of the C library, which is unrelated to the platform.
When it comes to specific output, you must call the interface provided by the operating system. The principle of system calling is
Methods (usually trap traps) enter the kernel space of the operating system and call the operating system code to complete some tasks.

Linux system calling has different implementation methods for different platforms. I will talk about this later. After calling write,
When you enter the kernel space, sys_write () is first introduced. The function code is located in FS/read_write.c.
When you enter sys_write (), you must find the corresponding file structure based on the uploaded FD descriptor. For standard output,
FD = 1. Each process's process control block has an array of open files. The file structure is based on FD in this
Find the corresponding structure in the array. After the structure is found, file-> write () is called to output data externally. Specific output
To see where the file structure corresponds to the device driver. Generally, embedded systems can input information from the serial port.
The bottom layer of file-> write () is the transmit_char function that is called by the serial port driver.

There are many books about Linux device drivers. The entire driver structure is very complicated and I do not need to mention it here.
As for how the terminal device is mounted in the driver queue, how to find the corresponding driver structure based on the standard output Descriptor
For more information, see.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.