Basic Linux concepts

Last Update:2014-01-04 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

During the process of learning Linux, there are always some basic concepts that plague me. For example, what is the kernel? What is the relationship between the console, Shell program, and kernel? This time, we summarized these unclear concepts through du Niang and Google, as shown below:

1. Linux architecture (the figure is too large and split into two columns by column)

2 Linux kernel 2.1 Basic functions

· Memory Management

· Process Management

Process scheduling, IPC

Process 4 elements

Program PCB address space system stack space

PCB: a core data structure allocated by the kernel when a process is created. The process itself cannot be directly accessed.

System stack space: the stack used by the process to run in the core State. It is connected to the PCB in 8 KB. The PCB occupies about 1000 bytes, and the system stack space occupies about 7200 bytes.

2.2 The maximum number of linux processes in the kernel is 4092 ). However, after 2.4, the number of processes in the system is limited by the number of physical memory of the system, that is, the space occupied by the PCB and system stack (8 K) of all processes is limited to less than or equal to 1/2 of the total physical memory. Example 64 M memory: Process ≤ 64 M/2/8 K = 4 K

· Hardware Management

Device Driver, ttyS (character device), sda (Block device), Network

· File System Management

Virtual File System

2.2 Start Process

1) from BIOS to KERNEL

MBR-> KERNEL self-extracting-> KERNEL initialization-> KERNEL startup (start_kernel function, in/usr/src/linux/init/main. c of the linux KERNEL source code tree)

2) kernel startup: Create 1 # process and execute it. It will create several kernel threads (kernelthreads), and then load and execute

Line Program/sbin/init (into a user process ). Later, init executes the corresponding script based on the/etc/inittab configuration file for system initialization, such as setting the keyboard, Font, loading module, and network.

For Redhat, the execution sequence is:
/Etc/rc. d/rc. sysinit # The first script executed by init
/Etc/rc. d/rc $ RUNLEVEL # $ RUNLEVEL is the default running mode.

/Etc/rc. d/rc. local # scripts run in the following modes: 2, 3, and 5

/Sbin/mingetty (or getty) # Wait for the user to log on

The/etc/inittab specifies the RUNLEVEL of the system. init starts related services (some background processes) based on the running level to implement different functions.

RUNLEVEL: 0-6

0: halt, 1: single user, 2: multi-user, 3: multi-user and start NFS service

4: retained; 5: Run xdm (Xwindow) to log on as a graphical interface

6: reboot

2.3 User Logon Process
2.4 Linux kernel space and user space

Linux simplifies the segmentation mechanism so that the virtual address and linear address are always the same. Therefore, the virtual address space in Linux is also 0 ~ 4G. The Linux kernel divides the space of 4G bytes into two parts. The maximum 1 GB (from the virtual address 0xC0000000 to 0 xFFFFFFFF) is used by the kernel, which is called the "kernel space ". 3G bytes (from the virtual address 0x00000000 to 0 xBFFFFFFF) are used by each process, which is called "user space ). Because each process can enter the kernel through a system call, the Linux kernel is shared by all processes in the system. Therefore, from the perspective of a specific process, each process can have 4 GB of virtual space.

Linux uses two-level protection mechanisms: Level 0 for Kernel use and Level 3 for user programs. As you can see (the figure cannot be shown here), each process has its own private user space (0 ~ 3G), this space is invisible to other processes in the system. The maximum 1 GB virtual kernel space is shared by all processes and kernels.

The kernel space stores kernel code and data, while the process user space stores the code and data of the user program. Both kernel space and user space are in virtual space. Although the kernel space occupies a maximum of 1 GB in each virtual space, the ing to physical memory always starts from the lowest address (0x00000000. For kernel space, its address ing is a very simple linear ing. 0xC0000000 is the displacement between physical addresses and linear addresses. in Linux code, it is called PAGE_OFFSET.

The kernel space can access all CPU commands, all memory space, and I/O space.

User space can only access limited resources. If you need special permissions, you can obtain the corresponding resources through system calls.

The user space allows page interruptions, while the kernel space does not.

Kernel space and user space are for linear address space.

All kernel (line) processes share an address space, and user processes have their own address space.

Data movement between user space and kernel space:

Access_ OK	Check the validity of the user space memory pointer
Get_user	Get a simple variable from the user space
Put_user	Enter a simple variable to the user space
Clear_user	Clear a block in the user space or return it to zero.
Copy_to_user	Copy a data block from the kernel to the user space
Copy_from_user	Copy a data block from the user space to the kernel
Strnlen_user	Obtains the size of the string buffer in the memory space.
Strncpy_from_user	Copy a string from the user space to the kernel

2.5 Linux kernel state and user State

When a task (process) executes a system call and is executed in the kernel code, it is called that the process is in the kernel running state (or kernel state for short ). At this time, the processor is executed in the kernel code with the highest privilege level (level 0. When the process is in the kernel state, the kernel stack of the current process is used for the executed kernel code. Each process has its own kernel stack. When a process executes the user's code, it says it is in the user running state (User State ). That is, the processor runs in the (3) User code with the lowest privilege level. When a user program is being executed and suddenly interrupted, the user program can also be symbolically called in the kernel state of the process. Because the interrupt handler will use the kernel stack of the current process. This is similar to the status of processes in the kernel state.

2.6 Linux kernel thread, lightweight process, user thread

Kernel thread

Kernel threadIs the separation of the kernel, a separation can handle a specific thing. This is particularly useful in processing asynchronous events such as Asynchronous IO. The use of kernel threads is cheap. The only resource used is the space for storing registers during kernel stack and context switching. The Multi-threaded kernel is called Multi-Threadskernel ).

The kernel usually needs to perform some operations in the background. This task can be completed through the kernel thread. The difference between a kernel thread and a common process is that the kernel thread does not have an independent address space (its mm pointer is actually set to NULL). The kernel thread runs only in the kernel space, it never switches to the user space. Like a common process, a kernel process can be scheduled, preemptible, or created only by other kernel threads. To create a new kernel thread in an existing kernel thread, follow these steps:

Int kernel_thread (int (* fn) (void *), void * arg, unsigned long flags)

A new task is created by passing specific flags parameters to a general clone () System Call. When the above function returns, a pointer pointing to the sub-thread task_struct is returned. The sub-thread starts to run the function pointed to by fn. arg is a parameter required for running. Generally, the kernel thread executes the functions it obtains during creation permanently (unless the system restarts ). A function is usually composed of a loop. When necessary, will this kernel thread be awakened and executed? After the current task is completed, it will sleep on its own.

From the kernel perspective, it does not have the thread concept. In Linux, all threads are implemented as processes. The kernel does not prepare special scheduling algorithms or define special data structures to represent threads. On the contrary, a thread is considered only a process that shares certain resources with other processes. Each thread has its own unique task_struct, so in the kernel, it looks like a common process (only this process shares some resources with other processes, such as address space.

After the Linux operating system is started, especially after X window is started, run the "ps-ef" command to view the processes in the system, at this time, you will find many process names ending with "d". Specifically, the names are displayed with "[]". These processes are kernel threads. The system starts from hardware-> kernel-> User-State processes. pid allocation is a forward loop process, so the pid of the kernel thread started with the system is usually small.

PID TTY STAT TIMECOMMAND

2 1? S 0: 01 init [3]

2 3? SN 0: 00 [ksoftirqd/0]

2 5? SN 0: 00 [ksoftirqd/1]

2 6? S <0: 00 [events/0]

2 7? S <0: 00 [events/1]

2 8? S <0: 00 [khelper]

2 9? S <0: 00 [kblockd/0]

2 10? S <0: 00 [kblockd/1]

2 11? S 0: 00 [khubd]

2 35? S 0: 42 [pdflush]

2 36? S 0: 02 [pdflush]

2 38? S <0: 00 [aio/0]

2 39? S <0: 00 [aio/1]

2 37? S 0: 19 [kswapd0]

2 112? S 0: 00 [kseriod]

2 177? S 0: 00 [scsi_eh_0]

2 178? S 0: 00 [ahc_dv_0]

2 188? S 0: 00 [scsi_eh_1]

2 189? S 0: 00 [ahc_dv_1]

2 196? S :31 [kjournald]

2 1277? S 0: 00 [kjournald]

2 1745? Ss 0: 02 syslogd-m 0

2 1749? Ss 0: 00 klogd-x

2 1958? Ss 0: 13/usr/sbin/sshd

2 2060? Ss 0: 00 crond

2 listen 5tty2 Ss + 0: 00/sbin/mingetty tty2

2 listen 6tty3 Ss + 0: 00/sbin/mingetty tty3

2 listen 7tty4 Ss + 0: 00/sbin/mingetty tty4

2 listen 8tty5 Ss + 0: 00/sbin/mingetty tty5

2 100009tty6 Ss +/sbin/mingetty tty6

2 23564? S 0: 00 bash

2 25605? Ss 0: 00 sshd: peter [priv]

2 25607? S 0: 00 sshd: peter @ pts/2

Lightweight Process (LWP)

A lightweight thread (LWP) is a user thread supported by the kernel. It is an advanced abstraction based on Kernel threads. Therefore, LWP is available only when kernel threads are supported first. Each process has one or more LWPs, and each LWP is supported by one kernel thread. This model is actually a one-to-one thread model mentioned in the dinosaur book. In this operating system, LWP is the user thread.

Since each LWP is associated with a specific kernel thread, each LWP is an independent Thread Scheduling unit. Even if an LWP is blocked in a system call, the execution of the entire process is not affected.

Lightweight processes have limitations. First, most LWP operations, such as creation, analysis, and synchronization, require system calls. The system call cost is relatively high: you need to switch between usermode and kernel mode. Secondly, each LWP requires support from a kernel thread. Therefore, LWP consumes kernel resources (the stack space of the kernel thread ). Therefore, a system cannot support a large number of LWP.

User thread

A user thread refers to a thread library that is fully established in the user space. The creation, synchronization, and destruction of user threads are completely completed in the user space without the help of the kernel. Therefore, such thread operations are extremely fast and low-consumption.

It is the first user thread model. It can be seen that the process contains threads, which are implemented in the user space. The kernel does not directly schedule the user thread process, the scheduling object of the kernel is the same as that of the traditional process. The kernel does not know the existence of the user thread. Scheduling between user threads is implemented by the thread library implemented in the user space.

This model corresponds to the many-to-one thread model mentioned in the dinosaur book. Its disadvantage is that if a user thread is blocked in system calls, the whole process will be blocked.

Enhanced User thread-user thread + LWP

3 GNU utility 3.1 core Utility

GNU coreutils

3.2 Working Principle of Shell

The basic function of Shell is to explain and execute various commands that users enter to implement the interfaces between users and Linux core. After the system starts, the core creates a process for each end user to execute the Shell interpreter. The execution process is basically as follows:

(1) read the command line input by the user's keyboard.

(2) analyze the command, use the command name as the file name, and transform other parameters into the form required by the system to call the internal processing of execve.

(3) The Terminal Process calls fork () to create a sub-process.

(4) The Terminal Process itself uses the system to call wait4 () to wait for the completion of the sub-process (if it is a background command, it does not wait ). When a sub-process calls execve () when it is running, the sub-process searches for the relevant file in the directory based on the file name (that is, the command name) and transfers it to the memory, execute this program (explain this command ).

(5) if the end of the command has a "&" (background command symbol), the terminal process does not need to call wait4 () to wait, and immediately sends a prompt, asking the user to enter the next command and convert it to "(1. If no & sign exists at the end of the command, the terminal process will wait until the sub-process (that is, the process that runs the command) completes the processing and then terminate the process. report it to the parent process (Terminal Process, at this time, the terminal process wakes up and after necessary identification, the terminal process sends a prompt asking the user to enter a new command and repeat the above process.

4 terminals, console

Because the original computer was expensive, a computer was generally used by multiple people at the same time. In this case, a computer needs to be connected to many keyboards and monitors for use by multiple people. In the past, this kind of equipment was dedicated to connecting to a computer. Only the display and keyboard, as well as a simple processing circuit, had no ability to process computer information, he is responsible for connecting to a normal computing machine (usually through a serial port), then logging on to the computer and performing operations on the computer. Of course, the computer operating systems at that time were multi-tasking and multi-user operating systems. In this way, only the display and Keyboard can be connected to a computer through a serial port.

What is the console? In a computer, the keyboard and display that are directly connected to the computer are called the console. Note the difference between it and the terminal. The terminal is connected through a serial port, not the computer itself, but the console is the computer itself, and a computer has only one console. When the computer is started, all information is displayed on the console instead of on the terminal. That is to say, the console is the basic device of the computer, and the terminal is the additional device. Of course, because the console also has the same functions as the terminal, the console is sometimes collectively referred to as a terminal. Non-terminal-related information, such as kernel messages and background service messages, can be displayed on the console but not on the terminal in the computer operating system.

Nowadays, because computer hardware is getting cheaper and cheaper, it is usually done by a person exclusively on a computer and no longer connects to the previous "terminal device". Therefore, the concepts of terminals and consoles have gradually evolved. Terminals and consoles are evolved from hardware to software. The terminals mentioned now, such as virtual terminals in linux, are all software concepts. He uses computer software to simulate previous hardware methods. For example, in linux, you use alt + f1 ~ F6 can switch six virtual terminals, which is like six terminal devices in a computer shared by many people in the past. This is why this is called a "virtual terminal. Of course, now linux can also connect to a real terminal through a serial line.

Simply put, the terminal that can directly display system messages is called the console, and the other is called the terminal.

Refer:

[Linux Command Line and shell script Programming Guide]

【Linux linuxlinuxlinuxlinuxlinuxlinux)

Http://blog.csdn.net/zhangskd/article/details/6956638

Http://www.ibm.com/developerworks/cn/linux/l-kernel-memory-access/

Http://blog.csdn.net/ylyuanlu/article/details/9115073

Http://blog.csdn.net/sailor_8318/article/details/2613107

Http://blog.csdn.net/yeyuangen/article/details/6858062

Http://blog.csdn.net/caomiao2006/article/details/8791775

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Basic Linux concepts

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Basic Linux concepts

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support