Process
What is the process? A process is an executing program, a process is an instance of a program that is executing on a computer, and a process is an entity that can be assigned to the processor and executed by the processor. Processes typically include instruction sets and system resource sets, where the instruction set refers to the program code, where the system resource set refers to I/O, CPU, memory, and so on. Together, we can also understand that a process is a program with a certain independent function in a data set on a running activity, the process is a system for resource allocation and scheduling of an independent unit.
As the process executes, the process can be represented by a unique representation, consisting of the following elements:
-
Process Descriptor: A unique identifier for the process that distinguishes it from other processes. Called the process ID in Linux, generated during system call fork, but instead of its PID field returned by Getpid, its thread group number Tgid.
-
Process state: We often say suspend, run, etc. state, which represents the current state.
-
Precedence: The execution scheduling between processes is relative to other processes.
-
Program counter: The address of the next instruction in the program that is about to be executed, which is the memory address in the kernel operation or in the user's memory space.
-
Memory pointers: Includes pointers to program code and process-related data, as well as pointers to shared memory blocks with other processes.
-
Context data: Data for the processor's registers when the process executes.
-
I/O status information: includes explicit I/O requests, I/O devices assigned to processes
-
Billing information: May include total processor time, total number of clocks used, time limit, etc.
All of these elements are placed in a data structure called the Process Control block. Process Control blocks are structures in which the operating system can support multiple processes and provide multi-processing. When the operating system makes a process switch, it performs a two-step operation, one that interrupts the process in the current processor, and the second executes the next process. The program counters, context data, and process state in the Process control block change, whether it is an interrupt or an execution. When a process is interrupted, the operating system saves the program counter and processor registers (contextual data in the corresponding process control block) to the appropriate location in the process control block, and the process state changes, which may go into a blocking state, or it may enter a ready state. When the next process is executed, the operating system sets the next process to run as a rule and loads the program context data and program counters that are about to execute the process.
Thread
The process has two feature parts: resource ownership and dispatch execution. Resource ownership refers to a process that includes resources such as memory space, I/O, and so on that the process is running. Dispatch execution refers to the execution path in the middle of the process execution, or the instruction execution flow of the program. These two feature parts can be separated, and, after separation, ownership of the data is often called a process, which has an distributable part of the execution code called a thread or a lightweight process.
Threads have "thread of execution" meaning inside, while processes are defined as resource owners in a multithreaded environment, and they store process control blocks for processes. The structure of a thread differs from the process, and each thread includes:
Thread state: The current state of the thread.
An execution stack
Private data area: Static storage space for each thread local variable
Register set: Some states of the storage processor
Each process has a process control block and a user address space, each with a separate stack and a separate control block, each with its own independent execution context. Its structure is shown in 8.1.
Figure 8.1 Process Model diagram
Threads are somewhat different from the process during execution. Each separate thread has a program run entry, sequence of sequence execution, and exit of the program. However, threads cannot be executed independently, and must be dependent on the process, which provides multiple threads for execution control. From a logical point of view, the meaning of multithreading lies in a process where multiple execution parts can be executed concurrently. At this point, the process itself is not a basic operating unit, but a container for threads.
The advantage of a thread over a process is that it is fast, whether it is creating a new thread or terminating a thread, and whether it is switching between threads or sharing data or communication between threads, its speed has a greater advantage over the process.
Concurrency and parallelism
Concurrency, also known as a common line, refers to the ability to handle multiple simultaneous activities, and concurrent events do not necessarily occur at the same time. For example, a modern computer system can load multiple programs into memory in a process in the same time, and use the processor's division multiplexing to show the feeling of running simultaneously on a single processor.
Parallelism refers to the simultaneous occurrence of two concurrent events, with the meaning of concurrency, while concurrency is not necessarily parallel.
The difference between concurrency and parallelism is that a single processor handles multiple tasks simultaneously and multiple processors or multicore processors simultaneously handle many different tasks. The former is a logical simultaneous occurrence (simultaneous), while the latter is physically occurring simultaneously.
Various concurrency models for PHP
Since there are two models, which one does PHP use? The answer is all support, that is, PHP support multi-threaded model, in the multi-threaded situation is usually to solve the problem of resource sharing and isolation. PHP itself is thread-safe.
Specifically, the model needs to look at which SAPI to use, such as in Apache, where multithreaded models may be used, and multi-process models may be used. And PHP-FPM is using a multi-process model.
The most recommended way is to use the PHP-FPM model, because this model has many advantages for PHP:
Memory release is simple, using the multi-process model when the process can easily release memory, because PHP has a lot of extensions, a little careless can lead to memory leaks, FPM through the process of the way out of the simple to solve the problem of violence.
Disaster tolerant ability, the same problem, extension or PHP may have a segment error, if it is a single-process multithreaded model, then the entire PHP is dead. This can affect service, multi-process, if a process dies, it will not affect the overall service.
Multi-process has the advantage of multiple processes, multithreading also has the advantage of multi-threading, such as HHVM it is a multi-threaded model of choice. The biggest benefit of multithreaded models is the ease of information sharing and communication, because pointers can be used directly within the same process space.
such as the opcode cache tool, in PHP, APC and Opcache and so on using shared memory to share opcode, then in the HHVM do not need to walk the shared memory, shared memory also has a problem is to store complex data structure inconvenient, because the problem of pointers, multi-threaded Case C The data structures in the/c++ can be shared. This is also helpful for efficiency gains.
There is also a distinct model difference between multi-process and Multithreading: logic in processing requests.
In a multi-process scenario, the FD connection is not passed because the cross-process is bad. So many processes typically take the form of a parent process and listen()
then each child process accept()
to achieve load balancing. This model may have a problem of surprise groups.
In a multithreaded model, a separate thread can be used to accept requests and distribute them to individual worker threads.