PHP High Performance Development-multi-process development

Source: Internet
Author: User
Tags message queue posix semaphore syslog

Software industry in hardware multi-core era
The previous increase in computational power has been guided by Moore's law, along the path of increasing CPU clock frequency, from the initial dozens of MHz to today's GHz. However, since entering the 2002, the CPU has become more difficult to increase the frequency, because the increase in the frequency of the heat dissipation and power consumption of a large increase and so on. A few years ago, Intel and AMD both tuned the research direction and began to study the placement of multiple execution cores on the same CPU.
Although Multicore is a hardware technology, hardware and software are interdependent, hardware is only a material basis, only with the support of software, can make hardware available. Today, the advantages of multi-core has become a consensus: the first is to "reduce power consumption", to solve the past by increasing the frequency of the sequelae caused by the problem;
The second is that the computing performance is stronger, and it can meet the requirement of multi-task processing and multi-task computing environment. To leverage these advantages of multicore, support for the software is essential. Among them, the most important is the software program to be able to parallel processing.

Concurrent and parallel relationships
The difference between concurrency and parallelism is that a single processor handles multiple tasks simultaneously and multiple processors or multicore processors simultaneously handle many different tasks. The former is a logical simultaneous occurrence (simultaneous), while the latter is physically occurring simultaneously. concurrency (concurrency), also known as co-occurrence, refers to the ability to handle multiple simultaneous activities, and concurrent events do not necessarily occur at the same time. Parallel (parallelism) refers to the simultaneous occurrence of two concurrent events, with the meaning of concurrency, while concurrency is not necessarily parallel.
A metaphor: concurrency and parallelism is the difference between a person eat three steamed bread and three people at the same time eat three steamed bread.

Processes and Threads

What is a process, the common explanation is that a process is a single execution of a program, and what is a thread (thread) that threads can understand as a fragment of a program executing in a process. The following concepts in a multitasking environment can help us understand the differences between the two:

The process is independent, which is manifested in the memory space, the context, and the thread running within the process space. In general (without the use of special techniques) a process cannot break through the process boundary to access storage space within other processes, and threads that are generated by the same process share the same memory space because they are in process space.

The two-segment code in the same process cannot be executed at the same time unless the thread is introduced. Threads are part of a process, and the threads generated by the process are forced to exit and clear when the process exits. The thread consumes less resources than the process consumes. Both processes and threads can have a priority. A process is also a thread in a threaded system. The process can be understood as the first thread of a program.

Modern operating systems are almost multitasking operating systems, with only one of the computers we typically useCPU, which means that only one heart, to make it multitasking and run multiple processes at the same time, must use concurrency technology. Implementing concurrency is quite complex and the easiest to understand is"scheduling algorithm for time slice rotation process", its idea is briefly described as follows: Under the management of the operating system, all running processes are used in turnCPU, each process is allowed to occupyCPUthe time is very short(likeTenmilliseconds), so the user can't feel it.CPUis to serve multiple processes in turn, as if all the processes were running uninterrupted. But in fact there is only one process at any one time that occupiesCPU.  
If a single computer has multipleCPU, the situation is different if the number of processes is less thanCPUdifferent processes can be assigned to differentCPUto run, so that multiple processes are actually running at the same time, which is parallel. But if the number of processes is greater thanCPU, you still need to use concurrency technology.

Only one process is running on the computer ( thread ) cpu cpu In the hardware conditions, a process is not going to be in multiple cpu run on it?

In fact, this is to understand that the process is the system of resource allocation and scheduling unit Thread is the operating system allocation processor (CPU) Time of the basic unit, is the smallest execution unit in the system . A process is assigned a separate resource space, but what is actually performed is a thread that, when running a process, is executed by the CPU of the space . So a process will run on multiple CPUs , that is, you can only act on a single program on a multi- CPU , and you will see multiple CPUs In a constant switch.

The status quo of dynamic language

In Python,ruby, if we want, we can use threads anywhere, like most c,c++,java developers. The problem is that Ruby,python uses a global interpreter lock (aka Gil). This GIL is a locking mechanism that protects data integrity. The Gil only allows data to be modified by one thread at a time, so that the thread does not corrupt the data and does not allow it to run concurrently. That's why some people say that Ruby and Python don't have real concurrency.

Multi-process or fork process, which is the most common solution for using Ruby and python,php concurrency. Because the default language is not capable of real concurrency, or because you want to avoid the challenges of threading programming, you may want to open more processes. This is easy if you don't want to share the state between processes. [

Most programming languages are not easy to implement concurrency, and functional languages simplify parallel development, such as erlang,scala,lisp,clojure. This is due to the function that does not share memory or have side effects (side effect).

PHP Multi-Process method

1.exec or System

2.popen creates a pipeline to connect to the process, and then uses fread/fgets/stream_get_contents to read the results returned by the process. Unlike functions such as exec or system, exec waits for the command to complete before running the following code, but Popen does not. Proc_open and more powerful, support stdin and stdout, path settings and so on.

3.PHP has a set of process control functions (–ENABLE-PCNTL and POSIX extensions are required at compile time), which enables PHP to create sub-processes like C in the *nix system, execute programs using the EXEC function, process signals, and so on. Pcntl uses ticks as a signal processing mechanism (signal handle callback mechanism) to minimize the load when handling asynchronous events. What is ticks? Tick is an event that occurs every time an interpreter executes N low-level statements in a code snippet, and this snippet needs to be specified by declare.

PHP Implementation Daemon

There are some basic rules for writing daemon programs to avoid unnecessary hassles.

1, first, the program runs after the call fork, and let the parent process exit. The child process obtains a new process ID, but inherits the process group ID of the parent process.

2. Call Setsid to create a new session, make yourself a new session and a new process group of leader, and make the process without control terminal (TTY).

3, change the current working directory to the root directory, so as not to affect the loading file system. Or you can change to some specific directory.

4. Set the file creation mask to 0, avoid the effect of permission when the file is created.

5. Close the unwanted open file descriptor. Because the daemon program executes in the background and does not need to interact with the terminal, it usually shuts down stdin, stdout, and stderr. Other according to the actual situation treatment.

Another problem is that the daemon program cannot interact with the terminal and cannot output information using the printf method. We can use the syslog mechanism to realize the output of information and to facilitate the debugging of the program. Before using the syslog, you need to start the SYSLOGD program, for the use of the SYSLOGD program please refer to its man page, or related documents, we are not discussed here.

Inter-process communication IPC

Introduction to several main means of interprocess communication under Linux:

    1. Pipe and well-known pipe (named pipe): Pipelines can be used for communication between affinity processes, and well-known pipelines overcome the limitations of pipe without name, so that, in addition to having the functions of a pipeline, it allows communication between unrelated processes;
    2. Signal (Signal): signal is a more complex mode of communication, used to inform the receiving process of an event occurred, in addition to inter-process communication, the process can also send signals to the process itself; Linux in addition to supporting early UNIX signal semantic function Sigal, Also support the semantics of the POSIX.1 standard signal function sigaction (in fact, the function is based on BSD, BSD in order to achieve a reliable signal mechanism, but also able to unify the external interface, with sigaction function to re-implement the signal function);
    3. Message queue (Message Queuing): Messages queue is a linked table of messages, including POSIX Message Queuing system V Message Queuing. A process with sufficient permissions can add messages to the queue, and a process that is given Read permission can read the messages in the queue. Message queue overcomes the disadvantage that the signal carrying information is low, the pipeline can only carry the unformatted byte stream and the buffer size is limited.
    4. Shared memory: Allows multiple processes to access the same piece of memory space and is the fastest available IPC form. is designed for inefficient operation of other communication mechanisms. It is often used in conjunction with other communication mechanisms, such as semaphores, to achieve synchronization and mutual exclusion between processes.
    5. Semaphore (semaphore): primarily as a means of synchronization between processes and between different threads of the same process.
    6. Socket: A more general inter-process communication mechanism that can be used for inter-process communication between different machines. Originally developed by the BSD branch of the UNIX system, it can now be ported to other Unix-like systems: both Linux and System V variants support sockets.

Precautions

    • PHP's memory management mechanism
    • Process IO exception
    • Resource fork Issues
    • The complexity of communication
    • Process Zombie Recovery mechanism

The concrete example is

1, the best combination of cronjob to run the script regularly, so even if your code does not manage the memory, it does not matter, run out once on the release.

2. For scripts that must reside in the process, be sure to run the code in a dead loop such as while (1) {}. This way, the script does not stop as long as the code is not in a state.

3, Echo can not use, but with log instead. Use the Write log method instead of ECHO. Because Echo is outputting a character to the screen, if there are no output objects, a fatal error is reported.

4, if MySQL, every time to reconnect MySQL or at least use to determine the connection. Because your script will not be able to restart the MySQL during the run, once restarted, the previous connection resources will be invalidated, the error is reported: MySQL has go away.

5, the newly generated variables, useless to release immediately.

6, if you want to access the file, the first to Clearstatcache, otherwise very likely inaccurate statistics, if you open files frequently, the handle value of the file will continue to increase, wait until the maximum value of the integer, the program will not be able to open the file

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.