Inter-process communication between Linux

Source: Internet
Author: User
Tags message queue mutex semaphore python subprocess

Vamei Source: Http://www.cnblogs.com/vamei Welcome reprint, Please also keep this statement. Thank you!

Thank you Nonoob correction

We have explained in the Linux signal base that the signal can be seen as a coarse interprocess communication (IPC, interprocess communication) way to pass information to the process's closed memory space. To allow more information to be passed between processes, we need other ways of communicating between processes. These inter-process communication methods can be divided into two types:

    • piping (pipe) mechanism. In the Linux text stream, we mentioned that you can use the pipeline to connect the output of one process to the input of another process, thereby leveraging the file manipulation API to manage interprocess communication. In the shell, we often use pipelines to connect multiple processes together, allowing each process to collaborate to achieve complex functionality.
    • Traditional IPC (interprocess communication). We mainly refer to the message queue, the semaphore (semaphore), and the shared memory. These IPC features allow multiple processes to share resources, similar to multi-threaded shared heap and global data. Because multi-process tasks have concurrency (each process contains a single process, multiple processes have multiple threads), synchronization issues must also be addressed when sharing resources (refer to Linux Multi-threading and synchronization).

piping and FIFO files

An original IPC approach is to communicate all processes through a file. For example, I write my name and age on paper (document). Another person reading this paper will know my name and age. He can also write his message on the same piece of paper, and when I read it, I can also know someone else's message. However, because the hard disk read and write slower, so this method is very inefficient. So, can we put this piece of paper in memory to improve read and write speed?

In the Linux text stream, we've explained how to use pipelines to connect multiple processes in the shell. Similarly, in many programming languages, there are commands to implement similar mechanisms, such as using popen and pipe in a Python subprocess, and the Popen library function in C to implement the pipeline (the pipeline in the shell is written based on this). A pipeline is a buffer that is managed by the kernel, which is the equivalent of a note we put into memory. One end of the pipeline connects to the output of a process. This process will put information into the pipeline. The other end of the pipeline connects to the input of a process that takes out the information that is put into the pipeline. A buffer does not need to be large, it is designed to be a circular data structure so that the pipeline can be recycled. When there is no information in the pipeline, the process read from the pipeline waits until the process at the other end puts the information. When the pipeline is filled with information, the process that tries to put it in will wait until the process on the other side takes out the information. When all two processes are terminated, the pipeline disappears automatically.

In principle, pipelines are built using the fork mechanism (refer to Linux process basics and Linux from program to process), allowing two processes to connect to the same pipe. At the beginning, the two arrows above are connected to the same process 1 (two arrows connected to process 1). The two connections are also copied to the new process (process 2) when the fork is copied. Subsequently, each process closes itself without the need for a connection (two black arrows are turned off; Process 1 Closes the input connection from the pipe, process 2 turns off the output to the pipe connection, so that the remaining red connections form the pipe.

Because of the fork mechanism, pipelines can only be used between parent and child processes, or between two child processes that have the same ancestor ( between processes that are related to each other). To solve this problem, Linux provides a FIFO way to connect the process. FIFO is also called named pipe (named pipe).

FIFO (first in, first out) is a special type of file that has a corresponding path in the file system. When a process opens the file in a read (R) manner and another process opens the file in a write (w), the kernel creates a pipeline between the two processes, so the FIFO is actually managed by the kernel and does not deal with the hard disk. The reason is called FIFO, because the pipeline is essentially a first-out queue data structure, the first to put in the first read out (as if the conveyor belt, a cargo, a pick-up), thereby guaranteeing the order of information exchange. FIFO just borrows the filesystem (file system, reference Linux file management background) to name the pipeline. The write-mode process writes to the FIFO file, while the read-mode process is read from the FIFO file. When the FIFO file is deleted, the pipe connection disappears as well. The advantage of FIFO is that we can identify the pipeline through the path of the file, so that there is no affinity between the processes to establish a connection.

Traditional IPC

These traditional IPC actually have a long history, so its implementation is not perfect (for example, we need a process to delete the established IPC). A common feature is that they do not use the API for file manipulation. For any IPC, you can create multiple connections and use key values as a way to identify them. We can use the key values in one process to get that connection (such as multiple message queues, and we choose to use one of them). The key values can be passed between processes in some IPC way (for example, Pipe,fifo or writing to a file), or they can be placed in the program at the time of programming.

In the case of several processes sharing key values, these traditional IPC are very similar to the way multi-threaded shared resources (see Linux Multi-Threading and synchronization):

  • semaphor e and mutex is similar to handling sync issue. We say that a mutex is like a bathroom that can hold only one person, so semaphore is like a n personal toilets. In fact, in terms of meaning, semaphore is a count lock ( I think translating semaphore into a semaphore is very easy to confuse semaphore with signal), which allows to be obtained by n processes. When there are more processes trying to get semaphore, you have to wait for the previous process to release the lock. When n equals 1, the semaphore and mutex implement exactly the same function. Many programming languages also use semaphore to handle multithreading synchronization issues. A semaphore is always present in the kernel until a process deletes it.
  • Shared memory is similar to multi-threaded sharing of global data and heap. A process can take a portion of its own memory space and allow other processes to read and write. When using shared memory, we should pay attention to the problem of synchronization. We can use semaphore synchronization, or we can establish a mutex or other thread synchronization variable in shared memory to synchronize. Because shared memory allows multiple processes to operate directly on the same memory area, it is the most efficient IPC method.

Message Queuing is similar to pipe. It is also the establishment of a queue, first put into the queue messages are first taken out. The difference is that Message Queuing allows multiple processes to put messages, and also allows multiple processes to take out messages. Each message can have an integer identifier (message_type). You can classify messages by identifier (in extreme cases, each message is set to a different identifier). When a process takes a message out of the queue, it can be taken out in FIFO order or only with a message that matches one of the identifiers (there are multiple such messages, which are also taken out in FIFO order). Another difference between Message Queuing and pipe is that it does not use the file API. Finally, a queue does not automatically disappear, it will persist in the kernel until a process deletes the queue.

Multi-process collaboration can help us take advantage of the multi-core and network era. Multiple processes can effectively solve the problem of computing bottlenecks. Internet communication is actually a problem of inter-process communication, except that the multiple processes are distributed on different computers. The network connection is implemented via the socket. Because the socket content is huge, we don't go deep here. A small note is that sockets can also be used for communication between processes within a computer.

Summary

PIPE, FIFO

Semaphore, message queue, shared memory; Key

Welcome to the Linux concept and System series article

Inter-process communication between Linux

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.