The basic principle of the communication between Linux processes, the communication methods and the understanding of the synchronized mode __linux

Source: Internet
Author: User
Tags mutex semaphore
Basic principle * * *:
Normally, a program can only access its own data, and the other processes do not communicate, each process is a separate entity, processes do not need to collaborate between the task can be completed. But as the complexity of the problem increases, a process cannot do all the work, and there must be more than one process that works together to solve the problem faster, better, and more quickly, just as collaboration between people can do much bigger things.

However, for security reasons, the OS restricts the process from accessing its own data, and does not extend the "hand" to the inside of other processes. How to solve the communication problem between processes?

If you have problems, you need to communicate, you need to communicate, there is a communication medium; For fairness and fairness, the control of the communication media should not belong to either side of the communication. As a corollary, in a computer system, only the OS itself is responsible for the communication Media Control task.

So the basic principle of interprocess communication is that the OS provides a communication medium for "dialogue" between processes. Since to communicate, as the communication of human society, communication to pay the time and money, the computer is the same, must have to pay the cost of communication. Because of the nature of the problem, the OS offers a variety of ways to communicate, and the cost of communication varies from one way to the other, using cost and communication efficiency. The pipes, message queues, and shared memory that we often hear are the ways that the OS provides for dialogue between processes.

Since it is communication, it is necessary to communicate both sides of the order to speak, otherwise it will become a quarrel, who can not hear each other say anything. Just as the judge in court controls the timing and time of speaking, the OS must also provide such controls to make the process of communication orderly and harmonious. We often hear of mutexes, conditional variables, record locks, file locks, semaphores are this column.

What is the medium of communication.

As we mentioned above, pipelines, message queues, and shared memory are the ways that the OS provides, and the "words" that the process calls for at least need to be kept somewhere for a while before they can be taken away by another process (hear). Different ways of implementing the dialogue, the media to save the intermediary information logically divided into the file system, kernel and memory three parts. In fact, stored in the kernel is also stored in memory, only this part of the memory can only be accessed by the OS, ordinary processes can not directly read and write.

Communication Mode * * *:

Interprocess communication that is frequently used is pipeline, Message Queuing, and shared memory. >>> Pipeline
The first pipelines are used to share data between processes that are related to each other. The parent process first creates a pipe, then fork () the child process, and the child processes automatically share access to the pipe. This pipe is not named (only one identifier in the process is identified), and therefore also known as an anonymous pipe.
If we run LS * | In the shell When grep foo, the shell creates an anonymous pipe, redirects the stdout of LS to the input of the pipe, redirects the grep stdin to the output of the pipe, and the output of LS is automatically called grep input. All of this is transparent to LS and grep, and they don't know the existence of the pipe.
The FIFO (in the in-in-out), also known as the pipe, appeared later. Famous and nameless for the OS, the OS can be directly managed to be famous, otherwise it is nameless. When a FIFO occurs, the pipeline is not only used for communication with a blood-related process, but can be used for communication between any process.
The life cycle of a pipe follows a process. When all processes that use the pipe exit or the call Close () is displayed by all processes, the pipe is discarded (in fact, the process exit is equivalent to calling close (), because all open resource identifiers are turned off by the OS when the process exits). If there are still unread data in the pipeline, the data is discarded. And when writing data to a pipeline, there must be a process for reading the data, or it would be meaningless.

>>> Message Queuing
Message Queuing acts like a pipe, and processes with sufficient write permissions can place messages into message queues, and processes with sufficient Read permissions can read messages from message queues. Unlike a pipe, it is not necessary for another process to wait for a message to arrive on that queue before a process writes the message. This is because the declaration cycle of Message Queuing follows the kernel. As long as the kernel does not delete message queues, Message Queuing still exists even if the process exits. (However, in an existing OS implementation, Message Queuing records how many processes are currently open by reference counting, and when all processes exit, the reference count is 0 o'clock, and the OS automatically deletes message queues).

>>> Shared Memory
There are two types of shared memory: Anonymous shared memory and well-known shared memory. They differ from the above mentioned pipes and FIFO, anonymous shared memory can only be used for blood-related interprocess communication, the name of shared memory does not have this limit.
The declaration cycle of shared memory also follows the kernel. The biggest difference between the two approaches is that data exchange between processes using shared memory is done without system calls--the data copy process of the kernel--all data exchanges are performed in memory. The way to share memory is the fastest, compared to the first two ways.

Sync Mode * * *:

In order to effectively control the process of communication between multiple processes and ensure the orderly and harmonious communication process, the OS must provide a certain synchronization mechanism to ensure that the process does not speak from the words but effectively work together. For example, in the mode of communication of shared memory, two or more processes have to write data to the shared memory, then how to ensure that a process is not interrupted by other processes in the process of writing to ensure the integrity of the data. And how to ensure that the reading process in the process of reading data will not change, to ensure that the data read is complete and effective.


Commonly used synchronization methods are: mutual-exclusion lock, condition variable, read-write lock, record lock (file lock) and semaphore.


>>> Mutual Exclusion Lock
As the name suggests, locks are used to lock something, and only the person with the key has control over what is locked (the thief who steals the lock is not in our discussion range). The so-called mutual exclusion, the literal understanding is mutually exclusive. So the mutex literally understands that the process has the lock, that it will repel all other processes to access the locked object, and that other processes can only wait if they need a lock, and wait for the lock-owning process to open the lock before it can run.
In an implementation, a lock is not associated with a specific variable, it is itself a separate object. The incoming (thread) process gets the object when it is needed and is released when it is not needed.
The main feature of the mutex is that the release of the mutex must be released by the locked Forward (line), and if the lock's incoming (thread) path is not released, then the other incoming (thread) process will never have the opportunity to obtain the required mutex. Mutexes are used primarily for synchronization between threads.
Condition variable
As mentioned above, in the case of a mutex, if you have a lock in the (line) thread that does not release the lock, the other thread will never get a lock, and there will never be a chance to continue executing the subsequent logic. In the real world, a thread a needs to change the value of a shared variable x, and in order to ensure that X is not modified by another thread during the modification process, thread A must first acquire a lock on X. Now if a has acquired the lock, because the business logic needs, only if the value of x is less than 0 o'clock, thread A can perform subsequent logic, so thread a must release the mutex, and then continue to "busy and so on." As shown in the following pseudo code:

Get x Lock
while (x <= 0) {
Unlock x;
Wait some time
Get x Lock
}

Unlock X


This approach is to consume the resources of the system, because the process must be active to obtain the lock, check the x conditions, release the lock, and then get the lock, then check, and then release, until the conditions to meet the operation can be. So we need a different kind of synchronization, and when thread X finds that the locked variable does not meet the condition, it automatically releases the lock and puts itself in a waiting state, giving the CPU control over to the other thread. Other threads now have the opportunity to modify the value of X, and then notify the thread that is trapped in the waiting state because the condition is not satisfied. This is a notification model synchronization mode, greatly saving CPU computing resources, reducing the competition between the threads, but also improve the efficiency of the system between the threads. This type of synchronization is the condition variable.

Frankly speaking, the four words "conditional variable" are not easy to understand literally. We can think of "conditional variable" as an object, a bell, a ringing bell. When a thread obtains a mutex, the thread releases the mutex and hangs itself on the "bell" because the locked variable does not satisfy the condition that it continues to run. After the other thread modifies the variable, it shakes the "bell" and tells the hanging thread: "What you've been waiting for has changed, wake up and see if it meets your needs." So those hanging threads knew that they were waking up to see if they could keep running.


>>> Read and write locks
Mutex is exclusive lock, the condition variable appears and the mutual exclusion lock work can effectively save the system resources and improve the cooperative efficiency between threads. The purpose of a mutex is to monopolize, and the purpose of the conditional variable is to wait and notify. But the real world is very complex di, and we have to solve the problem is also a variety of Di. Functionally, mutexes and conditional variables can solve virtually all problems, but performance is not necessarily entirely satisfying. People's endless desire to create a more targeted, better performance of the synchronization mechanism. Read and write locks are such a thing.
Consider a file that has multiple processes to read content, but only 1 processes have write requirements. We know that reading the contents of a file does not change the contents of the file, so that even if multiple processes read the same file at the same time, there is no problem, and everyone can coexist harmoniously. When the write process needs to write data, in order to ensure the consistency of the data, all read process can not read data, otherwise it is likely to read out of the data half is old, half is the new situation, the logic is out of chaos.
To prevent the reading of data from being written to new data, the read process must lock the file. Now if we have 2 processes all read at the same time, if we use the above mutex and condition variables, when one of the processes is reading the data, another process can only wait because it does not have a lock. In terms of performance, the time spent waiting for a process is a complete waste, because the process can read the contents of the file without affecting the first, but the process is not locked, so it can not do anything, only wait until the flowers are thanks.
So, we need a different kind of synchronization to meet the above requirements, which is read and write lock.
The emergence of read-write lock can effectively solve the problem of multi process parallel reading. Every process that needs to be read is requested to read the lock so that everyone does not interfere. When a process needs to be written as data, first apply for a write lock. If a read (or write) lock is found at the request, the write process must wait until all the read (write) locks are completely released. The read process first applies for a read lock before it is read, and if the read data is locked by a write lock, the read process must also wait for the read lock to be freed.

Naturally, multiple read locks can coexist, but write locks are completely mutually exclusive.


>>> record Lock (file lock)
In order to increase parallelism, we can further subdivide the granularity of the locked object based on the read-write lock. For example, in a file, the read process may need to read the first 1k bytes of the file, and the write process needs to write the last 1k bytes of the file. We can read the lock on the first 1k byte and write the lock on the last 1k, so that two processes can work concurrently. The so-called "record" in the record lock is actually the concept of "content". Use a read-write lock to lock a part, not the entire file.

A file lock can be considered a special case of a record lock, which can be called a file lock when a record lock is used to lock all the contents of a file.


>>> Lights
A semaphore can be said to be an upgraded version of a conditional variable. The condition variable is the same as the bell, after which each pending process also needs to acquire the mutual exclusion lock and determine whether the required conditions are satisfied, and the signal lamp combines the two steps.
In the posix.1 rationale, the article states that there are mutexes and conditional variables that also provide semaphores because: "The main purpose of this standard is to provide a way of synchronizing between processes; These processes may or may not share the memory area." Mutexes and condition variables are described as synchronization mechanisms between threads that always share (a) memory area. Both are synchronized methods that have been widely used for many years. Each group of primitives is particularly suited to specific problems. Although the intent of the semaphore is to synchronize between processes, mutexes and condition variables are intended to synchronize between threads, but semaphores can also be used between threads, and mutexes and condition variables can also be used for process see. Decisions should be taken on the basis of actual circumstances.

The most useful scenario for a semaphore is to indicate the number of available resources. For example, an array containing 10 elements, we can create a semaphore, the initial value is 0. Each time a process needs to read the elements in the group (assuming that only 1 elements are readable at a time), it is requested to use the semaphore (the value of the semaphore is reduced by 1), and when a process needs to write the element, apply to hang out the signal (signal light value plus 1). This signal has the effect of the amount of available resources. If we limit the value of the semaphore to only 0 and 1, the meaning of the mutex is very much the same.


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.