Linux various IPC mechanisms (2011-07-08 16:58:35)
Original Address:Linux various IPC mechanisms (RPM)
Jianpengliu
The original post on the IBM Developerworks website, is a series of articles, author Zheng Yanxin, through the explanation and examples of the use of several IPC in Linux, I feel very good, here to make a reservation, can read the basis of the Linux IPC is no problem.
i) the Linux environment interprocess communication (a) pipeline and famous pipe
HTTP://WWW.IBM.COM/DEVELOPERWORKS/CN/LINUX/L-IPC /part1/
II) Inter-process communication between Linux environments (ii): Signal
On: HTTP://WWW.IBM.COM/DEVELOPERWORKS/CN /linux/l-ipc/part2/index1.html
: http://www.ibm.com/developerworks/cn/ linux/l-ipc/part2/index2.html
III) Inter-process communication between Linux environments (iii) Message Queuing
http:// www.ibm.com/developerworks/cn/linux/l-ipc/part3/
Four) inter-process communication between Linux environments (quad) traffic lights
http://www.ibm.com/developerworks/cn/linux/l-ipc/part4/
V) Linux Environment interprocess communication (V): Shared memory
on: HTTP://WWW.IBM.COM/DEVELOPERWORKS/CN/LINUX/L-IPC /part5/index1.html
: http://www.ibm.com/developerworks/cn/linux/l-ipc/ part5/index2.html
VI.) Linux Environment interprocess communication (VI): socket
http://www.ibm.com/developerworks/cn/linux/l-ipc/
==============================================================================================
Introduction to several main means of interprocess communication under Linux:
Pipe and well-known pipe (named pipe): Pipelines can be used for communication between affinity processes, and well-known pipelines overcome the limitations of pipe without name, so that, in addition to having the functions of a pipeline, it allows communication between unrelated processes;
Signal (Signal): signal is a more complex mode of communication, used to inform the receiving process of an event occurred, in addition to inter-process communication, the process can also send signals to the process itself; Linux in addition to supporting early UNIX signal semantic function Sigal, Also support the semantics of the POSIX.1 standard signal function sigaction (in fact, the function is based on BSD, BSD in order to achieve a reliable signal mechanism, but also able to unify the external interface, with sigaction function to re-implement the signal function);
Message queue (Message Queuing): Messages queue is a linked table of messages, including POSIX Message Queuing system V Message Queuing. A process with sufficient permissions can add messages to the queue, and a process that is given Read permission can read the messages in the queue. Message queue overcomes the disadvantage that the signal carrying information is low, the pipeline can only carry the unformatted byte stream and the buffer size is limited.
Shared memory: Allows multiple processes to access the same piece of memory space and is the fastest available IPC form. is designed for inefficient operation of other communication mechanisms. It is often used in conjunction with other communication mechanisms, such as semaphores, to achieve synchronization and mutual exclusion between processes.
Semaphore (semaphore): primarily as a means of synchronization between processes and between different threads of the same process.
Socket: A more general inter-process communication mechanism that can be used for inter-process communication between different machines. Originally developed by the BSD branch of the UNIX system, it can now be ported to other Unix-like systems: both Linux and System V variants support sockets.
The two ends of the pipe can be described by the description character Fd[0] and fd[1], and it is important to note that the ends of the pipe are fixed on the task. That is, one end can only be used for reading, represented by the description word fd[0], which is called the pipe reading end, and the other end can only be used for writing, by the description word fd[1] to be said to be the pipe write end. If you attempt to read data from the pipe write end, or write data to the pipe read end, it will cause an error to occur. I/O functions for general files can be used for pipelines such as close, read, write, and so on.
The main limitations of the pipeline are reflected in its characteristics:
Only one-way data streams are supported;
Can only be used between processes that have affinity;
No Name;
The buffer of the pipeline is finite (piping is present in memory and is allocated a page size for the buffer when the pipeline is created);
The pipeline transmits the unformatted byte stream, which requires that the reader and writer of the pipeline must agree the format of the data beforehand, such as how many bytes count as a message (or command, or record), etc.
A significant limitation of pipeline applications is that it has no name, so it can only be used for inter-process communication with affinity, which is overcome when a named pipe (named pipe or FIFO) is presented. A FIFO differs from a pipe in that it provides a path name associated with it, which exists in the file system as a FIFO file. Thus, even processes that do not have affinity to the FIFO creation process, as long as they can access the path, can communicate with each other through the FIFO (the process that accesses the path and the creation process of the FIFO), so that processes that are not related to FIFO can also exchange data. It is important to note that FIFO strictly adheres to first-in, FIFO, which reads from the beginning of the pipeline, and is always returning data from the start, and writes the data to the end. They do not support file location operations such as Lseek ().
Pipelines are commonly used in two areas: (1) Pipelines are often used in the shell (as input input redirects), in which case the pipeline is created transparently to the user, (2) is used for inter-process communication with affinity, the user creates the pipeline himself, and reads and writes.
FIFO can be said to be the promotion of pipelines, overcome the pipe No name restrictions, so that the non-affinity process can also be used in first-out communication mechanism for communication.
Pipelines and FIFO data are byte streams, and applications must identify specific transport "protocols" in advance, using messages that propagate a particular meaning.
To flexibly apply pipelines and FIFO, it is critical to understand their read and write rules.
First, signal and signal source
Signal Essence
The signal is a simulation of the interrupt mechanism at the software level, in principle, a process receives a signal that the processor receives an interrupt request can be said to be the same. The signal is asynchronous, and a process does not have to wait for the signal to arrive by any action, in fact, the process does not know exactly when the signal arrives.
The signal is the only asynchronous communication mechanism in the interprocess communication mechanism, which can be regarded as asynchronous notification and what happens in the process of notifying the receiving signal. After POSIX real-time expansion, the signaling mechanism is more powerful and can deliver additional information in addition to the basic notification function.
Signal source
There are two sources of signal events: hardware sources (e.g. we press keyboards or other hardware failures), software sources, the most common system functions for sending signals are kill, raise, alarm and setitimer, and Sigqueue functions, and software sources include operations such as illegal operations.
Third, the process of the response to the signal
The process can respond to a signal in three ways: (1) ignore the signal, that is, no processing of the signal, wherein two signals can not be ignored: Sigkill and Sigstop, (2) capture the signal. Define the signal processing function, when the signal occurs, execute the corresponding processing function, (3) perform the default operation, Linux for each signal has a default action, for details please refer to [2] and other information. Note that the default response of a process to a real-time signal is a process termination.
Which of the three ways Linux responds to a signal depends on the parameters passed to the corresponding API function.
First, the signal life cycle
Execution complete from signal to signal processing function
For a complete signal life cycle (from signal sent to the corresponding processing function completed), can be divided into three important stages, these three stages are characterized by four important events: The signal is born, the signal is registered in the process, the signal is logged in the process, the signal processing function is completed. The time interval of two adjacent events constitutes a phase of the signal life cycle.
Message Queuing (also called Message Queuing) can overcome some of the drawbacks of early UNIX communication mechanisms. As one of the early UNIX communication mechanisms, the signal can transmit a limited amount of information, and although the POSIX 1003.1b in the real-time signal of the extension, so that the signal in the transmission of information to a considerable degree of improvement, but the signal this way of communication more like "instant" communication mode, It requires the process receiving the signal to react to the signal within a certain time frame, so that the signal is meaningful for the lifetime of the receiving signal process, and the information transmitted by the signal is close to the concept of continuous process (process-persistent), see Appendix 1 , pipelines and famous pipelines and famous pipelines are typical with the process of continuous IPC, and can only transmit unformatted byte stream will undoubtedly inconvenience the development of the application, in addition, its buffer size is also constrained.
A message queue is a linked list of messages. You can think of a message as a record, with a specific format and a specific priority. A process that has write permission to a message queue can add new messages to a certain rule, and a process that has read access to a message queue can read messages from the message queue. Message Queuing is persistent with the kernel (see Appendix 1).
Iii. restrictions on Message Queuing
There is a limit to the capacity of each message queue (the number of bytes that can be accommodated), and this value differs depending on the system.
Summary:
Message Queuing has greater flexibility than pipelines and well-known pipelines, first of all, it provides formatted byte streams that help reduce the workload of developers, and secondly, messages have types that can be used as a priority in real-world applications. These two points are not comparable to pipelines and famous pipelines. Similarly, Message Queuing can be reused across several processes, regardless of whether the processes are related, which is similar to a well-known pipeline, but Message Queuing is persistent with the kernel and is more powerful and more space-capable than a well-known pipeline, which continues with the process.
Semaphore is not the same way as other processes, it mainly provides access control mechanism for inter-process shared resources. is equivalent to a flag in memory that the process can determine whether it can access certain shared resources, and the process can modify the flag. In addition to access control, it can also be used for process synchronization.
I. Overview of traffic lights
Semaphore is not the same way as other processes, it mainly provides access control mechanism for inter-process shared resources. is equivalent to a flag in memory that the process can determine whether it can access certain shared resources, and the process can modify the flag. In addition to access control, it can also be used for process synchronization. There are two types of beacons:
Binary signal light: The simplest form of signal, the value of the signal can only take 0 or 1, similar to the mutex.
Note: A two-value semaphore can realize the function of mutex, but the content of the two is different. Traffic lights emphasize shared resources, as long as the shared resources are available, other processes can also modify the value of the semaphore, and the mutex emphasizes the process, and the resource-intensive process must be unlocked by the process itself after the resource is consumed.
Calculate the semaphore: The value of the semaphore can take any non-negative (of course, constrained by the kernel itself).
Five, the limit of traffic lights
1, one system call SEMOP can be operated simultaneously the number of semaphores semopm,semop parameter Nsops If this number is exceeded, a e2big error will be returned. The size of the SEMOPM is specific to the system, Redhat 8.0 is 32.
2, the maximum number of lights: SEMVMX, when the set semaphore value exceeds this limit, will return the Erange error. In Redhat 8.0, the value is 32767.
3, the maximum number of signal sets within the system range Semmni and the maximum number of signal lights in the system range semmns. Exceeding these two limits will return a ENOSPC error. The value in Redhat 8.0 is 32000.
4. The maximum number of semaphores in each beacon set is 250 in Semmsl,redhat 8.0. SEMOPM and SEMVMX should be noted when using SEMOP calls, and Semmni and Semmns should be noted when calling Semget. SEMVMX is also a semctl call to be aware of.
Shared memory can be said to be the most useful inter-process communication and the fastest form of IPC. Two different processes A, B shared memory means that the same piece of physical memory is mapped to the respective process address space of process A and B. Process A can instantly see that process B updates the data in shared memory, and vice versa. Because multiple processes share the same block of memory, there is a need for some kind of synchronization mechanism, both mutexes and semaphores.
One obvious benefit of using shared memory communication is that it is efficient because the process can read and write directly to the memory without requiring any copy of the data. For communication methods such as pipelines and message queues, four copies of the data are required in the kernel and user space, while shared memory copies only two data [1]: One from the input file to the shared memory area, and the other from the shared memory area to the output file. In fact, when you share memory between processes, you do not always have to read and write small amounts of data, and then re-establish the shared memory area when there is new communication. Instead, the shared area is maintained until the communication is complete, so that the data content is kept in shared memory and is not written back to the file. Content in shared memory is often written back to a file when it is de-mapped. Therefore, the use of shared memory communication mode is very efficient.
The Linux 2.2.x kernel supports a variety of shared memory methods, such as mmap () system calls, POSIX shared memory, and System V shared memory. Linux distributions such as Redhat 8.0 support mmap () system calls and System V shared memory, but have not implemented POSIX shared memory, this article will mainly introduce the principle and application of mmap () system call and System V shared memory API.
Second, mmap () and its related system call
Mmap () system calls enable shared memory between processes by mapping the same common file. After the normal file is mapped to the process address space, the process can access the same file as the normal memory without having to call read (), write (), and so on.
Note: In fact, the mmap () system call is not designed entirely for shared memory. It itself provides a different way of accessing ordinary files than normal, and processes can operate on ordinary files like read-write memory. The shared memory IPC for POSIX or System V is purely for sharing purposes, and of course mmap () realizes shared memory is also one of its main applications.
Conclusion:
Shared memory allows two or more processes to share a given store, because the data does not need to replicate back and forth, so it is the fastest inter-process communication mechanism. Shared memory can be implemented through the mmap () mapping of ordinary files (in special cases, anonymous mappings), or through the System V shared memory mechanism. The application interface and principle are simple and the internal mechanism is complex. In order to achieve more secure communication, it is often used in conjunction with synchronization mechanisms such as semaphores.
Shared memory involves knowledge of storage management and file system, and it is difficult to understand its internal mechanism, and the key is to hold tightly to the important data structure used by the kernel. System V shared memory is organized in the form of files in the special file system SHM. The identifier for shared memory can be created or obtained through Shmget. Once the shared memory identifier is obtained, the memory area is mapped to the virtual address space of the process through Shmat.
A set of interfaces can be thought of as the endpoints of interprocess communication (endpoint), and the names of each set of interfaces are unique (the only meaning is self-evident), and other processes can discover, connect, and communicate with them. The communication domain is used to describe the protocol of the socket communication, the different communication domains have different communication protocols and the address structure of the socket, so, when creating a socket interface, it is necessary to indicate its communication domain. More common is the UNIX domain socket interface (using a set of interface mechanism to implement inter-process communication within a single machine) and the Internet communication domain.
5. Other important concepts in network programming
The following is a list of other important concepts in network programming, basically giving the functionality that these concepts can achieve, which readers can look for in the programming process if they need these features.
(1), I/O multiplexing concept
I/O multiplexing provides a capability that enables processes to get this information in a timely manner when an I/O condition is met. I/O multiplexing is a common application where processes need to handle multiple descriptive words. One advantage of this is that the process is not blocking on a real I/O call, but rather blocking on the select () Call, and select () can handle multiple descriptors at the same time, and if the I/O of all the descriptors it handles is not in the prepared state, it will block; if there is one or more descriptive characters i/ O is in the Ready state, select () does not block, and the appropriate I/O is taken according to the specific descriptive words that are prepared.
(2), UNIX communication domain
The main introduction is the Pf_inet communication domain, to achieve inter-process communication between the Internet. A socket interface based on a UNIX communication domain (which specifies that the communication domain is pf_local when the socket is called) enables interprocess communication between machines. There are several advantages to using the UNIX communication domain socket interface: The UNIX communication domain socket interface is typically twice times the speed of the TCP socket interface, and the other advantage is that it is possible to pass the descriptor between processes through the UNIX communication domain socket interface. All the objects described in the descriptive words, such as files, pipelines, well-known pipes and sockets, can be passed through a UNIX domain-based socket interface after we get the description of the object in some way. The descriptor values received by the receiving process are not necessarily consistent with the values passed by the sending process (the descriptors are process-specific), but the special points point to the same entries in the Kernel file table.
(3), original socket interface
The original socket interface provides functionality not provided by the general set of interfaces:
The original socket interface can read and write some control protocol groupings for control, such as ICMPV4, which can realize some special functions.
The original socket interface can read and write special IPV4 packets. The kernel typically processes only a few packets of specific protocol fields, and some packets that require different protocol fields need to be read and written through the original set of interfaces;
It is also interesting to construct your own Ipv4 head through the original set of interfaces.
Root permission is required to create the original socket interface.
(4), access to the data link layer
Access to the data link layer allows the user to listen to all the groupings on the local cable without using any special hardware devices, and reading the data link layer groupings under Linux requires creating a sock_packet type of socket and requiring root access.
(5), out-of-band data (Out-of-band)
If you have some important information to send through the socket immediately (without queuing), consult the literature related to out-of-band data.
(6), multicast
The Linux kernel supports multicasting, but in the default state, most Linux systems turn off multicast support. Therefore, in order to implement multicasting, you may need to reconfigure and compile the kernel. Please refer to [4] and [2] for details.
Conclusion: The content of Linux socket programming can be said to be very rich, and it involves many network background knowledge, and interested readers can find a systematic and comprehensive introduction in [2].
At this point, the topic series (Inter-process communication between Linux environments) is all over. In fact, the general meaning of interprocess communication usually refers to Message Queuing, semaphores, and shared memory, either POSIX or Sys v. This series also introduces pipelines, famous pipes, signals and sockets, which is a more general inter-process communication mechanism.
Linux various IPC mechanisms (process communication)