Differences between threads and processes and process communication methods

Last Update:2017-06-09 Source: Internet

Author: User

Tags message queue posix semaphore

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

A process is the smallest unit of resource allocation, and a thread is the smallest unit of CPU scheduling

contrast dimension	multi-process	multithreading	summary
data sharing, synchronization	data sharing complex, need to use IPC, data is separate, synchronization simple	because sharing process data, data sharing is simple, but also because of this cause synchronization complex	Each has an advantage
memory, CPU	memory-intensive, switching complex, low CPU utilization	Low memory consumption, simple switching, high CPU utilization	threading advantage
Create destroy, toggle	Create destroy, switch complex, slow speed	Create destroy, switch simple, fast	threading advantage
programming, debugging	simple programming, simple debugging	programming complex, debugging complex	process dominant
reliability	process does not affect each other	A thread hanging off will cause the entire process to hang up	process dominant
Distributed	Adaptable to multi-core, multi-machine distributed; if one machine is not enough, it is easier to extend it to multiple machines.	Adaptable to multi-core distributed	Process-dominated

communication modes in several processes:# pipe: A pipe is a half-duplex mode of communication in which data can only flow in one direction and can only be used between processes that have affinity. A process's affinity usually refers to a parent-child process relationship. The system that creates the pipe calls the pipe function. #include <unistd.h>int pipe (int fd[2"); The parameter of the pipe function is an array pointer containing two integers of type int. When the function succeeds, it returns 0 and fills a pair of open file descriptor values into the array whose arguments point to. If it fails, it returns-1 and sets errno. The two file descriptors created by the pipe function fd[0] and fd[1] respectively make up both ends of the pipeline, and the data written to fd[1] can be read from fd[0]. And, fd[0] can only be used to read data from the pipe, fd[1] can only be used to write data to the pipeline, but not to use the reverse. If you want to implement two-way data transfer, you should use two pipelines. By default, this pair of file descriptors is blocked. If we read an empty pipe with the read system call, then read will be blocked until the data in the pipeline is readable. If we write the data in a full pipeline with the write system call, then write will be blocked until enough free space is available in the pipeline. The pipeline itself has a capacity limit, typically 65536 bytes, and you can use the FCNTL function to modify the pipeline capacity. The pipeline can pass data between the parent and child processes, using the two pipe file descriptors (fd[1] and fd[0]) to remain open after the fork is called. A pair of such file descriptors can only guarantee data transfer in one direction between the parent and child processes, and the parent and child processes must have one shutdown f[0] and the other close fd[1].# famous pipe (named pipe): A well-known pipe is also a half-duplex communication method that overcomes the pipe's no-name limitation, so that, in addition to having the functionality of a pipeline, it allows for inter-process communication without affinity, but it allows inter-process-free communication. # semaphore (Semophore): Semaphore is a counter that can be used to control access to shared resources by multiple processes. It is often used as a locking mechanism to prevent a process from accessing the shared resource while other processes are accessing the resource. Therefore, it is primarily used as a means of synchronization between processes and between different threads within the same process. When multiple processes simultaneously access a resource on the system, such as writing a record of a database at the same time, or modifying a file at the same time, the synchronization of the process needs to be considered to ensure that only one process can have exclusive access to the resource at any one time. Typically, the code for a program's access to a shared resource is only a short paragraph, but that code throws a race condition between processes. We call this code a critical section. Synchronizing processes, that is, ensuring that only one process can enter critical code segments at any one time. The semaphore is a special variable that only takes natural values and supports only two operations: Wait (wait) and signal (signal)-----Passeren (pass, enter the critical section) and Vrijgeven (release, exit critical section). Assuming that there is a semaphore SV, the P and V operations on it have the following meanings: P (SV), if the value of SV is greater than 0, it is reduced by 1, and if the value of SV is 0, the execution of the process is suspended. V (SV) If another process is suspended because it waits for the SV, or if not, the SV is added 1. The value of the semaphore can be any natural number, but the most common and simplest semaphore is the binary semaphore, which can only take 0 and 1 of these two values. The Linux Semaphore API is defined in the Sys/sem.h header file, consisting mainly of 3 system calls: 1, semget system call: Creating a new semaphore set, or acquiring an already existing semaphore set. On success, returns a positive integer value, which is the identifier of the semaphore set, returns 1 on failure, and sets the error. 2, Semget system call: Change the value of the semaphore, that is, perform p, v operation. 3. Semctl system call: Allows the caller to control the signal directly. When the critical code snippet is available, the value of the binary Semaphore SV is 1, and both process A and B have the opportunity to enter the critical code segment. If process a performs a P (SV) operation at this point, the SV is reduced by 1, then process B will be suspended if the P (SV) operation is performed. The critical code snippet becomes available again until process a leaves the keyword code and performs a V (SV) operation to add the SV to 1. If process B is suspended because it waits for the SV, it will wake up and enter the critical code snippet. Similarly, if process a performs a p (SV) operation again, it can only be suspended by the operating system to wait for process B to exit the critical code. # Message queue: Message Queuing is a linked list of messages, stored in the kernel and identified by message queue identifiers. Message Queuing overcomes the disadvantages of less signal transmission information, only unformatted byte stream, and limited buffer size. Message Queuing is a simple and efficient way to pass a binary chunk of data between two processes. Each data block has a specific type, and the receiver can selectively receive data based on the type, rather than having to receive data in a first in, out-of-order manner as a pipe and named pipe. 1, msgget system call: Create a message queue, or get an existing message queue. 2. MSGSND system call: Adds a message to the message queue. The MSGSND call in the blocking state may be interrupted by the following two exception conditions: (1) Message Queuing is removed. At this point the MSGSND call will return immediately and set errno to EIDRM. (2) The program receives the signal. At this point the MSGSND call will return immediately and set errno to Eintr. MSGSND returns 0 on success, 1 for failure and sets errno. Msgsnd the partial fields of the kernel data structure Msqid_ds will be modified when successful. (1) Add Msg_qnum 1. (2) Set Msg_lspid to the PID of the calling process (3) to set the Msg_stime to the current time. 3. MSGRCV system call: Gets the message from the message queue. The MSGRCV call in the blocking state may be interrupted by the following two exception conditions: (1) Message Queuing is removed. At this point the MSGRCV call will return immediately and set errno to EIDRM. (2) The program receives the signal. At this point the MSGRCV call will return immediately and set errno to Eintr. MSGRCV returns 0 on success, 1 for failure and sets errno. MSGRCV the partial fields of the kernel data structure Msqid_ds will be modified when successful. (1) Reduce msg_qnum by 1. (2) Set Msg_lspid to the PID of the calling process (3) to set the Msg_stime to the current time. 4. Msgctl system Call: Controls some properties of Message Queuing.# signal (sinal): A signal is a more sophisticated means of communication that notifies the receiving process that an event has occurred. # Shared memory: Shared memory is the mapping of memory that can be accessed by other processes, which is created by a process, but can be accessed by multiple processes. Shared memory is the fastest IPC approach and is specifically designed for low-efficiency operation of other interprocess communication modes. It is often used with other communication mechanisms, such as semaphores, to achieve synchronization and communication between processes. 1. Shmget system call: Create a new shared memory, or get a section of shared memory that already exists. On success, returns a positive integer value, which is the identifier of the shared memory; returns 1 on failure and sets the error. 2. Shmat System Call: Associates the newly created/fetched shared memory to the address space of the process. 3. SHMDT system call: After using shared memory, detach it from the process address. 4. Shmctl system call: Control some properties of shared memory. Posix method for shared memory: No file support is required, just use the following function to create or open a POSIX shared memory object:#include <sys/mman.h>#include <sys/stat.h>#include <fcntl.h>IntShm_open (Const Char* name,int oflag,mode_t mode); the success of the Shm_open function call is to return a file descriptor. The file descriptor can be used for subsequent mmap calls, thereby associating the shared memory with the calling process. Shm_open returns 1 on failure, and sets errno. shared memory objects created by Shm_open need to be deleted after they have been used as soon as open filegroups are closed#include <sys/mman.h>#include <sys/stat.h>#include <fcntl.h>int shm_unlink (const Char* name); The function marks the shared memory object specified by the name parameter as pending deletion. When all processes that use the shared memory object are separated from the process with Ummap, the system destroys the resources that the shared memory object occupies. If you use the above POSIX shared memory functions in your code, you will need to specify the connection option-LRT when compiling. # socket: Socket is also an inter-process communication mechanism, unlike other communication mechanisms, it can be used for different and inter-process communication. Shared Memory:High efficiency because the process can read and write directly to the memory without requiring any copy of the data. For communication methods such as pipelines and message queues, four copies of the data are required in the kernel and user space, while shared memory copies only two data: one at a time from the input file → shared memory area, and the other from the shared memory area → output file. In fact, when you share memory between processes, you do not always have to read and write small amounts of data, and then re-establish the shared memory area when there is new communication. Instead, the shared area is maintained until the communication is complete, so that the data content is kept in shared memory and is not written back to the file. Content in shared memory is often written back to a file when it is de-mapped. Therefore, the use of shared memory communication mode is very efficient. The disadvantages of various communication methods: 1, Pipeline: slow speed, limited capacity, only the parent-child process can communicate. 2, famous pipe (named pipe): Any process can communicate between, but slow. 3, Message Queuing: capacity is limited by the system, and pay attention to the first reading, to consider the last time you have not read the data. 4, Semaphore: Can not transfer complex messages, can only be used to synchronize. 5, Shared Memory: can easily control capacity, fast, but to maintain synchronization, such as a process at the time of writing, another process to pay attention to read and write problems, the equivalent of thread security threads, of course, the shared memory area can also be used for inter-thread communication, but not necessary, A piece of memory within the same process is already shared between threads.

Differences between threads and processes and process communication methods

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More