1. Ring buffer
The benefit of the buffer is the space exchange time and the coordination of slow and fast threads. Buffers can be used in a number of design methods, here to say a few of the design of the ring buffer, can be seen as several types of ring buffer mode. The design of the ring buffer involves several points, one is how to deal with the index beyond the buffer size, the second is how to represent the buffer full and buffer empty, three is how to queue, the team, and four is how the data length in the buffer is calculated.
PS. Specify all of the following scenarios in which data cannot be written when the buffer is full and the buffer is empty to read data
1.1. Regular array Ring buffers
Set buffer size N, team head out, tail in,out, in are subscript:
- At the beginning, in=out=0
- Team head tail update with modulo operation, out= (out+1)%n,in= (in+1)%N
- Out==in indicates that the buffer is empty, (in+1)%n==out indicates that the buffer is full
- Queue que[in]=value;in= (in+1)%N;
- The team ret =que[out];out= (out+1)%N;
- Data length Len = (in-out + N)% n
1.2, improved version of the array ring buffer
Also assume that the buffer size is N, the team head out, the tail in,out, the in array subscript, but the data type is unsigned int.
- At the beginning, in=out=0
- Increases the power of buffer size n to 2, assuming M
- Team head end update no longer take the mold, direct ++out,++in
- Out==in indicates that the buffer is empty, (in-out) ==m indicates that the buffer is full
- Queue que[in& (M-1)] = value; ++in;
- Out of the team ret = que[out& (M-1)]; ++out;
- In-out indicates data length
This idea of improvement comes from the Linux kernel loop queue Kfifo, which explains the meanings and principles of several behaviors.
⑴ raise the buffer size to a power of 2
This is convenient to take the mold, x%m = = x& (M-1) is true, bit operation efficiency than modulus to be higher. Use an example to analyze why the equation is set up:
Assuming m=8=2³, then m-1=7, the binary is 0000 0111
① if x<8----> x&7=x, x%8 = x, Equation established
② if x>8----> x = 2^a+2^b+2^c+ ... For example, 51 = 1+2+16+32 = 2^0+2^1+2^4+2^5, when seeking 51&7, because 7 of the binary 0000 0111, so the power of 2 as long as the number is greater than or equal to 2³, and the 7 result is 0, so 2^4 & 7 = 0, 2^5 & 7 = 0, (2^0+2^1+2^4+2^5) & (7) = 2^0+2^1=3. And according to ①, (2^0+2^1) &7 = (2^0+2^1)%8, so 51&7=51%8
Comprehensive evidence.
⑵out, in type designed as unsigned int
After the overflow of unsigned shaping, the count starts from 0: max_unsigned_int + 1 = 0, max_unsigned_int + 2 = 1, ....
In and out before overflow, can be through & to the in and out mapped to the correct position, after overflow it? Let me give you an example:
Assuming now In=max_unsigned_int, then in & (M-1) = M-1, that is, the last position, then the queue, should start from the beginning of the queue, that is, 0, and in+1 is 0, so even if overflow, (in+1) & (M-1) Can still be mapped to the correct location. That's why we're on the team, just doing a map and + + operation to ensure the right reason.
The elements in the queue are always maintained in the interval of [out,in], depending on the operation of the team and the teams, and there are three cases of this interval due to overflow:
- Out without overflow, in no overflow, in-out is the length of the data in this buffer.
- Out no overflow, in overflow, at this time the data length should be max_unsigned_int-out +1 + in = in-out + max_unsigned_int +1 = in-out.
- Out overflow, in overflow, at this time the data length is also in-out.
According to the above three cases, In-out always indicates the length of the data in the ring queue
I have to marvel that the KFIFO implementations in the Linux kernel are too subtle. Compared to the previous version, all the take-off operation has been changed to the operation, queue out team, the buffer data length is very simple.
1.3. Ring buffers implemented by linked list
The chain list implementation of the ring buffer is simpler than the array implementation and can be used in this design:
Assuming a ring buffer size of n is required
- Queue Length: You can design a member of size, each time O (1) Take size, you can also O (N) traverse the queue to find size
- Queue empty: Head->next = = NULL
- Queue full: size = = N
- The core of the team
ret = out;
out = out->next;
Head->next = out;
- The core new_node of the queue represents the new application node.
New_node->next = in->next;
In->next = In_node;
++size;
Of course, the set of linked list nodes is free, and the linked table node itself can contain arrays, lists, hash tables, and so on, such as the following, containing an array
At this point, two additional variable out_pos,in_pos can be added. Assuming that the size of the array within the node is N_elems, the number of nodes in the list is node_nums.
- Queue Length: (nodes_nums-2) *n+n-out_pos+in_pos
- Queue empty: Head->next = = NULL
- Queue full: Queue Length = = N
- The core of the team
Out_pos = = N_elems;
Delete_node = out;
Free (Delete_node);
out = out->next;
Out_pos = 0;
Head->next = out;
ret = out[out_pos++];
- The core of the team, New_node said the core of the new application
In_pos = = N_elems;
New_node->next = in->next;
in = New_node;
In_pos = 0;
in[in_pos++] = value;
1.4. Improve the ring buffer of the linked list
The above linked list ring queue out of the queue may free memory, into the queue may request memory, so, can use a free list to manage the freed memory, into the queue, if you want to increase the node, first from the idle list of nodes, not to go to request memory, so you can avoid multiple allocations to free memory, As for the other operations are the same.
The top is simply to say the next team out of the queue and other operations, in fact, the buffer is often associated with the read-write thread, each resource in the buffer, for similar threads may need mutually exclusive access, may also be able to share the use, and not the same line threads (read and write threads) often need to do synchronous operation. For example, read threads may share each resource of a buffer, or mutually exclusive use of each resource, usually, when the buffer is full, the write thread cannot write, the buffer space-time read thread cannot read, that is, the read-write thread requires synchronization. This is actually the operating system courses on the PV operation of several Classic mode, if read between, write between the requirements of mutually exclusive use of resources, and read-write threads do not require mutual exclusion, is the producer of consumer issues, if there is no mutual exclusion between read read, each resource can be used by multiple read threads together, There is a requirement for mutual exclusion between writes (each resource is used only by one write thread), and the read-write thread also requires mutual exclusion (no writing when reading, no reading when writing), or a reader-writer problem.
The following is an example of the producer consumer model and the 1.2-section improved loop buffer, which says that the concurrent loop queue has a lock implementation, and the next one says no lock implementation. As for the reader's question, there is time to discuss it later.
2. Producer Consumers
Let's start with the advantages of a producer consumer.
- Concurrent, if the buffer in the same way data processing, can open multiple threads or processes to process data or production data
- Asynchronous, producers do not have to wait for consumer data, consumers do not have to wait for producers to produce data, just according to the state of the buffer to respond, if combined with IO multiplexing technology, known as the reactor mode, can design a good asynchronous communication architecture, like zeromq the bottom of the thread communication is to use this kind of scheme to do.
- Decoupling, decoupling can be said to be a side effect, because producers and consumers are not directly related, that is, the producer does not call any consumer method or vice versa, so the change of either party does not affect the other party.
- Buffering, mainly to maintain their own performance, such as producers quickly, it's okay, consumers although not to come, but can put the data in the buffer.
Now officially started, according to the number of producers and consumers can be divided into four types of producer consumers, 1:1,1:n,m:1,m:n.
And then make a rule that the size of the ring buffer is m,m to 2 power, in and out unified known as the slot.
2.1. Single producer single Consumer
A producer, a consumer, the number of available resources in the buffer is M.
In this case, as long as the producers and consumers synchronous, synchronous method is to use two semaphore available_in_slots,available_out_slots respectively to indicate that the producer has multiple available resources, the consumer has multiple available resources , each production of a product, the producer available resources minus 1, the consumer available resources plus 1, this can be achieved by the PV operation, with P operation may consume 1 resources, p operation end of the number of resources reduction 1,V operation can produce 1 resources, v operation after the end of the number of resources plus 1. Initially, Available_in_slots=m, indicates that the producer has M space to put products, available_out_slots=0, indicating that the consumer has no available resources:
Available_in_slots =M;available_out_slots=0;inch= out=0;voidproducer () { while(true) {P (available_in_slots); queue[(inch+ +) & (M-1)] =data; V (Available_out_slots)}}voidconsumer () { while(true) {P (available_out_slots); queue[( out+ +) & (M-1)] =data; V (Available_in_slots)}}
2.2. Single producer multi-consumer
A producer, multiple consumers, a buffer of available resources in bits M.
In this case, consumers have multiple, consumers to the out slot to mutually exclusive access, with Out_slot_mutex to achieve mutual exclusion between consumers, to get Out_slot_mutex consumer thread to continue to execute, did not get to only block. Producer consumers should synchronize and use Available_in_slots,available_out_slots to realize the synchronization of producers ' consumers.
Available_writes_slots =M;available_read_slots=0; Out_mutex=1;inch= out=0;voidproducer () { while(true) {P (available_writes_slots); queue[(inch+ +) & (M-1)] =data; V (Available_read_slots)}}voidconsumer () { while(true) {P (available_read_slots); P (Out_mutex); queue[( out+ +) & (M-1)] =data; V (Out_mutex); V (Available_writes_slots)}}
2.3. Multi-producer single consumer
This is consistent with 2.2, and the same approach is used
Multiple producers, producers of the in slot to mutually exclusive access, with In_slot_mutex to achieve mutual exclusion between producers, to get the In_slot_mutex of the producer thread to continue to execute, did not get the only block. Producer consumers should synchronize and use Available_in_slots,available_out_slots to realize the synchronization of producers ' consumers.
Available_in_slots =M;available_out_slots=0; In_slot_mutex=1;inch= out=0;voidproducer () { while(true) {P (available_in_slots); P (In_slot_mutex); queue[(inch+ +) & (M-1)] =data; V (In_slot_mutex); V (Available_out_slots)}}voidconsumer () { while(true) {P (available_out_slots); queue[( out+ +) & (M-1)] =data; V (Available_in_slots)}}
2.4, multi-producer and multi-consumer
Multiple producers, multiple consumers, buffer available resources number of M.
Multiple producers, so the in slot to mutually exclusive access, with In_slot_mutex to achieve mutual exclusion between producers, multiple consumers, so the out slot also to mutually exclusive access, with Out_slot_mutex to achieve mutual exclusion between consumers , the synchronization between producer and consumer is realized by Available_in_slots,available_out_slots.
Available_in_slots =M;available_out_slots=0; In_slot_mutex=1; Out_slot_mutex=1inch= out=0;voidproducer () { while(true) {P (available_in_slots); P (In_slot_mutex); queue[(inch+ +) & (M-1)] =data; V (In_slot_mutex); V (Available_out_slots)}}voidconsumer () { while(true) {P (available_out_slots); P (Out_slot_mutex); queue[( out+ +) & (M-1)] =data; P (Out_slot_mutex); V (Available_in_slots)}}
The above is the implementation algorithm of the concurrent locking ring queue with the producer consumer as the usage scene. You can see that the lock mechanism is very useful, but the locking mechanism has a big problem, if for some reason the owner of the lock has been hung, may lead to deadlock, so this method has some hidden dangers. Recently, when learning the source code of ZEROMQ, learned a kind of lock-free queue implementation, then, the next look at the implementation of the lock-free loop queue (also in the case of producer consumers as a usage scenario).
Design of ring buffers and their use in producer-consumer mode (with lock ring queue in parallel)