Lock-free Algorithm in Linux Kernel

Source: Internet
Author: User

There are several pages in Linux Device Drivers (the third edition) to analyze the implementation of the lock-free algorithm. I want to explain the author's analysis in detail here. The first is the author's analysis of the circular buffer. When the buffer zone is full, the analysis is wrong. The second is that the author did not give a detailed introduction to the implementation techniques. For the above two points, this article will use the kfifo. h and kfifo. C code implementation of 2.6.11 (2.6.10 and 2.6.11 are the same) to conduct a more detailed analysis. For access to the critical section, the general approach is to lock before access and unlock when the access is exited. There may be a long wait time during the lock process, which may also affect the efficiency. If you can implement lock-free access under security conditions, it will undoubtedly improve the efficiency and boost people. Of course, everything is affected and restricted by certain environments and conditions, and the implementation of the lock-free algorithm is no exception. So under what circumstances can we implement and use the lock-free algorithm? In fact, the author of the Code stelian is in kfifo. the comments code line 115--117 of C has clearly told us that if only one reader and one writer concurrently access the critical section, the locks can be avoided. The following is the cyclic buffer Data Structure declared in kfifo. h:

29 struct kfifo {

30 unsigned char * buffer;/* The buffer holding the data */31 unsigned int size;/* the size of the allocated buffer */32 unsigned int in; /* data is added at offset (in % size) */33 unsigned int out;/* data is extracted from off. (Out % size) */34 spinlock_t * lock;/* protects concurrent modifications */35 }; buffer buffer is a cyclic buffer size in bytes. The size of the buffer in is the write subscript obtained in the buffer using the (in % size) operation when data is written.

An out statement is used to obtain the read subscript in the buffer when reading data (Out % size ).

Buffer buffer, size, in, and out are organized in the code based on a certain logical relationship. The following figure analyzes their features and logical relationships based on Figure A: 1. when the buffer zone is full, it is not allowed to write data to the buffer zone. That is, the data in the buffer zone will not be overwritten, and useless data will not be read when the data in the buffer zone is empty. 2. In terms of semantics, in is always greater than or equal to out, so there will always be (Out
+ Len) <= in, And Len's value range is 0 <= Len <= size-1. Note that this semantic relationship is always the relationship between In and out. In fact, in actual computation, if it does not exceed the representation range of the unsigned integer, the relationship between in and out is both semantic and logical. 3. When (in = out), the data in the buffer zone is empty. When (
(In! = Out) & (in % size) = (Out % size), that is, their read/write subscripts are equal, while in and out are not equal, indicates that the buffer is full. This buffer is full because the data occupies all the buffer space, which is not a byte waste as described in Linux Driver (Third edition. 4. In line 72-79 of the kfifo. C code, ensure that the size value is the Npower of 2. 5. The number of data bytes in the loop buffer is (in-out), which is represented by buf_data_bytes. Correspondingly (size
-(In-out) indicates the number of bytes of free space in the cyclic buffer zone. We use buf_free_space to represent it. 6. (size-(In % size) indicates the number of bytes starting from the subscript writing to the end of the loop buffer. We use wrete_index_to_buf_end_bytes to indicate the number of bytes. Correspondingly (size
-(Out % size) indicates the number of bytes from the beginning of the read subscript to the end of the loop buffer. We use read_index_to_buf_end_bytes.

The number of lines of kfifo. C code is small, and the code to be analyzed here is less, mainly for get/put analysis of the read/write interface of the buffer zone:

 

............... // The vertex represents the omitted code 72/* 73 * round up to the next power of 2, since our 'let the indices 74 * Wrap 'tachnique works only in this case. 75 * // here, the operation is to ensure that the value of size is 2 to the Npower 76 if (size & (size-1) {77 bug_on (size> 0x80000000 ); 78 size = roundup_pow_of_two (size); 79 }............... 105/* 106 * _ k1_o_put-puts some data into the FIFO, no locking version107 * @ FIFO: The fifo to be used.108 * @ B Uffer: The data to be added.109 * @ Len: the length of the data to be added.110 * 111 * This function copies at most 'len' bytes from the 'buffer' limit 112 * The FIFO depending on the free space, and returns the number of113 * bytes copied.114 * 115 * Note that with only one concurrent reader and one concurrent116 * Writer, you don't need extra locking to use these functions.117 */118 unsigned int _ k Export o_put (struct kfifo * FIFO, 119 unsigned char * buffer, unsigned int Len) 120 {121 unsigned int L; 122 // read this program, which can be combined with the second parameter of the analysis/min function in Figure B (FIFO-> size-(FIFO-> In-FIFO-> out )), // obviously it is the buf_free_space mentioned above. The minimum value obtained by comparing buf_free_space with the/function _ k1_o_put parameter Len is the number of bytes to be written into the circular buffer zone this time, that is, after the min function is called, Len indicates the actual number of bytes to write to the circular buffer. // Note: The FIFO> size here may be smaller than that of FIFO> in, But FIFO> size, // FIFO-> In, FIFO-> out are all unsigned shaping, and the logic in the operation will not go wrong. This // technique will be explained later. 123 Len = min (Len, FIFO-> size-FIFO-> in + FIFO-> out ); 124125/* first put the data starting from FIFO-> in to buffer end * // * first write the data from the write subscript to the buffer end * // (FIFO-> in & (FIFO-> size-1 )) it is equivalent to (FIFO-> in % FIFO-> size). // This technique will be explained later. Since they are equivalent, // FIFO-> size-(FIFO-> In & (FIFO-> size-1) is // wrete_index_to_buf_end_bytes. The min objective of the function is obvious. The obtained // is the number of bytes to be written, and the range of these bytes is from writing subscript to loop buffer. 126 L = min (Len, FIFO-> size-(FIFO-> In & (FIFO-> size-1 ))); 127 memcpy (FIFO-> buffer + (FIFO-> In & (FIFO-> size-1), buffer, L); 128129/* Then put the rest (if any) at the beginning of the buffer * // * and if there is still data not written, write the remaining number of bytes len-l into the remaining data from the buffer * zone buffer */130 memcpy (FIFO-> buffer, buffer + L, len-L ); 131 // note that FIFO> in only adds and does not subtract. When it reaches the maximum range it can express, it will follow the // rules of the unsigned integer, this is also an unsigned integer technique and will be explained later. 132 FIFO-> in + = Len; 133134 return Len; 135} 136 export_symbol (_ k1_o_put); 137138/* 139 * _ k1_o_get-gets some data from the FIFO, no locking version140 * @ FIFO: The fifo to be used.141 * @ Buffer: where the data must be copied.142 * @ Len: the size of the destination buffer.143 * 144 * This function copies at most 'len' bytes from the FIFO into the145 * 'buffer' and returns the number of copied bytes.146 * 147 * Note that only one concurrent reader and one concurrent148 * writer, you don't need extra locking to use these functions.149 */150 unsigned int _ kw.o_get (struct kfifo * FIFO, 151 unsigned char * buffer, unsigned int Len) 152 {153 unsigned int L; 154 // read this program combined with graph B's analysis // after calling the min function, Len is 155 Len = min (Len, FIFO> In-FIFO> out ); 156157/* first get the data from FIFO-> out until the end of the buffer * // * First read the data from the subscript until the buffer ends */158 L = min (Len, FIFO-> size-(FIFO-> out & (FIFO-> size-1); 159 memcpy (buffer, FIFO-> buffer + (FIFO-> out & (FIFO-> size-1), L); 160161/* then get the rest (if any) from the beginning of the buffer * // * if you have not read the required byte data, read the remaining byte from the buffer */162 memcpy (buffer + L, FIFO-> buffer, len-l); 163 // FIFO-> out is also added without subtraction, similar to FIFO-> in 164 FIFO-> out + = Len; 165166 return Len; 167}

168 export_symbol (_ k1_o_get );

Now we can analyze a scenario in which a reader and a writer access the circular buffer concurrently in Fig C to see what the lock-free situation is like. Assume that the size is 8, The out value is 0, and the in value is 5. process a needs to write 10 bytes of data to the circular buffer, and process B needs to read 15 bytes of data from the circular buffer. To facilitate the description, process a is abbreviated as process a and process B is abbreviated as process B. The following is a hypothetical Analysis of the seven time periods of process a and process B: 1. When period a finishes executing period 1, it writes 3 bytes of data to the cyclic buffer. However, Row A is executed to the second row of kfifo. C, and the second row of write subscript update code is scheduled to process B before it is executed. In this case, although the data is written to the cyclic buffer, the write subscript is not updated, so the number of bytes in the logical buffer is 5 in-out. 2. if B reads all the data in the buffer and updates the read subscript in the 2nd period, the values of in and out are both 5 at the end of the 2nd period, indicating that the data in the buffer is empty. 3. in the 3rd time period, the system still schedules B. Although a has written data to the buffer, the write subscript has not been updated yet. The values of in and out are both 5, and the data in the buffer still looks empty, then B can directly call the scheduler and discard the cup. 4. During the 4th period, a updated the write subscript, And the scheduler scheduled the B process again. 5. during the execution of the 4th period, the in value is 8, The out value is 5, and three bytes of data are in the buffer. After reading 3 bytes of data during the 5th period, B calls the scheduler to discard the CPU. 6. In period 6th, A writes all the data to be written to the buffer, and in period 7th, B reads the number of bytes to be read. In these 3rd time periods, we can see that the read process is basically waiting (the write process waits for a similar situation), but it is not a lock-and-Unlock sleep wake-up waiting method, it can take the initiative to discard the CPU or round robin mode. When a read process is scheduled at any time, the read process can access the critical zone of the buffer zone at any time without the restriction of the lock. It can be seen that the lock-free goal has been achieved. But how does kfifo. c Ensure data correctness? Ensuring the correctness of read data is similar to ensuring the correctness of write data. Here we will only discuss the situation of writing data. There are two guarantees for the correctness of Data Writing: one is the order of writing data and writing subscript updates; the other is to find the correct number of writable bytes. (1) In the _ k1_o_put interface, data is first written and then written subscript is updated. This ensures that the data read by the read process is correct. If, in turn, the subscripts are updated and then the data is written, and the read/write process accesses the data concurrently, the write process may enter sleep wait state after updating the subscript and hand over the CPU to the read process. In this way, the read process is very likely to read incorrect data. Because the update of the written mark indicates that the free space in the original update range has become the space with data, that is, the space where the read process can read data, in fact, the write process has not written the data. (2) In line 2 of the Code, the correct number of writeable bytes is obtained through comparison with the remaining free space to ensure that the existing data in the buffer is not overwritten during data writing.

Finally, I will explain some of the techniques used in this lock-free algorithm: 1. Binary skills: If a number is 1 minus the Npower of 2, its low N bits are all 1, and the other bits are all 0. If a number is to be divisible by the nth power of 2, its low N-bit value must be 0, on the contrary, if the low N bits of this number are not all 0, the number cannot be divisible by the Npower of 2. The N-bit lower of a number is the remainder of the modulus operation on the N-power data of 2. For example, if a number is to be divisible by 8 (3 power of 2), its binary value must be 0 if it is 3 lower. For example, in 16-bit machines, the binary values of 16 and 32 are 0000 0000 0001 0000 and 0000 0000 00010 0000, respectively, and their three-bit low values are all 0. The binary values of 19 and 33 on 16-bit machines are 0000 0000 0001 0011 and 0000 0000 000100001, respectively, and their low 3-bit values are not all 0, the lower three bits are the remainder of the modulo operation on 8. This feature can be divisible by the nth power of 2. It is also called by alignment of XX bytes (generally the nth power of 2) and boundary of XX bytes. Such as 8-byte alignment and page alignment. If you know this feature, you can use the '&' operation to determine whether a variable is aligned with a certain byte. For example, you can determine whether variable X is aligned with 8 bytes.
If (X & 8), that is, whether X can be divisible by 8, or whether X is at the boundary of 8 bytes. So how can we explain that (FIFO-> In & (FIFO-> size-1) and (FIFO-> in % FIFO-> size) are equivalent? Assume that FIFO> size is 8 and 8 is the power 3 of 2. (8-1) is 7, that is, the binary data with a low 3-bit value of 1 and a high value of 0. 7. Performing an & operation with an integer is to evaluate the lower three bits of a number, that is, to evaluate the remainder of an integer 8. 2. Unsigned shaping skills: In 32-bit machines, the maximum value of the unsigned integer is 4294967295, and the minimum value is 0. Then there is such an operation: (0-1) = 4294967295), (0-2) = 4294967294), and (4294967295 + 1) = 0), (4294967295 + 2) = 1 ). Based on the unsigned features above, it can be explained that the Code 132 and 164 rows of the kfifo. C program code are FIFO-> in and FIFO-> out operations that are constantly added but not reduced. When the values of FIFO> in and FIFO> out exceed the maximum value expressed by the unsigned integer, they can start from the minimum value 0 of the unsigned integer. If the value of FIFO> In has exceeded the maximum unsigned integer value and the value of data is now 2, and the value of FIFO> out is 4294967294, then (FIFO> out + 4)
= FIFO-> In). Therefore, when the Len value range is 0 <= Len <= FIFO-> size-1, in semantics, there will always be (FIFO-> out + Len) <= FIFO-> in. In kfifo. in C code, the unsigned integer is used in a unified manner and is not mixed with other signed values. This processing efficiency is higher than that of mixing multiple types of values, because data types must be converted before they are computed together. In addition, note that during the operation of the unsigned and signed integer operations, the signed integer is first converted to the unsigned integer, and the operation result is the unsigned integer data. Pay special attention to the value range of the unsigned integer during operation, and pay attention to the value range of 0. I have searched for bugs in this area twice. The bug is similar to this: If (x-y)> 0), X is an unsigned integer. when X is smaller than Y, this condition is determined to be problematic. Because it is an unsigned operation, when X is less than Y, the calculated result is out of the range of the unsigned minimum value 0, so the result will be far greater than 0, resulting in the final judgment error. We recommend that you do not use signed integer operations in this case. 3. Loop Buffer Technique: Through In and out technical operations, see program analysis.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.