Problem Description: The array element loops left, num_elem a one-dimensional array containing the elements of the Arr[num_elem] loop left rot_dist bit. Can I use only dozens of extra bytes of storage space to complete the rotation of the array in a time proportional to Num_elem?
One: Bentley ' s juggling alogrithm
Move variable arr[0] to TEMP variable tmp, move arr[rot_dist] to arr[0],arr[2rot_dist] to arr[rot_dist], and so on, until you return to the fetch ARR[0], the program ends with the value from TMP instead.
This method needs to be guaranteed: 1. The ability to traverse all array elements; 2. Arr[0] (that is, the value of TMP) assigns an appropriate array element to the last step.
When Num_elem and rot_dist , the above conditions naturally meet, otherwise not satisfied. From the point of view of algebra, when Num_elem and rot_dist the above traversal rules will be 0, ..., num_elem elements are rotated, and when Num_elem and rot_dist do not At the time of the interaction (greatest common divisor is common_divisor), the above traversal rules will form common_divisor disjoint rotations and rotate the num_elem/common_divisor elements .
1 unsigned gcd (unsigned m, unsigned n)2 {3 unsigned remainder;4 while(N >0) {5remainder = m%N;6m =N;7n =remainder;8 }9 Ten returnm; One } A - //Left rotate @arr containing @num_elem elements by @rot_dist positions - //Bentley ' s juggling algorithm from programming pearls theTemplate<typename _type> - voidArray_left_rotation_juggling (_type *arr,intNum_elem,introt_dist) - { - if(Rot_dist = =0|| Rot_dist = =Num_elem) + return; - + _type tmp; A intI, J, K; at - intCommon_divisor =gcd (Num_elem, rot_dist); - for(i =0; i < common_divisor; ++i) { -TMP =Arr[i]; -j = i, K = (j + rot_dist)%Num_elem; - while(k! =i) { inARR[J] =Arr[k]; -j =K; toK = (k + rot_dist)%Num_elem; + } -ARR[J] =tmp; the } *}
II: Gries and Mills Block swapping
Loops the array containing num_elem elements to the left rot_dist bit, equivalent to swapping two array blocks arr[0, rot_dist-1] and Arr[rot_dist, Num_elem-1] (that is, the block swapping problem), with X, y To represent the two array blocks.
- When X and Y are equal in length, swap directly two array blocks XY----YX.
- When x contains more elements, the x is split into two parts X1 and X2, where the length of X1 equals the length of Y, swap X1 and y:x1x2y, and y is where the loop should be after the left shift (after the block Exchange).
- When y contains a large number of elements, Y is split into two parts Y1 and Y2, where the length of Y2 equals the length of x, swap x and Y2:xy1y2, Y2y1x, at which point the X is positioned after the left shift of the loop (after the Block Exchange) (which becomes the final position).
After the 2nd case (3rd case) operation is complete, the problem has been reduced and the issue is resolved by continuing the exchange of x2x1 (3rd case, y1y2). This is the solution to the idea of recursion. This is a simple description of this method in the programming Zhu Ji Nanxiong. See here on the confused, read the "The Science of the Programming" section 18.1 and here is understood, in this collation.
Let's look at a simple example, using the idea above to move the array containing 7 elements to the left 2 bits, the array element values are 0, 1, 2, 3, 4, 5, 6, and the image is from here.
The red in the figure represents a shorter block. Observing the execution process can reveal several points:
- After the array is divided into left and right two blocks, the short block is swapped with a chunk of the long block, and the elements in the short block are in the final position, and subsequent operations no longer modify that part.
- The sub-block exchanged with this short block is located in a long block (after an interchange is performed, the long block is correspondingly shortened) away from the end of the short block.
- When the two blocks are equal in length, the program executes after swapping the two equal blocks.
First, you need the ability to swap two equal-length array blocks, and the function is implemented as follows.
1 //swap blocks of equal length2 //there must be no overlap between-Blocks3Template<typename _type>4 voidSwap_equal_blocks (_type *arr,intBEG1,intBEG2,intnum)5 {6 while(num-->0)7Std::swap (arr[beg1++], arr[beg2++]);8}
Because you need to consider the length of the two blocks, the length of the left and right two blocks of the bits I and J, that is, I have to deal with the side of the element is still to be processed, there are j elements on the side to be processed, I + J indicates that there is no final position at this time the number of elements. The initial state is drawn and the state after the first exchange is performed in the case of i > J and I < J two. The graph uses R to refer to the rot_dist, with N referring to Num_elem, and the gray part to indicate that it is already in the final position.
According to the following can be summed up a number of relationships:
- I > J, the two parts of the interchange are J elements starting from subscript r-i and subscript R.
- I < J, the two parts of the interchange are the I elements starting from subscript r-i and subscript r+j-i.
- ARR[0:R-I-1] and arr[r+j:n-1] are already in the final position.
- The I element to be processed on the left is always arr[r-i: r-1], and the right-hand pending J element is always arr[r:r+j-1].
Accordingly, the following implementation is implemented.
1Template<typename _type>2 voidArray_left_rotation_blockswapping (_type *arr,intNum_elem,introt_dist)3 {4 if(Rot_dist = =0|| Rot_dist = =Num_elem)5 return;6 7 inti = rot_dist, j = Num_elem-rot_dist;8 while(I! = j) {//could is dead loop when rot_dist equals to 0 or Num_elem9 //invariant:Ten //Arr[0:rot_dist-i-1] is in final position One //arr[rot_dist-i: rot_dist-1] is the left part, length I A //Arr[rot_dist:rot_dist+j-1] is the right part, length J - //Arr[rot_dist+j:num_elem-1] is in final position - if(I >j) { theSwap_equal_blocks (arr, rot_dist-I, Rot_dist, j); -I-=J; - } - Else { +Swap_equal_blocks (arr, rot_dist-i, rot_dist+j-I, i); -J-=i; + } A } atSwap_equal_blocks (arr, rot_dist-I, rot_dist, i); -}
Three: Reversal algorithm
The best explanation for this method is Doug Mcllroy's example with both hands, see. The graph moves the array loop left 5 bits.
The corresponding representation here is to use a three-time reversal to achieve a circular left shift. The implementation code is as follows:
//reverse the Elements @arr [@low: @high]voidReverseChar*arr,intLowintHigh ) { while(Low <High ) Std::swap (Arr[low+ +], arr[high--]);} Template<typename _type>voidArray_left_rotation_reversal (_type *arr,intNum_elem,introt_dist) { if(Rot_dist = =0|| Rot_dist = =Num_elem)return; Reverse (arr,0, rot_dist-1); Reverse (arr, rot_dist, Num_elem-1); Reverse (arr,0, num_elem-1);}
About performance: These three methods have a time complexity of O (n). Here, Victor J. Duvanenko the actual performance testing of three algorithms with Intel C + + Composer XE 2011 in Intel i7 860 (2.8GHz, with turboboost up to 3.46GHz), The results show that the gries-mills algorithm has the shortest running time, the reversal algorithm is second, but the gries-mills algorithm has little difference in running time. However, the reversal has an advantage in several test species, the reversal algorithm run time is very stable (that is, the standard deviation of the time measured many times is very small), and juggling algorithm in the running time and performance stability are poor. Duvanenko specifically analyzes the causes of this result: the gries-mills algorithm and the reversal algorithm good performance is due to their cache-friendly memory read mode, and the juggling algorithm's memory read mode is not cache-friendly ("the Gries-mills and reversal algorithms preformed well due to their cache-friendly memeory access patterns. The juggling algorithm preformed the fewest memory accesses, but came in 5x slower dut to its cache-unfriendly memory acce SS pattern.). In short, the juggling algorithm is less efficient than the other two algorithms, while the reversal algorithm and the gries-mills algorithm have the same running time, but the reversal algorithm is more stable, the algorithm principle is easier to understand, and the code implementation is more concise.
Programming Zhu Ji Nanxiong notes: Array looping left