Hill sort
The hill sort was named after computer scientist Donald L.shell, who discovered the hill sorting algorithm in 1959. The hill sort is based on the insertion sort, but adds a new feature that greatly improves the execution efficiency of the insert sort.
Insert sort: Too many times to copy
Because the hill sort is based on the insertion sort, you need to review the "insert exclusions". In the half of the insert exclusion execution, the data items on the left side of the marker are sorted (the data is ordered), and the data item on the right is not ordered. This algorithm takes out the data item indicated by the marker and stores it in a temporary variable. Then, from the first cell to the left of the data item that has just been removed, each time the ordered data item is moved to the right one cell until the data item stored in the temporary variable can be interpolated in an orderly fashion.
The following are problems with insertion sorting. Suppose a very small data item is in a position close to the right, which should be where the data item with the larger value is located. Move the small data item to the right position on the left, and all the intermediate data items (the data item where the item was originally located and the one where it should be moved to) must be moved to the right. This step performs nearly n copies of each data item. Although not all data items must be moved n locations but the data items are moved on an average of N/2 locations, this performs n N/2 shifts, which is a total of N2/2 replications. Therefore, the execution efficiency of the insert sort is O (N2).
If it is possible to move the smaller data items to the left without having to move all the intermediate data items one by one, the execution efficiency of the algorithm will be greatly improved.
N-Delta Sorting
Hill sort by increasing the interval between the elements in the insertion sort, and inserting the sort in these spaced elements, the data item can move in a large span. When these data items have been sequenced, the hill sorting algorithm decreases the interval of the data items in order to proceed sequentially. When these sorts are made, the interval between the data items is called an increment and is customarily represented by the letter H. Figure 7.1 shows the first step to sort the array with 10 data items in increments of 4 o'clock. Data items in positions 0, 4, and 8th are already in order.
Figure 7.1 4-incremental sorting of data items 0, 4, and 8th
After sorting 0, 4, and 8th data items, the algorithm moves one step to the right and sorts the 1, 5, and 9th data items. This sort process continues until all data items have completed a 4-delta sort, which means that all data items with an interval of 4 are arranged in order. This procedure is shown in 7.2 (using a more concise image of the legend).
After the 4-bit increment of Hill is sorted, the array can be thought of as consisting of 4 sub-arrays: (0, 4, 8), (1, 5, 9), (2, 6), and (3, 7), which are fully ordered in four sub-arrays. These sub-arrays are staggered to each other, but independent of each other.
Figure 7.2 Complete One-step sequencing with 4 increments
Note that, in this example, after the completion of the 4-Increment hill sort, there is no more than two units of the element that are in the final ordered sequence. This is the meaning of the array "basic order", and it is the mystery of the hill sort. By creating this staggered, internally ordered collection of data items, the effort required to complete the sort is minimized.
The insertion sort is very effective in ordering the basic ordered array. If the insertion sort only needs to move the data item one or two bits, then the algorithm will probably need O (N) time. In this way, when the array completes the 4-delta sort, you can make a normal insertion sort, which is the 1-delta sort. 4-Incremental sorting and 1-delta sorting are applied in combination, and it is much faster to apply a normal insert sort than the previous 4-delta sort.
Decrease interval
It has been shown above that the array with 10 data items is sorted at the initial interval of 4. For larger arrays, the starting interval should also be larger. The interval is then continuously reduced until the interval becomes 1.
For example, an array with 1000 data items may be preceded by 364 increments, then 121 increments, 40 increments, 13 increments, 4 increments, and the hill is sorted at 1 increments. The sequence used to form the interval (in this case, 364,121,40,13,4,1) is called the interval series. The interval sequence represented here is presented by Knuth, and this sequence is very common. The sequence starts in reverse Form 1 and is represented by recursion.
H = 3*h+1
To be generated with an initial value of 1. The first two columns of Table 7.1 show the sequence produced by this formula.
Table 7.1 Knuth interval sequence
There are other ways to generate interval sequences, which are discussed later. First, consider the use of Knuth sequences for hill sorting.
In the sorting algorithm, the initial interval is calculated using the generation formula of the sequence in a short loop. The H value is initially assigned to 1, and then the formula H=3*h+1 is used to generate the sequence, 1, 4, 13, 40, 121, 364, and so on. This process stops when the interval is greater than the size of the array. For an array with 1000 data items, the seventh digit of the sequence, 1093 is too large. Therefore, use the sixth digit of the sequence as the largest number to begin the sorting process, which is a 364-increment sort. Then, each time the outer loop of the sort routine is completed, the interval is reduced with the backward-push of the formula provided earlier:
h = (h-3)/3
It is shown in the third column of Table 7.1. This inverted formula generates an inverse sequence of 364, 121, 40, 13, 4, 1 starting with 364, sorted by each number as an increment. When the array is sorted by 1-increment, the algorithm ends.
Hill sort of Java code
Package com.goaji.shellsort;public class arraysh{ private long[] theArray; private int nElems; public arraysh (Int max) { thearray = new long[max]; nElems = 0; } public void insert (Long value) { theArray[nElems] = value; nElems++; } public void display () { system.out.print ("a="); for (int i = 0; i < nelems; i++) { &nbSp; system.out.print (thearray[i] + " "); } system.out.println (""); } public void shellsort () { int inner,outer; long temp; int h=1; while (H<=NELEMS/3) h = h*3+1; while (h>0) { for (outer = 0; outer < nelems; outer++) { temp = thearray[Outer]; inner = outer; while (inner>h-1 && thearray[inner-h]>=temp) { thearray[inner] = theArray[inner-h]; inner-=h; } theArray[inner] = temp; } h = (h-1)/3;&NBSP;&NBSP;&NBSP;&NBsp; } }}
public static void Main (string[] args) {int maxSize = 10; Arraysh arr; arr = new Arraysh (maxSize); for (int i = 0; i < maxSize; i++) {Long n = (int) (Java.lang.Math.random () *99); Arr.insert (n); } arr.display (); Arr.shellsort (); Arr.display ();} Output: a=26 9 14 14 26 37 50 50 53 83 87
It can be maxsize to take a larger value, but not too big. It takes approximately 1 minutes to complete the sorting of 10,000 data items;
Although the hill-ordered algorithm requires only a few lines of code to implement, tracking the algorithm is not straightforward.
Other interval sequences
The selection interval sequence can be called a magic. Only the formula h=h*3+1 generation interval sequence is discussed here, but the application of other interval sequences has achieved varying degrees of success. There is only one absolute condition, that is, the gradually decreasing interval must be equal to one at a time, so the last order is a normal insertion sort.
The efficiency of the hill sort
So far, no one has been able to theoretically analyze the efficiency of hill sequencing, except in some special cases. There are a variety of test-based evaluations that estimate its time-level from O (N3/2) to O (7/6 of the n-th side).
Java Data structures and algorithms (seventh advanced sort 1)