Basis
If two arrays are ordered, you can merge the two arrays into a larger, ordered array. The merging sort is based on this. To sort an array, you can divide it into two sub arrays sorted separately, then merge the results so that the whole is ordered. The ordering of the child arrays is also sorted by this method, which is recursive.
Here is the sample code:
#include "timsort.h" #include <stdlib.h> #include <string.h>//L1 two lengths, L2 sorted array P1, P2 merged into one//sorted target array.
void merge (int target[], int p1[], int L1, int p2[], int L2);
void Integer_timsort (int array[], int size) {if (size <= 1) return;
int partition = SIZE/2;
Integer_timsort (array, partition);
Integer_timsort (array + partition, size-partition);
Merge (array, array, partition, array + partition, size-partition);
} void Merge (int target[], int p1[], int L1, int p2[], int l2) {int *merge_to = malloc (sizeof (int) * (L1 + L2));
Current scan two array position int i1, i2;
I1 = i2 = 0;
The position of the next element to be placed in the merge process int *next_merge_element = merge_to; Scans two arrays, writes smaller elements to//merge_to. When two numbers are equal we choose//to the left, because we want to guarantee the stability of the sort//Of course for integers it doesn't matter, but this idea is important while (I1 < L1 && I2 < L2) {if p1[
I1] <= P2[i2]) {*next_merge_element = P1[i1];
i1++;
else {*next_merge_element = P2[i2];
i2++;
} next_merge_element++; }
// If an array is not scanned, we directly copy the remainder of the memcpy (next_merge_element, p1 + i1, sizeof (int) * (L1-I1));
memcpy (next_merge_element, p2 + i2, sizeof (int) * (L2-I2));
Now we have merged them into our extra storage space//It is time to dump to target memcpy (target, merge_to, sizeof (int) * (L1 + L2));
Free (merge_to);
}
#include "timsort.h" #include <stdlib.h> #include <string.h>//L1 two lengths, L2 sorted array P1, P2 merged into one//sorted target array
。
void merge (int target[], int p1[], int L1, int p2[], int L2);
void Integer_timsort (int array[], int size) {if (size <= 1) return;
int partition = SIZE/2;
Integer_timsort (array, partition);
Integer_timsort (array + partition, size-partition);
Merge (array, array, partition, array + partition, size-partition);
} void Merge (int target[], int p1[], int L1, int p2[], int l2) {int *merge_to = malloc (sizeof (int) * (L1 + L2));
Current scan two array position int i1, i2;
I1 = i2 = 0;
The position of the next element to be placed in the merge process int *next_merge_element = merge_to; Scans two arrays, writes smaller elements to//merge_to. When two numbers are equal we choose//to the left, because we want to guarantee the stability of the sort//Of course for integers it doesn't matter, but this idea is important while (I1 < L1 && I2 < L2) {if p1[
I1] <= P2[i2]) {*next_merge_element = P1[i1];
i1++;
else {*next_merge_element = P2[i2];
i2++;
} next_merge_element++; }
If an array is not scanned, we directly copy the remainder of the memcpy (next_merge_element, p1 + i1, sizeof (int) * (L1-I1));
memcpy (next_merge_element, p2 + i2, sizeof (int) * (L2-I2));
Now we have merged them into our extra storage space//It is time to dump to target memcpy (target, merge_to, sizeof (int) * (L1 + L2));
Free (merge_to);
}
I don't always post the full code ~
Optimization
Now, if you're a C programmer, you're probably already in a slot: I've applied for and freed up an extra storage space every time I merge (you might also be offended that I didn't check if the return value was null, please ignore it ...) If it makes you feel better
This problem can be corrected with a little change:
void merge (int target[], int p1[], int L1, int p2[], int l2, int storage[]);
void integer_timsort_with_storage (int array[], int size, int storage[]);
void Integer_timsort (int array[], int size) {
int *storage = malloc (sizeof (int) * size);
Integer_timsort_with_storage (array, size, storage);
Free (storage);
}
void merge (int target[], int p1[], int L1, int p2[], int l2, int storage[]);
void integer_timsort_with_storage (int array[], int size, int storage[]);
void Integer_timsort (int array[], int size) {
int *storage = malloc (sizeof (int) * size);
Integer_timsort_with_storage (array, size, storage);
Free (storage);
}
Now we have the topmost level of the sort function, doing some memory allocation (Setup) work and passing it into the call. This is the template that we are going to start optimizing, and of course the final version that is actually available is more complex than just optimizing a single memory space.
Now that we have the basic merge sort, we need to think: how can we optimize it?
Generally speaking, we can't expect to achieve the best in every situation. The merge sort performance is already close to the lower bound of the comparison sort. The key characteristic of Timsort is to make good use of the laws in the data. If there is a general rule in the data, we should use them as much as possible, and if not, our algorithm should guarantee that it will not be much worse than the normal merge sort.
If you've seen the implementation of the merge sort, you'll find that all the work is done in the merge process. So the focus of optimization is left here. This leads us to the following three possible optimization approaches:
1. Can you make the merge process run faster?
2. Is it possible to perform fewer merge processes?
3. Are there some situations where you might want to use a different sort instead of a merge sort?
The answer to all three questions is yes, and this is the most common way to optimize merge sequencing. For example, the recursive implementation makes it very easy to use different sorting algorithms based on the size of the array. Merge ordering is a good common sort algorithm (with good asymptotic complexity) but constant factor is more important for decimal groups, when the size of the array is less than a certain value (usually 7 or about 8) the performance of the merge sort is frequently lower than the insertion sort.
This is not the principle of timsort, but we'll use the insertion sort later, so we're going to go for a run.
The simplest: Suppose we have a sorted array with n elements, and the position of the n+1 element at the end. Now we want to add a new element to the inside and keep the array in order. We need to find a suitable location for the new element and move it back over its larger number. One obvious approach is to put the new elements in the position of the first n+1, then swap forward 22 to get to the right place (for larger arrays This is not the best way to do this: You might want to do a binary lookup of the data (binary search) and then move the remaining elements backwards without comparing them. But for the decimal group This is not very good practice, due to cache effects
This is how the insertion sort works: When you have the K-sorted element, insert the first K+1 element into it, and you have the k+1 element sorted. Repeated so until the whole array was ordered.
Here's the code:
void Insertion_sort (int xs[], int length) {
if (length <= 1) return;
int i;
for (i = 1; i < length; i++)
the array before {/I is already ordered, now insert xs[i] inside
int x = xs[i];
int j = i-1;
Move J forward until the array header or
//something <= x, and all elements on the right side have been
//shifted while
(J >= 0 && Xs[j] > x) {
x S[J+1], xs[j];
j--;
}
XS[J+1] = x;
}
}
void Insertion_sort (int xs[], int length) {
if (length <= 1) return;
int i;
for (i = 1; i < length; i++)
the array before {/I is already ordered, now insert xs[i] inside
int x = xs[i];
int j = i-1;
Move J forward until the array header or
//something <= x, and all elements on the right side have been
//shifted while
(J >= 0 && Xs[j] > x) {
XS[J+1], xs[j];
j--;
}
XS[J+1] = x;
}
}
The sorted code is now modified to the following:
void integer_timsort_with_storage (int array[], int size, int storage[]) {
if (size <= insertion_sort_size) {
Insertion_sort (array, size);
return;
}
void integer_timsort_with_storage (int array[], int size, int storage[]) {
if (size <= insertion_sort_size) {
Insertion_sort (array, size);
return;
}
You can view this version here
OK, let's get back to the point: Optimize merge sort.
Can I perform fewer merge processes?
For the general situation, No. But let us consider some of the prevailing conditions.
Let's say we have a sorted array, how many times do we need to perform the merge process?
In principle, 1 times is not necessary: The array has been ordered, do not need to do any extra work. So a viable option is to add an initial check to determine if the array is sorted and exit immediately after the confirmation.
But that would add a lot of extra computation to the sorting algorithm, although it would yield significant benefits in the case of success (Nlog (n)) to O (n)), but if the judgment failed, it would result in a lot of useless computations. Let's take a look at how we can implement this check and make good use of the results of the inspection, whether it is a failure or not.
Let's say we've encountered the following array:
{5, 6, 7, 8, 9, 10, 1, 2, 3}
(now let's ignore that we'll use different sorting methods for arrays less than N)
In order to get the best combination strategy, where should we segment it?
Obviously there are two of sorted arrays: 5 to 10 and 1 to 3, and if you choose these two segments as a fragment, you'll get a good result.
Next, a one-sided approach is proposed:
Find the longest ascending sequence in the initial state as the first segment (partition) and the remainder as the second segment.
This approach behaves well when the data is made up of a few sorted arrays, but this approach has a very bad worst-case scenario. Consider a completely reverse array, with only one number in the first paragraph of each segment, so that the first paragraph in each recursion has only one number, and the n-1 elements of the second paragraph are sorted recursively. This results in an apparently unsatisfactory performance of O (n^2).
We can also manually modify the short section to a total length of half of the elements to avoid this problem, but this is also unsatisfactory: our additional inspection work is basically not a benefit.
However, the basic idea has been clear: take advantage of the ordered sequence of subsequence as a unit of fragmentation.
The difficulty is the second paragraph, and in order to avoid bad worst-case scenarios, we need to ensure that our segments are as balanced as possible.
Let's take a step back and see if there is a way to correct it. Consider the following strange reverse thinking about the process of common merge sorting:
Cut the entire array into partitions of a length of 1.
When multiple partitions are present, 22 of odd and even alternates merge the partitions (alternating even/odd) and replace the original two partitions with the merged partitions.
For example, if we have an array of {1, 2, 3, 4} Then we will do this:
{{1}, {2}, {3}, {4}}
{{1, 2}, {3, 4}}
{{1, 2, 3, 4}}
It's easy to see that this is the same as the normal merge sort: We just make the recursive process explicit and replace the stack with extra storage. However, this approach is more intuitive in how we should use the sorted subsequence that exists: in the first step, we do not divide the array into segments of length 1, but instead divide them into many sorted segments. The merge operation is then performed in the same way for these segments.
Now there is only one small problem with this approach: we use some extra space that we don't need to use. The normal merge sort uses the stack space of O (log (n)). This version uses the O (n) space to store the initial segmentation.
So why does our "equivalent" algorithm have very different space consumption?
The answer is that I lied on their "equivalence". The most important difference between this method and the common merge order is that the common merge sort is "inert" in the segmented operation, and only if the next level is required to generate enough segments and then the fragments are discarded immediately after the next level is generated.
In other words, we actually generate the segments by merging the edges in the merge process instead of generating all the segments in advance.
Now, let's see if we can convert this idea into an algorithm.
In each step, a new, lowest-level segment is generated (this is a separate element in the general merge sort, which is a sorted subsequence in the version described above). Add it to a storage-segmented stack, and occasionally merge two segments from the top of the stack to reduce the stack size. Keep repeating this action until no new segments can be generated. Then merge the fragments from the entire stack.
There is another place in the above algorithm that is not specified: We do not specify when to perform the merge operation at all.
So far there have been too many words and the code is too small, so I'm going to give a temporary answer: anytime (Good pit dad).
Now, let's write some code first.
We use a fixed stack size, which is much larger than any reasonable stack height//Of course, we still need to check the overflow #define STACK_SIZE 1024 typedef struct {int *index;
int length;
Run;
typedef struct {int *storage;
Store the segmented (runs, the original author will get the section called Run) run Runs[stack_size];
The top of the stack pointer, pointing to the next position to be inserted int stack_height; Keep track of where we've been segmented so we can know where to start the next segment//array index < PARTIONED_UP_TO is already segmented and stored on the stack, and the index >= partioned_up_to// The elements are not yet stored on the stack.
When partitioned_up_to = = array length, all elements are on the stack of int *partitioned_up_to;
int *array;
int length;
} sort_state_struct;
typedef sort_state_struct *sort_state;
We use a fixed stack size, which is much larger than any reasonable stack height//Of course, we still need to check the overflow #define STACK_SIZE 1024 typedef struct {int *index;
int length;
Run;
typedef struct {int *storage;
Store the segmented (runs, the original author will get the section called Run) run Runs[stack_size];
The top of the stack pointer, pointing to the next position to be inserted int stack_height; Keep track of where we've been segmented so we can know where to start the next segment//array index < PARTIONED_UP_TO is already segmented and stored on the stack, and the index >= partioned_up_to// The elements are not yet stored on the stack. When partitioned_up_to = = The length of the arraySome elements are on the stack on the int *partitioned_up_to;
int *array;
int length;
} sort_state_struct;
typedef sort_state_struct *sort_state;
We'll pass the sort_state pointer to all the functions we need.
The underlying logical code for this sort is as follows:
while (Next_partition (&state)) {while
(Should_collapse (&state)) merge_collapse (&state);
}
while (State.stack_height > 1) merge_collapse (&state);
while (Next_partition (&state)) {while
(Should_collapse (&state)) merge_collapse (&state);
}
while (State.stack_height > 1) merge_collapse (&state);
next_partition function If there are any elements that are not in the stack, a new segment is pushed into the stack and returned 1, otherwise 0 is returned. Then the appropriate compression stack. Finally, when all the arrays are segmented, the entire stack is compressed.
Now we have the first adaptive version of the merge sort: If there are many ordered subsequence in the array, we can take a good shortcut. If not, our algorithm still has the time efficiency of (expecting) O (Nlog (n)).
The efficiency of this "expectation" is a bit unreliable, and at random we need a good strategy to control the process of merging.
Let's think about whether there are any better restrictions. A natural idea to achieve this is to maintain an invariant on the stack, continuously performing the merge until the invariant is satisfied.
Further, we want this invariant to maintain a maximum of log (n) elements in this stack.
Let's consider the following: the length of an element on each stack must be >= twice times the length of the element below it, so the top element of the stack is the smallest, and the second is the next element on the top of the stack, and at least twice times the length of the top element of the stack.
This invariant does guarantee the requirements of log (n) elements in the stack, but it creates a tendency to compress each stack, taking into account the length of the elements in the stack as follows:
Let's say we put a segment of length 1 on the stack, and we have the following merge:
8, 4, 2, 1, 1, +, 8, 4, 2, 2, 8, 4, 4,, +, 8, 8, April, 6, in all of them.
4,
128
This situation becomes worse after more optimizations are done on the merge process (basically because it stomps on certain structure that might is present in the array). But now our merger process is very simple, so we do not need to worry about it, the first time to do so.
One thing to note: We can now determine the size of our stack. Assuming that the length of the stack top element is 1, the second element length must be >=2, after the inevitable >=4 ... So the total length of the elements in the stack is 2^n-1, because in the 64-bit machine there will be at most 2^64 elements in the array (this is a pretty stunning array), so our stack only needs up to 65 elements, and set aside a position for the new stack element, Then we need to allocate 66 of the space to the stack to ensure that there will never be overflow.
It's also worth noting that we only need to check the length of an element at the top of the stack, >=2 * The top of the stack, because we always keep this invariant during the stack, and the merge process only affects two elements on top of the stack.
To satisfy the invariant, we now modify the Should_collapse function as follows:
int Should_collapse (sort_state state) {
if (state->stack_height <= 2) return 0;
int h = state->stack_height-1;
int head_length = state->runs[h].length;
int next_length = state->runs[h-1].length;
Return 2 * head_length > next_length;
}
int Should_collapse (sort_state state) {
if (state->stack_height <= 2) return 0;
int h = state->stack_height-1;
int head_length = state->runs[h].length;
int next_length = state->runs[h-1].length;
Return 2 * head_length > next_length;
}
Now, our adaptive merge sort is done, praise!
Look back at an example of a problem that has preceded it.
Consider the following array of reverse order:
What happens when we use our adaptive merge sort?
The stack is run as follows:
{5}
{5}, {4} {
4, 5}
{4, 5}, {3} {
4, 5}, {3}, {2}
{4, 5}, {2, 3}
{2, 3, 4, 5} {2, 3, 4, 5}
, {1}
{1}, 2, 3, 4, 5}
This is a clear enough combination strategy.
But there is a better way to sort the arrays in reverse order: reverse them directly in situ.
It is easy to modify our algorithm to take advantage of this, we have been looking for the increment of the subsequence, when looking for the not incremented subsequence can be very simple to look for a descending subsequence, and then reverse it to an incremental sequence into the stack.
According to the above strategy, we modify the search sequence code as follows:
if (Next_start_index < State->array + state->length) {
if (*next_start_index < *start_index) {
//We Have a decreasing sequence starting here.
while (Next_start_index < State->array + state->length) {
if (*next_start_index < * (next_start_index-1) ) next_start_index++;
else break;
}
Now reverse it into place.
Reverse (Start_index, next_start_index-start_index);
} else {
//We have a increasing sequence starting here.
while (Next_start_index < State->array + state->length) {
if (*next_start_index >= * (Next_start_index- 1) next_start_index++;
else break;
}}}
if (Next_start_index < State->array + state->length) {
if (*next_start_index < *start_index) {
//We Have a decreasing sequence starting here.
while (Next_start_index < State->array + state->length) {
if (*next_start_index < * (next_start_index-1) ) next_start_index++;
else break;
}
Now reverse it into place.
Reverse (Start_index, next_start_index-start_index);
} else {
//We have a increasing sequence starting here.
while (Next_start_index < State->array + state->length) {
if (*next_start_index >= * (Next_start_index- 1) next_start_index++;
else break;
}}}
As with the basic reverse sequence, our sort can now handle the mixing situation well. For example, the following array:
{1, 2, 3, 4, 5, 4, 3, 2, 1}
The sorting process is performed as follows:
{1, 2, 3, 4, 5}
{1, 2, 3, 4, 5}, {1, 2, 3, 4}
{1, 1, 2, 2, 3, 3, 4, 4, 5}
This situation is much better than our previous implementation!
Finally, we have to add a little optimization to the algorithm:
There is a tipping point in our previous merge sort to convert decimal groups to insert sort, but there is no such setting in our adaptive version at the moment, which means that our performance may be lower than the normal merge sort in arrays that are not available in many special structures.
Think back to the reverse merge sort process, where you can use the decimal group instead of inserting a sort to understand this: rather than starting at 1, we start by dividing the insertion_sort_size and using the insert sort to ensure that the paragraph is ordered.
This suggests a natural way to improve our adaptive version: When we find that a fragment is less than a set value, you can use the insert sort to grow it to a set length.
This allows us to change the code for the last side of the Next_partition function as follows:
if (Run_to_add.length < min_run_size) {boost_run_length (state, &run_to_add);} St
ate->partitioned_up_to = Start_index + run_to_add.length; if (Run_to_add.length < min_run_size) {boost_run_length (state, &run_to_add);} state->partitioned_up_to =
Start_index + run_to_add.length; The Boot_run_length function is as follows: void Boost_run_length (Sort_state state, run *run) {//Need to make sure we don ' t overshoot the en
D of the array int length = State->length-(Run->index-state->array);
if (length > min_run_size) length = min_run_size;
Insertion_sort (run->index, length);
run->length = length; } void Boost_run_length (Sort_state state, run *run) {//Need to make sure we don ' t overshoot the end of the array int
Length = State->length-(Run->index-state->array);
if (length > min_run_size) length = min_run_size;
Insertion_sort (run->index, length);
run->length = length; }
This improves the performance of the algorithm when applied to random data to a level that is quite competitive compared with the normal merge sort.
Here we finally got an adaptive merge sort, which can be said to be the core part of Timsort in some degree.