Compared with the internal sorting algorithm, the external sorting algorithm is in the case of a large amount of data. When the amount of data is very large, you can not put the data into memory for the internal sorting algorithm, only the data can be chunked or segmented, then entered into memory in order to sort, and then sort them sorted again, and finally achieve the total sort.
Therefore, the method used for external sorting is relatively single-merge sort. When an external sort is merged, it not only merges the ordering time, but also needs to read and write external memory, while computer knowledge knows the time required to access external memory is at least more than 10 times times the amount of memory accessed. Therefore, to improve the efficiency of the outer row, you need to reduce the number of visits to external memory, you can do N-way balance merge instead of just 22 merge.
It is concluded that the loser tree can be used to realize the total merging time and the merging method. The so-called loser Tree, and the winner Tree is the corresponding, the victor Tree, is the two fork tree of the sub-nodes to compare, "the victory of the party" to the parent node, and then compared with the same level. The loser tree, too, puts the winner in the parent node, but compares the loser to the same sibling. In order to create a loser tree, you can first create a one-dimensional array to hold the number of the leaf node, and the leaf node can also be represented by a one-dimensional array, the leaf node is stored in the data is the merging tree sequentially posted data, when a leaf node is selected as the final champion, the next value of the road as its leaf node, The process is shown in the following figure.
Where the LS array is used to store the loser tree, Ls[0] holds the winner of the Loser Tree, which is the smallest one, and array B is used to hold 5-way merge data. The specific process is: B1 and B2 comparison, B1 large, stored its number 1 to ls[3], and then B2 to compare with the same level of the winner, B3 and B4 compared b4, storage 4 in ls[4],b3 compared with B0, B3 large, B0 and B2 comparison, B0 large storage 0, the winner for B2, storage 2 to Ls[0 ]。
The specific algorithm implementation is as follows:
Create the loser tree, where the array ls initial value is set to K,k as the number of merge trees, here is 5, set b[5]=-1, easy for any other and it is the winner of the comparison
void Createlosertree () {
int i;
for (i=0; i<k; i++) {
ls[i] = k;
}
for (i=k-1; i>=0; i--) {
Adjust (LS, i);
}
}
Adjust the tree, the loser is recorded in the parent node, the winner on the previous level to do a comparison
void Adjust (int ls[], int s) {
int tmp = (s+k)/2;
while (tmp>0) {
if (B[s] > B[ls[tmp]]) {
int tt = s;
s = ls[tmp];
LS[TMP] = TT;
}
TMP/= 2;
}
Ls[0] = s;
}
For complete parameters Please refer to my link: https://github.com/clarkzhang56/the-method-of-sort/blob/master/externalsorting/externalsort.c Click to open the link
So, the loop operation is done until all the data is sorted.