In-Place merge sort)

Source: Internet
Author: User

I was defeated by this problem during my internship with Microsoft last time. I couldn't figure it out.

Http://blog.ibread.net/345/in-place-merge-sort/

When talking about merge sort, we naturally think of divide-and-conqure,
O (nlgn) time complexity and extra O (n) space. The extra space of O (n) seems to be the most obvious disadvantage of merge sort, but in fact this is completely insurmountable. That is to say, we can fully implement O (nlgn) time and the merge sort of O (1) space. For this algorithm without extra space (I .e. the extra space of the constant size), there is a common name called in-place.
Algorithms, so we call this merge algorithm in-place merge sort, that is, in-situ Merge Sorting.

Before entering the specific details, let's look at a conclusion that will be used later:

Given the two ordered arrays X and Y whose lengths are N and m respectively, and an auxiliary array a whose lengths are N, there is an O (m + n) Time Complexity Algorithm auxsort; this algorithm can merge and sort x and y in the same place, while the elements in a remain unchanged (the order may be disrupted ).

The auxiliary array a is used, and the space complexity is O (n. Therefore, in advance, we will explain that the auxsort algorithm does use an auxiliary array a with a length of N. However, in the final algorithm, auxsort serves as subroutine, and we will use part of the array to be sorted as the secondary array A, rather than opening up another space. So the final space complexity is still O (1 ). So please take a look...

The specific steps of the auxsort algorithm are as follows:

  1. Exchange X and.
  2. Maintain two integer subscripts, PX and Py, and initialize them to 0, representing a [0] and y [0] respectively. Also, maintain a DEST pointer, which points to X [0] during initialization. the pointer automatically transitions to Y [0] after moving to X [n-1].
  3. Compare a [PX], Y [py]:
    If a [PX] <= Y [py], swap (A [PX ++], * DEST ++ );
    If a [PX]> Y [py], swap (Y [py ++], * DEST ++ );
  4. If there are still residual elements in a or Y, the remainder is exchanged with * DEST in sequence. Of course, the Dest ++ operation must be executed in each step.

Through the above steps, we can easily find that the basic idea of this algorithm is exactly the same as that of the traditional 2-way merge sort, and the time complexity is also O (n ). However, in the traditional merge sort, we need to apply for an additional destination array to store the sorted data. Here, the destination array is x + y itself, so we need to first exchange X and, in addition, in the traditional merge sort, the data in X or Y is assigned to the newly opened destination array Z. However, the data in our destination array cannot be destroyed, therefore, only swap operations can be used.

For better understanding, let's look at an example. Suppose n = 3, and X1 <Y1 <X2 <Y2 <X3 <Y3, auxsort shows the specific execution process 1.

Figure 1
Sort by secondary array in-situ Merging

The focus of this article is the in-place merge sort algorithm. For details, refer to the exercise in taocp vol3 5.2.5. According to knuth, this algorithm is from doklady akad. Nauk SSSR 186 (1969 ). These guys are really amazing... If you don't talk nonsense, go to the topic.

Suppose we have two arrays x and y which have their respective orders. Their total lengths are N and 2. The algorithm to be introduced below will use the extra space of O (1) to sort it in situ in O (n) time. In general, algorithms are divided into three parts: segmented, merged, and scanned.

Figure 2
Original Array

Multipart: We combine the original array and divide it into m + 2 parts by length n = SQRT (n): Z1, Z2 ,... ZM + 1, ZM + 2 and 3 are shown. In this way, except that ZM + 2 has (N % N) elements, other m + 1 blocks have exactly n elements.

Figure 3
Multipart

In addition, as shown in 3, assume that X [-1] (the last element in X) is included in ZK, then we swap it with ZM + 1. So far, Z1, Z2 ,..., All the elements in ZM are ordered and the scale is N. After the adjustment, ZM + 1 and ZM + 2 are collectively referred to as A, and the size of S is [N, 2n ). The preceding two steps can be completed in O (n) time.

Merge by two: adjust the size of Z1, Z2,... ZM based on the first element in each piece, so that Z1 [0] <= Z2 [0] <=... <= ZM [0]. If the first element is equal, the size of the last element is used as the basis. Because extra space cannot be used, we can choose to sort: Each round selects the smallest block of the first element from the remaining block and exchanges it with the current block. In the worst case, the m round is carried out, and each round requires O (m) comparison,
For O (n) switching, the overall time complexity is O (M (m + n) = O (n ).

After such sorting, we can start to merge and sort the Z1 and Z2. Since the length of a is S> = N, we can use the auxsort method described above. The only drawback is that the order of elements in a is disrupted, but it doesn't matter because the elements in a are unordered. That is to say, we need to execute the S-1 round auxsort algorithm, in which the I round processes Zi and Zi + 1. Through the following induction, we can know that after the execution of this S-1 round of auxsort, Z1 ~ All the elements in ZM have been sorted.

Before that, assume that the I block after sorting is Ri, And the Zi after auxsort processing is Z 'I. In addition, for the convenience of description, we set Z1 ~ In ZM, the first block of X is X1, the first block of Y is Y1, and so on.

We will prove that after the I round processing, rI = z' I.

  1. I = 1. If z1 is X1, Z1, Z2, Z3 may be {x1, x2, X3}, {x1, x2, Y1}, {x1, Y1, x2 },{ X1, Y1, Y2 }.

    1. {X1, x2, X3 }. Then X1 [n-1] <= x2 [0] = Z2 [0] <= Zi [0] <= Zi [k] (I> = 2), that is, x1 contains the smallest n elements, R1 = x1 = z'1. In addition, z'2 [0] = Z2 [0] = x2 [0] <= X3 [0] = Z3 [0].
    2. {X1, x2, Y3 }. Same as above, X1 [n-1] <= Z2 [0] <= Zi [k] (I> = 2), so R1 = x1 = z'1. In addition, z'2 [0] = Z2 [0] <= Z3 [0].
    3. {X1, Y1, X2 }. Obviously, the minimum n elements must be in X1 or Y1, so R1 = z'1. In addition, z'2 [0] <= max (x1 [n-1], Y1 [0]) <= x2 [0] = Z3 [0].
    4. {X1, Y1, Y2 }. Similarly, the minimum n elements must be in X1 or Y1, R1 = z'1. In addition, z'2 [0] <= Y1 [n-1] <= Y2 [0] = Z3 [0]

    Therefore, after merge, the properties of z'2 [0] <= Z3 [0] are still valid. In the subsequent analysis, if z'2 [n-1] = x1 [n-1], z'2 can be considered as x1; similarly, if z'2 [n-1] = Y1 [n-1], we can regard it as Y1. This is a very important inference. It directly supports our subsequent analysis of the above four situations.

  2. Assume that the conclusion is true when I = K, that is, after K-round auxsort, R1 ~ Rk is in place. Based on the inference in the previous step, Z' k + 1 can be directly considered as Xi or YJ. That is to say, the methods discussed in four situations in the previous step are still applicable. Certificate completion;
  3. In conclusion, we can find the conclusion is true based on the induction method.

The merging time of each round is O (n). Therefore, the overall time complexity of step 2 is O (Mn) = O (n ).

Scanning: After step 2 ends, R [0... kn-1] has been sorted. Since a has a total of S elements, it is easy to see that the largest S elements in the original array must exist in A and R [kn-S, kn-1. Therefore, we can sort A and R [kn-S, kn-1] in O (s ^ 2) = O (n) Time by selecting the sort, in this way, the largest s element is moved to. In other words, R [0,
Kn-1] stores the first n-s elements in the original array.

Using auxsort, you can use a as the auxiliary array to convert R [0... kn-s-1] (length is N-2s) and R [kn-S, kn-1] (length is S) Merge sort. In this way, the first n-s elements in the original array are sorted, and the time complexity is O (n ). Since the order of elements in a has been disrupted, we need to re-sort the elements in the selected order within the time of O (s ^ 2) = O (n.


In this way, after dividing, merging, and scanning the last three steps, we have completed a round of in-situ Merge Sorting for X and Y. the time complexity is O (n ), the space complexity is O (1 ). Based on this, the complete Merge Sorting can be completed within the O (nlogn) time, and the space complexity is still O (1 ).

Note that because auxsort is based on exchange operations, the order of elements in a will be disrupted, so the Sorting Algorithm is unstable.


Postscript:

I can't remember when I first heard about this concept. Recently, I was impressed that I went to the STC interview last year. When the interviewer asked merge sort, what are the disadvantages of this? I don't seem to have any special disadvantages. It is generally believed that the extra space of O (n) can be saved because we have in-place merge sort. The interviewer asked me to describe the general process with great interest. :( Sorry, I didn't actually know the specific steps at the time, but I had a big bang. Fortunately, the interviewer does not care about it either.

Later, I went back and looked at inplace_merge in STL. I found that this is not a pure in-place merge sort, and may also use additional buffer. I am very disappointed, no further research will be conducted. Some time ago, I found that this was a exercise in the taocp, so I finally got it through the answer. Since knuth's answer has always been simple and clear, I did not understand the correctness of the Two-to-two merge step, so I had to re-deduce it myself, so it is much more difficult to write than the original article. If you have no time, it may be much better to look back at the original text. : D

In addition, stable in-place merge sort is also one of the taocp exercises (so the slow reading of taocp does not mean that I am stupid ...), Wait until I have time to sort it out ~ Hope it won't take too long. Haha.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.