Find the median of two ordered arrays

Source: Internet
Author: User

This is my second Leetcode topic, initially thought and the first one is very simple, but do the process only to find this topic is very difficult, give a "just on the battlefield on the ground on the mine hung off" feeling. Later searched Leetcode difficulty Distribution table (Leetcode difficulty and interview frequency) only to find that the problem is 5 difficulty, it is really underestimated it! Search on the Internet a lot of answers, but rarely concise and correct answers, only a search for the K-small value of the method is very good, here to tidy up.

First of all, the compilation of Leetcode to run the slot: seemingly no time-out judgment, and small and large data set difference is very small. I started with the stupidest way to do this, using sorting to merge two arrays into an array, and then return the median number:

[CPP]View Plaincopyprint?
  1. Class Solution {
  2. Public
  3. double findmediansortedarrays (int a[], int m, int b[], int n) {
  4. //Start typing your/C + + solution below
  5. //do not write int main () function
  6. int *a=new int[m+n];
  7. memcpy (A,a,sizeof (int) *m);
  8. memcpy (a+m,b,sizeof (int) *n);
  9. Sort (a,a+n+m);
  10. double median= (Double) ((n+m)%2? a[(n+m) >>1]:(a[(n+m-1) >>1]+a[(n+m) >>1])/2.0);
  11. Delete A;
  12. return median;
  13. }
  14. };

The method actually passed the test, but its complexity is the worst case O (nlogn), which shows that Leetcode only the correctness of the algorithm requirements, time requirements are not strictly.

Another method is to use a merge-like operation to find the median, using two pointers to a and B array headers to iterate through the array, and then count the number of elements until the median is found, at which point the algorithm complexity is O (n). After that, we tried to extend the method according to the problem of the algorithm introduction (9.3-8), but the method would have an infinite number of boundary detail problems, and the extension is not necessarily correct, this can be seen from the comments of the various web pages, it is very not recommended that we go this way.

Finally, a very good method was seen from the medianof-sorted arrays. The original text is interpreted in English, where we translate it into Chinese. The core of this approach is to turn the original problem into a problem that looks for the K decimal (assuming that the two original sequence is in ascending order), so that the median is actually a small number (M+n)/2. So as long as the problem of k decimal is solved, the original problem is solved.

First, assuming that the number of elements of arrays A and b are greater than K/2, we compare a[k/2-1] and b[k/2-1] two elements, which represent the K/2 small element of a and the K/2 small element of B respectively. These two elements compare a total of three cases:>, < and =. If a[k/2-1]<b[k/2-1], this means that the elements of a[0] to a[k/2-1] are in the first K-small element after the merging of a and B. In other words, a[k/2-1] cannot be larger than the K-decimal value after the two array is merged, so we can discard it.

Proving is also very simple, can be used to disprove the law. Suppose A[k/2-1] is greater than the K-value after merging, we may as well assume that it is a small value (k+1). Because a[k/2-1] is less than b[k/2-1], b[k/2-1] is at least the (k+2) small value. But in fact, there are at most k/2-1 elements in a is less than a[k/2-1],b and there are at most k/2-1 elements less than a[k/2-1], so the number of elements less than a[k/2-1] is at most k/2+ k/2-2, less than K, this and a[k/2-1] is the first (k + 1) The number of contradictions.

When A[k/2-1]>b[k/2-1], a similar conclusion exists.

When A[k/2-1]=b[k/2-1], we have found the small number of K, which is the equal element, which we remember as M. Since there are k/2-1 elements in a and B are less than m, so M is the small number of K. (There may be some doubt that if K is an odd number, then M is not the median.) Here is the idealized consideration, slightly different in the actual code, is to seek K/2 first, and then use K-K/2 to get another number. )

Through the above analysis, we can use recursive method to find the number of small K. In addition we need to consider several boundary conditions:

    • If A or B is empty, return directly to B[k-1] or a[k-1];
    • If k is 1, we only need to return the smaller values in a[0] and b[0];
    • If a[k/2-1]=b[k/2-1], return one of them;

The final implementation code is:

[CPP]View Plaincopyprint?
  1. Double findkth (int a[], int m, int b[], int n, int k)
  2. {
  3. //always assume that M is equal or smaller than n
  4. if (M > N)
  5. return findkth (b, N, A, M, K);
  6. if (m = = 0)
  7. return b[k-1];
  8. if (k = = 1)
  9. return min (a[0], b[0]);
  10. //divide K into parts
  11. int pa = min (k/2, m), PB = K-pa;
  12. if (A[pa-1] < b[pb-1])
  13. return findkth (A + PA, m-pa, B, N, K-PA);
  14. Else if (A[pa-1] > b[pb-1])
  15. return Findkth (A, m, B + Pb, N-PB, K-PB);
  16. Else
  17. return a[pa-1];
  18. }
  19. Class Solution
  20. {
  21. Public
  22. double findmediansortedarrays (int a[], int m, int b[], int n)
  23. {
  24. int total = m + N;
  25. if (total & 0x1)
  26. return findkth (A, M, B, N, TOTAL/2 + 1);
  27. Else
  28. return (findkth (A, M, B, N, TOTAL/2)
  29. + findkth (A, M, B, N, TOTAL/2 + 1))/2;
  30. }
  31. };

As we can see, the code is very concise and efficient. In the best case, every time there is a k half of the elements are deleted, so the algorithm complexity is logK, because the median number of K is (M+n)/2, so the algorithm complexity is log (m+n).

If there are two ordered arrays, they are already in order. So how do you ask for their median number? If you use the method of sorting the two arrays, the quickest time complexity is also O (NLOGN). However, if the median and sequential methods are used to find, the problem can be solved in O (n) time.

We first look for the median of each array, because it is an array of sorted order, so it can be found in O (1) time. Then, compare the size of these two numbers. If the median of a is greater than the median of B, look for the first half of A's array and the second half of B, and vice versa, in the first half of B's array and the second half of a. According to the recursive equation, the time complexity of the solution is O (n).

Median : the number of intermediate positions in a set of data, and if it is an even number, the average of the median two positions.

This problem two array number, then two array of the total number of the median is 2*n is even, the median must be the average of two positions in the middle. Time complexity requires O (LOGN), you must make full use of the array of orderly information.

       Read many versions on the web, all of which are considered incomplete for odd even numbers. Try to write it yourself and find a lot of details that are easy to ignore. The test considers integers, and you can set the array directly to the double type. If it is an integer type, consider the cast of the result type. There is also a final recursive end condition that only takes into account only one element in both arrays, or two cases where the median number is equal. We can find a few more examples to test. This code has been tested on vc++6.0. You can test it yourself, if you have any questions, please ask. The code is as follows:

The final return result must be double, because the average value of the two median is not necessarily an integer double midnum (int *a,int l1,int r1,int *b,int l2,int R2) {//depending on the position of the median of the parity, To ensure that the number of two words group elements equal int mid1,mid2;if ((r1-l1+1)%2==0)//Even when {mid1= (L1+R1)/2+1;//a Remove the median mid2= (L2+R2)/2;//b take the median}else//odd time { mid1= (L1+R1)/2;mid2= (L2+R2)/2;} if (l1==r1 && l2==r2)//The last two arrays are left with one element return (double) (A[L1]+B[L2])/2;//the last two arrays have two elements left, and A[mid1]>b[mid2], The situation below can not be processed, has been handed down//such as a[6]={1,3,5,6,8,10}; B[6]={2,4,7,9,11,15}; Last A{6,8},b{4,7}if (R1-l1==1 && r2-l2==1) {if (A[mid1]>b[mid2])//have to sort the remaining 4 numbers a[l1],a[ R1] (R1==MID1), B[l2] (L2==MID2), B[r2]{if (B[r2]<=a[l1])//b[mid2],b[r2],a[l1],a[mid1]return (double) (B[R2]+A[L1] )/2;else if (A[l1]<=b[mid2] && a[mid1]>=b[r2])//a[l1],b[mid2],a[mid1],b[r2]return (double) (b[mid2]+a[ MID1])/2;else if (A[l1]<=b[mid2] && a[mid1]<b[r2])//a[l1],b[mid2],b[r2],a[mid1]return (double) (B[mid2 ]+B[R2])/2;}} if (A[mid1]==b[mid2]) return A[mid1];else if (A[mid1] > B[mid2]) return Midnum (A,L1,MID1,B,MID2,R2); else return Midnum (a,mid1,r1,b,l2,MID2);} int main () {int a[6]={1,3,5,6,8,10};int b[6]={2,4,6,9,11,15};d ouble m2=midnum (a,0,5,b,0,5);cout<<m2<< Endl;int a2[10]={17,18,28,37,42,54,63,72,89,96};int b2[10]={3,51,71,72,91,111,121,131,141,1000};d ouble M=MidNum ( a2,0,9,b2,0,9); Cout<<m<<endl;return 0;}

Because the complexity is LOGN, consider using two points.

First, assume that the length of the two arrays is odd and greater than 1. Let Mid be (1 + N)/2, which is the subscript of the middle element. Consider the size relationship of X[mid] and Y[mid]:

(1) X[mid] > Y[mid]
In this case, we can think that when we combine two numbers and sort them, X[mid] 's ranking (ranking starting from 1) is definitely greater than N, as we can determine that these elements must be less than or equal to x[mid]:x[1...mid-1],y[1...mid-1], Y[mid].
Similarly, can be analyzed, Y[mid] ranking is definitely less than n + 1. By introducing a theorem, if we kill any K-elements (k > 0) in front of any of the K-elements and Y[mid [X[mid] at the same time, then the resulting median of the new two arrays is still the same as the original array. It is easy to prove that this theorem draws a picture. Therefore, the original problem is transformed into a smaller sub-problem.

(2) x[mid] = = Y[mid]
This is the case, I think for a long time, in the end how to deal with. Later found that he was foolish. If X[mid] equals Y[mid], consider the process of our sequencing. First, we can sort x[1...mid-1] and y[1...mid-1] together to get a new array of length (mid-1), and then we combine X[mid + 1...N] and Y[mid + 1...N to get a length of (mid-1) New array R, finally, we put X[mid] and Y[mid] in the middle, we get the final ordered array: p,x[mid],y[mid],r
In other words, when x[mid] = = Y[mid], you can immediately determine x[mid] and Y[mid] is the two you are looking for the median!

(3) X[mid] < Y[mid]
This situation and situation (1) symmetry, not burdensome.

Then, assume that the length of the two arrays is even, and is greater than 1. Make mid (1+n)/2, which is the subscript of the element to the left of the two elements in the middle. Consider the size relationship of X[mid] and Y[mid + 1]
(1) X[mid] > Y[mid]
Using a similar approach to the odd-numbered cases above, we can tell that X[mid] is ranked in half, and Y[mid] is ranked after half, so we can also use the same idea to narrow down the scale of the problem.
(2) x[mid] = = Y[mid]
The same idea, so the same conclusion. When they are equal, you can immediately make sure that they are the two median you are looking for.
(3) X[mid] < Y[mid]
Symmetry, the same approach. an ordered array to find the median number after merging

The first step: assuming that the two ordered arrays (which have already been sorted) are equal in length, the Write function finds the median number of two arrays combined. The second step: assuming that the length of the two ordered array, the same to find the median

Resolution: The topic looks very simple. First question: Assuming that the array length is n, then I will merge the array 1 and Arrays 2 directly, and then find the intermediate element directly. For such a scheme, the first question and the first question are no different. In this case, the complexity of Time is O (n). Usually in such cases, the mentor type of talent will be able to say: "You have a better way:)" If it is more efficient than linear, the direct thought is the logarithm O (log (n)), this time complexity is possible here? Of course it's possible. To continue to look at the following analysis.

First to find a picture (self-drawn, humble point)

Let us first analyze to see: Think of the efficiency of the logarithm, the first thought is the binary search, for this topic two points to find the meaning of what?

We found A[N/2] and B[N/2] to compare,

If they are equal, then our search is over, because the answer has been found A[N/2] is definitely the median of the sort.

If we find B[N/2]>A[N/2], explain what, this number should be in a[n/2]->a[n] this sequence, or in B[1]-B[N/4] here. Or, here or very important, we can say that we have successfully turned the problem into a sorted array of a[n/2]-a[n] and B[0]-B[N/2] to find the median of the merge, and obviously recursion is a good choice.

Similarly, what if B[N/2]<A[N/2]? Apparently found in A[0]-A[N/2] and b[n/2]-b[n].

When you continue to think, when does this recursion converge? Of course, a case is equal to the value of the occurrence, if not to wait until this n==1 the end of the time.

According to this idea, we can easily write the following code, of course, the value of the boundary needs to think about it, the previous idea is just an idea.

What if someone says it's not long? The same, let's draw a picture to see: (My drawing level certainly improved)

int find_median_equal_length (int a[], int b[], int length) {if (length = = 1) return A[0];int i = length/2;if (a[i] = = B[i] Return A[i];else if (A[i]<b[i]) return Find_median_equal_length (&a[i], &b[0], length-i); else return Find_ Median_equal_length (&a[0], &b[i], length-i);}


Find the median of two ordered arrays

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.