0.5 billion count Median

Source: Internet
Author: User

The easiest way to find the median is to sort the sequence first and take the median. However, it takes nearly 2 GB to read all the 0.5 billion numbers into the memory.

One idea is to use the external sorting method to record the number of data in the sorting process and find the median. First, hash () % 100 is used to divide the data into 100 files, then each file is sorted in the memory, and then 100 small files are merged, and finds the median in the merge process. The time complexity is O (nlogn)

 

Another method is to divide the data into 0-9999999,0000000-999999999 ,...... About 50 parts, each part is saved to a small file, and the number of elements in each small file is counted. Because the files are relatively ordered, it is easy to determine which file the median is located in, the sorting order of median in the small file can be obtained, and the small file is processed in the same way. When the file content is small, the median operation can be directly performed in the memory, the time complexity of finding K small elements for n random numbers is O (n), so the total time complexity is O (n)

 

0.5 billion no. of elements found

The idea is: divide the 0.5 billion pieces of data into 50 parts by size, 0-9999999,100 00000-99999999... And store them in the file separately. For each file, you only need to find that there are no elements in each file.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.