Classification of algorithm problems --- top-K problems and multi-channel Merge Sorting

Source: Internet
Author: User

Pro1: search for the first k large numbers

Method 1: If the value behind the K-root heap is greater than the current root, replace it with it and adjust the heap

Most people recommend using heap and heap. The following is a specific explanation:

If k = 1, you do not need to do anything. You can traverse it directly, and the time complexity is O (n ).

Next we will discuss the situation where K is large, such as 10 thousand.

If you create a small root heap, the root is the minimum K number currently. Then read the number of N-K, each read a number is compared with the current root, if more than the current root, replace it, and adjust the heap. If it is small, it is read into the next one.

Time complexity O (N * logk ).

 

Method 2: Use the idea of fast partitioning:

This question also provides a good method of time complexity. This algorithm is mentioned in the beauty of programming.

First, find the maximum K number. This time complexity can be O (n). The specific method is as follows (using the fast partitioning idea ):

Select a random number from N number and scan it again. If n is greater, place it on the right. If n is smaller, place it on the left.

If: A: L = K-1 returns n

B: L> the K-1 continues the previous operation in the L element.

C: L <The K-1 continues the previous operation in the r element.

B and C only need to execute one item at a time. ThereforeAverage complexityApproximate: O (N + n/2 + N/4...) = O (2n) = O (N)

 

Pro2: 20 rows already ordered + 20 rows merged for top 500

There are 20 arrays, each of which has 500 elements and is sorted. Now we can find the top 500 in the 20*500 number.

A:

Take a number from each of the 20 arrays, record the source array of each number, and create a large root heap containing 20 elements. At this time, the heap top is the maximum number. Take out the heap top element andAdd the next number from the source array of the heap top element to the heap, And then take the maximum value. Keep doing this for 500 times.

 

Pro3: Minimum heap K-path merge

Please provide an algorithm that combines K sorted linked lists into a sorted linked list at the time of O (nlgk). Here N is the total number of elements in all input linked lists.

Algorithm idea:

1. extract the first element of each chain table from K linked lists to form an array arr of K size, and then convert the array arr to the minimum heap, then arr [0] is the smallest element;

2. take out Arr [0] and place it in the new linked list. Then, add the ARR [0] element to the next element in the original linked list, that is, arr [0]. next, if arr [0]. if next is blank, that is, the elements in the linked list where it is located have been obtained, add the last element of the heap to ARR [0], the heap size is automatically reduced by 1, and the loop is enough.

Http://www.programlife.net/stl-priority-queue.html

 

Pro4: Overall ordered local disorder


An integer array with 10 billion elements. Its elements are ordered. Now it is divided into several segments. Each segment has no more than 20 elements, and the number of elements varies, now, the order of these elements is disrupted in each segment, and then the arrays of these 10 billion elements are sorted again. What is the algorithm with the minimum time complexity? The time complexity is also given.

Http://bbs.csdn.net/topics/390252481

Http://blog.csdn.net/burningsheep/article/details/8104493

Analysis:

If the length of each segment is equal, you can use the K-path merge method above, but the length is not equal here. Other methods need to be considered separately.

Solution: (insert directly to sort)

Observe this result. Assume that the number from 1st to 5N has been ordered as sort (5N ), then we need to add the 5 data records 5N + 1 to 5N + 5 to the sorted array, just insert and sort the 5 data records. Because the segment length cannot exceed 5, you only need to search for the 5N + 1 number at most when inserting it. The number of comparisons cannot exceed 5 times. Because 5N + 1 to 5N + 5 are sorted, the number of subsequent comparisons will not exceed 5 (up to the previous insert position ). Therefore, each time we add five numbers to the sorted array, the time complexity is O (5*5 ),

Assume that the length is n and the length of each segment cannot exceed K. The time complexity of each insert is O (K * K ).

For insert operations in segments, N/K times are required. Therefore, the total time complexity is O (K * k) * O (N/K) = O (NK)

 

Pro5: 10 billion, calculate the maximum number of 10 thousand, and describe the time complexity of the algorithm.

Create a heap and put the first 10 thousand in the heap. In the future, the smallest one will be pushed out.


Optional appeal algorithm implementation tools: Write heap sorting function, priority queue, STL set Multiset

Classification of algorithm problems --- top-K problems and multi-channel Merge Sorting

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.