Summary of the Python version solution for the maximum K number problem, and the python Solution

Source: Internet
Author: User

Summary of the Python version solution for the maximum K number problem, and the python Solution

The TopK problem is to find the maximum number of K. This problem is very common, for example, to find the top 10 keywords from 10 million search records.
Method 1:
Sort first, and then extract the first k number.
Time Complexity: O (n * logn) + O (k) = O (n * logn ).
This method is simple and crude.

Method 2: Max heap

We can create a data container with a size of K to store the minimum K Number, traverse the entire array, and compare each number with the maximum number in the container. If this number is greater than the maximum value in the container, otherwise, replace the maximum value in the container with this number. This method is also very simple to understand. As for the container selection, many people first respond to the largest heap, but how to implement the largest heap in python? We can use the heapq library that implements the minimum heapq, because in an array, each number is reversed, the maximum number is changed to the minimum number, and the order of the entire number is changed, so we can reverse each number in the array, and then use the smallest heap to return the result. The Code is as follows:

import heapqdef get_least_numbers_big_data(self, alist, k):  max_heap = []  length = len(alist)  if not alist or k <= 0 or k > length:    return  k = k - 1  for ele in alist:    ele = -ele    if len(max_heap) <= k:      heapq.heappush(max_heap, ele)    else:      heapq.heappushpop(max_heap, ele)  return map(lambda x:-x, max_heap)if __name__ == "__main__":  l = [1, 9, 2, 4, 7, 6, 3]  min_k = get_least_numbers_big_data(l, 3)

Method 3: quick select

The quick select algorithm is similar to the quick select algorithm. The difference is that quick select only needs to go in one direction.
Time Complexity: O (n ).

def qselect(A,k):   if len(A)<k:return A   pivot = A[-1]   right = [pivot] + [x for x in A[:-1] if x>=pivot]   rlen = len(right)   if rlen==k:     return right   if rlen>k:     return qselect(right, k)   else:     left = [x for x in A[:-1] if x<pivot]     return qselect(left, k-rlen) + right  for i in range(1, 10):   print qselect([11,8,4,1,5,2,7,9], i) 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.