# Finding the first k minimum or maximum value algorithm (Java) from the mass data

Source: Internet
Author: User

Now there's a problem: ask for the first k minimum or maximum from multiple data

Analysis: There are a variety of scenarios that can be implemented. First, the most easy to think about is to quickly sort the data, and then output the first k digits.

Second, the first definition of the array of K, from the source data to remove the first k fill this array, adjust the maximum value of this array maxValue to the first, and then to the rest of the N-k data iteration, for each traversed to the number x, if x < MaxValue, with X to replace MaxValue, Then adjust the position of the maximum value of the array.

Third, based on the idea of two, maintenance of the heap of K, from the source data to remove the first k fill the instantiation heap, adjust the maximum value of this heap maxValue to the heap top, and then to the rest of the N-k data iteration, for each traversed to the number x, if x < MaxValue, Replace the MaxValue with X, and then adjust the position of the heap's maximum value.

There are other options, omitted.

The time complexity and space complexity are calculated separately below.

Space complexity of time complexity

Scenario One O (N*LGN + k) defines an array in the stack, with virtually no heap memory occupied

Scenario Two O (K + (n-k) *k) defines an array in the stack and consumes almost no heap memory

Programme III O (k + (n-k) *lgk) O (k)

When n tends to infinity, it is clear that scenario three is the most selective, and, when the amount of data is very high, the scheme does not work at all, because an array does not have the data in the sea, in fact, there is almost no one to write the algorithm. The time complexity is N*LGN, if the data into the heap, it turns out that the data in the heap operation, time complexity are lgk, where k is the capacity of the heap. Today wrote the Java code for scenario three, which is shared as follows:

Package findminnumincludedtopn;

Import Java.io.File;
Import java.io.IOException;

/**
* Find out the top K minimum value from the mass data, the precise time complexity is: K + (n-k) * lgk, the space complexity is O (K), currently the optimal algorithm in all algorithms
*
* @author Tongxueqiang
* @date 2016/03/08
* @since JDK 1.7
*/
Public class Findminnumincluedtopn {
/**
* Find out the top K minimum values from the mass data
*
* @param k
* @return
* @throws IOException
*/
Public int[] FINDMINNUMINCLUEDTOPN (int k) throws IOException {
Long start = System.nanotime ();

int[] heap = new Int[k];
int index = 0;
//Import massive data from file
String text = null;
//read out the first n data, build the heap
Do {
if (text! = null) {
Heap[index] = integer.parseint (text);
}
index + +;
} while (text! = NULL && index <= k-1);

buildheap (heap);//Build the heap and adjust the maximum position to the first

//Traverse the remaining n in the file (file data capacity, assumed to be infinite)--K-bar data, if the read data is smaller than heap[0], replace it, and update the heap
While (text! = null) {
if (text! = NULL &&! "". Equals (Text.trim ())) {
if (integer.parseint (text) < heap[0]) {
heap[0] = integer.parseint (text);
maxheap (heap);
}
}
}
Long end = System.nanotime ();
long time = End-start;
System.out.print ("spents:" + Time + "nanosecond");
return heap;
}

/**
* Build Heap
*
* @param heap
*/
public void Buildheap (int[] heap) {
Maxheap (heap);
}

/**
* Update the heap data, adjust the maximum value to the first, compare the switching algorithm, so easy!
*
* @param heap
*/
Public void Maxheap (int[] heap) {
int max = heap[0];
int largeindex = 0;
//Find the maximum index position, starting at the second position
for (int i = 1; i < heap.length; i++) {
if (Heap[i] > max) {
max = heap[i];
largeindex = i;
}
}
//Exchange
swap (heap, largeindex);
}

/**
* Data exchange
*
* @param heap
* @param largeindex
*/
private void Swap (int[] heap, int largeindex) {
int temp;
temp = heap[0];
Heap[0] = Heap[largeindex];
Heap[largeindex] = temp;
}
}

Test class:

public class Test {

public static void Main (string[] args) throws Exception {
FINDMINNUMINCLUEDTOPN fmnt4 = new FINDMINNUMINCLUEDTOPN ();
int heap[] = FMNT4.FINDMINNUMINCLUEDTOPN (4);
}
}

Read data from a 14.2M file (about 130多万条 data), find the first 4 minimum values, time-consuming average 0.6 seconds, the effect is very good, and my computer hardware configuration is quite rotten, the CPU has been aging, dual-core, no-brand.

Finding the first k minimum or maximum value algorithm (Java) from the mass data

Related Keywords:

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

## A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

• #### Sales Support

1 on 1 presale consultation

• #### After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

• Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.