Input:
The input file contains at most n positive integers. Each positive integer must be less than N, which is n = 10 ^ 7. If an integer appears twice during input, a fatal error occurs. These integers are not associated with any other data.
Output:
List of sorted integers output in ascending order.
Basic Idea: Use a 10 million-bit string to represent the file. In this string, if and only when integer I is in this file, enable (set to 1) only when the I-th digit is used ). The process of solving this problem can be divided into three natural stages. In the first phase, all bits are disabled and the set is initialized as an empty set. In the second stage, read each integer in the file, open the corresponding bit, and create the set. In the third phase, check each bit. If a bit is 1, write the corresponding integer to create the sorted output file. If n is the number of medians (10000000 in this example ),ProgramThe pseudo code is as follows:
/* Phase 1: Initialize set to empty */
For I = [0, N]
Bit [I] = 0;
/* Phase 2: insert present elements into the Set */
For each I in the input file
Bit [I] = 1;
/* Phase 3: Write sorted output */
For I = [0, n)
If bit [I] = 1
Write I on the output file
Principles:
Bitmap Data StructureThis data structure represents the dense set in a finite field. Each element appears at least once, and no other data is associated with the element. Even if these conditions (for example, when there are multiple elements or extra data), you can use keys in a finite field as table indexes (tables have more complex entries)
Multi-channel (Multiple-pass)AlgorithmThese algorithms have multiple channels for input data, and each read is a step forward.
Time and space trade-offsThe two cannot be ignored.
Simple DesignCompared with complex programs, simple programs are generally more reliable, secure, robust, and effective, and easier to build and maintain.
Technorati tags: programming Pearl, algorithm, sorting, disk sorting