Sort the disk files by programming Pearl's Reading Notes

Source: Internet
Author: User

Input:

The input file contains at most n positive integers. Each positive integer must be less than N, which is n = 10 ^ 7. If an integer appears twice during input, a fatal error occurs. These integers are not associated with any other data.

Output:

List of sorted integers output in ascending order.

Basic Idea: Use a 10 million-bit string to represent the file. In this string, if and only when integer I is in this file, enable (set to 1) only when the I-th digit is used ). The process of solving this problem can be divided into three natural stages. In the first phase, all bits are disabled and the set is initialized as an empty set. In the second stage, read each integer in the file, open the corresponding bit, and create the set. In the third phase, check each bit. If a bit is 1, write the corresponding integer to create the sorted output file. If n is the number of medians (10000000 in this example ),ProgramThe pseudo code is as follows:

/* Phase 1: Initialize set to empty */
For I = [0, N]
Bit [I] = 0;
/* Phase 2: insert present elements into the Set */
For each I in the input file
Bit [I] = 1;
/* Phase 3: Write sorted output */
For I = [0, n)
If bit [I] = 1
Write I on the output file

Principles:

Bitmap Data StructureThis data structure represents the dense set in a finite field. Each element appears at least once, and no other data is associated with the element. Even if these conditions (for example, when there are multiple elements or extra data), you can use keys in a finite field as table indexes (tables have more complex entries)

Multi-channel (Multiple-pass)AlgorithmThese algorithms have multiple channels for input data, and each read is a step forward.

Time and space trade-offsThe two cannot be ignored.

Simple DesignCompared with complex programs, simple programs are generally more reliable, secure, robust, and effective, and easier to build and maintain.

Technorati tags: programming Pearl, algorithm, sorting, disk sorting

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.