"Reading Notes", "Programming Zhu Ji Nanxiong", the position of the first chapter vector & bitmap

Source: Internet
Author: User

The narrative mode of this book is a series of algorithms, data structure and so on, which is derived from a specific problem. A total of three articles, basic, performance, application. Each chapter covers a number of chapters, the case is very practical and difficult, the explanation is also lively and interesting.

Since it is also the first time to contact the programming skills of the book, and the algorithm data structure of the knowledge reserves is weak, so it seems, purely to find abuse ah Orz. This behavior, the closure of the injured, it is boring. Can also be said that after graduation, reading as a war, see themselves in the "peacefully of peace", leisure has been idle, the sense of hardship is very little, there is also retreat shrinking. In other words, this book is not like clrs that difficult to fight (now think of all the brain pain AH), "programming pearls" such a good opponent, exhaust boredom is just practicing skills!

A lot of crap, hey, crossing Pat. Let's see what the first chapter says.

problem

How do I order a disk file with a maximum of 10 million records and a 7-bit integer per record?

Accurate description

Here the book gives an accurate description of the problem:

input: a file with a maximum of n positive integers, each of which is less than N, where n=10^7. If any integer in the input file recurs, it is a fatal error. No other data is associated with the integer.

output: a list of input integers sorted in ascending order.

constraints: up to (approximately) 1MB of memory space is available and sufficient disk storage space is available. Running time of up to a few minutes and running time of 10 seconds does not require further optimization.

Solution Solutions
    1. merge sort. the disadvantage of associative sorting is that it requires an O (n) of auxiliary space, that he will write more than one extra work file or memory. Advantage: Just read once. Disadvantage: Additional files need to be manipulated several times.

    2. Read the data in batches, and then use quick sort. Each integer is represented by a 32-bit or 4-byte representation up to the signed 2^31-1>9 999 999, and the calculated 10^7/(10^6/4) is available, and 40 read ordering can reach the target. Advantage: fast. Disadvantage: Multiple read data, multi-pass algorithm.

    3. Use bitmap data structures. This is a large table (10^7/32 + 1) row (32) column:

      31 30 ... 1 0
      31 30 ... 1 0
      ......
      31 30 ... 1 0
      31 30 ... 1 0

If the data is {2,3,5}, then the first line will be: 0000 0000 0000 0000 0000 0000 0010 1100. Obviously, all the integers in the file can be found in this table corresponding to the location, as smart as you, this is a size of 10^7/32 + 1 int[], the specific algorithm is this:

Package Chpt1;import java.util.list;/** * Created by WQI on 2016/10/22.    */public class Bitsort {//bit vectors, also known as bitmaps, are used here to visualize and take bitmap.    Private int[] BitMap; I >> 5 is a set of 32 numbers, that is, an int element with an array weight.    I & 0x1f equivalent to take that number to 32 modulo, found the corresponding bit of the number, |= is set 1.    private void Set (int i) {this.bitmap[i >> 5] |= (1 << (I & 0x1f)); }//and set () essentially the difference is that &= the number corresponding to the position 0 private void clr (int i) {this.bitmap[i >> 5] &= ~ (1 <<    (I & 0x1f)); }//returns the number corresponding to the bit 1 or 0 private int test (int i) {return (This.bitmap[i >> 5] & (1 << (I & 0x1    f)));                } public void sort (list<integer> List) {this.bitmap = new int[list.size ()/32 + 1];        for (int i = 0; i < this.bitMap.length; i++) {clr (i);//The first phase resets all bits to 0.        } for (int i = 0; i < list.size (); i++) {Set (List.get (i));//The second stage reads each integer in the file to establish the set, and each corresponding bit is set to 1. } for (int i = 0; i < list.size (); i++){if (test (i) = = 1) {//The third phase examines each bit, and if the bit is 1, the number is output. }        }    }}

Advantage: You only need to read the file data once, and no additional files are required. and using basic bit operations, the speed will be faster. Disadvantage: No, this algorithm is quite friendly.

Summarize

This chapter, in the English original, is named Cracking the Oyster, which translates as "wonderful (but also cracked) oysters". And the "in-depth thinking of case studies", as mentioned in the opening overview, is not only interesting, but also has practical benefits. "Very apt, oysters are fresh, pearls beauty is also very intuitive to see." Another digression, seemingly in the Western countries Oyster the meaning of the word is very good, Shakespeare died lines have the world is my Oyster free translation can be expressed as-I can do whatever you like.

Bitmap This data abstraction is really interesting! Cleverly solves this seemingly very big data problem. Also learned on the internet, many times in the processing of large quantities of data, you can consider the use of the data structure, which needs to be supplemented , hoping to write a few in this to "take a corner, not three." In addition, for the data concept of bitmaps, Java has been very simple to implement, is Bitset (), the collection class initial range in 0~63 (JDK version:1.8.0_101), if this time Bitset (). Set (64), It will automatically expand to the current capacity of doubling, that is, 0~127.

Write it first, October 22, 2016 16:56:01. After-school exercises are very dependent on the understanding of the previous article, and the change is not too big, the answer code part of the book is written in C + +, but only Java I do not seem to bother (see qsort Good Envy C-Series programmer AH), tomorrow to see if there are interesting topics to add it. #明日债明日还 #

;-)

"Reading Notes", "Programming Zhu Ji Nanxiong", the position of the first chapter vector & bitmap

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.