9.10 Extensibility and Storage Limits (ii)--given an input file, contains 4 billion non-negative integers. Produces an integer that is not in the file. Memory Limit: 1GB

Source: Internet
Author: User

/**
* Function: Given an input file, contains 4 billion non-negative integers. Produces an integer that is not in the file. Memory Limit: 1GB
* Advanced: Memory limit 10MB.

*/


/** * Idea: *  * 1) Create bit vectors that contain 40个亿个 bits.    the bit vector (bv,bit vector) is actually an array that uses an integer (or another data type) array to store the Boolean values compactly. Each integer can store a string of 32-bit or Boolean values. * 2) Initialize all elements of BV to 0. * 3) Scan all the numbers in the file (num) and call Bv.set (num,1). * 4) Next, start scanning BV again from index 0. * 5) returns the first index with a value of 0. * * *  value is determined: * 1) 2^32-->40 billion different integers, an integer of 4 bytes. * 2) 2^30 byte-->1gb-->2^33bit-->80 million bits. Each bit maps a different integer that can store more than 8 billion different integers. */long numberofints= ((long) integer.max_value) +1;byte[] bitfield=new byte[(int) (NUMBEROFINTS/8)];p ublic void Findopennumber () throws Filenotfoundexception{scanner in=new Scanner (New FileReader ("F:\\file.txt")); while ( In.hasnext ()) {int n=in.nextint ();/*  use the OR operator to set the nth bit of a byte to find the corresponding number in the Bitfield. * (for example, 10 (decimal) will correspond to the 2nd bit of index 2 in the byte array) */bitfield[n/8]= (byte) (1<< (n%8));} for (int i=0;i<bitfield.length;i++) {for (int j=0;j<8;j++) {/* * *) retrieves each bit of each byte. When a bit is found to be 0 o'clock, the corresponding value is found. */if ((bitfield[i]& (1<<j)) ==0) {System.out.println (i*8+j); return;}}}

Advanced: 10MB

/** * Idea: To scan a data set two times, you can find an integer that is not in the file. You can divide all integers into chunks of the same size. * First Scan array: determines the number of elements per array. * Second scan bit vector: Determines the number less in the range. * * Bitsize:: The size of each block range for the first scan. * Blocknum: The number of blocks during the first scan. * * value is determined: * 1) 10mb-->2^23byte. An integer of 4 bytes, so that it contains a maximum of 2^21 elements of an array. * 2) bitsize= (2^32/blocknum) <=2^21, so, blocknum>=2^11. * 2^11<=bitsize<=2^26. * Under these conditions, the more the median value is chosen, the less memory is being sued at any time. */int bitsize=1048576;int blocknum=4096;byte[] bitfield2=new byte[bitsize/8];int[] blocks=new Int[blockNum];p ublic void FindOpenNumber2 () throws Filenotfoundexception{scanner in=new Scanner (New FileReader ("F:\\file.txt")); int Starting=-1;while (In.hasnext ()) {int n=in.nextint (); blocks[n/bitfield2.length*8]++;} If the for (int i=0;i<blocks.length;i++) {/* * is less than, it indicates that there is at least one number missing in the Block */if (blocks[i]<bitfield2.length*8) {starting=i* Bitfield2.length*8;break;}} In =new Scanner (new FileReader ("F:\\file.txt")), while (In.hasnext ()) {int n=in.nextint (); if (n>=starting&& n<starting+bitfield2.length*8) {bitfield2[(n-starting)/8]= (Byte) (1<< ((n-starting)%8));}} for (iNT i=0;i<bitfield2.length;i++) {for (int j=0;j<8;j++) {/* * * Retrieve each bit of each byte. When a bit is found to be 0 o'clock, the corresponding value is found. */if ((bitfield2[i]& (1<<j)) ==0) {System.out.println (i*8+j+starting); return;}}}


Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

9.10 Extensibility and Storage Limits (ii)--given an input file, contains 4 billion non-negative integers. Produces an integer that is not in the file. Memory Limit: 1GB

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.