[Algorithm series 18] bitmap of mass data processing

Source: Internet
Author: User

A: Introduction

The so-called bitmap is to use a bit bit to mark the value of an element, and key is the element. Because of the use of bit units to store data, the storage space can be greatly reduced.

Second: Basic ideas

Let's take a concrete example, assuming we want to sort the 5 elements (4,7,2,5,3) within 0-7 (assuming that the elements are not duplicated). Then we can use the bitmap method to achieve the purpose of sorting. To represent 8 numbers, we only need 8 bit (1Bytes).
(1) First we open up 1 bytes (8bit) of space and set all the bit bits in these spaces to 0, such as:

(2) then traverse the 5 elements, first the 1th element is 4, then 4 corresponds to the position of 1, because it is zero-based, so the 5th position is 1 (for example):

Then the 2nd element 7, the 8th position is 1, and then the 3rd element, until the processing of all elements, the corresponding position is 1, the state of the memory bit is as follows:

(3) Then we now traverse through the bit area, which is the number output (2,3,4,5,7) of the bit 1 bits, so that the order is reached.

The algorithm is relatively simple, but the key is how to determine the decimal number map to the binary bit bit map map.

Three: Map map

Suppose you need to sort or find the total number of n=10000000.
Bitmap in 1bit represents a number
1 int = 4Bytes = 4*8bit = three bit, then n number requires N/32 int space. So we need to apply the size of the memory space to int a[1 + N/32], where: a[0] accounts for 32 in memory for the decimal number 0-31, and so on:

The bitmap table is:

    a[0]  --------->  0-31     a[1]  --------->  32-63     a[2]  --------->  64-95     a[3]  --------->  96-127     

Then how the decimal number is converted to the corresponding bit bit, the following describes the use of displacement to convert the decimal number to the corresponding bit bit.

To request an int one-dimensional array, you can use it as a two-dimensional array that is listed as a 32-bit.

A[0]
A[1]
A[2]
A[3]

A[i] ..... ..... .....................

A[n]

For example:
Decimal 1 in a[0], the position is as follows:

Decimal 31 in A[0], the position is as follows:

Decimal 32 in a[1], the position is as follows:

Decimal 33 in A[1], the position is as follows:

The analysis concludes how decimal numbers are converted to the corresponding bit bits by the following steps:

(1) To find the subscript of the decimal number in the corresponding array a

Decimal number 0-31, corresponds to the array a[0], 32-63 corresponds to the array a[1], 64-95 corresponds to the array a[2] ...
Analysis: for a decimal number n, corresponding to the array a[n/32]
For example n=11, then n/32=0, then 11 corresponds to the subscript in array A is 0,n=32, then n/32=1, 32 corresponds to the subscript in array A is 1,n = 106, then N/32 = 3, then 106 corresponds to the subscript in array A is 3.

(2) Subscript for decimal number in corresponding array a[i]

For example, decimal number 1 in a[0] subscript is 1, decimal number 31 in a[0] subscript 31, decimal number 32 in a[1] subscript 0.
In the decimal 0-31 corresponds to 0-31, and 32-63 corresponds to 0-31, that is, given a number n can be obtained by modulo 32 in the corresponding array a[i] subscript.
Analysis: for a decimal number n, corresponding to the array a[n/32][n%32]

(3) Shift

For a decimal number n, corresponding to the array a[n/32][n%32], but the array A is not a two-dimensional array, we implement the shift operation by 1.
A[N/32] |= 1 << N 32
Shift Operation:
A[N>>5] |= 1 << (N & 0x1F)

N & 0x1F Reserve n the next five bits equivalent to n 32 decimal number in the array a[i] subscript

/ *--------------------------------* Date: 2015-02-07* sjf0115* title: bitmap* Blog:----------------------------------- -*/#include <iostream>#include <vector>using namespace STD;#define N 1000000000//Request the size of the memoryinta[1+ n/ +];//Set where the bit bit is 1voidBitMap (intN) {//row = N/32 The subscript of the decimal number in array a    intRow = n >>5;//N & 0x1F reserved for the latter five bits of n    //equivalent to n% 32 decimal number in array a[i] subscriptA[row] |=1<< (N &0x1F);}//Determine if the bit is 1BOOLExits (intN) {intRow = n >>5;returnA[row] & (1<< (N &0x1F));}voidShow (intRow) {cout<<"Bitmap bitmap display:"<<endl; for(inti =0; i < row;++i) { vector<int>VecintTMP = A[i]; for(inti =0; I < +; ++i) {Vec.push_back (TMP &1); TMP >>=1; }//for        cout<<"a["<<i<<"]"<<" ,"; for(inti = vec.size ()-1; I >=0;-I.) {cout<<vec[i]<<" "; }//for        cout<<endl; }//for}intMain () {intNum[] = {1,5, -, +, -, About,159, -, +, -, *, $}; for(inti =0; I < A; ++i) {BitMap (num[i]); }//for    introw =5; Show (5);/*if (Exits (n)) {cout<< "This number already exists" <<endl;    }//if else{cout<< "This number does not exist" <<endl; }//else*/    return 0;}

Application scope

Can be used in quick Find, deduplication, sorting, compression data and so on.

Extended

Bloom filter can be seen as an extension to bitmap
Bron Filter Specific reference: [Algorithm series of ten] Big data volume processing weapon: Bron filter

Specific applications

To be perfected ....

Reference:
http://blog.csdn.net/hguisu/article/details/7880288
http://blog.csdn.net/v_july_v/article/details/6685962
Http://www.tuicool.com/articles/mUb2Qnn
http://nemogu.iteye.com/blog/1522332
http://blog.csdn.net/v_july_v/article/details/7382693

[Algorithm series 18] bitmap of mass data processing

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.