Bitmap Sorting (bitmap technology application)

Source: Internet
Author: User

1. Description of the problem

The integers are sorted by the K <n, given that they are not equal to the integer n. The content discussed in this article can be found in the first chapter of the programming Zhu Ji Nanxiong (second edition).

2. Problem Analysis

There are a number of sorting methods available for sorting: Insert sort, merge sort, quick sort, hill sort, etc. Each sort has a different kind of niche. Why do we need bitmap sorting? All internal ordering (mentioned above) must be loaded into memory at once for all sorted elements. If there are 1000,000 integers, each integer 4 bytes, it means that at least 4000,000b is about 4MB of memory space, if only 1MB of memory space is available, then what should be done?

Many problems have a common solution strategy, and in addition to the general purpose, often need to based on the actual needs and characteristics of the problem of mining targeted solutions. The feature here is that all integers are not greater than n, and integers are not duplicated. How do you use this feature?

Bitmap technology can be used. The so-called bitmap technology, is to map the problem into the string, the bit string processing, then the bit string inverse to the problem space. Specifically, if you want to sort an array of elements that are not greater than 20, [5, 2, 12, 18, 7, 9, 13, 19, 16, 4, 6], you can map it to string 11010011001001110100, where 1 represents where the array element appears ( The highest bit is at the back, the lowest bit is on the left, the following is labeled 0, and then, from the low to the high, you get {2, 4, 5, 6, 9, 12,13,16,18,19} so it's sorted. Based on the bitmap technique, the ordering of 1000,000 distinct integer arrays requires only about 1000,000 B = 0.125MB of memory space.

3. Detailed Design

[1] Input: An unordered array, each of the numbers in the array is not equal to an integer n, and is densely distributed in the interval of [0, N-1]

[2] Output: A sorted array

[3] Data structure: bit vector. The key to bitmap ordering is the implementation of bit vectors. Bit vectors have "set one", "Clear 0", "Test bit is 1" operation. From the implementation perspective, you can use an integer array (because in Java, the shift, bitwise operations are all integers as the basic unit), which means that every 32 bits is a group. The bit vector length is best taken as a multiple of 32 to facilitate programming. Assuming there are 64 bits, then the 59th position is 1, 59/32 = 1, 59%32 = 27, which means that the 27th bit of the 1th set of a[1] will need to be placed. Divided by 32 can be achieved using the right Shift 5 bits (I >> 5), 32 modulo, can be achieved by 1 << (I & 0x1f). The rest is a matter of detail, such as making sure the boundaries are not wrong. The bit string direction is defined as: a[p]a[p-1]...a[1]a[0], p = n/32; n is the smallest integer not less than 32 times times the number of N. A[P] is the highest bit of 32 bits, A[0] is the lowest bit of 32 bits.

4. Algorithm Description

STEP1: According to the problem description to determine the number of bits vector, the initialization of the bit vector bv;

STEP2: For each element of the array, use its value as the position, the corresponding position of the bit vector 1;

STEP3: Scan from low to high, each bit of bit vector, if bit is 1, output the position subscript of the bit, as the element value of the final sorted array.

5. Java Code Implementation

Package datastructure.vector;/** * Implements n-dimensional vector * */public class Nbitsvector {private static final int bits_per_int =; p rivate static final int SHIFT = 5;  Concatenates the bits of all integers in an integer array into a bit vector private int[] bitsvector;  The total number of bits of the vector private int bitslength; public nbitsvector (int n) {int i = 1, while (i * Bits_per_int < n) {i++;} this.bitslength = i * bits_per_int; if (bit  Svector = = null) {Bitsvector = new int[i];} }/** * Setbit: Place the position of the vector in place one * @param i to place the position */public void setbit (int i) {bitsvector[i >> SHIFT] |= 1 <& Lt (I & 0x1f); }/** * Clrbit: Place the position vector of the i position 0 * @param i to clear the location */public void clrbit (int i) {bitsvector[i >> SHIFT] &= ~ (1 << (I & 0x1f)); }/** * Testbit: Test if the bit vector's I bit is 1 * @param i test the position of the bit * @return if the bit vector's I bit is 1, returns TRUE, otherwise false */public Boolea     n testbit (int i) {return (Bitsvector[i >> SHIFT] & 1 << (I & 0x1f))! = 0; }/** * clr: bit vector all clear 0 */public void CLR () {int veclen = bitsvector.length;    for (int i = 0; i < Veclen; i++) {bitsvector[i] = 0; }}/** * Getbitslength: Gets the total number of bits for the bit vector */public int getbitslength () {return bitslength;}       /** * Gets the binary representation of the given integer I, or 0 if the high level is not 1. * @param i given integer i */public String inttobinarystringwithhighzero (int i) {string basicresult = Integer.tobina      Rystring (i);     int bitsforzero = Bits_per_int-basicresult.length ();     StringBuilder sb = new StringBuilder ("");     while (bitsforzero--> 0) {sb.append (' 0 ');     } sb.append (Basicresult);     return sb.tostring ();     The public String toString () {StringBuilder sb = new StringBuilder ("Bits Vector:");     for (int i = bitsvector.length-1; I >=0; i--) {Sb.append (Inttobinarystringwithhighzero (Bitsvector[i]));     Sb.append ("");     } return sb.tostring (); } public static void Main (string[] args) {nbitsvector nbitsvector = new Nbitsvector (64);     Nbitsvector.setbit (2);     System.out.println (Nbitsvector);     Nbitsvector.setbit (7);     Nbitsvector.setbit (18);     Nbitsvector.setbit (25);     Nbitsvector.setbit (36);     Nbitsvector.setbit (49);     Nbitsvector.setbit (52);     Nbitsvector.setbit (63);     System.out.println (Nbitsvector);     Nbitsvector.clrbit (36);     Nbitsvector.clrbit (35);     System.out.println (Nbitsvector);     System.out.println ("*:" + nbitsvector.testbit (52));     System.out.println (":" + nbitsvector.testbit (42));     NBITSVECTOR.CLR ();     System.out.println (Nbitsvector); } }

Package Algorithm.sort;import Java.util.arrays;import datastructure.vector.nbitsvector;/** * Bitmap sort * */public class Bitsmapsort {private nbitsvector nbitsvector; public bitsmapsort (int n) {if (Nbitsvector = = null) {Nbitsvector = new NBits Vector (n);}} Public int[] Sort (int[] arr) throws Exception {if (arr = = NULL | | arr.length = 0) {return null;} NBITSVECTOR.CLR (); int arrlen = arr.length;for (int i=0; i < Arrlen; i++) {if (Arr[i] < 0 | | arr[i] > Nbitsvector . Getbitslength ()-1) {throw new Exception ("given integer" + arr[i] + "exceeds range, please check input");} if (Nbitsvector.testbit (Arr[i])) {throw new Exception ("There is a repeating integer:" + arr[i] + ", check the input! ");} Nbitsvector.setbit (Arr[i]);} int bitslength = Nbitsvector.getbitslength (), int count = 0;for (int i=0; i < bitslength; i++) {if (Nbitsvector.testbit ( i) {arr[count++] = i;}} return arr;} public static int Maxofarray (int[] arr) {int max = arr[0];for (int i=1; i < arr.length; i++) {if (Arr[i] > max) {max = Arr[i];}} return Max;} public static void Test (int[] arr) {try {//63 can be changed to array maximum Maxofarray (arr) bitsmapsort BMS = new Bitsmapsort (64); System.out.println ("Before sorting:" + arrays.tostring (arr)); int[] sorted = Bms.sort (arr); System.out.println ("After sorting:" + arrays.tostring (sorted));} catch (Exception e) {System.out.println (E.getmessage ());}}  public static void Main (string[] args) {int[] empty = null;test (empty); empty = new Int[0];test (empty); int[] unsorted = new Int[] {7, 9, 5, ten, 15, Notoginseng, 13};test (unsorted); int[] Unsorted2 = new int[] {, 34, 46, 52, 7, 9, 5, 7, 13};test, Notoginseng, Unsorted2, int[] unsorted3 = new int[] {, 7, 9, 5,,, Notoginseng, 13};test (uns ORTED3);}}

  6. C Source program:

/* * BITVEC.C:N dimension vector Implementation * author:shuqin1984 2011-08-31 */#include <assert.h> #define N 10000000#define M ((n%32= =0)?  (N/32): (n/32+1)) #define SHIFT 5#define mod32 (n) ((n)-((((n) >> shift) << shift) int bitvec[m];    An n-dimensional vector is implemented with an array of m integers to implement the int test (int i);    The bit I of the test bit vector is 1void set (int i);  Place the bit vector i position 1void clear (int i);    Place the bit vector i-clear 0 void clearAll ();        The bit vectors are all bits clear 0 void show ();  Displays the current value of the bit vector void printb (int x, int i); Prints the i-bit binary void printbz (int x, int n) of positive integer x;    Prints a binary representation of a positive integer (n bits from a low number), if the number of digits is not sufficient before 0 int test (int i) {assert (i >= 0); Return (Bitvec[i>>shift] & (1 << mod32 (i)))! = 0;}    void set (int i) {assert (i >= 0); Bitvec[i>>shift] |= (1 << mod32 (i));}    void Clear (int i) {assert (i >= 0); Bitvec[i>>shift] &= ~ (1 << mod32 (i));}    void ClearAll () {int i;    for (i = 0; i < M; i++) {bitvec[i] = 0;     }}void Show () {int i = 0; if (M = = 1) {PRINTBZ (Bitvec[i], N);         } else {int bits = (n%32==0)?: (N%32);          PRINTBZ (bitvec[m-1], bits);         for (i=m-2; I >=0; i--) {printbz (bitvec[i], 32); }} printf ("\ n");}    void Printb (int x, int i) {printf ("%c", ' 0 ' + ((((unsigned) x) & (1 << i)) >> i));   }void printbz (int x, int n) {int i;   for (i = n-1; I >= 0; i--) {PRINTB (x, i);  }}

/* * BITSORT.C: Implement bitmap sorting and measure run time * author:shuqin1984 2011-8-31 */#include <stdio.h> #include <stdlib.h> #include <time.h> #include <assert.h> #include <limits.h> #include "bitvec.c" #define Max_len 10void Bitsort ( char* filename), void bitsortf (), void Runtime (void (*F) ()), void testdata (int, int), int randrange (int low, int. high), int ma    In () {Srand (time (0));    printf ("sizeof (int) =%d\n", sizeof (int));    printf ("Rand_max =%d\n", Rand_max);        printf ("Int_max =%d\n", Rand_max);        Runtime (BITSORTF);    GetChar (); return 0;}     /* * Read the data from the specified file name and sort it, and then write the sorted data to Output.txt */void bitsort (char* filename) {int i;          Char Buf[max_len];     file* fin = fopen (filename, "R");         if (fin = = NULL) {fprintf (stderr, "can ' t Open file:%s", filename);     Exit (1);           } while (Fgets (buf, Max_len, Fin)) {set (Atoi (BUF));          } fclose (Fin);          Show ();     file* fout = fopen ("Output.txt", "w"); if (fout= = NULL) {fprintf (stderr, "can ' t Open file:%s", "output.txt");     Exit (1);                      } for (i = 0; i < N; i++) {if (Test (i)) {fprintf (Fout, "%d\n", I);          }} fclose (Fout);     printf ("----------sort successfully---------------");     printf ("\ n"); }void bitsortf () {bitsort ("Data.txt");}      void runtime (void (*F) ()) {printf ("Runtime ... \ n");      int scale = 10;           while (scale <= N) {testdata (scale, N);         clock_t start = clock (); (*f)         ();         clock_t end = Clock ();         printf ("Scale:%d\t Cost:%8.4f\n", scale, (double) (End-start)/clocks_per_sec);         printf ("---------------------------------------------------");         printf ("\ n");      Scale *= 10;          }}/* * Create test data: Select num positive integers not less than Max and write to file data.txt */void testdata (int num, int max) {int i;          ASSERT (num <= max);     file* fout = fopen ("Data.txt", "w");  if (Fout = = NULL) {       fprintf (stderr, "can ' t Open file:%s", "Data.txt");     Exit (1);     } for (i = 0; i < num; i++) {fprintf (Fout, "%d\n", (Rand () *rand ())% max);     } fclose (Fout);     printf ("----------TestData successfully---------------"); printf ("\ n");}    /* * Randrange: Generates a random integer of the given range */int randrange (int low, int.) {ASSERT (High <= low), return rand ()% (high-low) + low;}

7. Additional Instructions

Bitmap technology, can be said to be a very effective solution technology, in the file management has been applied to, its role is similar to binary search technology. The bitmap technique can also detect duplicate integers, missing integers, such as finding a repeating integer in more than 4.3 billion random integer permutations that are less than 2^32 (known to the drawer principle). In reading, not only to draw the solution of the problem, but also to understand the general technology behind.

If the problem is not to sort an array of integers, but to sort a series of records, how do we use the existing algorithms? You can use a function to calculate a recorded keyword, get an integer that is not duplicated (this process is similar to hashing), and then sort the array of integers using bitmap techniques.

Bitmap Sorting (bitmap technology application)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.