Source: Internet
Author: User

Label:

The original: A daily walkthrough of the classic Algorithm--the 11th bitmap algorithmIn all the performance-optimized data structures, I think the most used is the hash table, yes, there is a fixed-position lookup with O (1) constant time, how concise and graceful,

But on a specific occasion:

①: Sorts 1 billion non-repeating integers.

②: Find the number that repeats in 1 billion digits.

Of course, I only have a normal server, even if 2G of memory, in this scenario, how can we better select data structures and algorithms?

One: Problem analysis

These days, Daniel wrote the sorting algorithm is just a few, first we calculate how much to put in memory G: (1 billion * +)/(1024*1024*1024*8) =3.6g, poor

2G of memory directly exploded, so all kinds of God horse data structure can not play up, of course, the use of external sorting is still possible to solve the problem, because to go IO so temporarily removed, because we

To play high-performance, hopeless after we think we can do some bits on the hands and feet?

For example, I want to sort {1,5,7,2} These four byte-type numbers, what should I do? We know that byte is a 8 bit bit, we can actually use the value in the array as the bit bit

Key,value use "0,1" to identify if the key has occurred? Look at the following picture:

As we can see, our array values are already a key in byte, and finally I just have to traverse the corresponding bit bit to be 1, so it's natural to be an ordered array.

Someone might say, "What if I add a 13?" Very simple, a byte can hold 8 number, then I only need two byte to solve the problem.

It can be seen that I have a linear array into a bit bit of the two-dimensional matrix, the final space we need is simply: 3.6g/32=0.1g, note that the bitmap sort does not

is n, but depends on the maximum value in the array to be sorted, and the actual application is not very related, for example, I open 10 threads to read a byte array, then the complexity is: O (MAX/10).

Two: Code

I think we all know the idea of bitmap, this time let us witness the charm of the binary system, of course, these shifts are bit computing work, familiar with you to play.

1:clear method (place all bit positions of the array 0)

For example, the current 4 corresponds to the bit position 0, only 1 left to move 4-bit reverse and b[0] & can be.

1 #region The bit bit used for initialization is 0 2//<summary> 3/// the bit bit used for initialization is 0 4// </summary> 5 //< param name= "I" ></param> 6 static void Clear (Byte i) 7 {8 //= function equivalent to i%8 9 var shift = i & 0x0 7;10 //compute should put the subscript of the array of var arrindex = i >> 3;13 //will be the current byte in the specified bit 0,& after the other array bit bit must be unchanged, this is 1 The magical use of the var bitpos = ~ (1 << shift); + //Set the specified bit position in the array one "& Operation" A[arrindex] &= (byte) (Bitpos); }20 #endregion

2:add method (bit 1 operation)

Also very simple, to the current 4 corresponding bit position 1, only 1 left to move 4 bit and b[0] | Can.

1 #region set the corresponding bit bit to 1 2// <summary> 3// set the corresponding bit bit to 1 4// </summary> 5// <param Name = "I" ></param> 6 static void Add (Byte i) 7 {8 //= function equivalent to i%8 9 var shift = i & 0x07;10 Calculates the subscript of the array that should be placed. var arrindex = i >> 3;13 //move 1 in byte to I-bit var bitpos = 1 << shift;16 17 //the specified bit position in the array one "| operation" a[arrindex] |= (byte) bitpos;19 }20 #endregion

2:contain method (determines whether the current bit bit is 1)

If I understand clear and add, I believe the last method is no longer a problem.

1 #region Determine if the current x exists in the bits of the array 2 //<summary> 3 ///Determines whether the current x exists in the bits of the array 4 //</summary> 5/ <param name= "I" ></param> 6 //<returns></returns> 7 static bool contain (byte i) 8< c7/>{9 var j = a[i >> 3] & (1 << (I & 0x07)), if (j = = 0) return false;13 re Turn true;14 }15 #endregion

The last total code:

View Code1 usingSystem;2 usingSystem.Collections.Generic;3 usingSystem.Linq;4 usingSystem.Text;5 usingSystem.Diagnostics;6 usingSystem.Threading;7 usingSystem.IO;8 9 namespaceConsoleApplication2Ten { One Public class Program A { - Static byten =7; - the Static byte[] A; - - Public Static voidMain () - { + //space-saving practices -A =New byte[(N >>3) +1]; + A for(bytei =0; I < n; i++) at Clear (i); - -ADD (4); -Console.WriteLine ("Insert 4 Success! "); - - vars = contain (4); in -Console.WriteLine ("whether currently contains 4:{0}", s); to +s = contain (5); - theConsole.WriteLine ("whether currently contains 5:{0}", s); * $ Console.read ();Panax Notoginseng } - the #regionThe bit bit used for initialization is 0 + /// <summary> A ///the bit bit used for initialization is 0 the /// </summary> + /// <param name= "i" ></param> - Static voidClear (bytei) $ { $ //functions equivalent to i%8 - varShift = i &0x07; - the //calculates the subscript that should be placed on the array - varArrindex = i >>3;Wuyi the //then the specified bit bit in the current byte is 0,& after the other array bit bit must be unchanged, this is the magical use of 1 - varBitpos = ~ (1<<shift); Wu - //positions the specified bit in the array one "& Operation" AboutA[arrindex] &= (byte) (bitpos); $ } - #endregion - - #regionSet the corresponding bit bit to 1 A /// <summary> + ///set the corresponding bit bit to 1 the /// </summary> - /// <param name= "i" ></param> $ Static voidADD (bytei) the { the //functions equivalent to i%8 the varShift = i &0x07; the - //calculates the subscript that should be placed on the array in varArrindex = i >>3; the the //move 1 in byte to I bit About varBitpos =1<<shift; the the //places the specified bit in the array one "| Operation " theA[arrindex] |= (byte) Bitpos; + } - #endregion the Bayi #regionDetermines whether the current x exists in the bits of the array the /// <summary> the ///determines whether the current x exists in the bits of the array - /// </summary> - /// <param name= "i" ></param> the /// <returns></returns> the Static BOOLContain (bytei) the { the varj = A[i >>3] & (1<< (I &0x07)); - the if(J = =0) the return false; the return true;94 } the #endregion the } the}

The daily walkthrough of the classic Algorithm problem--the 11th problem bitmap algorithm