Counting bits set (naive)

unsigned int v; Count the number of bits set in V

unsigned int c; C accumulates the total bits set in V

for (c = 0; v; v >>= 1)

{

c + = v & 1;

}

The naive approach requires one iteration per bit, until no more bits is set. So in a 32-bit word with only the high set, it'll go through iterations.

Counting bits set by lookup table

static const unsigned char bitssettable256[256] =

{

# define B2 (n) n, n+1, n+1, n+2

# define B4 (n) B2 (n), B2 (n+1), B2 (n+1), B2 (n+2)

# define B6 (n) B4 (n), B4 (n+1), B4 (n+1), B4 (n+2)

B6 (0), B6 (1), B6 (1), B6 (2)

};

unsigned int v; Count the number of bits set in 32-bit value V

unsigned int c; C is the all bits set in V

Option 1:

c = bitssettable256[v & 0xFF] +

bitssettable256[(v >> 8) & 0xFF] +

bitssettable256[(v >>) & 0xFF] +

Bitssettable256[v >> 24];

Option 2:

unsigned char * p = (unsigned char *) &v;

c = bitssettable256[p[0]] +

BITSSETTABLE256[P[1]] +

BITSSETTABLE256[P[2]] +

BITSSETTABLE256[P[3]];

To initially generate the table algorithmically:

Bitssettable256[0] = 0;

for (int i = 0; i <; i++)

{

bitssettable256[i] = (I & 1) + BITSSETTABLE256[I/2];

}

On July, Hallvard Furuseth suggested the macro compacted table.

Counting bits set, Brian Kernighan ' s The

unsigned int v; Count the number of bits set in V

unsigned int c; C accumulates the total bits set in V

for (c = 0; v; c + +)

{

V &= V-1; Clear the least significant bit set

}

Brian Kernighan ' s method goes through as many iterations as there are set bits. So if we had a 32-bit word with only the high bit set and then it would only go once through the loop.

Published in 1988, the C programming Language 2nd Ed. (by Brian W. Kernighan and Dennis M. Ritchie) mentions this in Exerc Ise 2-9. On April, 2006 Don Knuth pointed out to me the This method "is first published by Peter Wegner in CACM 3 (1960), 322. (Also discovered independently by Derrick Lehmer and published in 1964-a book edited by Beckenbach.) "

Counting bits set in, or 32-bit words using 64-bit instructions

unsigned int v; Count the number of bits set in V

unsigned int c; C accumulates the total bits set in V

Option 1, for at most 14-bit values in V:

c = (v * 0x200040008001ull & 0x111111111111111ull)% 0xf;

Option 2, for at most 24-bit values in V:

c = ((V & 0xfff) * 0x1001001001001ull & 0x84210842108421ull)% 0x1f;

c + = (((V & 0xfff000) >>) * 0x1001001001001ull & 0x84210842108421ull)

% 0x1f;

Option 3, for at most 32-bit values in V:

c = ((V & 0xfff) * 0x1001001001001ull & 0x84210842108421ull)% 0x1f;

c + = ((((V & 0xfff000) >>) * 0x1001001001001ull & 0x84210842108421ull)%

0x1f

c + = ((v >>) * 0x1001001001001ull & 0x84210842108421ull)% 0x1f;

This method requires a 64-bit CPU with fast modulus division to be efficient. The first option takes only 3 operations; The second option takes 10; And the third option takes 15.

Rich Schroeppel originally created a 9-bit version, similiar to option 1; See the programming Hacks sections of Beeler, M., Gosper, R. W., and Schroeppel, R. hakmem. MIT AI Memo 239, Feb. 29, 1972. His method is the inspiration for the variants above, devised by Sean Anderson. Randal E. Bryant offered a couple bug fixes on May 3, 2005. Bruce Dawson tweaked what had been a 12-bit version and made it suitable for the number of bits using the same number of operations O N Feburary 1, 2007.

Counting bits set, in parallel

unsigned int v; Count bits set in this (32-bit value)

unsigned int c; Store the total here

static const int s[] = {1, 2, 4, 8, 16}; Magic Binary Numbers

static const int b[] = {0x55555555, 0x33333333, 0x0f0f0f0f, 0x00ff00ff, 0x0000ffff};

c = v-((v >> 1) & B[0]);

c = ((c >> s[1]) & B[1]) + (C & b[1]);

c = ((c >> s[2]) + C) & B[2];

c = ((c >> s[3]) + C) & B[3];

c = ((c >> s[4]) + C) & B[4];

The B array, expressed as binary, is:

B[0] = 0x55555555 = 01010101 01010101 01010101 01010101

B[1] = 0x33333333 = 00110011 00110011 00110011 00110011

B[2] = 0x0f0f0f0f = 00001111 00001111 00001111 00001111

B[3] = 0x00ff00ff = 00000000 11111111 00000000 11111111

B[4] = 0x0000ffff = 00000000 00000000 11111111 11111111

We can adjust the method for larger integer sizes by continuing with the patterns for the Binary Magic Numbers, B and S. I F there is k bits, then we need the arrays S and B to be ceil (LG (k)) elements long, and we must compute the same number o F expressions for C as S or B is long. For a 32-bit V, operations is used.

The best method for counting bits in a 32-bit an integer V is the following:

v = v-((v >> 1) & 0x55555555); Reuse input as Temporary

v = (V & 0x33333333) + ((v >> 2) & 0x33333333); Temp

c = ((v + (v >> 4) & 0xf0f0f0f) * 0x1010101) >> 24; Count

The best bit counting method takes only operations, which are the same as the Lookup-table method, but avoids the memory and potential cache misses of a table. It is a hybrid between the purely parallel method above and the earlier methods using multiplies (in the sections on Counti Ng bits with 64-bit instructions), though it doesn ' t use 64-bit instructions. The counts of bits set in the bytes are done with parallel, and the sum total of the bits set in the bytes are computed by mul Tiplying by 0x1010101 and shifting.

A generalization of the best bits counting method to integers of bit widths upto-(parameterized by Type T) are this:

v = v-((v >> 1) & (t) ~ (t) 0/3); Temp

v = (V & (t) ~ (t) 0/15*3) + ((v >> 2) & (T) ~ (t) 0/15*3); Temp

v = (v + (v >> 4)) & (T) ~ (t) 0/255*15; Temp

c = (t) (V * ((t) ~ (t) 0/255) >> (sizeof (v)-1) * CHAR_BIT; Count

See Ian Ashdown's nice newsgroup post for more information on counting the number of bits set (also known as Sideways Addi tion). The best bit counting method is brought to my attention on October 5, 2005 by Andrew Shapira; He found it in pages 187-188 of software Optimization Guide for AMD athlon™64 and Opteron™processors. Charlie Gordon suggested A-shave off one operation from the purely parallel version on December, 2005, and Don C Lugston trimmed three more from it on December 30, 2005. I made a typo with Don's suggestion that Eric Cole spotted on January 8, 2006. Eric later suggested the arbitrary bit-width generalization to the best method on November 17, 2006. On April 5, Al Williams observed that I had a line of dead code at the top of the first method.

Determine how many bits in an integer are 1