Python implementation bitmap data structure

Source: Internet
Author: User
Bitmap is a very common data structure, such as for Bloom filter, for sorting without repeating integers, and so on. Bitmap are typically based on arrays, where each element of an array can be considered a series of binary numbers, and all elements make up a larger binary set. For Python, integer types are signed by default, so the number of available bits for an integer is 31 bits.

Bitmap Realization Idea

Bitmap is used to operate on each one. For example, if a Python array contains 4 32-bit signed integers, the total available bit is 4 * 31 = 124 bits. If you want to operate on a 90th bits, get to the first element of the array of operations, get the appropriate bit index, and then perform the operation.



Shown as a 32-bit integer, the default is a signed type in Python, the highest bit is the sign bit, and bitmap cannot use it. The left side is high, the right side is low, the lowest bit is the No. 0 position.

Bitmap is used to operate on each one. For example, if a Python array contains 4 32-bit signed integers, the total available bit is 4 * 31 = 124 bits. If you want to operate on a 90th bits, get to the first element of the array of operations, get the appropriate bit index, and then perform the operation.

Initialize bitmap

First you need to initialize the bitmap. Take 90 for this integer, because a single integer can only use 31 bits, so 90 is divided by 31 and rounded up to know that several array elements are required. The code is as follows:

Copy the Code code as follows:


#!/usr/bin/env python
#coding: UTF8

Class Bitmap (object):
def __init__ (self, max):
self.size = Int ((max + 31-1)/+) #向上取整

if __name__ = = ' __main__ ':
Bitmap = Bitmap (90)
print ' requires%d elements. '% bitmap.size

Copy the Code code as follows:


$ python bitmap.py
Requires 3 elements.


Computes the index in the array

The index in the array is actually the same as the size of the previously computed array. Just before the maximum number calculation, now replace any integer that needs to be stored. But a little different, the index in the array is rounded down, so the implementation of the Calcelemindex method needs to be modified. Change the code to read as follows:

Copy the Code code as follows:


#!/usr/bin/env python
#coding: UTF8

Class Bitmap (object):
def __init__ (self, max):
Self.size = Self.calcelemindex (max, True)
Self.array = [0 for I in range (self.size)]

def calcelemindex (self, num, up=false):
"Up is true for rounding up, otherwise rounding down"
If up:
return int (num + 31-1) #向上取整
Return NUM/31

if __name__ = = ' __main__ ':
Bitmap = Bitmap (90)
The print ' array requires%d elements. '% bitmap.size
print ' 47 should be stored on the array of%d elements. '% Bitmap.calcelemindex (47)

Copy the Code code as follows:


$ python bitmap.py
The array requires 3 elements.
47 should be stored on the 1th element of the array.

So it's important to get the largest integer, otherwise it's possible to create an array that doesn't fit some data.

Computes the bit index in the array element

The bit index in the array element can be obtained by the modulo operation. A bit index is obtained by making the integer that needs to be stored and 31 modulo. Change the code to read as follows:

Copy the Code code as follows:


#!/usr/bin/env python
#coding: UTF8

Class Bitmap (object):
def __init__ (self, max):
Self.size = Self.calcelemindex (max, True)
Self.array = [0 for I in range (self.size)]

def calcelemindex (self, num, up=false):
"Up is true for rounding up, otherwise rounding down"
If up:
return int (num + 31-1) #向上取整
Return NUM/31

def calcbitindex (self, num):
Return num% 31

if __name__ = = ' __main__ ':
Bitmap = Bitmap (90)
The print ' array requires%d elements. '% bitmap.size
print ' 47 should be stored on the array of%d elements. '% Bitmap.calcelemindex (47)
print ' 47 should be stored on the%d bit of the array element%d. '% (Bitmap.calcelemindex (+), Bitmap.calcbitindex (47),)

Don't forget to count from the No. 0 place Oh.

1 operation

The bits default is 0, and a position of 1 indicates that the data is stored in this bit. Change the code to read as follows:

Copy the Code code as follows:


#!/usr/bin/env python
#coding: UTF8

Class Bitmap (object):
def __init__ (self, max):
Self.size = Self.calcelemindex (max, True)
Self.array = [0 for I in range (self.size)]

def calcelemindex (self, num, up=false):
"Up is true for rounding up, otherwise rounding down"
If up:
return int (num + 31-1) #向上取整
Return NUM/31

def calcbitindex (self, num):
Return num% 31

def set (self, num):
Elemindex = Self.calcelemindex (num)
Byteindex = Self.calcbitindex (num)
Elem = Self.array[elemindex]
Self.array[elemindex] = Elem | (1 << byteindex)

if __name__ = = ' __main__ ':
Bitmap = Bitmap (90)
Bitmap.set (0)
Print Bitmap.array

Since the No. 0 position is counted, if you need to store 0, you need to place the No. 0 position 1.

Clear 0 operation

A location of 0, which discards the stored data. The code is as follows:

Copy the Code code as follows:


#!/usr/bin/env python
#coding: UTF8

Class Bitmap (object):
def __init__ (self, max):
Self.size = Self.calcelemindex (max, True)
Self.array = [0 for I in range (self.size)]

def calcelemindex (self, num, up=false):
"Up is true for rounding up, otherwise rounding down"
If up:
return int (num + 31-1) #向上取整
Return NUM/31

def calcbitindex (self, num):
Return num% 31

def set (self, num):
Elemindex = Self.calcelemindex (num)
Byteindex = Self.calcbitindex (num)
Elem = Self.array[elemindex]
Self.array[elemindex] = Elem | (1 << byteindex)

def clean (self, i):
Elemindex = Self.calcelemindex (i)
Byteindex = Self.calcbitindex (i)
Elem = Self.array[elemindex]
Self.array[elemindex] = Elem & (~ (1 << byteindex))

if __name__ = = ' __main__ ':
Bitmap = Bitmap (87)
Bitmap.set (0)
Bitmap.set (34)
Print Bitmap.array
Bitmap.clean (0)
Print Bitmap.array
Bitmap.clean (34)
Print Bitmap.array

Clear 0 and set 1 are reciprocal operations.

Test whether a bit is 1

Determines whether a bit is 1 in order to take out previously stored data. The code is as follows:

Copy the Code code as follows:


#!/usr/bin/env python
#coding: UTF8

Class Bitmap (object):
def __init__ (self, max):
Self.size = Self.calcelemindex (max, True)
Self.array = [0 for I in range (self.size)]

def calcelemindex (self, num, up=false):
"Up is true for rounding up, otherwise rounding down"
If up:
return int (num + 31-1) #向上取整
Return NUM/31

def calcbitindex (self, num):
Return num% 31

def set (self, num):
Elemindex = Self.calcelemindex (num)
Byteindex = Self.calcbitindex (num)
Elem = Self.array[elemindex]
Self.array[elemindex] = Elem | (1 << byteindex)

def clean (self, i):
Elemindex = Self.calcelemindex (i)
Byteindex = Self.calcbitindex (i)
Elem = Self.array[elemindex]
Self.array[elemindex] = Elem & (~ (1 << byteindex))

def test (self, i):
Elemindex = Self.calcelemindex (i)
Byteindex = Self.calcbitindex (i)
If Self.array[elemindex] & (1 << byteindex):
Return True
Return False

if __name__ = = ' __main__ ':
Bitmap = Bitmap (90)
Bitmap.set (0)
Print Bitmap.array
Print Bitmap.test (0)
Bitmap.set (1)
Print Bitmap.test (1)
Print Bitmap.test (2)
Bitmap.clean (1)
Print Bitmap.test (1)

Copy the Code code as follows:


$ python bitmap.py
[1, 0, 0]
True
True
False
False


Next, you implement a sort of non-repeating array. It is known that the largest element of an unordered, non-negative integer array is 879, which is ordered naturally. The code is as follows:
Copy the Code code as follows:


#!/usr/bin/env python
#coding: UTF8

Class Bitmap (object):
def __init__ (self, max):
Self.size = Self.calcelemindex (max, True)
Self.array = [0 for I in range (self.size)]

def calcelemindex (self, num, up=false):
"Up is true for rounding up, otherwise rounding down"
If up:
return int (num + 31-1) #向上取整
Return NUM/31

def calcbitindex (self, num):
Return num% 31

def set (self, num):
Elemindex = Self.calcelemindex (num)
Byteindex = Self.calcbitindex (num)
Elem = Self.array[elemindex]
Self.array[elemindex] = Elem | (1 << byteindex)

def clean (self, i):
Elemindex = Self.calcelemindex (i)
Byteindex = Self.calcbitindex (i)
Elem = Self.array[elemindex]
Self.array[elemindex] = Elem & (~ (1 << byteindex))

def test (self, i):
Elemindex = Self.calcelemindex (i)
Byteindex = Self.calcbitindex (i)
If Self.array[elemindex] & (1 << byteindex):
Return True
Return False

if __name__ = = ' __main__ ':
MAX = 879
Suffle_array = [45, 2, 78, 35, 67, 90, 879, 0, 340, 123, 46]
result = []
Bitmap = Bitmap (MAX)
For num in Suffle_array:
Bitmap.set (num)

For I in range (MAX + 1):
If Bitmap.test (i):
Result.append (i)

print ' original array:%s '% Suffle_array
print ' sorted array is:%s '% result

Bitmap is implemented, it is very simple to use it for sorting. Other languages can also implement bitmap, but for statically typed languages, such as C/golang, because you can directly declare unsigned integers, the available bits become 32 bits, just change the above code 31 to 32, please note.

  • Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.