Detailed description of bitmap data structure implemented by python

Source: Internet
Author: User

Bitmap is a common data structure, for example, used in the Bloom Filter and for sorting non-repeated integers. Bitmap is usually implemented based on arrays. Each element in an array can be considered as a series of binary numbers, and all elements form a larger binary set. For Python, the integer type is signed by default, so the available digits of an integer are 31 bits.

Bitmap is used to operate on each bit. For example, if a Python array contains four 32-bit signed integers, the total number of available bits is 4*31 = 124 bits. If you want to operate on 90th binary bits, you must first obtain the nth element of the Operation Array, then obtain the corresponding bit index, and then perform the operation.

Initialize bitmap

First, initialize bitmap. Taking the integer 90 as an example, because a single integer can only use 31 digits, you can obtain several array elements if 90 is divided by 31 and rounded up. The Code is as follows:

Copy codeThe Code is as follows:
#! /Usr/bin/env python
# Coding: utf8

Class Bitmap (object ):
Def _ init _ (self, max ):
Self. size = int (max + 31-1)/31) # rounded up

If _ name _ = '_ main __':
Bitmap = Bitmap (90)
Print 'requires % d elements. '% Bitmap. size

Copy codeThe Code is as follows:
$ Python bitmap. py
Three elements are required.


Calculate the index in the array

The index calculated in the array is actually the same as the size of the previously calculated array. However, it is previously calculated for the maximum number, and now it is replaced with any integer that needs to be stored. But there is a difference that the index in the array is rounded down, so you need to modify the implementation of the calcElemIndex method. The code is changed to the following:

Copy codeThe Code is as follows:
#! /Usr/bin/env python
# Coding: utf8

Class Bitmap (object ):
Def _ init _ (self, max ):
Self. size = self. calcElemIndex (max, True)
Self. array = [0 for I in range (self. size)]

Def calcElemIndex (self, num, up = False ):
'''Up is True, it is rounded up, otherwise it is rounded down '''
If up:
Return int (num + 31-1)/31) # rounded up
Return num/31

If _ name _ = '_ main __':
Bitmap = Bitmap (90)
Print 'array requires % d elements. '% Bitmap. size
Print '47 should be stored on the % d array element. '% Bitmap. calcElemIndex (47)

Copy codeThe Code is as follows:
$ Python bitmap. py
The array requires three elements.
47 should be stored on the 1st array elements.

Therefore, it is important to obtain the maximum integer. Otherwise, the created array may not accommodate certain data.

Calculate the bit index in the array element

Bitindexes in array elements can be obtained through modulo operation. The bitwise index can be obtained after the modulo operation of the integer and 31 to be stored. The code is changed to the following:

Copy codeThe Code is as follows:
#! /Usr/bin/env python
# Coding: utf8

Class Bitmap (object ):
Def _ init _ (self, max ):
Self. size = self. calcElemIndex (max, True)
Self. array = [0 for I in range (self. size)]

Def calcElemIndex (self, num, up = False ):
'''Up is True, it is rounded up, otherwise it is rounded down '''
If up:
Return int (num + 31-1)/31) # rounded up
Return num/31

Def calcBitIndex (self, num ):
Return num % 31

If _ name _ = '_ main __':
Bitmap = Bitmap (90)
Print 'array requires % d elements. '% Bitmap. size
Print '47 should be stored on the % d array element. '% Bitmap. calcElemIndex (47)
Print '47 should be stored on the % d bit of the % d array element. '% (Bitmap. calcElemIndex (47), bitmap. calcBitIndex (47 ),)

Don't forget that it is counted from 0th bits.

Set 1

The default value of the binary bit is 0. If the binary bit is set to 1, the data is stored in the binary bit. The code is changed to the following:

Copy codeThe Code is as follows:
#! /Usr/bin/env python
# Coding: utf8

Class Bitmap (object ):
Def _ init _ (self, max ):
Self. size = self. calcElemIndex (max, True)
Self. array = [0 for I in range (self. size)]

Def calcElemIndex (self, num, up = False ):
'''Up is True, it is rounded up, otherwise it is rounded down '''
If up:
Return int (num + 31-1)/31) # rounded up
Return num/31

Def calcBitIndex (self, num ):
Return num % 31

Def set (self, num ):
ElemIndex = self. calcElemIndex (num)
ByteIndex = self. calcBitIndex (num)
Elem = self. array [elemIndex]
Self. array [elemIndex] = elem | (1 <byteIndex)

If _ name _ = '_ main __':
Bitmap = Bitmap (90)
Bitmap. set (0)
Print bitmap. array

Because it starts from 0th bits, if you need to store 0, you need to set the 0th position to 1.

Clear 0 operations

Drop the stored data at a location of 0. The Code is as follows:

Copy codeThe Code is as follows:
#! /Usr/bin/env python
# Coding: utf8

Class Bitmap (object ):
Def _ init _ (self, max ):
Self. size = self. calcElemIndex (max, True)
Self. array = [0 for I in range (self. size)]

Def calcElemIndex (self, num, up = False ):
'''Up is True, it is rounded up, otherwise it is rounded down '''
If up:
Return int (num + 31-1)/31) # rounded up
Return num/31

Def calcBitIndex (self, num ):
Return num % 31

Def set (self, num ):
ElemIndex = self. calcElemIndex (num)
ByteIndex = self. calcBitIndex (num)
Elem = self. array [elemIndex]
Self. array [elemIndex] = elem | (1 <byteIndex)

Def clean (self, I ):
ElemIndex = self. calcElemIndex (I)
ByteIndex = self. calcBitIndex (I)
Elem = self. array [elemIndex]
Self. array [elemIndex] = elem &(~ (1 <byteIndex ))

If _ name _ = '_ main __':
Bitmap = Bitmap (87)
Bitmap. set (0)
Bitmap. set (34)
Print bitmap. array
Bitmap. clean (0)
Print bitmap. array
Bitmap. clean (34)
Print bitmap. array

Clearing 0 and setting 1 are reciprocal operations.

Test whether a user is 1

Determine whether a value of 1 is used to retrieve the previously stored data. The Code is as follows:

Copy codeThe Code is as follows:
#! /Usr/bin/env python
# Coding: utf8

Class Bitmap (object ):
Def _ init _ (self, max ):
Self. size = self. calcElemIndex (max, True)
Self. array = [0 for I in range (self. size)]

Def calcElemIndex (self, num, up = False ):
'''Up is True, it is rounded up, otherwise it is rounded down '''
If up:
Return int (num + 31-1)/31) # rounded up
Return num/31

Def calcBitIndex (self, num ):
Return num % 31

Def set (self, num ):
ElemIndex = self. calcElemIndex (num)
ByteIndex = self. calcBitIndex (num)
Elem = self. array [elemIndex]
Self. array [elemIndex] = elem | (1 <byteIndex)

Def clean (self, I ):
ElemIndex = self. calcElemIndex (I)
ByteIndex = self. calcBitIndex (I)
Elem = self. array [elemIndex]
Self. array [elemIndex] = elem &(~ (1 <byteIndex ))

Def test (self, I ):
ElemIndex = self. calcElemIndex (I)
ByteIndex = self. calcBitIndex (I)
If self. array [elemIndex] & (1 <byteIndex ):
Return True
Return False

If _ name _ = '_ main __':
Bitmap = Bitmap (90)
Bitmap. set (0)
Print bitmap. array
Print bitmap. test (0)
Bitmap. set (1)
Print bitmap. test (1)
Print bitmap. test (2)
Bitmap. clean (1)
Print bitmap. test (1)

Copy codeThe Code is as follows:
$ Python bitmap. py
[1, 0, 0]
True
True
False
False


Next we will implement sorting of non-repeated arrays. It is known that the maximum element of an unordered non-negative integer array is 879. please sort it naturally. The Code is as follows:
Copy codeThe Code is as follows:
#! /Usr/bin/env python
# Coding: utf8

Class Bitmap (object ):
Def _ init _ (self, max ):
Self. size = self. calcElemIndex (max, True)
Self. array = [0 for I in range (self. size)]

Def calcElemIndex (self, num, up = False ):
'''Up is True, it is rounded up, otherwise it is rounded down '''
If up:
Return int (num + 31-1)/31) # rounded up
Return num/31

Def calcBitIndex (self, num ):
Return num % 31

Def set (self, num ):
ElemIndex = self. calcElemIndex (num)
ByteIndex = self. calcBitIndex (num)
Elem = self. array [elemIndex]
Self. array [elemIndex] = elem | (1 <byteIndex)

Def clean (self, I ):
ElemIndex = self. calcElemIndex (I)
ByteIndex = self. calcBitIndex (I)
Elem = self. array [elemIndex]
Self. array [elemIndex] = elem &(~ (1 <byteIndex ))

Def test (self, I ):
ElemIndex = self. calcElemIndex (I)
ByteIndex = self. calcBitIndex (I)
If self. array [elemIndex] & (1 <byteIndex ):
Return True
Return False

If _ name _ = '_ main __':
MAX = 879
Suffle_array = [45, 2, 78, 35, 67, 90,879, 0,340,123, 46]
Result = []
Bitmap = Bitmap (MAX)
For num in suffle_array:
Bitmap. set (num)

For I in range (MAX + 1 ):
If bitmap. test (I ):
Result. append (I)

Print 'original array: % s' % suffle_array
Print 'sorted array: % s' % result

If bitmap is implemented, it is very easy to sort by bitmap. Bitmap can also be implemented in other languages, but for static languages such as C/Golang, because unsigned integers can be directly declared, the available bits become 32 bits, you only need to change 31 in the above Code to 32. Please note that.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.