Tips for applying bloom Filter

Source: Internet
Author: User
Tips for applying bloom Filter

Jomeng January 29, 2007

 

Below are some tips based on the standard Bloom filter:

 

1. Calculate the sum of the two sets. Assume that two Bloom Filters represent the S1 and S2 sets respectively, and their arrays are of the same size and use the same group of hash functions. In this case, they must represent the bloom filter in the same set of S1 and S2, you only need to perform the "or" Operation on the bits of S1 and S2 to get the result.

 

2. Fold the bloom filter ". If you want to reduce the size of a bloom filter by half, you only need to divide the bit array of the bloom filter into two halves and perform the "or" operation. The result is the desired one. When searching for an element, You need to block the highest bit of the hash index address.

 

3. estimate the number of elements in the set by the number of 0. In the first article on the concept and principle of bloom filter, we mentioned that the ratio of 0 in a bit array is very concentrated in its mathematical expectation M (1-1/M) in the vicinity of Kn, M is the size of the bit array, k is the number of hash functions, and N is the number of elements in the Set expressed by Bloom filter. According to the above formula, knowing the number of 0 can easily infer the size of N.

 

4. estimate the number of intersection elements in a set through inner product. Assume that two Bloom Filters represent the S1 and S2 sets respectively. They use the same group of hash functions with the same array size, next we will look at the probability that the I-th bit is set to 1 at the same time in two Bloom Filters. To set a certain bit to 1 at the same time, there are only two possibilities: either it is set by an element in S1) and S2-(S1 S2. S2. Therefore, the probability that the I-th bit is set to 1 at the same time in two Bloom Filters is:


| S | Number of elements in S, K indicates the number of hash functions, and M indicates the size of bit arrays. After simplification and multiplication by m, the expected mathematical values of the Inner Product of the two bits are as follows:

If you do not know | S1 | and | S2 |, you can use the method in 3 to estimate their size based on the number of 0. Finally, based on the above formula, we can easily infer the size of | S1 ∩ S2 | when we know the inner product.

 

5. indicates the complete set. It is very easy to set the bit array to full 1 to represent the complete set, because finding any element will get a positive result.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.