Bloom Filter (Bron filter)

Source: Internet
Author: User

The Bron filter is used to test whether an element exists in a given set, is a random data structure with high spatial utilization (probabilistic data structure), and there is a certain false recognition rate (false positive). That is, the Bron filter reports that an element exists in a collection, but in fact the element is not in the collection, but there is no error-aware case (false negative), that is, if an element does not exist in the collection, then the Bron filter does not report that the element exists in the collection, No false negatives occurred, with a recall rate of 100%. Algorithm DescriptionBron filter is actually a bit array, the number of elements is m, the initial state is all 0, also requires k different hash functions, each hash function must be guaranteed to a uniform random distribution algorithm to map the given element to a position in the array (1~m). When adding an element to a Bron filter, the filter does not actually save the element data, but instead maps the element to a K-bit array position (1~M) with k different hash functions, then sets the value of the array corresponding to the K position to 1, indicating that the element already exists in the collection. When checking whether an element exists in the collection, the same element is mapped to the K-position (1~m) of the bit array through K different uproar functions, if the value of a position in the bit array is 0, indicating that the element does not exist in the collection, or if the position corresponds to a value of all 1, then there are two cases: (1) The element exists in the "set" (2) Error recognition (false positive), that the element does not actually exist in the "collection" in the implementation of a simple filter is no way to distinguish between the two cases. As shown, the bit array size of the Bron filter is 18, the number of hash functions is 3, and a total of three elements in the collection {x, Y, z}, respectively, are mapped to different locations on the array (the mapping of each collection element is represented by a position (value) on the bit array). Now determine if the element w exists in the set, first through the Bron filter's 3 hash function to get three positional coordinates in the bit array, and then determine whether the value of the three positions in the bit array is all 1, by the know, there is a position of the corresponding value is not 1, so the element w does not exist in the collection. Simple filter (bit array only 0 and 12 states) implementations do not support the removal of elements from the collection, because deleting an element means that the value of the corresponding position of the bit array (derived from the value of the element through the hash function) is all set to 0. But there is no way to determine if these locations (values) are used by other elements in the collection, and if the position (value) being used by other elements is set to 0, it will cause false negative to appear, that is, the element actually exists in the collection, and the Bron filter does not report that it exists. Key implementationsSimple filter implementation involves three important attributes: bit array, hash function, false recognition rate. Assuming that the bit array size is M, the collection element array is n, the number of hash functions is k, the false recognition rate is p, then the following formula: The above introduction to Bron filter is limited to simple filter, complex implementation and the corresponding company derivation please parameter http://en.wikipedia.org/wiki/Bloom_filter.

Bloom Filter (Bron filter)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.