This is a created article in which the information may have evolved or changed.
I didn't know what was going on this afternoon, and suddenly there was a bloom filter in my head. Do the crawler so long, have not found the application scenario, think of their own implementation of a play. The principle is simple. First define an n-long array, each bit is 0, add a record k hash, and then the hash of int% n as index, the corresponding index bit is set to 1. Every time you judge to do the same operation, to determine whether each bit is 1, as long as one is not 1, then this record certainly does not exist. But if it's all 1, it's not necessarily there.
Bloom filter Principle Description online too much, directly on the link https://blog.csdn.net/hguisu/article/details/7866173
A simple bloom filter that supports Redis, memory, and file three working modes is implemented according to the principle.
Https://github.com/lujinda/simplebloom