1. Given two integer sets a and B, each set contains 2 billion different integers. Please give an algorithm to quickly calculate a given B. The algorithm can be stored externally, however, the memory usage must not exceed 4 GB.
A:
The basic idea is to use bitmap and bitmap.
Thinking process: the maximum integer is 2 to the power-1; if each record a number in turn, the number of int needs to be (2 to the power of 32-1)/32 = 0.1 billion. The occupied memory size is 4 bytes * 0.1 billion = 0.4 GB. No more than 4 GB of question requirements.
Therefore, the solution is:
1) apply for two [2's 32 power-1]/32 int-type Integer Arrays
2) scan two sets A and B in sequence. If the set contains an integer, the corresponding position is 1.
3) perform the intersection operation on the two integer arrays used as the flag bit.
Thinking, if two sets contain 2 billion URLs, how can we find the intersection of the two (using the bloom filter) and convert it into a search operation? <Implementation?>