Chapter 2 of programming Pearl River
Question 1: Given a file containing 4 billion 32-bit integers, the order of integers is random.Find a 32-bit integer that does not exist in this file.
A: A 32-bit integer has a total of 0xffffffff integers.If a bit is used to indicate an integer, a total of about MB memory is required. If the memory is sufficient, you can build such a large bitmap and quickly find a non-existent integer.
Question 2: Question 1: If the memory limit is 100 MB, how can this problem be achieved?
Answer: You can use the aforementioned multi-channelAlgorithmRead the object six times. Only integers in the range of [(0 xffffffff/6) * (N-1) and (0 xffffffff/6) * n] are read at a time.
Question 3: Question 1: If the memory limit is only several hundred bytes, how can this problem be achieved?
A: If you use the multi-channel algorithm mentioned above, it can also be implemented, but this will need to be divided into millions of channels, and the file will be traversed millions of times, the cost is too high, not worth the candle.
In another way, we do not divide the value range of a 32-bit integer, but rather the bit of an integer. These integers have only 32 bits, and each bit has only 0 and 1 values.
First, read the file and record the number of integers where bit 0 is 0 and 1 respectively. Select one of the few. Read the file for the second time. In the previous selected integer, the number of integers where bit 1 is 0 and 1 are respectively recorded. Select one of the few. Read the file for the third time. In the previous selected integer, the number of integers where bit 2 is 0 and 1 are respectively recorded. ....... Read the file 32nd times. In the previous selected integer, the number of integers with bit 32 being 0 and 1 is recorded respectively.
During the above consecutive reads, the number of values of a certain bit of a class is 0. At this point, we can output the missing INTEGER: the low bit of the integer, determined by the previous read filter rules; the high bit of the integer can be filled at will; the bit in the current statistics is directly filled in the category where the number is 0. In this way, you only need to traverse the file for up to times to complete the search. The disadvantage of this method is that you need to traverse all integers in the file multiple times, even though you only need to pay attention to the integer 1/2 at the previous time.
Question 4: Question 3: Can I reduce the number of integer reads and judgments?
Answer: slightly modify the answer to Question 3 and add a prefix to each integer. When traversing a file for the first time, add the corresponding integer with a bit of 1 to file 1 with a prefix of 1, and write the corresponding integer with a bit of 0 with a prefix of 0 to file 2. Statistics are complete. The next traversal selects one of the two files generated this time. In this way, the number of integers in each traversal does not exceed half of the number of integers in the previous traversal, which can greatly reduce the number of reading and judging integers. Only temporary files are needed.
Question 5: given an object containing 4.3 billion 32-bit integers, the order of integers is random.Find a 32-bit integer that appears at least twice in this file.
Answer: If you use question 1, when you write an integer with Mark 1 to a bit of the bitmap, you can determine whether the bit has been set to 1. If yes, the integer is repeated.
If Question 2, question 3, and question 4 are used, change "one with fewer choices" to "one with more choices ", change "exit when a type of statistical data is searched for 0" to "return if the previous traversal finds a type of statistical data greater than 1.