Question: Sorting 2 GB of data is a basic requirement. Data: 1. Each data cannot exceed 0.8 billion; 2. Data Type bit: int; 3. Each data can be repeated at most once. Memory: up to MB of memory is used for operations. I have heard of many solutions to similar problems. Some of them use memory for multiple times and some use external storage. I think these two methods are not a good idea and are too slow. Because this question does not seem to constrain efficiency, the two methods are also correct. However, this time I propose a better algorithm to answer this question, if you have better practices, please leave a message and discuss it together. Hope you can join us ..... Idea: divide the memory of MB equally, you can open two arrays, one array arr stores all data that is not repeated once, and the other array arr_2 stores only repeated data. The storage method is to operate on each data bit in the array. For example, if the number of 18 is 18/32 = 18%, it corresponds to a bit in the array of arr [0], and each array element is composed of 32 bits, 32 = 18, that is to say, the 18th bits of the number of arr [0] correspond to the 18 bits. In the same way, let's add another number: 43 43/32 = 11th % 32 = 11, that is, 43 corresponds to the bits in arr [1. As long as the corresponding location is found, the location 1 remains unchanged (the default value is 0), and the corresponding location 1 in the memory is traversed once. if duplicate data is encountered, the second array will be used. If this bit is already 1 in this query, the corresponding position of arr_2 is 1. When outputting data, we need to traverse two arrays synchronously. Output: a reverse restoration process that traverses every bit in the memory. This bit corresponds to an array subscript and its position. After a multiplication, and operation, the data can be restored, write files or print them to the screen in sequence. Let's talk a little bit about it. Go directly to the Code. If you have any questions, follow the thread to discuss it. [Cpp] <pre class = "cpp" name = "code"> # include <stdio. h> # include <stdlib. h> # define NUM 1024*1024 // memory size occupied by data, that is, the carrier for storing data # define N 1024*1024*128/10. the correctness of the test can be measured by 10. // The data volume is unsigned long int arr [NUM]. unsigned long int arr_2 [NUM]; unsigned long int temp [N]; // you do not need to open this array and read int main () {int I, j, temp_num = 0, temp_num_2 = 0, flag = 0; // clear memory memset (arr, 0, sizeof (arr); memset (arr_2, 0, sizeof (arr_2 )); </pre> <pre class = "cpp" name = "code"> // obtain the data and store it in the array for (I = 0; I <N; I ++) {temp [I] = N-I; temp [I ++] = N-I;} // The following loop is a sorting process, if it is 1, the corresponding position in the other memory is 1 for (I = 0; I <N; I ++) {if (arr [temp [I]/32]> (temp [I] % 32) & 0x00000001) = 1) arr_2 [temp [I]/32] | = (0x00000001 <(temp [I] % 32 )); arr [temp [I]/32] | = (0x00000001 <(temp [I] % 32);} printf ("\ n "); for (I = 0; I <NUM & flag <N; I ++) {if (arr [I] = 0) continue; temp_num = arr [I]; for (j = 0; j <32; j ++) {if (temp_num & 0x00000001) = 0) {temp_num = (temp_num> 1 );} else if (temp_num & 0x0001) = 1) {printf ("% d", (I <5) + j ); temp_num = (temp_num> 1); temp_num_2 = arr [I]; flag ++; // output of duplicate data if (temp_num_2 & 0x00000001) = 1) {printf ("% d", (I <5) + j); flag ++ ;}}}www.2cto.com printf ("\ n"); return 0 ;} </pre> <br> <p> </p> <pre> </pre> <p> </p> <pre> </pre> <p> </p>