Accurate Problem description and algorithm selection suitable for problem data.
Sort disk files. A file contains a maximum of 10 million records. Each record is a 7-digit integer with no other related data. Each integer only appears once, it can only provide about 1 MB of memory. Because it is a real-time system, a response can be provided after a maximum of several minutes, and 10 seconds is an ideal running time.
Accurate Problem description:
Input: a file containing n positive integers. Each number is less than n, where n = 10 000. If two identical integers are displayed in the input file, the error is fatal. No other data is related to this integer.
Output: files output in ascending order
Constraints: About 1 MB of memory space is available, and there is sufficient disk space. It does not need to be optimized when the running time is less than 10 seconds.
The bit vector can be used to represent an integer set. For example, an 8-bit long dimension vector {,} can represent a 1-8 Integer Set}
Therefore
1. initialize the bit vector to 0. 2. Read disk data. I value bit [I] = 1. 3. Scan the bit vector. if (bit [I] = 1) print I
Feasibility: 1 MB = 1024x1024 x 8bit = 8388608bit. Under the condition of about MB, all data can be expressed using the bit vector.
1 #include <stdio.h>
2 #define arrSpace 262144
3 int arr[arrSpace];
4 char fileName[20];
5
6 void init(){
7 int i;
8 for(i=0; i < arrSpace; i++){
9 arr[i] = 0;
10 }
11 }
12
13 int main(void){
14 int n;
15 int i,j;
16
17 scanf("%s",fileName);
18 FILE *in = fopen(fileName,"r");
19 FILE *out = fopen("outSort.txt","w");
20
21 while(fscanf(in, "%d", &n) != EOF){
22 arr[n/32] = arr[n/32] | (1<<(n%32));
23 }
24
25 for(i=0; i<arrSpace; i++){
26 for(j=0; j<32; j++){
27 if((arr[i] & (1<<j)) != 0){
28 fprintf(out,"%d\n",i * 32 + j);
29 }
30 }
31 }
32
33 fclose(in);
34 fclose(out);
35 }
The int type occupies 32 bits in C language, so the int array of 262144 length is 1 MB
Perform computation on the 32 operator and the remainder of the read data, for example, 60/32 = % 32 = 28, set the first bit of the array arr [1] to 1 (an int is 0-31bit)
Therefore, the integer 1 (int type) is shifted to 28 BITs and performs logical or operation with arr [1]. The corresponding bit can be set to 1.
At the end of the output, 1 is shifted to 0-31 places, and the logic and operation with arr are performed. If the result is 1, this integer exists in the data, output data to a file in the order of traversal to complete the task.