Text: Step-by-step algorithm (looking for lost numbers)
"Disclaimer: Copyright, welcome reprint, please do not use for commercial purposes. Contact mailbox: feixiaoxing @163.com "
Let's say we have a 100 million data, where the range of data is 0~1 billion, which is 100M of data. But this array lost some of the data, for example, less 5 ah, less 10 ah, then there is any way to get these lost data back? This problem is not difficult, but it can help us to expand our thinking, and constantly improve the efficiency of the operation of the algorithm.
For this problem, one of our simplest ideas is to flag the individual data and then output the data sequentially.
void Get_lost_number (int data[], int length) {int Index;assert (NULL! = Data && 0! = length); unsigned char* PFlag = (unsigned char*) malloc (length * sizeof (unsigned char)) memset (pFlag, 0, length * sizeof (unsigned char)); for (index = 0; ind ex < length; Index + +) {if (0 = = Pflag[data[index]]) Pflag[data[index]] = 1;} for (index = 0; index < length; index++) {if (0 = = Pflag[index]) printf ("%d\n", index);} Free (pFlag); return;}
Perhaps the friend also saw that the above code needs to allocate the same length space as the original data. In fact, we can use bit to set the access flag, so the space we apply can also be reduced.
void Get_lost_number (int data[], int length) {int Index;assert (NULL! = Data && 0! = length); unsigned char* PFlag = (unsigned char*) malloc ((length + 7) >> 3); memset (pFlag, 0, length * sizeof (unsigned char)); for (index = 0; Index < Length Index + +) {if (0 = = (pflag[data[index >> 3] & (1 << (Data[index]% 8))) Pflag[data[index] >> 3] |= 1 << (Data[index]% 8);} for (index = 0; index < length; index++) {if (0 = = (Pflag[data[index] >> 3] & (1 << (Data[index]% 8))) PR intf ("%d\n", index);} Free (pFlag); return;}
The above code has been reduced in space, so what is the method of parallel operation of this data?
void Get_lost_number (int data[], int length) {int index; RANGE Range[4] = {0};assert (NULL! = Data && 0! = length); unsigned char* PFlag = (unsigned char*) malloc ((length + 7 ) >> 3); memset (pFlag, 0, length * sizeof (unsigned char)) Range[0].start = 0, range[0].end = length >> 2;ra Nge[1].start = length >> 2, range[1].end = length >> 1;range[2].start = length >> 1, range[2]. End = Length >> 2 * 3;range[3].start = length >> 2 * 3, Range[3].end = length, #pragma omp parallel forfor (Inde x = 0; Index < 4; Index + +) {_get_lost_number (data, Range[index].start, Range[index].end, PFlag);} for (index = 0; index < length; index++) {if (0 = = (Pflag[data[index] >> 3] & (1 << (Data[index]% 8))) PR intf ("%d\n", index);} Free (pFlag); return;}
For multi-core parallel computing, we added the sub-function _get_lost, which we further complement complete.
typedef struct _RANGE{INT Start;int end;} Range;void _get_lost_number (int data[], int start, int end, unsigned char pflag[]) {int index;for (index = start; index < End index++) {if (0 = = (pflag[data[index >> 3] & (1 << (Data[index]% 8)))) Pflag[data[index] >> 3] |= 1 & lt;< (Data[index]% 8);}}
Summary:
(1) Code optimization can be done continuously, but not necessarily for all scenarios
(2) The current CPU has begun to change from the 2 core->4 nuclear->8 nuclear, and friends can master as much knowledge of multicore programming as possible.
Step-by-step write algorithm (search for lost numbers)