Original link: http://lixing123.com/?p=285
Recently took part in the Coursera curriculum, Stanford University's "algorithm: Design and Analysis". This is a very worthwhile course to learn. In the homework assigned by the teacher, there is such a problem:
The goal of this problem are to implement a variant of the 2-sum algorithm.
The file contains 1 million integers, both positive and negative (there might be some repetitions!). This was your array of integers, with the ith row of the file specifying the ith entry of the array.
Your task is to compute the number of target valuesTIn the interval [ -10000,10000] (inclusive) such that there isdistinctNumbers x,y The input file that satisfy x+y=t . (note:ensuring distinctness requires a one-line addition to the algorithm from lecture.)
Write your numeric answer (an integer between 0 and 20001) in the space provided.
The topic provides a file containing 1 million integers that I have uploaded to this github project--point here. This project is the OBJECTIVE-C code implementation for this issue.
This is a 2-sum problem. To solve this problem, first jump to my mind, is the most brute force method: traversal. Add all 1 million integers to each other and save their results. This practice is simple and brutal and ineffective. Because of its asymptotic complexity of O (n^2), it is necessary to calculate the number of times to be at least 1,000,000^2=10^12 for this problem. I have tried to run this brute force program, probably every 0.1s can calculate for an integer all the other numbers and its. So it probably takes 1000000/3600/10>25 hours, so it won't work.
I think, since the title is required, the result in the -10000~10000 range of note, that is, for any number x, only need to calculate the range within the number of-x+10000~-x-10000 can be, you can first sort all of these 1 million numbers. Asymptotic time complexity in O (Nlog (n)) sorting algorithm There are several, I chose a previously implemented merge sort algorithm. Theoretically, this algorithm should be relatively fast, but do not know whether the code is wrong or other reasons, fast, no results within 5 minutes. Moreover, this method is also more troublesome, even if the order is good, also need a variety of pointers, each calculation of a number, you need to recalculate the number of the sum of the range, error-prone.
Then I thought of a method: first I found that all these numbers, their minimum value is-99,999,887,310, the maximum value is 99,999,662,302, that is, between -10^11~1-^11. So, I put all these 1 million numbers in the 5,000,000 arrays. For a number x,x/20000=i, X is placed in the array of number I. For example, group[0] put the number of -20,000~20,000, group[1] put -40,000~-20,000 and 20,000~40,000.
So for any number in group[1, it is only necessary to calculate the number of the numbers in the group and at most 2. For example, for 25,000, it needs to be calculated and -35,000~-15,000, and these numbers are contained in group[0] and group[1].
Moreover, the total number of integers is 1,000,000, while the group has 5,000,000, that is, the vast majority of the group has only 0 or 1 numbers, the computation will be greatly reduced, the asymptotic time complexity is O (n).
In fact, the speed of this method is very fast: only about 1s of time.
This is what I will do with a problem of the algorithm's running time, from more than 1 days, to a process of reducing to about 1s.
Implementing the code here: click here. Because of the iOS development, it was implemented with OBJECTIVE-C.
Algorithm optimization: from 1 days to 1s