[Transfer] http://blog.csdn.net/wuzhekai1985/article/details/6597351
I have seen a blog about algorithm interview questions on the Internet over the past few days, which is well summarized and has many classic questions, most of them are from three books: "programming Pearl", "the beauty of programming", and "the beauty of code. Here are some answers and Thoughts in the book. If something is wrong, I hope to get guidance from experts.
[1] limited time
Most of the interview questions require time complexity. If there are any words that involve the "fastest" type, there is no doubt that the principle of time and space should be first applied to change the time with space. Hash, large array, and some auxiliary spaces are the first choice. In my interview experience, I have used Hash and Big Array several times. However, this is usually not the only solution that the interviewer wants to hear. Then they will say "What if there is only XXXX space ?". To put this approach, you just need to spend more time for yourself and reflect the integrity of your thinking. Simply put, you can use B...
Eg1.1: calculate the number of binary 1 in a char (8 bit), the sooner the better. -- The beauty of Programming
Five methods are provided for programming. (1) Division operations (2) bitwise operations (3) in-place operations, the complexity of the algorithm is only related to the number of 1 (4) using branch operations (5) look-up table method.
The 2nd methods use bitwise operations, which are much more efficient than the first method.3rd MethodsVery skillful. The 4 and 5 methods are actually using space for time, but if it is an int (32bit), then these two methods are not applicable. Code for method 3
Int count (byte v ){
Int num = 0;
While (v ){
V & = (V-1 );
Num ++;
}
}
Eg1.2:There is an integer array a [n], so that you do not need division, find another array B [N], where B [I] = A [0] * A [1]... * A [N-1]/A [I], the expected complexity is O (n ). -- Toplanguage
Use two auxiliary arrays C [N] And d [N], where C [I] = A [0] * A [1] *... A [I-1] * A [I], d [I] = A [I] * A [I + 1] *... A [N-2] * A [N-1], B [I] = C [I-1] * d [I + 1]
[2] Limited Space
Space limitations here refer to space limitations in the logic of big data analysis. In most cases, compression is used. Bitmap is a good method to replace a larger int type with a bit (or a few). The most common bitmap is to replace 1 int with 1 bit. In fact, many times, 1 bit can replace a larger space, depending on the information you need to retain...
Eg2.1: there is a large file that stores a bunch of 7-digit phone numbers without duplicates. Please use the minimum memory consumption to sort them. -- Programming Pearl River
Implemented Using bitmap technology. If each number is stored in an int, 40 MB (10 ^ 7*4)/10 ^ 6 MB) is required. If bitmap technology is used, you only need to use 1 digit to store 1 number, and 1.25 MB (10 ^ 7/8)/10 ^ 6 MB) is required)
Each number corresponds to one bitmap. The bitmap is completely cleared at the beginning. When a number is read, the corresponding bitmap is retrieved and the corresponding numbers are output in the bitmap order.
Eg2.2: give 10 MB of memory to a 4 million integer file and find an integer that is not in the file.
10 MB memory can be used to store the occurrence of a number ranging from 0 to (8*10 ^ 7-1. Scan the file once and place the corresponding location in the range,A number out of the range is discarded.Then, traverse the bitmap and find the first zero bit. The bitmap must have a bit that has not been set.
Expansion 1: give 10 MB of memory and a 4 billion full-size file to find an integer that is not in the file.
The preceding method can also be used, but it may need to be scanned multiple times. Because the integer in the file is more than 8*10 ^ 7, after the first scan, all bitmap bits may be set. If this happens, use 10 MB memory (8*10 ^ 7) to store the data in the range of (16*10 ^ 7-1). Try again. The average performance is almost one scan.
Extension 2:For 10 MB of memory, for a 4 billion integer file, find an integer that is not in the file. Only files can be scanned once
For the moment, I have not thought of a deterministic algorithm. Here is an approximate method. 2 million numbers are randomly generated and sorted. Scan the file once and delete the corresponding number in the file. For example, there are 5 of the 2 million random numbers and 5 of the files, then, remove Random Number 5 from the array (simply set it to-1 ). The remaining 2 million numbers (2*10 ^ 6) * (1-(4*10 ^ 9)/2 ^ 32) are randomly generated. Just take any one of them. Almost never fails.
[3] file-based
More and more large companies are beginning to deal with file processing. The space limitations mentioned above are also basically dealing with files. File-based processing involves searching, sorting, and reducing the number of file reads. In addition to the bitmap method, you can also consider the Sentinel. A typical case is to increase the size of a single file in the outer row.
Eg3.1: given an ordered file containing 4300000000 32-bit integers, find an integer that appears at least twice. -- Programming Pearl River
Idea 1: If the memory is not limited and bitmap technology is used, there will be two numbers in the same bit. In fact, the concept of Pigeon nest is used. A 32-bit integer can represent a maximum of 4294967295, less than 4.3 billion.
Idea 2: If the memory is limited, use the binary search method. Because of the integer space of 4.3g> 32 bits, there will certainly be repeated integers according to the concept of the pigeon cage. The search range starts from all 32-bit positive integers (all are considered as unsigned int, which simplifies the problem), that is, [0, 2 ^ 32), and the median value is 2 ^ 31. Traverse the file. If the number of integers smaller than 2 ^ 31 is greater than 2 ^ 31, adjust the search range to [0, 2 ^ 31], and vice versa; then traverse the entire file until the final result is obtained. In this way, there will be a total of N logsearch times, each time over n integers (each time is completely traversed), the overall complexity is O (nlogn)
Eg3.2: There is a file with many, many integers (maybe 10 billion), looking for the maximum K. -- The beauty of Programming
List several solutions
Solution 1: if there are not many elements, use quick sorting and traverse to find the maximum K elements. The total time complexity is O (n logn) + O (k)
Solution 2: Find the smallest number in K, that is, the number in K. Use the binary search to find the k-th number, and then traverse it. The total time complexity is O (nlogn)
Solution 3: if the data cannot be fully loaded into the memory, the above two methods are not very good. You can use heap sorting to maintain the minimum heap of a k element. A new number is discarded if it is smaller than the minimum number of the heap. If it is larger than the minimum number of heap, replace the minimum element and adjust the heap. The time complexity is O (n logk)
Solution 4: if the data range is limited, you can use the counting method, that is, scan the file once, record the number of occurrences of each integer, and then obtain the maximum K from large to small. The time complexity is O (n)
[4] common methods
You need to believe that the interviewer is also a person. He will not spend 30 minutes describing a problem to you, or ask you to make a deduction on 50 pages, the purpose of algorithm testing is only to improve your thinking ability, rather than to solve a complicated problem. Most problems are solved quickly and clearly...
1. Divide and conquer is definitely a method you must consider, if possible. Dynamic Planning is the preferred weapon for the interviewer to kill the goods, because it is heavy, hard to describe, not easy to write, but just right, beautiful, fast, and easy to write during the interview. The use of sub-governance is too much, and it is almost ubiquitous. The Sub-division, fast sorting, and population counting are extremely beautiful...
Eg4.1: Give you an integer array with the length of N. Find the largest sub-array and. -- The beauty of Programming
This problem can be solved through dynamic planning. Define two secondary Arrays: Start [N] And all [N]. Start [I] indicates the largest continuous array and that contains element I starting from element I. All [I] indicates the continuous array and the largest segment starting from element I. All [0] = max {A [0], a [0] + start [1], all [1]} can be easily solved using dynamic planning.
Int maxsum (int * a, int N ){
All [n-1] = start [n-1] = A [n-1];
For (INT I = n-2; I> = 0; I --){
Start [I] = max (A [I], a [I] + start [I + 1]);
All [I] = max (start [I], all [I + 1]);
}
Return all [0];
}
If you want to return the position of the largest sub-array, you can record it in a loop. The algorithm can still maintain O (n) time complexity.
Eg4.2: calculates the number of binary 1 in an int (32bit. -- The beauty of code
For more information, see eg1.1 method 1, method 2, and method 3.
2. There are too many times of sorting and searching for sorting. It is very important that sorting can use binary. Binary is so easy to use that we always want to sort. Searching and sorting are always closely related. Of course, you only need to measure the cost for sorting...
Eg4.3: there is a forum with more than half of the total number of ID posts. Please find this shuiwang ID list for all posts on the forum. -- The beauty of Programming
Solution 1: first sort the IDs in order, and then take the ID in the middle.
Solution 2: each time a different ID is deleted, the last remaining ID is required.
Extension 1: if there are three posts with many IDs and the number of posts exceeds 1/4 of the total N, find these three IDs.
A similar solution can be used to maintain three candidates. For the new ID, check the number of occurrences of the three candidates. If the number of times is 0, the candidate is set as the new ID, and the number of times is increased by 1. If the number of times is greater than 0, and the new ID is equal to one of the candidates, add 1 to the number of occurrences of the candidate. If the number of occurrences is greater than 0, and the new ID is not equal to any of the three, reduce the number of appearance of the three candidates by one. The last three IDs are all required.
Eg4.4: for a group of one-dimensional spaces [1, 6] [2, 4]..., whether the request has overlapping intervals. -- The beauty of Programming
Solution: sort the target interval by X coordinate, merge the intersection interval, and scan the merged interval to check whether the source interval is in one of the target ranges. You can also use binary search in the last step.
3. When the problem scale is reduced, the problem may seem very scary. After careful analysis, you can explore most of the irrelevant content and obtain the true intention of the question. This is very important. In addition, some questions may be limited in space. At this time, you can consider dynamically reducing the data size, such as using subtraction or division to offset or offset, and so on...
Eg4.5: Give an integer N, calculate its factorial N !, There are several 0 ends. -- The beauty of Programming
Solution: The occurrence of 0 is due to 2*5, so you only need to calculate min (number of 2, number of 5. Because the frequency of occurrence of 2 is greater than 5, only the number of 5 is required.
Eg4.6: There are three color balls in the box: red, yellow, and blue. You can use any two balls of different colors to change the two balls of another color, for example, 1 red + 1 yellow = 2 blue. Now there are 171 red balls, 172 yellow balls, and 173 blue balls in the box. I asked if I could change them to the same color after several exchanges. -- Toplanguage
Guess: No. A maximum of 0 colors can be selected, one for the other, and the third for the other.
Eg4.7: there is a group of numbers. Except for one, only one of them is paired. Please find out the unpaired number. In addition, if there are two unpaired numbers, how should we be good.
Solution: if there is only one number, we can perform an exclusive or operation on all the numbers. The final result is the number to be searched. If there are two, perform an exclusive or operation on all the numbers to get a number, and then find one of the numbers that is not 0 bit, use this bit to divide the number into two parts. The two numbers that are not paired are not in the same part, and then call an algorithm with only one condition for the two parts.
4. the constant method is a typical fast meal method. Its idea is to use this constant to push back a group of numbers, and in some cases, to solve some problems quickly...
Eg4.9? -- Microsoft interview questions
Solution: calculate the total number X of the current card. If the total number is Y, the card is Y-X.
5. encoding is really a good thing. It can abstract complicated problems. For example, encoding a sequence can be directly mapped to the array script, greatly improving the access speed...
Eg4.10: Last Baidu pen question eg4.11: there are 1000 super expensive bottles of wine, one of which is toxic. This kind of poison is very powerful, even if it is diluted by 1000000 times, it can still poison the dead. However, this poison will only be poisoned for a certain period of time, with a duration of 1 month. In order not to waste these wines, 1000 of the experts decided to spend five weeks to identify the wine. They only wanted a maximum of 10 people to sacrifice, and you needed to arrange it. -- Toplanguage
To be answered
6. Do not underestimate probability questions, even the most basic probability knowledge. Probability questions are favored because they are often contrary to intuition and are easy to get confused. This scene is a favorite of the interviewer. I used to have a simple probability question during an interview with Baidu, but I still feel sweaty. So, for personal safety, review the basic knowledge of probability...
Eg4.12: A chain table with a length of N is unknown. We hope you can traverse the linked list only once and pick out K numbers from the medium probability of the linked list. -- Toplanguage
A blog solution, very good http://blog.csdn.net/potty15/article/details/6221715
A: First, pick out the number of the first K and save it in pick [1... K]. Then, traverse from the k + 1.
For I = k + 1 to n do // n does not know here, but you can use the linked list-> next = NULL to determine whether it has reached the end of the linked list.
R = random (1, I );
If (1 <= r <= k );
Pick [R] = I;
Simple mathematical proof is as follows:
Induction: at the beginning of the algorithm, the probability of the first K number being selected is 1, without losing its universality. Select the J number to discuss,
I = k + 1 round:
The probability that the return value of random (1, I) is J is 1/k + 1. Therefore, the probability of J retention is k/k + 1.
I = K + 2 round:
The probability that the return value of random (1, I) is J is 1/K + 2, so the probability that J retains is (K/k + 1) * (k + 1/K + 2) = K/K + 2
...
I = n rounds
The probability that the return value of random (1, I) is J is 1/N, so the probability of J retention is (K/k + 1) * (k + 1/K + 2) *.... * (N-1/n) = K/n
For the numbers from k + 1 to n, select m to discuss,
When I = m:
The probability of the return value of random (1, I) in [1, K] is k/M. Therefore, the probability of J is k/m, and m is stored in the second bit.
I = m + 1 round:
The probability that the return value of random (1, I) is S is 1/(m + 1). Therefore, the probability of J retention is (K/m) * (m/m + 1) = K/(m + 1)
...
I = n rounds
The probability that the return value of random (1, I) is S is 1/N, so the probability of J retention is (8/m) * (m/m + 1) *.... * (N-1/n) = K/n
.
[5] Acceleration Method
Most of the time, the algorithms you give are basically correct, but they are not good enough. The interviewer will want you to optimize it. There are many optimization methods. The basic idea is to consider where there is a waste. There are two common types of waste. One is to use heavy computations, such as division and modulo. You may need to accelerate the computing. In addition, sometimes your algorithm is too crude. For example, you only need symbols, but you have calculated the total number...
Eg5.1: calculates the maximum common divisor of two numbers. -- The beauty of Programming
Solution 1: The principle f (x, y) = f (y, X % Y) is used to divide the moving phase.
Solution 2: The principle f (x, y) = f (y, x-y) is used, that is, the moving phase subtraction.
Solution 3: Based on the parity of two numbers
X is even, Y is even F (x, y) = 2 * f (x> 1, Y> 1)
X is even, Y is odd f (x, y) = f (x> 1, Y)
X is odd, Y is even F (x, y) = f (x, y> 1)
X is odd, Y is odd f (x, y) = f (y, x-y)
Eg5.2: There is an integer array a [n], calculate the maximum product of any number of N-1. -- The beauty of Programming
Solution 1: The eg1.2 algorithm is used to calculate the product of all possible number of N-1, and traverse it to find the maximum product.
Solution 2: Use the positive and negative distribution of N numbers. Scan it again first. The number of positive numbers in the array is P, the number of negative numbers is N, the number of zero is Z, and the absolute value is the smallest positive number A and negative number B.
If z> = 2, the result is 0.
If Z = 1
If n is odd, the result is 0.
If n is the product of the even result except 0
If Z = 0
If n is the product of the odd result after removing the smallest Negative Absolute Value
If n is the product of the even result to remove the positive number of the smallest absolute value
Eg5.3: estimate the number of comparisons in quick sorting. -- The beauty of code
Solution:
Int CC (int n ){
Int m;
If (n <= 1) return 0;
M = randint (1, N );
Return n-1 + CC (m-1) + CC (n-M );
}
[6] Data Structure
In most interviews, we design algorithms for arrays, because there are many simple changes and the interviewer is very sure. However, other data results are equally important. The AVL and B trees may be complex, but the linked list and tree structures are also common. I personally met them many times...
1. Linked List eg6.1: A single-chain table header pointer. check whether a single-chain table has loops without using a large amount of additional data or modifying the original data. -- Microsoft interview questions
Solution: Use two pointers: Slow pointer P = p-> next, fast pointer q = Q-> next. If we encounter each other, there will be loops.
Eg6.2: give you two linked lists, how to determine whether they are intersecting, if intersection, how to find the first intersection of the two linked lists. -- The beauty of Programming
Solution: The two linked lists traverse to the end, that is, p-> next = NULL & Q-> next = NULL, and then determine P = Q.
Eg6.3: only give you a pointer to an element in the linked list. Delete this element. -- The beauty of Programming
Solution: copy the last element to the current element p-> value = p-> next-> value, and then delete the last element.
2. Tree eg6.4: heap Sorting Algorithm
Generally, algorithms are available in books, so we will not list them here.
Eg6.5: Determine whether a binary tree T contains the structure of another binary tree p. -- Microsoft interview questions
To be answered
The above content is just a glimpse of the leopard. The main source of the questions is some fast-food books and forums, including the beauty of programming, the beauty of code, and the pearl of programming. It is particularly recommended that toplanguage group's "Today we think" album. If you eat too much fast food, it will always be unnutritious. You need to eat a nutritious meal in a slow manner, such as the Bible of Uncle Gao, Introduction to algorithms, and, polia's "how to solve problems".