Binary Search 2

Source: Internet
Author: User

Further steps: Main Principles

When you encounter a problem that you think can be solved using binary search, you need to use some methods to verify that it is feasible. Now I will abstract them at another layer, which allows us to solve more problems and make the proof of binary search easier and implement them. This part is a bit formal, but it is not as difficult as you think.

Assume an assertedPDefines a set of certain sequenceS. This is a candidate result for finding regional issues. In this article, this assertion is a function that returns a Boolean value, true or false (we can also useYesAndNo). We use this assertion to determine whether a candidate value is valid based on the definition of the problem (without violating certain conditions ).

The main principle is as follows:Binary Search can and can only be used for allXIn SMedium, p (x)Contains P (y)For all y> X. This attribute is used when the second part of the search area is removed. This statement is the same as the statement.P(X) contains metadataP(Y) for all Y <X (symbol 'distinct' indicates non-logical), this is what we use when removing the first half of the search area. This principle can be easily proved, but I will omit it to save space.

In fuzzy mathematics, I suggest that if you encounter a yes or no problem (assertion), for a potential solution xYesThis means that all elements after X areYes. Similarly, if no is selected, you can know that all the elements after X will be no. Therefore, if you want to find each element (sequence) in the area, you will getNoFollowed by a string of consecutiveYes.

Readers will find that binary search can be used when the assertion is continuous yes and then Continuous NO. This is correct, and the assertions also satisfy the initial premise. In short, we only process assertions in principle.

If the conditions in the main principle are met, we can use binary search to find the legal minimum value, such as the minimumP (x)TrueX. Based on the principle of binary search, the first step is to design an assertion that can be valued and used for Binary Search: We need to select what this algorithm is looking. We can make it find the first satisfiedP (x)TrueXOr the last one meetsP (x)FalseX. As you can see, the difference between the two is very subtle, but it is necessary to use a specific one. For beginners, let's select the firstYes(First choice ).

The second part proves that the assertion can be used for binary search. This is where we use the main principle to determine whether the conditions in the principle are met. It doesn't need to be completely accurate. You just need to convince yourselfP (x)ImplicationP (y)For allYGreaterXOr elseP (x)Contains metadataP (y)For allYLessX. This can be proved by a simple sentence or two.

When the declarative domain is an integer, it can proveP (x)ImplicationP (x + 1)Or elseP (x)Contains metadataP (x-1)The rest can be proved by induction.

These two parts are usually staggered: when we think that a problem can be solved by binary lookup, our goal is to design an asserted that meets the conditions of the main principle.

Some may ask why we choose to use this abstraction instead of the simpler algorithms we have used? Because many problems cannot be modeled into a specific value, but the Definition and Valuation of an assertion such"Does one job cost less than or equalX?"When we are looking for a job with the minimum cost. For example, search for familiar Traveling Salesman Problems (TSP)Only access the cheapest loop in the order of each city. Here, the target value is not defined like this, but we can define an asserted"Is there less than or equalX?"Then we use a binary search to find the smallest X that satisfies the assertion. This is called dividing the original question into a decision (yes/no) problem. Unfortunately, we know that there is no efficient assessment of this assertion, so the traveling salesman problem cannot be easily solved using binary search, but many optimal problems can be solved using binary search.

Now let's jump to the binary search used on sorted arrays when introducing abstract definitions. First, repeat the problem:"Give you an arrayAAnd the target value. The first index in the array is equal to or greater than the target value .", Which is more or less done using lower_bound in C ++.

We want to find the index of the target value, so all the index values in the array are candidate answers. Finding region s is a set of all candidate answers, so a region includes all index values. Consider this asserted"A [x]Must it be greater than or equal to the target value ?". If we want to find that the first asserted is true, we will get exactly what we have discussed in the front section.

The conditions in the main principle are met because the array is in ascending order: If a [x] is greater than or equal to the target value, all the elements after it must be greater than or equal to the target value.

Let's start with this simple sequence:

0

5

13

19

22

41

55

68

72

81

98

Search region (INDEX ):

1

2

3

4

5

6

7

8

9

10

11

Use our assertions (target value: 55) to get:

No

No

No

No

No

No

Yes

Yes

Yes

Yes

Yes

This is a continuousNoThen the continuousYes, As we want. Note that index 7 (the position of our target value) is the first result of our assertion.YesLocation, which is found in our binary search.

Implement this discrete algorithm

The most important thing we should remember before starting encoding is to select two numbers (lower bound and Upper Bound) You want to maintain. One possible answer isOne must include the first satisfiedP (x)True xClosed range. All your code is guided by maintaining this variable: it tells you how to correctly move the boundary. This is a frequent place where errors occur, if not carefully.

Another thing you need to be careful about is how high the upper limit is. Here I actually mean width with height, because there are two variables to be concerned about. Each time a coder sums up that he or she sets the width to be large enough during encoding, but it is too late to find a counterexample during the gap ). Unfortunately, there are no useful suggestions here, except to check your border twice and three times! At the same time, because the execution time increases exponentially with the boundary, you can set them a little higher as long as you do not violate the asserted value. Always pay attention to the ubiquitous overflow errors, especially when calculating the median.

Now we can finally code the binary search we discussed earlier:

Binary_search (Lo, hi, p): <br/> while lo <Hi: <br/> mid = lo + (Hi-Lo) /2 <br/> If P (MID) = true: <br/> Hi = mid <br/> else: <br/> Lo = Mid + 1 <br/> If P (LO) = false: <br/> complain // P (x) is false for all X in S! <Br/> return lo // Lo is the least X for which p (x) is true

The two key lines are Hi = mid and Lo.
= Mid + 1. When P (MID) is true, we can remove the second part of the search area, because the asserted value for this area is true (main principle ). However, we cannot remove the median value because it may be the firstPIs a real element. It is good to move the upper bound to mid to avoid introducing errors.

Similarly, if P (MID) is false, we can remove the first part of the search area. This time we also remove the mid. P (MID) is false, so we don't need it in our search area. In this way, we can move the lower bound to the Mid + 1 position.

If we want to find the last X that satisfies p (x) as false, we can come up with a principle similar to the same side ):

// Warning: There is a nasty bug in this snippet! <Br/> binary_search (Lo, hi, p): <br/> while lo <Hi: <br/> mid = lo + (Hi-Lo) /2 // Note: Division truncates <br/> If P (MID) = true: <br/> Hi = mid-1 <br/> else: <br/> Lo = mid <br/> If P (LO) = true: <br/> complain/P (x) is true for all X in S! <Br/> return lo // Lo is the greatest X for which p (x) is false

You can verify that this Code meets our conditions, and the source we want to search for is always in the range of (Lo, hi. Then there is another problem. When you run your code, the following result is obtained by the assertion in the search interval:

No

Yes

This code will be an endless loop. It always selects the first element as mid, but does not move the bottom boundary, because it wants to leave no in the search area. The solution is to change mid = lo + (Hi-Lo)/2 to mid
= Lo + (Hi-Lo + 1)/2, then it will be rounded up rather than down. There are other ways to solve this problem, but this is the simplest. Remember to use it to contain two element sets. The first element gets false, and the second element gets true to test your code.

You may wonder why you should replace mid with mid = lo + (Hi-Lo)/2
= (Lo + hi)/2 to calculate the median. This is used to avoid another potential integer error: the first formula is used to make Division always rounded down to the bottom boundary. However, due to Division truncation, when lo + hi is a negative number, it is rounded up. Programming like this is to ensure that the divisor is always an integer, so it will be rounded as we want. When the search area only contains a positive integer or a real number, it is not displayed. I decided to write the entire area in this way to ensure consistency.

Real Number

Binary lookup can be used to define a monotonic function whose domain is a real number. Binary Search of real numbers is usually easier than integer search, because you do not need to pay attention to moving the border:

Binary_search (Lo, hi, p): <br/> while we choose not to terminate: <br/> mid = lo + (Hi-Lo) /2 <br/> If P (MID) = true: <br/> Hi = mid <br/> else: <br/> Lo = mid <br/> return lo // Lo is close to the border between NO and yes


Since the real number set is dense, it should be clear that we usually cannot find a specific value. Then we can quickly find someXYesF (x)YesNoAndYesThe boundary is within a certain error range. There are two ways to determine when to terminate: When you look for a pre-defined range (for example, 10-12) or set a definite number of iterations. On topcoder, your best choice is several hundred iterations, which gives you the best precision but does not need to think too much. 100 iterations approximate the search area to 10-30 of the initial size, which is sufficient for most (not all) problems.

If you need as few iterations as possible, you can stop them when the range is reduced to a certain value, but you need to compare the corresponding boundary values rather than making absolute comparisons. The reason for this is that doubles cannot be accurate to 15 digits, so when the search space contains a large number (Millions of sorted values), you cannot get an absolute error smaller than 10-7.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.