Two-point search, do you really understand?

Source: Internet
Author: User
Tags true true

Reproduced to: http://duanple.blog.163.com/blog/static/709717672009049528185/

Author: phylips@bmy

Recently in practice dynamic programming problem (DP), where the "maximum increment subsequence" problem to use a binary lookup, suddenly found that the binary search is not as simple as we think. For example, how to determine the stop condition, how different problems increase the bottom boundary and reduce the small boundary, how to ensure that the problem scale is small and not into the dead loop. The following article is a good explanation for the nature of two-point lookup, and solves the problem of duality with a common method. The end of the article is my own reference to the article written programming examples ...

Historically, Knuth in 6th 2.1 of its book <<sorting and searching>>, noting that although the first binary search algorithm appeared in 1946, the first fully correct binary search algorithm did not appear until 1962.

A binary lookup that is written without careful consideration often encounters the error of off by one or infinite loops. Here we discuss the rationale for the binary lookup, how to implement it, and how to use the technology to guarantee a correct binary program, so that we are free from thinking about the boundaries of trouble and the end of judgment problems.

The following functions are Lower_bound, Upper_bound, Binary_search and Equal_range in the C + + STL, and these functions are those that we have to consider how to implement. By implementing these functions, you can check if you really have a two-point lookup.

Theoretical basis:
When we encounter a problem, we need to determine whether it can be solved by binary lookup. This is easy to judge for the most common number, but for some things such as a binary + greedy combination, the two-decomposition equation, some monotone function problem, can be solved by using a binary solution, but sometimes they do not appear so clearly.

Considering an assertion defined on the ordered set S, the search space contains the candidate solution of the problem. In this article, an assertion is actually a two-valued function that returns a Boolean value. This assertion can be used to verify whether a candidate solution is a valid candidate solution for the defined problem.

We call the following theorem main theorem:binary search can be used if and just if for all x in S, p (x) implies P (y) for all y > x. Actual Through this property, we can halve the search space, which means that if the solution of our problem is to use such a validation function, the value of the validation function can satisfy the above conditions, so that the problem can be used to find the appropriate solution, such as the left-most legal solution. The above theorem also has an equivalent argument!p (x) implies!p (y) for all y < X. This theorem is easy to prove, here is omitted proof.

In fact, if you apply such a P function to the entire sequence, we can get a sequence of the following
Fasle false ... true .....
In the case of 01, this is actually a sequence of 0 0 0 0......1 1 1 1 ....
And all of the binary lookup problems can actually be translated into such a 01 sequence of the first 1 of the lookup problem, in fact, we find a two-point search for a unified model. Just like the 01 theorem used in ordering networks, if you can sort all of the 01 sequences, you can sort all the sequences. In fact, a binary lookup can also be used to resolve true True....fasle false ... That is 1 1 1 1 ... 0 0 0 0 ... the search problem for the sequence. Of course, if we turn the definition of P back, the sequence becomes the one above, which is the model that can be converted into the above.

So we turn all the problems into the position of the first 1 in the 0011 pattern sequence. Of course, the actual problem, it is also possible to find the 1100 pattern sequence of the last 1 position. At the same time note that the corresponding two cases of implementation is slightly different, and this difference for the correctness of the program is very critical.

The following examples are involved in both cases, which generally have some problems with maximizing requirements, and their assertion functions tend to have 1100 patterns, such as poj3258 River Hopscotch, and some problems with minimizing requirements, their assertion functions often have 0011 patterns, Like poj3273 monthly Expense.

And for the lookup of the number key, we can use the following assertion to make it the above pattern. For example, if x is greater than or equal to key, the assertion function value for an ascending sequence becomes the following pattern: 0 0 0 0......1 1 1 1 ... and looking for the leftmost key (similar to the lower_bound in STL, is the above model to find the leftmost 1. Of course, the problem is to find the last occurrence of the key (similar to the STL in the Upper_bound), just to change the assertion to: X is less than equals key, it becomes 1 1 1 1 ... 0 0 0 0 ... the search problem for the sequence.

This finding problem becomes the question of how to find the left or right 1 of the above sequence.

A similar problem with the solution of a monotone function is to set up an assertion that the function value is greater than or equal to 0. Also become the sequence as above, if it is monotonous rise, it becomes 0011 mode, and vice versa is 1100 mode. In fact, when the independent variable of a function takes the value of a real number, such a sequence actually becomes the form of an infinite series, which is 1111 ... 0000 The middle is infinite, and the 01 boundary is infinitely small. So looking at the rightmost 1, is generally looking for a approximation, that is, using a binary on the real field (the following source code 4), and using Fabs (begin-end) to control the accuracy to determine whether to stop the iteration. For example, POJ 3122 is in the 1111 ... The 0000-mode infinite sequence looks for that rightmost 1, the corresponding argument value.


Realize:
A binary basic implementation of a key in a judgment sequence (return-1 indicates that it does not exist or represents the location of index):

SOURCE Program 1:

int binary_search (int array[],int key) {
int begin = 0;
int end = Array.size ()-1;
while (begin < end) {
int mid = begin + (End-begin)/2;
if (Array[mid] < key)
begin = Mid+1;
else if (Array[mid] > key)
end = Mid-1;
else return mid;
}
return-1;
}


In the following procedure we all assume that such 1 is present, if it does not exist, you can add a sentence at the end of the verification, whether ==1 can be.

0011 the leftmost 1 of the sequence is found:

SOURCE Program 2:

int binary_search (int array[],int key) {
int begin = 0;
int end = Array.size ()-1;
while (begin < end) {
int mid = begin + (End-begin)/2;
if (array[mid] = = 1) {
end = Mid;
}
else{
Begin= mid+1;
}
}
return begin;
}

The search for the rightmost 1 of the 1100 sequence:

SOURCE Program 3:

int binary_search (int array[],int key) {
int begin = 0;
int end = Array.size ()-1;
while (begin < end) {
int mid = begin + (end-begin)/2;//wrong!!! In fact it should be:int mid = begin + (end-begin+1)/2;
if (array[mid] = = 1) {
begin = Mid;
}
else{
end = Mid-1;
}
}
return begin;
}

Binary floating-point numerical solution monotone function (this problem, more simple than the previous, because do not need to think too much about begin,end boundary problem, as long as simple =mid on it), Fabs (begin-end) < 0.0000000001 control accuracy, sometimes time out, need to adjust A value of 0.0000000001, or set an iteration counter that ends in a certain number of steps:

Double Binary_search (double) {

Double begin = min;

Double end = max;

while (Fabs (begin-end) < 0.0000000001) {

Double mid = (end-begin)/2;

if (f (mid) > 0) begin = Mid;

else End = mid;

}

return mid;

}

The correspondence between theory and implementation:

If careful, careful observation, we can find that the realization and the theory of the assertion is closely related. In fact, the IF conditional statement used in the loop within our implementation is a program language representation of the so-called assertion in the above theoretical basis. In fact, finding such assertions can guide our implementation.

The guarantee of the correctness of the program:

The correctness of the program depends mainly on the cyclic invariant to ensure.

For binary lookups, it is generally necessary to establish two invariance:

1. The current to-do list must contain the target element 2. The size of each pending lookup sequence becomes smaller.

1 to prevent, to miss the target elements, 2 can guarantee that the program will eventually terminate. Each loop in the branch, to ensure that such two invariance can be satisfied, then such a binary lookup program does not normally contain logic errors.

Observe the source program above 2 and 3, which changes when some mid+-1, some did not add, in essence, is to guarantee the invariant 1. For example, end = Mid in the source program 2, there is a reason why not write end = Mid-1, because for 0011来 said, And this mid is probably the leftmost 1, if the end = Mid-1, so that this 1 in the next iteration is actually not in the search sequence, that is, invariance 1 is not established.

Carefully observe the source program above 3, if given a sequence of 1 0 will do.

Yes, it actually goes into an infinite loop. why. In fact, the violation of the invariance of 2, that is, no guarantee sequence scale decline, further this is determined by the particularity of division, such as 3/2=1, such as begin=0 End=1,mid = 0, eventually will not cause the narrowing of the sequence length. The problem, then, is to write a binary lookup, the most commonly committed off by one error. solution here, put int mid = begin + (end-begin+1)/2, because it is obvious that end is necessarily reduced, and this also guarantees that each iteration of the begin will become larger. This guarantees the invariance of 2.

Some people may ask, what source program 2 does not have such a problem, is actually because int mid = begin + (End-begin)/2; we can see that if the Begin!=end, mid can be guaranteed to be smaller than the original end, and when end-begin= 1 is mid equals begin. And in the source program 2, begin=mid+1, to ensure that the begin is changed, and mid = begin + (End-begin)/2, and just ensure that end is necessarily reduced, so combined to ensure that the invariance of 2.

When writing a program, consider ensuring that these two invariance can help write the correct binary lookup.

Basic example

POJ 3233 3497 2104 2413 3273 3258 1905 3122

Note:

poj1905 actually solve a transcendental equation L "sinx-lx=0, you can use the source code 4, the two decomposition equation

poj3258 looking for the largest possible distance, is actually 111000 sequence to find the rightmost 1, you can refer to the source code 3

poj3273 looking for the smallest possible value, is actually 000111 sequence to find the leftmost 1, you can refer to source code 2

Summarize

The first is to find the basis for a binary search, that is, an assertion that conforms to the main theory: 0 0 0 ... 111 .....

To determine the upper and lower bounds of the two points, as far as possible to relax the upper and lower bounds, to prevent leakage of reasonable range, determine the upper bound, can also multiply the method

Observe whether the problem belongs to a 0011 or a 1100 pattern lookup

Write program note two invariance retention

Note that the validator can handle 01 of these two sequences of use cases without error

Note mid = begin+ (End-begin)/2, with mid= (begin+end)/2 is overflow dangerous. In fact, in the early Java JDK binary search has such a bug, then the Java Master Joshua Bloch found, just corrected.

BS_BasicBinarySearchSelfTest.cpp:Defines the entry point for the console application. This file is the summary of binary search//And for number key lookups, we can use one of the following assertions to make it the above pattern (00000011111 ... or 111111100000 ...). For example, whether X is greater than or equal to key,//So for an ascending sequence its assertion function value becomes the following pattern: 0 0 0 0......1 1 1 1 ...,//and looking for the leftmost key (similar to the STL Lower_bound, is the above model to find the leftmost 1. Of course, the problem is to find the last occurrence//that key (similar to the STL in the Upper_bound), just to change the assertion to: X is less than equal to key, it becomes 1 1 1 1 ...

0 0 0 0 ... the search problem for the sequence.
For binary lookups, it is generally necessary to establish two invariance:////1. The current pending lookup sequence must contain the target element 2. The size of each backlog is smaller. 1 to prevent, to miss the target elements, 2 can guarantee that the program will eventually terminate.
Each loop in the branch, to ensure that such two invariance can be satisfied, then such a binary lookup program does not normally contain logic errors.
#include "stdafx.h" #include <iostream> #include <vector> using namespace std;
	The monotone sequence determines whether a key exists, is the return position, does not exist, returns-1 int binarysearchiskeyexists (vector<int> nums, int nkey) {int nbegin = 0;
	int nend = Nums.size ()-1;
		while (Nbegin < nend) {int nmid = Nbegin + (nend-nbegin)/2;
		if (nums[nmid] = = Nkey) return nmid;
		else if (Nums[nmid] < Nkey) Nbegin = Nmid + 1; else Nend = Nmid- 1;
} return-1;

We all assume that 1 of these are present in the program below. If it does not exist, can be added at the end of the verification, whether ==1 can be,//that is, if we take the "ascending sequence to find the first greater than the key element of the position" as an example, then we can finally determine whether the lookup//element is really greater than the key 
	Lookup of the leftmost 1 in the 0011 sequence (the position of the first element that is greater than the key is found in an ascending sequence) int binarysearchlowerbound (vector<int> nums, int nkey) {int nbegin = 0;
	int nend = Nums.size ()-1;
		while (Nbegin < nend) {int nmid = Nbegin + (nend-nbegin)/2;
		if (Nums[nmid] > Nkey) nend = Nmid;
	else Nbegin = Nmid + 1;
return nbegin;  The lookup of the rightmost 1 in the//1100 sequence (the position of the last element that is less than the key is found in an ascending sequence) int binarysearchupperbound (vector<int> nums, int nkey) {int nbegin =
	0;
	int nend = Nums.size ()-1; while (Nbegin < nend) {int nmid = Nbegin + (Nend-nbegin + 1)/2;//Notice if (nums[nmid) < Nkey) Nbegin =
		Nmid;
	else nend = nMid-1;
return nbegin;
	int _tmain (int argc, _tchar* argv[]) {vector<int> Testarray = {1,2,3,3,4,5,6,7,8};
	for (int i = 0; i < testarray.size (); ++i) {cout << testarray[i] << ', ';
	} cout << Endl; for (int i = 0; I < testarray.size ();
	++i) {cout << i << ', ';
	} cout << Endl;
	cout << "is exists:" <<binarysearchiskeyexists (testarray,4) <<endl;
	cout << "Lower bound:" << binarysearchlowerbound (Testarray, 4) << Endl;
	cout << "Upper bound:" << binarysearchupperbound (Testarray, 4) << Endl;
return 0; }


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.