Binary_search function Usage

Source: Internet
Author: User
Tags comparison set set sorted by name
STL binary search (binary search in STL)

Section I
Distinguish the different search algorithms correctly Count,find,binary_search,lower_bound,upper_bound,equal_range
This article is a summary of the effective STL 45th, explains the similarities and differences of various search algorithms and the timing of their use.

The algorithm that can be searched first is roughly count,find,binary_search,lower_bound,upper_bound,equal_range. The partisan version with discriminant, such as count_if,find_if or Binary_search, is used in roughly the same way, without affecting the selection, so it is not considered.
Note that these lookup algorithms require a sequence container, or an array. The associated container has a corresponding member function of the same name except Binary_search.

First, whether the interval is sorted is a critical factor when choosing a search algorithm.
You can divide the sorting interval into two groups by whether you want it:
A. Count,find
B. Binary_search,lower_bound,upper_bound,equal_range
Group A does not need a sorting interval, and the B group needs a sort interval.
When an interval is sorted, preference is given to Group B, as they provide the efficiency of the logarithmic time. And A is linear time.

In addition, group B depends on the finding and judging law, a uses the law of equality (find objects need to define operator==), B uses the equivalence law (find objects need to define OPERATOR<, must return false on equality).

Difference in Group A
Count: Calculates the number of object intervals.
Find: Returns the position of the first object.
If the lookup succeeds, find returns immediately and Count does not return immediately (until the entire interval is found), where find is more efficient.
Therefore, count is not considered unless the number of objects is to be computed.

Difference in Group B {1,3,4,5,6}
Binary_search: Determine if an object exists
Lower_bound: Returns the first position of the >= object, Lower_bound (2) =3, Lower_bound (3) =3
The target object exists as the location of the target object, not the last position.
Upper_bound: Returns the first position of the > object, Upper_bound (2) =3,upper_bound (3) =4
The next location, whether or not it exists.
Equal_bound: Returns the pair that consists of the return value of Lower_bound and Upper_bound, which is the range of all equivalent elements.
Equal_bound has two areas to note:
1. If the two iterators returned are the same, the lookup interval is empty and no such value
2. Returns the distance between iterators is equal to the number of objects in the iterator, and for the sorting interval, he completes the double task of Count and find

Section II binary Search in STL

If an ordered sequence is included in a C + + STL container, the STL provides four functions for searching, which are implemented using binary search.
which
An element that assumes the same value may have more than one
Lower_bound returns the position of the first element that matches a condition
Upper_bound returns the last qualifying element position
Equal_range returns the position of all header/tail elements equal to the specified value, in fact Lower_bound and Upper_bound
Binary_search returns whether there are elements that need to be looked up.

Section II Effect STL #45

Article 45: Note the difference between count, find, Binary_search, Lower_bound, Upper_bound, and Equal_range

What you're looking for, and you have a container or you have an iterator that divides it--what you're looking for is inside. How are you going to finish the search? The arrows in your quiver have these: count, count_if, find, find_if, Binary_search, Lower_bound, Upper_bound, and Equal_range. In front of them, how do you make a choice.

Simple. You are looking for something that is fast and simple. The quicker the simpler the better.

For the time being, I assume you have a pair of iterators that specify the search interval. Then, I'll take into account that you have a container instead of an interval situation.

To select a search strategy, you must depend on whether your iterator defines an ordered interval. If so, you can use Binary_search, Lower_bound, Upper_bound, and Equal_range to speed up (usually a logarithmic time-see clause 34) search. If the iterator does not divide an ordered interval, you can only use the linear time algorithm count, count_if, find, and find_if. In the following, I'll ignore whether count and find have _if differences, as I would ignore Binary_search, Lower_bound, Upper_bound, and Equal_range. Whether you rely on the default search verb or the one you specify is the same for the selection search algorithm.

If you have an unordered interval, your choice is count or find. They can answer slightly different questions, so it's worth carefully distinguishing them. The question that count answers is: "Does this value exist, and if so, there are several copies." And find answers the question: "Is there, if so, where it is." ”

Suppose what you want to know is whether there is a specific widget value W in the list. If you use count, the code looks like this:
List < widgets > LW; List of Widgets
Widget W; Specific widget values
...
if (count (Lw.begin (), Lw.end (), W)) {
...//W in LW
} else {
...//Not in
}

Here is an example of how count is used as a check for existence. Count returns 0 or a positive number, so we convert the non 0 to true and zero to false. If this is what makes us more obvious:
if (count (Lw.begin (), Lw.end (), W)! = 0) ...

And some programmers write like this, but using implicit conversions is more common, like the original example.

Compared to the original code, using find is slightly more difficult to understand because you have to check the return value of Find and the list's end iterator for equality:
if (Find (Lw.begin (), Lw.end (), W)! = Lw.end ()) {
...//Found
} else {
...//not found
}

If it is to check for existence, the usage of count is relatively simple to encode. However, when the search succeeds, it is less efficient because find stops when a matching value is found, and count must continue searching until the end of the interval to find other matching values. For most programmers, the efficiency advantage of find is enough to justify a slight increase in complexity.

In general, it is not enough to know if there is a value in the interval. Instead, you want to get the first object in the interval that equals that value. For example, you might want to print out this object, you might want to insert something in front of it, or you might want to delete it (but the boot removed when iterating see clause 9). When you need to know more than a value exists, and to know which object (or objects) have that value, you have to use find:
List<widget>::iterator i = Find (Lw.begin (), Lw.end (), W);
if (i! = Lw.end ()) {
...//found, I pointed to the first
} else {
...//not found
}

For an ordered interval, you have other options, and you should use them explicitly. Count and find are linear in time, but the search algorithms for ordered intervals (Binary_search, Lower_bound, Upper_bound, and Equal_range) are logarithmic times.

Migrating from an unordered interval to an ordered interval leads to another migration: judging by the use of equality to determine whether two values are the same or equivalent. Clause 19 describes the difference between equality and equivalence in detail, so I won't repeat it here. Instead, I'll simply state that both the count and find algorithms are searched using equality, while Binary_search, Lower_bound, Upper_bound, and equal_range are equivalent.

To test whether a value exists in an ordered interval, use Binary_search. Unlike the standard C library (and therefore also in the standard C + + library), Bsearch,binary_search returns only one bool: whether this value was found. Binary_search answered the question: "Is it there." "Its answer can only be yes or No. If you need more information than this, you need a different algorithm.

Here is an example of a binary_search applied to an ordered vector (you can know the advantages of ordered vectors FROM clause 23):
Vectors < widgets > vw; Build a vector and put
...//Data,
Sort (Vw.begin (), Vw.end ()); Sort the data
Widget W; The value to find
...
if (Binary_search (Vw.begin (), Vw.end (), W)) {
...//W in VW
} else {
...//Not in
}

If you have an orderly interval and your question is: "Is it there, and if so, where is it?" "You need equal_range, but you may want to use Lower_bound. I'll talk about Equal_range very quickly, but first, let's see how to use Lower_bound to locate a value in the interval.

When you use Lower_bound to find a value, it returns an iterator that points to the first copy of the value (if found) or to the location where the value can be inserted (if not found). So Lower_bound answered the question: "Is it there." If so, where is the first copy? If not, it will be where. As with find, you have to test the results of Lower_bound to see if it points to the value you are looking for. But unlike find, you can't just detect whether the return value of Lower_bound is equal to the end iterator. Instead, you have to check that the object that Lower_bound is indicating is not the value you need.

Many programmers use lower_bound like this:
Vectors < widgets >:: Iterator i = Lower_bound (Vw.begin (), Vw.end (), W);
if (i! = Vw.end () && * i = = W) {//Ensure I points to an object;
It also guarantees that the object has the correct value.
This is a bug.
...//Find this value, I point to
First object equal to this value
} else {
...//not found
}

In most cases, this is going to work, but it's not really right. Look again for the code that detects whether the value you want is found:
if (i! = Vw.end () && *i = = W) ...

This is an equal test, but the Lower_bound search is equivalent. In most cases, equivalence tests and equality tests produce the same results, but it is not difficult to see the difference between equal and equivalent results, as argued in clause 19. In this case, the above code is wrong.

To complete it, you must detect whether the value returned by the iterator to the object that the Lower_bound is pointing to is equivalent to the value you are looking for. You can do it manually (clause 19 demonstrates what you should do when it is worth doing when clause 24 provides an example), but can be done more cunningly, because you have to confirm that you have used the same comparison function as the lower_bound. Generally, it can be an arbitrary function (or function object). If you pass a comparison function to Lower_bound, you must confirm that the equivalent detection code for your handwriting uses the same comparison function. This means that if you change the comparison function you pass to Lower_bound, you also have to make changes to your equivalence detection section. Keeping the comparison function synchronized is not a rocket launch, but it is another thing to remember, and I think you already have a lot of things you need to remember.

Here's an easy way to do this: use Equal_range. Equal_range returns a pair of iterators, the first of which is equal to the iterator returned by Lower_bound, and the second equals the return of the Upper_bound (that is, the next one that is equivalent to the last iterator that is searching for the value interval). So, Equal_range, returns a pair of iterators that divide the interval that is equivalent to the value you are searching for. A well-named algorithm, isn't it. (Of course, it might be better to call Equivalent_range, but Equal_range is also very good.) )

There are two important places for the return value of Equal_range. First, if the two iterators are the same, it means that the interval of the object is empty; The result is to use Equal_range to answer "Is it there?" "The answer to this question. You can use this:

Vectors < widgets > vw;
...
Sort (Vw.begin (), Vw.end ());
typedef vectors < Widgets >:: Iterator Vwiter; A convenient typedef
typedef pair < Vwiter, Vwiter > Vwiterpair;
Vwiterpair p = equal_range (Vw.begin (), Vw.end (), W);
if (p.first! = P.second) {//If Equal_range does not return
Empty interval ...
...//description found, P.first point
The first one and P.second
Point to the last one next
} else {
...//not found, P.first and
P.second all points to the search value
}//the insertion position

This code is only used for equivalence, so it is always correct.

The second thing to note is that Equal_range returns two iterators, which are distance equal to the number of objects in the interval, that is, the object that is equivalent to the value being looked for. As a result, equal_range not only completed the task of searching the ordered interval, but also completed the counting. For example, to find a widget equivalent to W in VW and then print out how many of these widgets exist, you can do this:
Vwiterpair p = equal_range (Vw.begin (), Vw.end (), W);
cout << "There is" << distance (P.first, P.second)
<< "elements in VW equivalent to W.";

So far, all we've been talking about is assuming we're going to search for a value within a range, but sometimes we're more interested in finding a place in the interval. For example, suppose we have a timestamp class and a timestamp vector, which is sorted according to the old timestamp in the previous method:
Class Timestamp {...};
BOOL operator< (const timestamp& LHS,//return in time LHS
Const timestamp& RHS); Is it in front of RHS
vector<timestamp> VT; Build vectors, populate data,
...//Sort to make the old time
Sort (Vt.begin (), Vt.end ()); In front of the new

Now suppose we have a special timestamp--agelimit, and we remove from VT all the older timestamp than Agelimit. In this case, we do not need to search for and agelimit equivalent timestamp in VT, because there may not be any element that is equivalent to this exact value. Instead, we need to find a location in VT: The first element that is not older than agelimit. This is simple enough, because Lower_bound will give us the answer:
Timestamp Agelimit;
...
Vt.erase (Vt.begin (), Lower_bound (Vt.begin (),//exclude all from VT
Vt.end (),//value in Agelimit
Agelimit)); The previous object

If our needs change a little bit, we have to rule out all the timestamp that are at least as old as Agelimit, which is where we need to find the first timestamp that is younger than agelimit. This is a special task for Upper_bound:
Vt.erase (Vt.begin (), Upper_bound (Vt.begin (),//Remove all from VT
Vt.end (),//before the value of Agelimit
Agelimit)); or an equivalent object

Upper_bound is also useful if you want to insert something into an ordered interval, and the object's insertion position is where it should be in an orderly equivalence relationship. For example, you might have a list of ordered person objects, with objects sorted by name:
Class Person {
Public
...
Const string& name () const;
...
};

struct Personnameless:
Public Binary_function<person, person, bool> {//See clause 40
BOOL Operator () (const person& LHS, const person& RHS) const
{
Return Lhs.name () < Rhs.name ();
}
};

List<person> LP;
...
Lp.sort (Personnameless ()); Using Personnameless to sort LP

To keep the list still the order we want (by name, the equivalent name is still in order), we can use Upper_bound to specify the insertion position:
Person Newperson;
...
Lp.insert (Upper_bound (Lp.begin (),//Newperson in LP
Lp.end (),//before or equivalent
Newperson,//The last one
Personnameless ()),//After object
Newperson); Insert Newperson

This work is very good and convenient, but it is important not to be misled-mistakenly think that this usage of upper_bound allows us to magically find the insertion position in a list within a logarithmic time. We did not--clause 34 explained that because we used the list to find the linear time spent, it only used a few comparisons.

All the time, I think about it. We have a pair of iterators that define the search interval. Usually we have a container, not an interval. In this case, we must distinguish between sequences and associative containers. For standard sequence containers (vector, String, deque, and list), you should follow my recommendations in this article by using the container's begin and end iterators to divide the interval.

This is different for standard associative containers (set, Multiset, map, and Multimap) because they provide the member functions of the search, which are often better than the STL algorithm. Clause 44 details why they are a better choice, briefly, because they are more natural to behave faster. Fortunately, the member function usually has the same name as the corresponding algorithm, so the previous discussion recommends that you use the algorithm count, find, Equal_range, Lower_bound, or Upper_bound, When searching for associative containers, you can simply replace them with member functions of the same name.

The policy for calling Binary_search is different because the algorithm does not provide a corresponding member function. To test whether a value exists in a set or map, use the habitual method of count to detect the member:
Set<widget> s; Set set, put data
...
Widget W; W is still the value to save for search
...
if (S.count (W)) {
...//presence and W equivalent values
} else {
...//There is no such value
}

To test whether a value exists in multiset or Multimap, find tends to be better than count, because once a single object equals the expectation is found, find can stop, and count, in the worst case, must detect every object in the container. (for set and map, this is not a problem because set does not allow duplicate values, and map does not allow duplicate keys.) )

However, count is reliable for associative container counts. In particular, it is better than calling Equal_range and then applying distance to the result iterator. First, it's clearer: count means "count." Second, it's simpler, instead of creating a pair of iterators and then passing it on to distance, which is the first and second. Third, it could be a little faster.

To give all that we have taken into account in these terms, where we proceed. The table below explains everything.

You want to know. In an unordered interval In an orderly interval On a set or map On the multiset or Multimap
Whether the expected value exists. Find Binary_search Count Find
Whether the expected value exists. If so, where is the first object equal to this value? Find Equal_range Find Find or Lower_bound (see below)
Where the first object is not before the expected value. Find_if Lower_bound Lower_bound Lower_bound
Where is the first object after the expected value. Find_if Upper_bound Upper_bound Upper_bound
How many objects are equal to expectations. Count Equal_range, then distance. Count Count
Where are all objects equal to the expected value. Find (Iteration) Equal_range Equal_range Equal_range



The above table summarizes how an ordered interval can be manipulated, and the frequency of equal_range may be surprising. When searching, this frequency increases because of the importance of equivalence detection. For Lower_bound and Upper_bound, it is easy to retreat from equality detection, but for equal_range, it is natural to detect equivalence only. In the second row of orderly intervals, Equal_range defeated find also because of a reason: Equal_range spends logarithmic time, while find spends linear time.

For Multiset and Multimap, when you are searching for the first line of an object that equals a specific value, this table lists the two algorithms for find and lower_bound as candidates. Find is the usual choice for this task, and you may have noticed that in the set and map column, this is only find. But for the multi container, if there is not only one value present, find does not guarantee that the first element in the container is equal to the given value; it only recognizes one of these elements. If you really need to find the first element equal to the given value, you should use Lower_bound, and you have to do an equivalent test of the second part manually, the content of clause 19 can help you to confirm that you have found the value you are looking for. (You can use Equal_range to avoid manual equivalence detection, but it is much more expensive to call Equal_range than to call Lower_bound.) )

Making choices in Count, find, Binary_search, Lower_bound, Upper_bound, and Equal_range is simple. When you call, choosing an algorithm or a member function can give you the desired behavior and performance, and is the least work. Follow this advice (or refer to that form) and you won't be confused.

Dichotomy search (binary search), also known as binary retrieval, the basic idea of binary retrieval is that the elements in the dictionary are stored in an array from small to large, and first the given value key is compared with the key code (key) of the element in the middle of the dictionary, and if it is equal, the search succeeds;   If key is small, the binary search is continued in the first half of the dictionary;   If key is large, the binary search is continued in the second half of the dictionary.   In this way, half of the search interval is reduced by one comparison, so it goes on until the search succeeds or the retrieval fails. Binary retrieval is an efficient retrieval method, which requires the dictionary to be sorted by key code in the sequential table.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.