Difference between count, find, binary_search, lower_bound, upper_bound and performance_range of STL Algorithm

Source: Internet
Author: User
Tags sorted by name

What are you looking for, and you have a container or you have a range divided by the iterator-everything you are looking for is in it. How do you complete the search? The arrow in your arrow bag has this
Count, count_if, find, find_if, binary_search, lower_bound, upper_bound, and
Performance_range. How do you choose to face them?

Simple. You are looking for something fast and simple. The faster, the simpler the better.

For the moment, I suppose you have a pair of iterators that specify the search areas. Then, I will consider that you have a container instead of a range.

To select a search policy, you must determine whether your iterator defines an ordered interval. If yes, you can accelerate through binary_search, lower_bound, upper_bound, and pai_range (usually logarithm time). For more information, seeClause 34)
Search. If the iterator does not divide an ordered interval, you can only use linear timeAlgorithmCount, count_if, find, and find_if. In this article, I will ignore
Whether count and find have the difference of _ if, just as I will ignore binary_search, lower_bound, upper_bound, and
Whether there is a difference between the limit _range and the limit type. Whether you rely on the default search predicate or specify your own, you have the same considerations for selecting a search algorithm.

If you have an unordered interval, you can choose count or find. They can answer slightly different questions, so it is worth separating them carefully. Count answers the following question: "Is there a value? If so, how many copies are there ?" Find answers the question: "Is there? If so, where is it ?"

Suppose you want to know whether there is a specific widget value W in the list. If you use count,CodeIt looks like this:

 
List <widget> LW; // widget list
Widget W; // specific widget Value
...
If (count (LW. Begin (), LW. End (), W )){
... // W in LW
} Else {
... // No
}

Here we demonstrate a common usage: Count is used as a check for existence. Count returns zero or a positive number, so we convert non-zero to true and zero to false. In this case, what we need to do is more obvious:

 
If (count (LW. Begin (), LW. End (), W)! = 0)...

And someProgramBut implicit conversion is more common, as in the first example.

Compared with the original code, using find is slightly more difficult, because you must check whether the return value of find is equal to the end iterator of list:

 
If (Find(LW. Begin (), LW. End (), W)! = LW. End ()){
... // Found
} Else {
... // Not found
}

If you want to check whether the Count exists, the usage of count is simply encoded. However, when the search is successful, the efficiency is relatively low, because find stops when the matching value is found, and
Count must continue searching until the end of the interval to find other matching values. For most programmers, the efficiency advantage of find is sufficient to prove that it is appropriate to slightly increase complexity.

Generally, it is not enough to know whether a value exists in the interval. Instead, you want to obtain the first object in the range that is equal to this value. For example, you may want to print this object, you may want to insert something before it, or you may want to delete it (but for guidance on deleting during iteration, seeClause 9). When you need to know not only whether a value exists, but also the object (or object) that owns the value, you need to find:

 
List <widget>: iterator I = find(LW. Begin (), LW. End (), W );
If (I! = LW. End ()){
... // Found, I points to the first
} Else {
... // Not found
}

For ordered intervals, you have other options, and you should use them explicitly. Count and find are linear, but the search algorithms (binary_search, lower_bound, upper_bound, and pai_range) in the ordered interval are logarithm time.

Migrating from unordered intervals to ordered intervals leads to another migration: Judging from using equality to determining whether two values are the same to using equivalence.Clause 19The difference between equality and equivalence is described in detail, so I won't repeat it here. Instead, I will simply describe that the count and find algorithms use equal searches, while binary_search, lower_bound, upper_bound, and interval _range are equivalent.

To test whether a value exists in the ordered interval, binary_search is used. Unlike the Standard C library (so it is also in the Standard C ++ Library)
) Bsearch, binary_search returns only one bool: whether the value is found. Binary_search answers this question: "Is it there ?" Its answer
Yes or no. If you need more information than this, you need a different algorithm.

Here is an example of applying binary_search to ordered vector (you canClause 23Knows the advantages of ordered vector ):

 
Vector <widget> VW; // create a vector and place it in
... // Data,
Sort (VW. Begin (), VW. End (); // sort data
Widget W; // the value to be found
...
If (Binary_search(VW. Begin (), VW. End (), W )){
... // W in VW
} Else {
... // No
}

If you have an ordered interval and your question is: "Is it there? If so, where is it ?" You need to use lower_bound. I will discuss about interval _range soon, but first, let's see how to use lower_bound to locate a value in the interval.

When you use lower_bound to find a value, it returns an iterator pointing to the first copy of the value (if any) or to the bit where the value can be inserted.
(If not found ). Therefore, lower_bound answers this question: "Is it there? If yes, where is the first copy? If not, where will it be ?" Like find, you must test
The result of lower_bound to see if it points to the value you are looking. Unlike find, you cannot just check whether the returned value of lower_bound is equal to the end iterator.
Instead, you must check whether the object marked by lower_bound is the value you need.

Many programmers use lower_bound as follows:

 
Vector <widget>: iterator I = lower_bound(VW. Begin (), VW. End (), W );
If (I! = VW. End () & * I = W){// Ensure that I points to an object;
// This ensures that the object has a correct value.
// This is a bug!
... // Find this value, I points
// The first object equal to this value
} Else {
... // Not found
}

This works in most cases, but it is not true. Check again whether the required values are found in the Code:

 
If (I! = VW. End ()&&* I = W)...

This isEqualBut the lower_bound search usesEquivalent. In most cases, the equivalent test and equality test produce the same results,Clause 19It is not difficult to see the differences between equal and equivalent results. In this case, the above Code is wrong.

To completely complete the process, you must check whether the value of the object pointed to by the iterator returned by lower_bound is equivalent to the value you are looking. You can manually complete (Clause 19Demonstrate how you do it, when it is worth itClause 24Submit
For example), but it can be done more cleverly, because you must be sure to use the same comparison function as lower_bound. In general, it can be an arbitrary function (or function pair)
). If you pass a comparison function to lower_bound, you must make sure that you use the same comparison function as your handwritten equivalence detection code. This means that if you change
The comparison function of lower_bound, You have to modify your equivalence detection part. Keeping the comparison function synchronized is not a rocket launch, but another thing to remember, and I think you already have a lot
Something you need to remember.

Here is a simple method: Use interval _range. When _range returns a pair of iterators, the first is equal to the iterator returned by lower_bound, and the second
Equal to the value returned by upper_bound (that is, equivalent to the next iteration of the last iterator to search for the value range ). Therefore, equal_range returns a pair of partitions and
The iterator of the range with equal values. An algorithm with a good name, isn't it? (Of course, equivalent_range may be better, but it is also very good .)

There are two important aspects for the return value of performance_range. First, if the two iterators are the same, it means that the object's range is empty; this is not found. In this result, use pai_range to answer "Is it there ?" The answer to this question. You can use this method:

 
Vector <widget> VW;
...
Sort (VW. Begin (), VW. End ());
Typedef vector <widget>: iterator vwiter; // convenient typedef
Typedef pair <vwiter, vwiter> vwiterpair;
Vwiterpair P = pai_range(VW. Begin (), VW. End (), W );
If (P. First! = P. Second){// If pai_range does not return
// Null interval...
... // The description is found. P. First points
// The first one and P. Second
// Point to the next of the last one
} Else {
... // Not found, P. First and
// P. Second all points to the search Value
} // Insert position

This code is only equivalent, so it is always correct.

The second thing to note is that the items returned by performance_range are two iterators. The distance for them is equal to the number of objects in the range, that is, the pair equivalent to the value to be searched.
Image. As a result, the sorted _range not only completes the task of searching the ordered interval, but also completes the count. For example, you need to find a widget equivalent to W in VW, and then print out the number
Sample widgets exist. You can do this:

 
Vwiterpair P = pai_range (VW. Begin (), VW. End (), W );
Cout <"there are" <Distance (P. First, P. Second)
<"Elements in VW equivalent to W .";

So far, we have discussed the assumption that we want to search for a value in a range, but sometimes we are more interested in finding a position in the range. For example, suppose we have a timestamp class and a timestamp vector, Which is sorted by the method in front of the old timestamp:

 
Class timestamp {...};
Bool operator <(const timestamp & LHS, // returns LHS on time
Const timestamp & RHs); // whether it is before RHS
Vector <timestamp> VT; // creates a vector and fills in data,
... // Sort to make the old time
Sort (vt. Begin (), vt. End (); // in front of the new

Now suppose we have a special timestamp -- agelimit, And we delete all timestamp older than agelimit from VT. In this situation
In this case, we do not need to search for timestamp equivalent to agelimit in VT, because there may be no element equivalent to this exact value.
Instead, we need to find a location in VT: the first element that is no older than agelimit. This is a little simple, because lower_bound will give us the answer:

 
Timestamp agelimit;
...
Vt. Erase (vt. Begin (),Lower_bound(Vt. Begin (), // exclude all from VT
Vt. End (), // The value of agelimit
Agelimit); // The previous object

If our requirements change a little, we need to exclude all timestamp that are at least as old as agelimit, that is, we need to find the first timestamp that is younger than agelimit. This is a special task for upper_bound:

 
Vt. Erase (vt. Begin (),Upper_bound(Vt. Begin (), // remove all
Vt. End (), // before the value of agelimit
Agelimit); // or an equivalent object

Upper_bound is also useful when you want to insert an object into an ordered interval and the object is inserted at the place where it should be in an ordered equivalence relationship. For example, you may have a list of ordered person objects, which are sorted by name:

Class person {
Public:
...
Const string & name () const;
...
};

Struct personnameless:
Public binary_function <person, person, bool> {// seeClause 40
Bool operator () (const person & LHS, const person & RHs) const
{
Return LHS. Name () <RHS. Name ();
}
};

List <person> LP;
...
LP. Sort (personnameless (); // sort LP using personnameless

We can use upper_bound to specify the insert position to keep the list in the desired order (by name, the equivalent names are still arranged in order after insertion:

 
Person newperson;
...
LP. insert (Upper_bound(LP. Begin (), // rank in newperson in LP
LP. End (), // before or equal
Newperson, // the last one
Personnameless (), // After the object
Newperson); // insert newperson

This work is very good and convenient, but it is important not to be misled-mistakenly think that the use of upper_bound allows us to find the insert position in a list in the logarithm time. We do not --Clause 34It is explained that because we use list, it takes a linear time to search, but it only uses a logarithm comparison.

Until now, I have considered the case where we have a pair of iterators defining the search areas. Usually we have a container instead of a range. In this case, we must differentiate the sequence and associated capacity.
. For standard sequence containers (vector, String, deque, and list), you should follow the suggestions I have put forward in these terms and use the begin and end iterator of the container.
Divide the intervals.

This situation is different for standard associated containers (set, Multiset, map, and multimap) because they provide search member functions, which are often better choices than STL algorithms.Clause 44Details
Explain in detail why they are better choices, simply put, because they are faster and more natural. Fortunately, member functions usually have the same name as the corresponding algorithms, so we recommend that you use
When searching for associated containers, you can simply use the member functions with the same name.
.

The policies for calling binary_search are different because this algorithm does not provide the corresponding member functions. To test whether a value exists in the set or map, use the regular method of Count to check the members:

Set <widget> S; // create a set and put it into the data
...
Widget W; // W is still the value to be searched
...
If (S. Count (W)){
... // There are values equivalent to W
} Else {
... // This value does not exist
}

To test whether a value exists in Multiset or multimap, find is usually better than count, because once a single object that is equal to the expected value is found, find can stop
But count, in the worst case, must detect every object in the container. (This is not a problem for set and map, because set does not allow repeated values, while map does not allow repeated values.
Key .)

However, count is reliable for the associated container count. In particular, it is better than calling cmd_range and then applying distance to the result iterator. First, it is clearer: Count means "count ". Second, it is simpler; you do not need to create an iterator and then make it(Note: first and second)To distance. Third, it may be faster.

Where should we begin with all of the considerations we have taken into account in these terms? The following table shows everything.

What You Want To Know algorithm used member functions used
In the unordered range In the ordered range on set or Map On Multiset or multimap
does the expected value exist? Find binary_search count Find
does the expected value exist? If so, where is the first object equal to this value? Find interval _range Find Find or lower_bound (see below)
where is the first object not before the expected value? find_if lower_bound lower_bound lower_bound
where is the first object after the expected value? find_if upper_bound upper_bound upper_bound
How many objects are equal to the expected value? count interval _range, and then distance count count
where are all objects equal to the expected value? Find (iteration) interval _range interval _range interval _range

The table above summarizes how to operate the ordered interval. The occurrence frequency of the interval _ range may be surprising. When searching, this frequency increases because of the importance of equivalence detection. For
Lower_bound and upper_bound, which are easy to retreat in the equal detection, but it is natural to detect only the equivalence for the interval _range. In the second ordered Area
Between them, mongo_range beat find for another reason: mongo_range takes logarithm time, while find takes linear time.

For Multiset and multimap, when you are searching for the row of the first object that is equal to a specific value, this table lists the find and lower_bound algorithms
Candidate.
Find is a common choice for this task, and you may have noticed that in the set and map columns, this is only find. However, for multi containers, if not only one value is saved
In this case, find does not guarantee that the first element in the container that equals the given value can be identified; it only recognizes one of these elements. If you really need to find the first element equal to the given value, you should use
Lower_bound, and you must manually perform the equivalence check on the second part,Clause 19Content to help you confirm that you have found the value you are looking. (You can use interval _range to avoid manual equivalence detection, but it takes much more to call interval _range than to call lower_bound .)

The options in count, find, binary_search, lower_bound, upper_bound, and interval _range are simple.
Single. When you call it, selecting an algorithm or a member function can give you the behavior and performance you need, and it is the least effort. Follow this advice (or refer to the table) and you will not be confused.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.