Use STL to implement DFS/BFS algorithms -- check duplicate statuses

Source: Internet
Author: User
Use STL to implement DFS/BFS Algorithms -- Check the duplicate statusA few days ago, some netizens commented that my use of "Shen" and "Shen Kuan" is a second-class word. The correct statement should be "depth first" and "breadth first ". Although attitudes and words are a bit unacceptable, the comments are a bit reasonable. To be honest, I do not remember where I learned the first word "Shen" and "Shen Kuan", but I can be certain that I cannot create such a word, I just thought the two words are good and simple, so I used them. Later, I checked it on Google and found that there are indeed more words in four words than in two words. It seems that most people still like long words, or the longer words are more common. So I wondered if I should "correct" my own "errors" and use them more easily. Next, I am very happy to see some netizens "Shen" and "Xian Kuan", so I thought that if I changed this, isn't it a little sorry for those netizens who like me to use "Shen" and "Shen Kuan? After thinking about it, I simply used text E, so I came up with this new title. As mentioned above, our DFS/BFS algorithms can only search for stateless state space trees, such as playing chess and pushing boxes, the status may return to the previous status after several steps. When the repeat status appears, we must be able to identify it and should not put it in the search tree. The simplest and most direct method is to check the status one by one when the nextstep () member function of the problematic status class returns a set of possible next states, the duplicate status is not put into the search tree (that is, the stack or the inbound Queue ). As for how to check the duplicate status, we still adopt the most common method of function templates, so that users can provide their own check methods, use a generic template parameter in our DFS/BFS algorithm to specify it. In this way, our algorithm will become like this (taking BFS as an example): Template <class T1, class T2, class T3> int breadthfirstsearch (const T1 & initstate, const T2 & afterfindsolution, t3 & checkdup) // The first two parameters are the same as those of the old version. // checkdup: the similar function. For each possible state call in the next step, it accepts a const T1 &, // return a Boolean value. True indicates that the status is repeated. False indicates that the status is not repeated. // return: number of answers found {int n = 0; queue <t1> states; States. push (initstate); vector <t1> nextstates; bool stop = false; while (! Stop &&! States. empty () {T1 S = states. front (); States. pop (); nextstates. clear (); S. nextstep (nextstates); For (typename vector <t1 >:: iterator I = nextstates. begin (); I! = Nextstates. end (); ++ I) {if (I-> istarget () {// locate a target State + + N; If (afterfindsolution (* I )) // process the result and decide whether to stop the search {stop = true; break ;}} else {// not the target status. Determine whether to put it in the search queue if (! Checkdup (* I) // only puts non-duplicate statuses in the search queue {states. push (* I) ;}}} return N;} compared with the old version, we can see that, it only adds a template parameter T3, a call parameter checkdup, and a line of code if (! Checkdup (* I )). All the added things are clear, so you don't need to talk about them. Now, BFs users need to provide one more function object (or function-like) type. This function object is responsible for saving and checking the passed-in status object (type: T1, check whether the passed-in status object is the same as a previously passed-in object. In order to reduce the burden on BFs users, I think we should provide a few simple inspection methods for users to choose from. Users only need to provide the methods they provide if none of the methods we provide are applicable. The first simplest check method is not to check, that is, it is the same as the old version of BFS. The method of not checking is very simple. According to the requirements of this BFS algorithm, we can directly return false. Template <class T> struct nocheckdup: STD: unary_function <t, bool> {bool operator () (const T &) const {return false ;}; here, we can do more to make the new version BFS compatible with the old version, that is, provide the same interface as the old version, which makes it easier for users to use. The method is also very simple. Define a reload breadthsearchfirst, which accepts two parameters (the same as the old version), and then use a nocheckdup <t1> object as the third parameter to call the new version of the algorithm. Template <class T1, class T2> int breadthfirstsearch (const T1 & initstate, const T2 & afterfindsolution) // two-parameter version // initstate: initialization status, class t1 should provide the member functions nextstep () and istarget (), // nextstep () and return all possible states in the next step with vector <t1>, // istarget () used to determine whether the current status meets the required answer; // afterfindsolution: similar to the syntax, called after finding a valid answer, it accepts a const T1 &, // return a Boolean value. "True" indicates that the search is stopped, and "false" indicates that the search is continued. // return: number of answers found {nocheckdup <t1> nocheckdup; return breadthfirstsearch (in Itstate, afterfindsolution, nocheckdup);} with this overloaded version, we can recompile the Data independence problem program and the n queen problem program without modifying it. Next, we can implement several simple duplicate status check methods. Of course, they all use STL containers and algorithms. There are many methods to use STL for re-query. I think of the three most commonly used methods: Find algorithm with linear complexity and set container with Logarithmic complexity, and faster but also the most complex hash container. The last hash container is not the standard implementation of STL, but most STL implementations are provided. We may try to use it. Next, let's take a look at how to implement it one by one. The first is the linear search method. We can use a vector container to save all input states, and then use the find algorithm to perform linear search. The Code is as follows; // similar to the syntax, use a vector container to check whether the state nodes are repeated. The linear complexity // requires that the State class provide operator = template <class T> class sequencecheckdup: STD: unary_function <t, bool> {typedef vector <t> cont; cont States _; public: bool operator () (const T & S) {typename cont: iterator I = find (States _. begin (), States _. end (), S); if (I! = States _. end () // The status already exists, repeating {return true;} States _. push_back (s); // If the status is not repeated, return false;} is recorded. Why is vector used? Our requirement is simple: You can put State objects into containers in sequence and execute the find algorithm. Of course, deque and list can be used, and the time complexity is the same level (because the order of inserted elements is not important, we can simply select backend insert ). In this way, I chose the vector that occupies the least space. To make the find algorithm executable, a requirement for state class T is raised, which must provide operator =. This is the responsibility of the provider of the Status class (that is, the user of BFS. After a BFS user provides a problem state class mystate that supports operator =, he can use the new version BFS: mystate initstate (n); sequencecheckdup <mystate> checkdup; int Total = breadthfirstsearch (initstate, continuesearch, checkdup); note that, unlike nocheckdup, sequencecheckdup is stateful, the results of using the same object to call sequencecheckdup twice are different. The call to sequencecheckdup may change the state of the function object. Therefore, its operator () cannot be const. Be especially careful when using such stateful function objects. This is a digress and I will not say much. As you can imagine, linear search speed is unsatisfactory. When the search tree increases to a certain extent, the linear search speed slows down. BFS calls checkdup every time a new State node is generated. Therefore, if the number of knots in the search tree is N, the time complexity of BFS is O (n * n ). We know that the Set container in STL is designed to speed up element search in the container. If we use set to replace vector, we can expect the time complexity of BFS to be reduced to O (N * logn ). In this way, we have the second re-query method ordercheckdup, as shown in the following code: // similar to the function, use the set container to check whether the status node is repeated // The status class provides operator <template <class T> class ordercheckdup: STD: unary_function <t, bool> {typedef set <t> cont; cont States _; public: bool operator () (const T & S) {typename cont: iterator I = States _. find (s); if (I! = States _. end () // The status already exists, repeating {return true;} States _. insert (I, S); // The status is not repeated. Return false;} is recorded. Compared with sequencecheckdup, ordercheckdup replaces the container that stores the existing State objects from the vector with the set, replace the General find algorithm with set: Find () and the original vector: push_back with set: insert (). That's all. Unlike sequencecheckdup, because ordercheckdup stores State objects in the Set, it requires the status class t to provide operator <rather than operator =. Therefore, if you want to use ordercheckdup, you need to provide operator for your problem state class mystate <; as for BFS usage, it is the same as before: mystate initstate (N ); ordercheckdup <mystate> checkdup; int Total = breadthfirstsearch (initstate, continuesearch, checkdup); generally, the O (N * logn) time complexity provided by ordercheckdup is quite good, this is suitable for most problems. However, users may need to query at a higher speed. In this case, you can consider using the hash container. This is our third re-check method hashcheckdup, as shown in the following code: // similar to the syntax, use the hash_set container to check whether the status node is repeated. // The required status class provides operator ==and the hash function template <class T, class hashfcn = hash <t> class hashcheckdup: STD :: unary_function <t, bool> {typedef hash_set <t, hashfcn> cont; cont States _; public: typedef typename cont: hasher; hashcheckdup (const hasher & HF ): states _ (100, Hf) {} bool operator () (const T & S) {If (States _. find (s )! = States _. end () // The status already exists, repeating {return true;} States _. insert (s); // The status is not repeated and the returned false;} is recorded; it uses the hash_set that has not officially entered the C ++ standard, however, many STL implementations provide this container. For more information, see the relevant documentation, the hash_set container requires that the stored object type T be operator = and a hash function. So we can see that in addition to replacing set with hash_set, hashcheckdup also has a template parameter and a constructor (that is, a hash function ). The implementation of the hash function is directly related to the speed at which the hash_set container searches. There is no common hash algorithm. Therefore, the mystate provider must provide the hash algorithm for mystate, the format should be as follows: struct hashmystate {size_t operator () (const mystate & S) const {... }; After you have prepared operator = and hash algorithms for mystate, you can call BFS: mystate initstate (n) like this ); // The following variable definitions must be removed by adding parentheses, otherwise the compiler will regard it as a function declaration hashcheckdup <mystate, hashmystate> checkdup (hashmystate ())); int Total = breadthfirstsearch (initstate, continuesearch, checkdup); this Code contains an interesting phenomenon, that is, in the checkdup definition, the arguments used to call the constructor are enclosed in multiple brackets. At a glance, the parentheses seem redundant, but they are not. Without this parentheses, the Code cannot be compiled. The compiler will complain that the breadthfirstsearch () template function in the next line of code has an error in instantiation and cannot convert mystate to hashmystate (*) (). At the beginning, I was also confused, just vaguely remembering where I saw this error. After some searches, the answer is found in article 29th of exception C ++ style. In the absence of multiple parentheses, the compiler considers this line of code: hashcheckdup <mystate, hashmystate> checkdup (hashmystate (); As a function declaration rather than variable definition! The book provides the simplest solution: remove ambiguity by adding a pair of parentheses. This is why the seemingly redundant parentheses exist. Now we have a total of four methods to find duplicate states: No search, linear search, binary search (set container actually provides a binary search on an ordered container) and hash search. You can select one of them as needed. Different methods have different requirements for the problem status class, which are summarized as follows:
Search Method Requirements for problem status
Do not search This method is applicable only when no duplicate status occurs and has no special requirements on the problematic status category.
Linear search Requires the problem status class to provide operator =, slow query speed
Binary Search Requires the problem status class to provide operator <, fast search speed
Hash search Operator = and hash algorithms are required for the problematic status class. If the hash algorithm is suitable, the fastest searching speed is required.
If none of these four methods do not suit you, you can provide your own search algorithm and use it to call BFs. After talking about this, I think we should also use an actual problem example to test our new BFs. The example I selected was a question I did a few years ago: Push box; compared with the sudoku issue and the n queen issue, the push box game is much more complicated, so the next article may spend more time.
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.