Analysis of PHP keyword replacement classes (avoid repeated replacement, retain and restore original links)

Source: Internet
Author: User
A keyword replacement class, which is mainly used for keyword filtering or keyword search Replacement. keyword replacement is actually a str_replace () process, if you are interested, let's take a look at the php keyword replacement class (avoid repeated replacement, retain and restore the original link)

The main content of this section:

A keyword replacement class

It can be used for keyword filtering or keyword search replacement.

Implementation process analysis:

Keyword replacement is actually a str_replace () process. if it is a simple str_replace, it only takes about 2 seconds for A-word article.

Problem:

The keyword is replaced more than once. for example, a must be replaced with a, but the result may be.

Therefore, a method is required to protect the replaced tags. before processing the article, replace the label first, for example, [_ tnum _], and then restore it after the article has been processed.

Another problem is that if [_ tnum _] exists in a keyword or article, you need to exclude this. in this case, str_replace cannot be used, but preg_replace must be used for regular expression elimination.

The third question is what to do if there are two keywords a and AB. we hope to first match the long one and then the short one, so we need to sort it before matching.

In the last question, after str_replace is changed to preg_replace, it takes 5 seconds to match the same string for 10 times and strpos is faster in the string processing function. Then, use strpos to find the keyword, less than 1 second for 10 million queries. Even if it is 1 million, it takes more than 8 seconds.

A keyword matching replacement class, code:

Sample code:

<? Php/** keyword matching class * @ author ylx
 
  
* @ Packet mipang * use an instance * $ str = "The green shell egg, zafdensa, will pop up next year to open the room Lucas local Army"; * $ key = new KeyReplace ($ str, array ("xxxx" => "sadf", "next year" =>' http://baidu.com ', "Next year" =>' google. com '); * echo $ key-> getResultText (); * echo $ key-> getRuntime (); */class KeyReplace {private $ keys = array (); private $ text = ""; private $ runtime =; private $ url = true; private $ stopkeys = array (); private $ all = false; /*** @ access public * @ param string $ text specifies the article to be processed * @ param array $ keys specifies the dictionary phrase array (key => url ,...) the url can be an array. if it is an array, a * @ param array $ stopkeys of the specified stop word array (k Ey ,...) here, the words will not be processed * @ param boolean $ url true indicates replacing them with links; otherwise, only * @ param boolean $ all true indicates replacing all words found, otherwise, only replace the first time */public function _ construct ($ text = '', $ keys = array (), $ url = true, $ stopkeys = array (), $ all = false) {$ this-> keys = $ keys; $ this-> text = $ text; $ this-> url = $ url; $ this-> stopkeys = $ stopkeys; $ this-> all = $ all ;} /*** get the processed article * @ access public * @ return string text */public function getResultText () {$ Start = microtime (true); $ keys = $ this-> hits_keys (); $ keys_tmp = array_keys () ($ keys); function cmp ($, $ B) {if (mb_strlen ($ a) = mb_strlen ($ B) {return;} return (mb_strlen ($ a) <mb_strlen ($ B ))? :-;} Usort ($ keys_tmp, "cmp"); foreach ($ keys_tmp as $ key) {if (is_array ($ keys [$ key]) {$ url = $ keys [$ key] [rand (, count ($ keys [$ key])-)];} else $ url = $ keys [$ key]; $ this-> text = $ this-> r_s ($ this-> text, $ key, $ url);} $ this-> runtime = microtime (true)-$ start; return $ this-> text;}/*** get processing time * @ access public * @ return float */public function getRuntime () {return $ this-> runtime ;} /*** set the keyword ** @ access Public * @ param array $ keys array (key => url ,...) */public function setKeys ($ keys) {$ this-> keys = $ keys ;} /*** set the stop word * @ access public * @ param array $ keys array (key ,...) */public function setStopKeys ($ keys) {$ this-> stopkeys = $ keys ;} /*** set the article * @ access public * @ param string $ text */public function setText ($ text) {$ this-> text = $ text ;} /*** is used to find the keyword hit in the string * @ access public * @ return ar Ray $ keys returns the matched word array (key => url ,...) */public function hits_keys () {$ ar = $ this-> keys; $ ar = $ ar? $ Ar: array (); $ result = array (); $ str = $ this-> text; foreach ($ ar as $ k => $ url) {$ k = trim ($ k); if (! $ K) continue; if (strpos ($ str, $ k )! = False &&! In_array ($ k, $ this-> stopkeys) {$ result [$ k] = $ url;} return $ result? $ Result: array ();} /*** is used to find the stop word hit in the string * @ access public * @ return array $ keys returns the matched word array (key ,...) */public function hits_stop_keys () {$ ar = $ this-> stopkeys; $ ar = $ ar? $ Ar: array (); $ result = array (); $ str = $ this-> text; foreach ($ ar as $ k) {$ k = trim ($ k); if (! $ K) continue; if (strpos ($ str, $ k )! = False & in_array ($ k, $ this-> stopkeys) {$ result [] = $ k ;}} return $ result? $ Result: array ();} /*** process the replacement process ** @ access private * @ param string $ text replaced by * @ param string $ key keyword * @ param string $ url link * @ return string $ text processed article */private function r_s ($ text, $ key, $ url) {$ tmp = $ text; $ stop_keys = $ this-> hits_stop_keys (); $ stopkeys = $ tags = $ a = array (); if (preg_match_all ("#] +> [^ <] *] *> # Su ", $ tmp, $ m) {$ a = $ m []; foreach ($ m [] as $ k => $ z) {$ z = preg_replace ("#\## s", "\#", $ z); $ tmp = preg_replace ('#'. $ z. '# s', "[_ ". $ k. "_]", $ tmp,) ;}}; if (preg_match_all ("# <[^>] +> # s", $ tmp, $ m )) {$ tags = $ m []; foreach ($ m [] as $ k => $ z) {$ z = preg_replace ("#\## s ", "\ #", $ z); $ tmp = preg_replace ('#'. $ z. '# s', "[_ tag ". $ k. "_]", $ tmp,) ;}} if (! Empty ($ stop_keys) {if (preg_match_all ("#". implode ("|", $ stop_keys ). "# s", $ tmp, $ m) {$ stopkeys = $ m []; foreach ($ m [] as $ k => $ z) {$ z = preg_replace ("#\## s", "\#", $ z); $ tmp = preg_replace ('#'. $ z. '# s', "[_ s ". $ k. "_]", $ tmp,) ;}}$ key = preg_replace ("# ([# \ (\) \ [\] \ *]) # s ", "\\\\$", $ key); if ($ this-> url) $ tmp = preg_replace ("#(?! \ [_ S | \ [_ a | \ [_ | \ [_ t | \ [_ ta | \ [_ tag) ". $ key ."(?! Ag \ d + _ \] | g \ d + _ \] | s \ d + _ \] | _ \]) # us ", ''. $ key. '', $ tmp, $ this-> all? -:); Else $ tmp = preg_replace ("#(?! \ [_ S | \ [_ a | \ [_ | \ [_ t | \ [_ ta | \ [_ tag) ". $ key ."(?! Ag \ d + _ \] | g \ d + _ \] | s \ d + _ \] | _ \]) # us ", $ url, $ tmp, $ this-> all? -:); If (! Empty ($ a) {foreach ($ a as $ n = >$ at) {$ tmp = str_replace ("[_ ". $ n. "_]", $ at, $ tmp) ;}} if (! Empty ($ tags) {foreach ($ tags as $ n =>at at) {$ tmp = str_replace ("[_ tag ". $ n. "_]", $ at, $ tmp) ;}} if (! Empty ($ stopkeys) {foreach ($ stopkeys as $ n = >$ at) {$ tmp = str_replace ("[_ s ". $ n. "_]", $ at, $ tmp) ;}return $ tmp ;}}
 

The above is the PHP keyword replacement class introduced in this article (avoid repeated replacement, retain and restore the original link ).

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.