The following are the referenced contents: FCICQ: Here is a look at the PHP word here how to do. function &dv_chinesewordsegment ($str, $encodingName = ' GBK ') { static $OBJENC = null; if ($OBJENC = = null) { if (!class_exists (' dv_encoding ')) { Require_once root_path. ' inc/dv_encoding.class.php '; } $OBJENC =& dv_encoding::getencoding ($encodingName); } $strLen = $objEnc->strlength ($STR); $returnVal = Array (); if ($strLen < = 1) { return $str; } $arrStopWords =& dv_getstopwordlist (); Print_r ($arrStopWords); Filter all HTML tags $str = Preg_replace (' #<[a-za-z]+?. *?> #is ', ', $str); Filter all Stopword $str = Str_replace ($arrStopWords [' Strrepl '], ', $str '); $str = preg_replace ($arrStopWords [' Pregrepl '], ', $str '); echo "$STR: {$str} “; $arr = Explode (", $str); Fcicq: OK, here is the key to the PHP word ************* foreach ($arr as $tmpStr) { if (Preg_match ("/^[x00-x7f]+$/i", $tmpStr) = = 1) {//FCICQ: All E-wen, it doesn't matter, MySQL can know $returnVal [] = '. $tmpStr; else{//fcicq: Sino-British mixed ... Preg_match_all ("/([a-za-z]+)/I", $tmpStr, $matches); if (!empty ($matches)) {//FCICQ: English part foreach ($matches [0] as $matche) { $returnVal [] = $matche; } } Filtering ASCII characters $TMPSTR = Preg_replace ("/([x00-x7f]+)/I", " , $TMPSTR); FCICQ: You see, the rest is not all Chinese? $strLen = $objEnc->strlength ($tmpStr)-1; for ($i = 0; $i < $strLen; $i + +) { $returnVal [] = $objEnc->substring ($tmpStr, $i, 2) ; FCICQ: Notice the substr here, not in the manual. FCICQ: You look carefully, all the words are divided into two. For example, "database Application", will be divided into the data library of the application ... Full-Text Search: Full text Search This participle of nature is not what kind of But the same is true when searching. For example, searching the database is equivalent to searching the data base. This is a fairly traditional method for Full-text search. } } } return $returnVal; }//end function Dv_chinesewordsegment FCICQ: This is the legendary substr. I believe many people write PHP code that is better than this. Function &substring (& $str, $start, $length =null) { if (!is_numeric ($start)) { return false; } $strLen = StrLen ($STR); if ($strLen < = 0) { return false; } if ($start < 0 $length < 0) { $mbStrLen = $this->strlength ($STR); } else{ $mbStrLen = $strLen; } if (!is_numeric ($length)) { $length = $mbStrLen; } elseif ($length < 0) { $length = $mbStrLen + $length-1; } if ($start < 0) { $start = $mbStrLen + $start; } $returnVal = '; $mbStart = 0; $mbCount = 0; for ($i = 0; $i < $strLen; $i + +) { if ($mbCount >= $length) { Break } $currOrd = Ord ($str {$i}); if ($mbStart >= $start) { $returnVal. = $str {$i}; if ($currOrd > 0x7f) { $returnVal. = $str {$i +1}. $str {$i +2}; $i + 2; } $mbCount + +; } elseif ($currOrd > 0x7f) { $i + 2; } $mbStart + +; } return $returnVal; }//end function SubString Inserts a full-text search word list. Altogether two, a topic_ft, a bbs_ft $arrTopicIndex =& dv_chinesewordsegment ($topic); if (!empty ($arrTopicIndex) && Is_array ($arrTopicIndex)) { $topicindex = $db->escape_string (Implode (', $arrTopicIndex)); if ($topicindex!== ") { $db->query ("UPD ATE {$dv}topic_ft SET topicindex= ') {$topicindex} ' WHERE topicid= ' {$RootID} '); } else{ $db->query ("DEL ete from {$DV}topic_ft WHERE topicid= ' {$RootID} '); } } } |