Implement a simple MySQL-weighted Chinese full-text search
I am writing a web, want to do full-text search database. But Google learned that because of Chinese word segmentation, MySQL only support full-text search in English, want to support Chinese, need a variety of plug-ins or implement some more complex mechanisms, and buy the virtual host does not support these complex things. Think carefully, because the function of their own needs are relatively simple, mainly 2 fields of search, and the amount of data is not large, even if the number of fields to add more than a few select will not have too much impact on speed, so through some work around to achieve the demand.
Step 1: Use locate for a simple search
Locate can determine if the substring is in a child disorder.
There are two column, one name, and one description.
So you can use locate>0 to determine if the keyword appears in it.
is actually
SELECT * FROM table WHERE LOCATE (key, ' name ') >0 OR LOCATE (key, ' description ');
This allows us to simply implement a search for a key in two domains
Step 2: Search for multiple keywords
Usually, search has more than one keyword, so we need to execute the next STEP1 query for each key word. (Of course, you can also synthesize one, here lazy to query only 1 keywords at a time)
Then we merge each query out of the array, so we get a final set.
The PHP code is as follows:
- function Selectlocate ($tarcols, $skey) {
- $where = "";
- $connector = "";
- Global $count;
- foreach ($tarcols as $tarcol) {
- $where. = $connector;
- $where. = "LOCATE (' $skey ', $tarcol)! = 0 ";
- if ($connector = = "") {
- $connector = "OR";
- }
- }
- $sql = "SELECT * from pets_table WHERE $where";
- $result = mysql_query ($sql);
- $ret = Array ();
- while ($item = Mysql_fetch_array ($result, Mysql_assoc)) {
- $count + +;
- $ret [] = $item;
- }
- return $ret;
Step 3: Matching weights
The result of the above STEP2 is in fact disordered. Usually, if we search for a field:
1. If the field and the keyword are identical, then generally speaking, the result should be the most relevant
2. If he only appeared once, the correlation was the lowest.
3. If he appears more frequently than in other row, then his correlation is higher than the result of 2.
So, the search is based on this order to consider the weight,
A. If it is exactly equal, the weight is 1000
B. If it appears 1 times, the weight is 10, and n times occurs
C. Weight of n*10
The results of each search are appended with the weight----"then merge the same item----" and add the weights
Finally, by the weight of the order, you can get a sorted search results.
Here are the code for the two 1 keywords that correspond to 1 fields (the code above is the 1 keyword multiple fields) query (without the code that merges two arrays, the related code in STEP4), just traverse each keyword and field to complete the search
- $count = 0;
- function selectequal ($col, $skey) {
- $connector = "";
- Global $count;
- $sql = "SELECT * from Pets_table WHERE LOWER ($col) =lower (' $skey ')";
- $result = mysql_query ($sql);
- $ret = Array ();
- while ($item = Mysql_fetch_array ($result, Mysql_assoc)) {
- $count + +;
- $item ["weight"] = 1000;
- $ret [] = $item;
- }
- return $ret;
- }
- function Selectlocate ($col, $skey) {
- Global $count;
- $sql = "SELECT *, (Length (description)-Length (REPLACE (description, ' $skey ', ')))/length (' $skey ') *10 as weight from pets _table WHERE LOCATE (LOWER (' $skey '), LOWER ($col)) >0 ";
- $result = mysql_query ($sql);
- $ret = Array ();
- while ($item = Mysql_fetch_array ($result, Mysql_assoc)) {
- $count + +;
- $ret [] = $item;
- }
- return $ret;
- }
Step 4: Weight of the field
In my needs, it is clear that the name field is more important than description, so the result of the name field should be skewed when matching, so you can add a weighting factor to the field.
1. If the match is in the Name field, set the coefficient to 10;
2. If it is matched in description, set the coefficient to 1;
A new, more effective weight value can be obtained by multiplying the weight of step 3 for each calculation, multiplied by the coefficient.
Finally, sort by weight, you can get a search result with the most relevant degree
Other Details:
If a keyword already satisfies the equal condition, then the locate condition will still return a result, so when the locate condition is used, the case of equal is filtered out.
-
- $count = 0;
- function selectequal ($col, $val, $skey) {
- $connector = "";
- Global $count;
- $sql = "SELECT * from Pets_table WHERE LOWER ($col) =lower (' $skey ')";
- $result = mysql_query ($sql);
- $ret = Array ();
- while ($item = Mysql_fetch_array ($result, Mysql_assoc)) {
- $count + +;
- $item ["weight"] = 1000* $val;
- $ret [] = $item;
- }
- return $ret;
- }
- function Selectlocate ($col, $val, $skey) {
- Global $count;
- $sql = "SELECT *, (Length (description)-Length (REPLACE (description, ' $skey ', ')))/length (' $skey ') *10* $val as weight FRO M pets_table WHERE LOCATE (LOWER (' $skey '), LOWER ($col)) >0 and LOWER ($col)!=lower (' $skey ') ";
- $result = mysql_query ($sql);
- $ret = Array ();
- while ($item = Mysql_fetch_array ($result, Mysql_assoc)) {
- $count + +;
- $ret [] = $item;
- }
- return $ret;
- }
- function Cleanarr ($arr) {
- Global $count;
- $tmp = Array ();
- $tmpall = Array ();
- foreach ($arr as $item) {
- if (array_key_exists ($item [' uid '], $tmp)) {
- $tmp [$item [' uid ']]+= $item ["Weight"];
- }
- else{
- $tmp [$item [' uid ']] = $item ["Weight"];
- $tmpall [$item [' uid ']] = $item;
- }
- }
- Sort by weight in descending order
- Arsort ($TMP);
- $ret = Array ();
- Rebuildthe return Arary
- $count = 0;
- foreach ($tmp as $k = = $v) {
- $count + +;
- $tmpall [$k] [' Weight ']= $v;
- $ret []= $tmpall [$k];
- }
- return $ret;
- }
- Require_once ("consvr.php");
- $colshash = Array ("name" =>10, "description" =>1);
- $ret = Array ();
- $keywords =explode ("", $keywords);
- $cols = Array_keys ($colshash);
- foreach ($keywords as $keyword) {
- foreach ($colshash as $col = = $val) {
- $ret = Array_merge ($ret, Selectequal ($col, $val, $keyword));
- $ret = Array_merge ($ret, Selectlocate ($col, $val, $keyword));
- }
- }
- $ret = Cleanarr ($ret);
- $ret = Array (' msg ' = + ' Success ', ' count ' = $count, ' children ' + $ret, ' query ' = ' complex:not readable ');
- echo Json_encode ($ret);
- Mysql_close ();
- ?>
http://www.bkjia.com/PHPjc/827622.html www.bkjia.com true http://www.bkjia.com/PHPjc/827622.html techarticle implement a simple MySQL with weight of Chinese full-text search itself in writing a web, want to do full-text retrieval of the database. But Google learned that because of Chinese word segmentation, MySQL only support ...