Implement a simple MySQL-weighted Chinese full-text Search _php Tutorial

Source: Internet
Author: User
Implement a simple MySQL-weighted Chinese full-text search
I am writing a web, want to do full-text search database. But Google learned that because of Chinese word segmentation, MySQL only support full-text search in English, want to support Chinese, need a variety of plug-ins or implement some more complex mechanisms, and buy the virtual host does not support these complex things. Think carefully, because the function of their own needs are relatively simple, mainly 2 fields of search, and the amount of data is not large, even if the number of fields to add more than a few select will not have too much impact on speed, so through some work around to achieve the demand.

Step 1: Use locate for a simple search
Locate can determine if the substring is in a child disorder.
There are two column, one name, and one description.
So you can use locate>0 to determine if the keyword appears in it.
is actually
SELECT * FROM table WHERE LOCATE (key, ' name ') >0 OR LOCATE (key, ' description ');
This allows us to simply implement a search for a key in two domains

Step 2: Search for multiple keywords
Usually, search has more than one keyword, so we need to execute the next STEP1 query for each key word. (Of course, you can also synthesize one, here lazy to query only 1 keywords at a time)
Then we merge each query out of the array, so we get a final set.

The PHP code is as follows:

 
  
  
  1. function Selectlocate ($tarcols, $skey) {
  2. $where = "";
  3. $connector = "";
  4. Global $count;
  5. foreach ($tarcols as $tarcol) {
  6. $where. = $connector;
  7. $where. = "LOCATE (' $skey ', $tarcol)! = 0 ";
  8. if ($connector = = "") {
  9. $connector = "OR";
  10. }
  11. }

  12. $sql = "SELECT * from pets_table WHERE $where";
  13. $result = mysql_query ($sql);
  14. $ret = Array ();
  15. while ($item = Mysql_fetch_array ($result, Mysql_assoc)) {
  16. $count + +;
  17. $ret [] = $item;
  18. }
  19. return $ret;
Step 3: Matching weights
The result of the above STEP2 is in fact disordered. Usually, if we search for a field:
1. If the field and the keyword are identical, then generally speaking, the result should be the most relevant
2. If he only appeared once, the correlation was the lowest.
3. If he appears more frequently than in other row, then his correlation is higher than the result of 2.
So, the search is based on this order to consider the weight,
A. If it is exactly equal, the weight is 1000
B. If it appears 1 times, the weight is 10, and n times occurs
C. Weight of n*10
The results of each search are appended with the weight----"then merge the same item----" and add the weights
Finally, by the weight of the order, you can get a sorted search results.

Here are the code for the two 1 keywords that correspond to 1 fields (the code above is the 1 keyword multiple fields) query (without the code that merges two arrays, the related code in STEP4), just traverse each keyword and field to complete the search

 
 
  1. $count = 0;
  2. function selectequal ($col, $skey) {
  3. $connector = "";
  4. Global $count;
  5. $sql = "SELECT * from Pets_table WHERE LOWER ($col) =lower (' $skey ')";
  6. $result = mysql_query ($sql);
  7. $ret = Array ();
  8. while ($item = Mysql_fetch_array ($result, Mysql_assoc)) {
  9. $count + +;
  10. $item ["weight"] = 1000;
  11. $ret [] = $item;
  12. }
  13. return $ret;
  14. }
  15. function Selectlocate ($col, $skey) {
  16. Global $count;
  17. $sql = "SELECT *, (Length (description)-Length (REPLACE (description, ' $skey ', ')))/length (' $skey ') *10 as weight from pets _table WHERE LOCATE (LOWER (' $skey '), LOWER ($col)) >0 ";
  18. $result = mysql_query ($sql);
  19. $ret = Array ();
  20. while ($item = Mysql_fetch_array ($result, Mysql_assoc)) {
  21. $count + +;
  22. $ret [] = $item;
  23. }
  24. return $ret;
  25. }



Step 4: Weight of the field
In my needs, it is clear that the name field is more important than description, so the result of the name field should be skewed when matching, so you can add a weighting factor to the field.

1. If the match is in the Name field, set the coefficient to 10;
2. If it is matched in description, set the coefficient to 1;

A new, more effective weight value can be obtained by multiplying the weight of step 3 for each calculation, multiplied by the coefficient.
Finally, sort by weight, you can get a search result with the most relevant degree

Other Details:
If a keyword already satisfies the equal condition, then the locate condition will still return a result, so when the locate condition is used, the case of equal is filtered out.

 
 
  1. $count = 0;
  2. function selectequal ($col, $val, $skey) {
  3. $connector = "";
  4. Global $count;
  5. $sql = "SELECT * from Pets_table WHERE LOWER ($col) =lower (' $skey ')";
  6. $result = mysql_query ($sql);
  7. $ret = Array ();
  8. while ($item = Mysql_fetch_array ($result, Mysql_assoc)) {
  9. $count + +;
  10. $item ["weight"] = 1000* $val;
  11. $ret [] = $item;
  12. }
  13. return $ret;
  14. }
  15. function Selectlocate ($col, $val, $skey) {
  16. Global $count;
  17. $sql = "SELECT *, (Length (description)-Length (REPLACE (description, ' $skey ', ')))/length (' $skey ') *10* $val as weight FRO M pets_table WHERE LOCATE (LOWER (' $skey '), LOWER ($col)) >0 and LOWER ($col)!=lower (' $skey ') ";
  18. $result = mysql_query ($sql);
  19. $ret = Array ();
  20. while ($item = Mysql_fetch_array ($result, Mysql_assoc)) {
  21. $count + +;
  22. $ret [] = $item;
  23. }
  24. return $ret;
  25. }
  26. function Cleanarr ($arr) {
  27. Global $count;
  28. $tmp = Array ();
  29. $tmpall = Array ();
  30. foreach ($arr as $item) {
  31. if (array_key_exists ($item [' uid '], $tmp)) {
  32. $tmp [$item [' uid ']]+= $item ["Weight"];
  33. }
  34. else{
  35. $tmp [$item [' uid ']] = $item ["Weight"];
  36. $tmpall [$item [' uid ']] = $item;
  37. }
  38. }

  39. Sort by weight in descending order
  40. Arsort ($TMP);

  41. $ret = Array ();

  42. Rebuildthe return Arary
  43. $count = 0;
  44. foreach ($tmp as $k = = $v) {
  45. $count + +;
  46. $tmpall [$k] [' Weight ']= $v;
  47. $ret []= $tmpall [$k];
  48. }
  49. return $ret;
  50. }

  51. Require_once ("consvr.php");


  52. $colshash = Array ("name" =>10, "description" =>1);
  53. $ret = Array ();
  54. $keywords =explode ("", $keywords);
  55. $cols = Array_keys ($colshash);
  56. foreach ($keywords as $keyword) {
  57. foreach ($colshash as $col = = $val) {
  58. $ret = Array_merge ($ret, Selectequal ($col, $val, $keyword));
  59. $ret = Array_merge ($ret, Selectlocate ($col, $val, $keyword));
  60. }

  61. }
  62. $ret = Cleanarr ($ret);
  63. $ret = Array (' msg ' = + ' Success ', ' count ' = $count, ' children ' + $ret, ' query ' = ' complex:not readable ');
  64. echo Json_encode ($ret);
  65. Mysql_close ();

  66. ?>











http://www.bkjia.com/PHPjc/827622.html www.bkjia.com true http://www.bkjia.com/PHPjc/827622.html techarticle implement a simple MySQL with weight of Chinese full-text search itself in writing a web, want to do full-text retrieval of the database. But Google learned that because of Chinese word segmentation, MySQL only support ...

  • Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.