Sphinx Full-Text Search tutorial for PHP use

Source: Internet
Author: User
Tags mysql code

CORESEEK/API Directory provides PHP interface file sphinxapi.php, this file contains a Sphinxclient class

Introduce this file in PHP, new

$sphinx =New Sphinxclient ();Sphinx Host name and Port $sphinx->setserver (' localhost ',9312);Sets the returned result set to the PHP array format $sphinx->setarrayresult (true);The offset of the matching result, the meaning of the parameter is: Start position, return result bar number, maximum number of matching bars $sphinx->setlimits (0,00n1000);  Max search Time $sphinx->setmaxquerytime ( 10);  Perform a simple search, this search will query all fields of information, to query the specified field please continue to see below $index =  ' email ' 
The index source is the index class in the configuration file, and if there are multiple index sources to use, separate the numbers: ' Email,diary ' or use the ' * ' symbol for all index sources $result = $sphinx->query ( ' search keywords ', $index); echo <pre> ';p rint_r ($result); echo </pre> ';

$result is an array in which

Total is the amount of data that is matched to

Matches is a matching data that contains id,attrs this information

Words is the word breaker for search keywords

You may wonder why there is no such information as the content of the message, in fact, Sphinx does not return data arrays like MySQL, because Sphinx originally did not record the complete data, only recorded the data after the word.

It depends on the matches array, the ID in matches refers to the first field in the Sql_query SELECT statement in the configuration file, which is in our configuration file.

Sql_query = SELECT emailid,fromid,toid,subject,content,sendtime,attachement from email

So the ID in matches refers to Emailid

As for weight refers to the matching weights, the higher the general weight is returned the highest priority, matching the weight of the relevant content please refer to the Official document

Attrs is the information in the SQL_ATTR_ in the configuration file, and later mentions the use of these properties

Said so much, even if the search results are not the email data we want, but the fact Sphinx is not recorded real data, so to get to the real email data and according to the ID in matches to search MySQL email table, But overall the speed is still much faster than the MySQL like, as long as the hundreds of thousands of data volume above, otherwise with Sphinx will only be slower.

Next introduce Sphinx Some of the usage of MySQL-like conditions

Emailid Range $sphinx->setidrange ($min, $max);Property filtering, the filtered properties must be set in the config file sql_attr_, which we defined previously sql_attr_uint = Fromidsql_attr_uint = Toidsql_attr_tim Estamp = SendtimeIf you want to modify these properties again, remember to re-establish the index after the configuration is complete to take effect/Specify some value Sphinx->setfilter (' Fromid ',Array1,2));The value of Fromid can only be 1 or 2.In contrast to the above conditions, you can add a third parameter $sphinx->setfilter (' Fromid ',Array1,2),FALSE);The value of Fromid cannot be 1 or 2.Specifies the range of a value $sphinx->setfilterrange ( ' toid ', 5, 200); //toid value between 5-200 //In contrast to the above conditions, you can add a third parameter $sphinx->setfilterrange ( ' toid ', 5, 200,  FALSE); //toid value outside of 5-200 //perform search $result = $sphinx->query ( ' keywords ',  ' * ');           

Sort mode

Search results can be sorted using the following pattern:

Sph_sort_relevance mode, sorted in descending order of relevance (best match in front)

Sph_sort_attr_desc mode, arranged in descending order of attributes (the higher the value of the property, the greater the number of rows in front)

SPH_SORT_ATTR_ASC mode, arranged in ascending order of attributes (the smaller the attribute value, the more it is in front)

Sph_sort_time_segments mode, descending by time period (last hour/day/week/month), and then by relevance

sph_sort_extended mode, which combines columns in ascending or descending order in a SQL-like way.

sph_sort_expr mode, sorted by an arithmetic expression

Sorting using AttributesIn reverse order of fromid, note that when you use Setsortmode again, the previous sort $sphinx->setsortmode is overwritten ("Sph_sort_attr_desc", ' Fromid '); //If you want to use multiple field sorting can use sph_sort_extended mode //@id is the Sphinx built-in keyword, here refers to Emailid, as for why Emailid, think about $sphinx-> Setsortmode ( "Sph_sort_attr_desc", ' Fromid ASC, toid desc, @id desc ');  Perform search $result = $sphinx->query (' keywords ', ' * ');  See the official document sorting mode for more information         

Matching mode

Like the following optional matching pattern:

Sph_match_all, matching all query terms (default mode);

Sph_match_any, matches any one of the query words;

Sph_match_phrase, the whole query is regarded as a phrase, which requires a complete match in order;

Sph_match_boolean, consider a query as a Boolean expression

sph_match_extended, the query is treated as an expression Coreseek/sphinx the internal query language. Starting with version Coreseek 3/sphinx 0.9.9, this option is replaced by the option SPH_MATCH_EXTENDED2, which provides more functionality and better performance. This option is retained for compatibility with legacy code-so that legacy application code can continue to work even when Sphinx and its components include API upgrades.

Sph_match_extended2, use the second version of "Extended match mode" to match the query.

Sph_match_fullscan, the query is forced to match by using the full scan mode described below. Note that in this mode, all query terms are ignored, although filters, filter ranges, and groupings still work, but any text match does not occur.

Our main concern is the Sph_match_extended2 extended match pattern, which allows the use of some conditional statements like MySQL

//设置扩展匹配模式$sphinx->SetMatchMode ( "SPH_MATCH_EXTENDED2" );//查询中使用条件语句,字段用@开头,搜索内容包含测试,toid等于1的邮件:$result = $sphinx->query(‘@content (测试) & @toid =1‘, ‘*‘);//用括号和&(与)、|、(或者)、-(非,即!=)设置更复杂的条件$result = $sphinx->query(‘(@content (测试) & @subject =呃) | (@fromid -(100))‘, ‘*‘);//更多语法请查看官方文档匹配模式的说明

What is worth mentioning in the extended match pattern is the field of the search, and if the field is set, the fields of the extended match search are not included by default, only with SetFilter () or Setfilterrange ().

Before we set the Fromid, Toid, sendtime as attributes, but also want to use in the extended match mode as a condition to do?

You can just select one more time in the Sql_query statement.

Sql_query = SELECT emailid,fromid,fromid,toid,toid,subject,content,sendtime,sendtime,attachement from email

Setup done remember to re-establish the index

More conditional tricks

Just some tips, but not recommended for use in the deployment environment, as for why, see the end of the article

<, <=, >, >=

The default Sphinx does not have these comparators.

What if I want the message to be sent more than a certain date? Use the Setfilterrange () method to simulate

/greater than or equal to a certain time intercept $time$sphinx->setfilterrange (' Sendtime ', $time,10000000000)The maximum time cut is 10 9, plus 1 is not beyond.//greater than a certain time to intercept $time$sphinx->setfilterrange (  ' Sendtime ', $time +1, 10000000000" / Less than or equal to a certain time intercept $time$sphinx->setfilterrange ( ' Sendtime ', -1, $time) / /time-truncated minimum is 0, so should be reduced 1//greater than a certain time to intercept $time$sphinx->setfilterrange (  ' Sendtime ',  -1, $time-1)         
is not NULL

How to search for empty fields, such as I want to search for empty attachments, someone might want to @attachment (")? In fact, this is a search for two single quotes ... Sphinx Search for strings without quotes

Currently Sphinx does not provide such a function, in fact, can be in the MySQL statement on the hands and feet:

Sql_query = SELECT Emailid,fromid,toidsubject,content,sendtime,attachement! = "As attach is not null from Email//This returns a new field Attachisnotnull, when the Attachisnotnull is 1, the attachment is not empty.

Setup done remember to re-establish the index

Find_in_set ()

Search for a message containing an attachment, MySQL is accustomed to use find_in_set so simple sentence to be done, in the Sphinx must be set in the configuration property Sql_attr_multi Multi-value attribute (MVA):

Sql_attr_multi = Attachment #attachment可以是逗号分隔的附件ID, or a space, semicolon, etc Sphinx can be recognized

Setup done remember to re-establish the index

You can then use SetFilter () in PHP

//搜索包含附件ID为1或2邮件,mysql语法是这样FIND_IN_SET(`attachment`, ‘1,2‘)$sphinx->SetFilter(‘attachment‘, array(1,2))//可以使用SetFilterRange,搜索包含附件ID在50-100范围的邮件$sphinx->SetFilterRange(‘attachment‘, 50, 100)
Summarize

If you want a free, easy-to-use, fast full-text search engine, Sphinx is undoubtedly the best choice, but do not forget the purpose of Sphinx: full-Text search. Don't think about those mess conditions. You want to make the Sphinx search as flexible as MySQL and can be used completely alone in some complex multi-conditional searches, like advanced search for some emails, then I suggest you spend more time on the optimization of PHP or MySQL code, because that might make your search slower.

The best way is to search for the content in the simplest way, and return the ID to the MySQL database search.

Sphinx Full-Text Search tutorial for PHP use

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.