The stone of his mountain-sphinx full-Text Search PHP usage Tutorial ____php

Source: Internet
Author: User
Tags mysql code truncated

Data table:

emailid mediumint (8) unsigned not NULL auto_increment COMMENT ' mail id ',
fromid Int (a) unsigned Not null default ' 0 ' COMMENT ' Sender ID ',
toid Int (a) unsigned NOT null default ' 0 ' COMMENT ' recipient id ',
content text UN Signed NOT NULL COMMENT ' message content ',
subject varchar (MB) unsigned NOT NULL COMMENT ' message headers ',
sendtime Int (a) NOT null C Omment ' Send time ',
attachment varchar (MB) not NULL COMMENT ' attachment ID, comma-separated ', PRIMARY KEY (emailid),
) Engine=myisam ';
With the open console, you must open the console PHP to connect to Sphinx (make sure you have established the index source):
D:\coreseek\bin\searchd-c d:\coreseek\bin\sphinx.conf
CORESEEK/API Directory provides PHP interface file sphinxapi.php, this file contains a Sphinxclient class
Introduce this file in PHP, new

$sphinx = new Sphinxclient ();
Sphinx Host name and Port
$sphinx->setserver (' Loclahost ', 9312);
Sets the return result set to the PHP array format
$sphinx->setarrayresult (true);
The offset of the matching result, the meaning of the parameter is as follows: Starting position, returning result bar number, maximum matching number
$sphinx->setlimits (0, 1000);
Maximum search Time
$sphinx->setmaxquerytime (a);
Perform a simple search, this search will query all the fields of information, to query the specified fields please continue to see the following
$index = ' email '//index source is the index class in the configuration file, if there are multiple index sources to use, the number is separated: ' Email,diary ' or use ' * ' to represent all index source
$result = $sphinx->query (' search keyword ', $index); 
Echo ';
Print_r ($result);
Echo ';


$result is an array in which the total number of data to be matched to
matches is the matching data that contains id,attrs
words is the word breaker for search keywords
You may be wondering why there is no email content of this information, in fact, Sphinx does not return a data array like MySQL, because the Sphinx did not record the complete data, only to record the data after the participle. The
specifically depends on the matches array, where the ID in the matches refers to the first field in the Sql_query SELECT statement in the configuration file, as in our configuration file
Sql_query = SELECT Emailid,fromid, Toid,subject,content,sendtime,attachement from email
so the ID in matches refers to Emailid
As for weight is the matching weight, The higher the general weight is returned the priority is also the highest, the matching weight related content Please refer to the Official document
Attrs is the information in the SQL_ATTR_ in the configuration file, which will be mentioned later in the use of these properties
so much, even if the search results are not the email data we want, But the fact Sphinx is not record real data, so to obtain real email data also according to matches ID to search MySQL email table, but the overall rate of this time is far more than MySQL like faster, the premise is hundreds of thousands of data volume above, Otherwise it will only be slower with Sphinx.
Next, I'll introduce some sphinx similar to MySQL conditions

The range of Emailid
$sphinx->setidrange ($min, $max); 
Property filter, Filter properties must be set in the configuration file Sql_attr_  , before we defined these
    sql_attr_uint  = Fromid
    sql_attr_uint  = toid
    Sql_attr_timestamp  = sendtime
//If you want to modify these properties again, remember to rebuild the index after the configuration is complete to take effect
//Specify some values
$sphinx->setfilter ( ' Fromid ', Array (1,2));    The value of the Fromid can only be 1 or 2
//Contrary to the above conditions, and the third parameter can be added
$sphinx->setfilter (' Fromid ', Array (1,2), false);    Fromid value cannot be 1 or 2
//Specify a range of values
$sphinx->setfilterrange (' toid ', 5);    The toid value is between 5-200
//And the above conditions can be added to the third parameter
$sphinx->setfilterrange (' toid ', 5, false);    Toid value is beyond 5-200
//execute search
$result = $sphinx->query (' keyword ', ' * ');
Sort mode
You can sort search results using the following pattern:
Sph_sort_relevance mode, in descending order of relevance (best match in front)
Sph_sort_attr_desc mode, sorted by attribute descending (the more the property value is in the front)
SPH_SORT_ATTR_ASC mode, sorted by property in ascending order (the smaller the property value, the higher the row is in the front)
Sph_sort_time_segments mode, descending by time period (last hour/day/week/month), and then by degree of correlation
sph_sort_extended mode, which combines the columns in a SQL-like fashion, ascending or descending order.
sph_sort_expr mode, sort by an arithmetic expression
Use attribute sorting
//To sort Fromid in reverse order, note that when you use Setsortmode again, you will overwrite the previous sort
$sphinx->setsortmode ("Sph_sort_attr_desc", ' Fromid ');
If you want to use multiple fields to sort by using the sph_sort_extended mode
//@id is the Sphinx built-in keyword, here refers to Emailid, as for why Emailid, think about
$sphinx-> Setsortmode ("Sph_sort_attr_desc", ' Fromid ASC, toid DESC, @id DESC ');
Execute search
$result = $sphinx->query (' keyword ', ' * ');
See the Official document sort mode for more information
Match mode
Like the following optional matching pattern:

Sph_match_all, match all query words (default mode);
Sph_match_any, matching any one of the query words;
Sph_match_phrase, the whole query is regarded as a phrase, which requires a complete match in order;
Sph_match_boolean, consider a query as a Boolean expression
sph_match_extended, consider a query as an expression Coreseek/sphinx the internal query language. Starting with version Coreseek 3/sphinx 0.9.9, this option is replaced by the option SPH_MATCH_EXTENDED2, which provides more functionality and better performance. This option is retained to be compatible with legacy code-so that old application code can continue to work even when Sphinx and its components include API upgrades.

Sph_match_extended2, use the second version of "Extended match mode" to match the query.
Sph_match_fullscan, enforces the use of the full scan mode described below to match the query. Note that in this mode, all query words are ignored, although filters, filter scopes, and groupings still work, but any text matching does not occur.
We want to focus on the SPH_MATCH_EXTENDED2 extended matching mode, extended matching mode allows you to use a number of conditional statements like MySQL
Set the extended match pattern
$sphinx->setmatchmode ("Sph_match_extended2");
A conditional statement is used in the query, and the field begins with the @, and the search contains the test, Toid equals 1:
$result = $sphinx->query (' @content (test) & @toid =1 ', ' * ');
Set more complex conditions in parentheses and & (with), |, (or),-(non-!=)
$result = $sphinx->query (' (@content (test) & @subject = er) | (@fromid-()) ', ' * ');
For more syntax see the official document matching pattern description

The search field is worth mentioning in the extended match pattern, and if the field is set, the fields of the extended matching search do not contain these properties by default, only with SetFilter () or Setfilterrange ().
Before we set the Fromid, Toid, sendtime as attributes, but also want to be in the extended match mode to be used as a condition what to do.
Just select one more time in the Sql_query statement and the field is OK.
Sql_query = SELECT emailid,fromid,fromid,toid,toid,subject,content,sendtime,sendtime,attachement from email
Setup completes remember to rebuild the index
More Conditional Tips
Just some tips, but not recommended for use in a deployment environment, as for why, please see the end of the article

<, <=, >, >=
The default Sphinx do not have these comparison characters.
What if I want the mail to be sent longer than a certain date? Use the Setfilterrange () method to simulate
is greater than or equal to a certain time truncated $time
$sphinx->setfilterrange (' Sendtime ', $time, 10000000000)//Time truncated maximum is 10 9, plus 1 is not beyond the.
//greater than a certain time truncated $time
$sphinx->setfilterrange (' Sendtime ', $time +1, 10000000000)
//less than equal to a certain time truncated $time
$ Sphinx->setfilterrange (' Sendtime ',-1, $time)    //Time cut is 0, so should be reduced by 1
//greater than a certain time $time
$sphinx-> Setfilterrange (' Sendtime ',-1, $time-1)
is not NULL
How to search for empty fields, such as I want to search the attachment for empty mail, someone might think @attachment (") is not OK. Actually this is the search for two single quotes ... Sphinx Search for strings without quotes
At present, Sphinx does not provide such a function, in fact, can be used in the MySQL statement on the hands and feet:
Sql_query = SELECT emailid,fromid,toidsubject,content,sendtime,attachement!= ' as attach is not null from email//here Returns a New Field Attachisnotnull, when Attachisnotnull is 1, the attachment is not empty.
Setup completes remember to rebuild the index
Find_in_set ()
To search for a message that contains an attachment, MySQL is accustomed to find_in_set such a simple sentence, in the Sphinx must be set in the configuration attribute Sql_attr_multi multi-valued attribute (MVA):
Sql_attr_multi = Attachment #attachment可以是逗号分隔的附件ID, or a space, semicolon, etc. sphinx can recognize
Setup completes remember to rebuild the index
then PHP can use SetFilter ()
//search contains attachment ID 1 or 2 mail, MySQL syntax is such find_in_set (' Attachment ', ' 1,2 ')
$ Sphinx->setfilter (' attachment ', Array (1,2))
//You can use Setfilterrange to search for mail $sphinx with attachment IDs in the 50-100 range
-> Setfilterrange (' attachment ', 50, 100)
If you want a free, easy to use, speed of Full-text search engine, Sphinx is undoubtedly the best choice, but do not forget the purpose of Sphinx: full-Text search. Don't think about the messy conditions. You want to make Sphinx search more flexible than MySQL, can be used completely alone in a number of complex and multiple criteria search, like some email advanced search, then I suggest you still spend more time on PHP or MySQL code optimization, because that may make your search more slow.

The best way is to search the content in the simplest way, and return the ID to the MySQL database search.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.