Sphinx Reference Manual (vi)

Source: Internet
Author: User
Tags constant data structures error code error handling hash readable set time
Reproduced from Http://sphinxsearch.com/wiki/doku.php?id=sphinx_manual_chinese, although the version is older but most of the features and configuration instructions for the current new version are still applicable. English proficiency is best to check the latest version of the English manual.
6. API Reference


Sphnix has several implementations of the SEARCHD client API for different programming languages. When this article is complete, we provide official support for our own Php,python and Java implementations. In addition, there are third-party free, open source API implementations for Perl,ruby and C + +.



The API reference implementation is written in PHP, because (we believe) Sphinx is the most widely used in PHP compared to other languages. So this reference document is based on the PHP API reference, and all the code samples in this section are given in PHP.



Of course, all other APIs provide the same approach and use exactly the same network protocol. So this document also applies to them. There may be small differences in the method naming conventions or the use of specific data structures. However, there is no difference in the functionality offered by the APIs in different languages. 

6.1. Common API Method 

6.1.1. GetLastError (Error message)



prototype: function GetLastError ()



Returns the most recent error description information in readable form. Returns an empty string if the previous API call did not have an error.



Any other function (such as Query ()) should call this function after it fails (generally returning false for a function failure), and it will return the description of the error.



This function itself does not reset the description of the error, so it can be called multiple times, if necessary.

6.1.2. getlastwarning (Alarm information)



prototype: function getlastwarning ()



Returns the most recent warning description information in a readable format. Returns an empty string if the previous API call did not have a warning.



You should call this function to verify that your request, such as Query (), has completed but generated a warning. For example, a search query for a distributed index might complete successfully even if several remote agents timed out. A warning message is generated.



The function itself does not reset the warning message, so it can be called multiple times, if necessary.

6.1.3. Setserver (Set search service)



prototype: function Setserver ($host, $port)



Set the host name and TCP port of the SEARCHD. All subsequent requests use the new host and port settings. The default host and Port are "localhost" and 9312, respectively.

6.1.4. Setretries (Setup failed retry)



prototype: function setretries ($count, $delay = 0)



Sets the number of distributed search retries and the delay time.



For transient failures, searchd retries up to $count times for each agent. $delay is the time, in milliseconds, to delay between retries of two times. By default, retry is forbidden. Note that this call does not cause the API itself to retry the transient failure, it just lets searchd do so. The current transient failure includes the various failures of the connect () call and the case where the remote agent exceeds the maximum number of connections (too busy).

6.1.5. Setconnecttimeout (set time-out)



prototype: function setconnecttimeout ($timeout)



Set the connection time-out period, if the connection to the server, if more than this time is not connected to give up.



Sometimes the server will be delayed in response, which may be due to network latency, or because the server has not finished processing too many queries, the accumulation caused. Whatever the case, with this option, it gives the client application some control over what to do when the Searchd is unavailable, and prevents the script from failing to run because it exceeds the running limit (especially in PHP).



When the connection fails, the appropriate error code is returned to the application for error handling and notifying the user at the application level.

6.1.6. Setarrayresult (set result return format)



prototype: function Setarrayresult ($arrayresult)



PHP dedicated. Controls the return format of the search result set (matches are returned by an array or by a hash)



The $arrayresult parameter should be a Boolean type. If $arrayresult is False (the default), the match is returned in PHP hash format, the document ID is the key, and the other information (weights, attributes) is the value. If $arrayresult is true, the match is returned as a normal array, including all information for the match (with document ID)



This invocation is also introduced when the MVA attribute is introduced into packet support. The results of the MVA grouping may contain duplicate document IDs. So they need to be returned as a normal array, because the hash can only hold one record per document ID. 

6.1.7. Isconnecterror (check for link errors)



prototype: function Isconnecterror ()



Check whether the previous error is an API-level network error or a remote error returned by SEARCHD. If the last attempt to connect to Searchd failed at the API level, it returns true, otherwise false (the error occurred remotely, or there was no attempt to connect at all). This was introduced in version 0.9.9-rc1. 

6.2. General Search Settings

6.2.1. Setlimits (Sets the result set offset)



prototype: function setlimits ($offset, $limit, $max _matches=0, $cutoff = 0)



Sets an offset ($offset) to the server-side result set and a limit ($limit) to the number of matches that are returned to the client from that offset. The result set size of the current query ($max _matches) can be set on the server side, and a threshold ($cutoff) will stop the search when the found match reaches this threshold. All of these parameters must be non-negative integers.



The first two parameters behave the same as the parameters in the MySQL limit clause. They enable SEARCHD to return a maximum of $limit matches starting with a match numbered $offset. The default values for offset ($offset) and result limit ($limit) are 0 and 20 respectively, which means that the first 20 matches are returned.



Max_matches This setting controls the number of matches that SEARCHD maintains in memory during the search. In general, even if Max_matches is set to 1, all matching documents will be processed, scored, filtered, and sorted. But only the best n documents are stored in memory at any one time, for performance and memory use, which is the size of the N. Note that the max_matches is set in two places. The limit for a single query is specified by this API call. But there is also a limit for the entire server, which is controlled by the Max_matches settings in the configuration file. To prevent the misuse of memory, the server does not allow a single query to be more restrictive than the server.



The client cannot receive more than max_matches matches. The default limit is 1000, and you should not experience situations where you need to set them higher. 1000 records are sufficient to show the end user. If you want to transfer the results to the application for further sorting or filtering, be aware that the Sphinx end is much more efficient to complete.



$cutoff settings are provided for advanced performance optimizations. It tells Searchd to force a stop after it finds and processes a $cutoff match. 

6.2.2. Setmaxquerytime (set maximum search time)



prototype: function Setmaxquerytime ($max _query_time)



Sets the maximum search time, in milliseconds. The parameter must be a non-negative integer. The default value is 0, meaning no limit is made.



This setting is similar to the $cutoff in Setlimits (), but this setting limits the query time, not the number of matches processed. Once the processing time has been too long, the local search query will be stopped. Note that if a search queries multiple local indexes, that restriction is used independently of these indexes. 

6.2.3. Setoverride (Setting temporary property value overrides)



prototype: function Setoverride ($attrname, $attrtype, $values)



Sets a temporary (valid only for a single query) override for attribute values for different documents. Only scalar attributes are supported. $value is a hash table whose key is to overwrite the document ID of the property, which is the value to be overwritten for the document ID. Introduced in version 0.9.9-rc1.



The property override attribute enables users to "temporarily" modify some document values for a single query without affecting other queries. This function can be used to personalize the data. For example, suppose you are implementing a personalized search function that puts a friend's recommended posts in front of them, which are not only dynamic, but also personalized, and therefore cannot be easily indexed because they cannot affect other users ' searches. The override mechanism is for a single query and does not affect others. So you can, for example, set a "Friends_weight" property for each document, the default value is 0, and then temporarily set the property of document 123,456,789 (the current user's friend referral) to 1, and then use that value for the correlation calculation. 

6.2.4. Setselect (Setting the contents of the returned information)



prototype: function Setselect ($clause)



Set the SELECT clause to list the specific attributes to be removed and the expressions to be computed and removed. The syntax of the clause mimics SQL. Introduced in version 0.9.9-rc1.



Setselect () is very similar to the part between select and from in a standard SQL query. It allows you to specify which properties (columns) to take out, and which expressions to evaluate and remove on those columns. The difference from the SQL language is that an expression must use the keyword as to give each expression an alias, and the alias must be a valid identifier (consisting of letters and numbers). This can be done in SQL, but it is not mandatory. Sphinx forces must have aliases so that the computed results can always be returned in a result set with a "normal" name, or referenced in other clauses, and so on.



Other aspects are basically equivalent to SQL. Supports an asterisk ("*"), supports functions, and supports any number of expressions. The calculated expression can be used for sorting, filtering, and grouping, which is the same as other general properties.



Starting with version 0.9.9-rc2, the use of the aggregate function (AVG (), MIN (), MAX (), SUM ()) is allowed when group by is used.



The expression ordering (section 4.5, "sph_sort_expr mode") and the surface Distance calculation function (section 6.4.5, "Setgeoanchor (Setting the surface distance anchor point)") is now the internal implementation of this expression evaluation mechanism, respectively, using the " Magic names "@expr" and "@geodist".

Example:


$CL->setselect ("*, @weight + (User_karma+ln (pageviews)) *0.1 as Myweight");
$CL->setselect ("Exp_years, salary_gbp*{$GBP _usd_rate} as SALARY_USD,
   IF (age>40,1,0) as Over40");
$CL->setselect ("*, AVG (price) as Avgprice");

6.3. Full-Text Search Settings

6.3.1. Setmatchmode (set match mode)

prototype: function Setmatchmode ($mode)



Set the matching pattern for full-text queries, as described in section 4.1, "matching patterns". The parameter must be a constant corresponding to a known pattern.



Warning: (PHP only) query pattern constants cannot be enclosed in quotation marks, which gives a string instead of a constant:


$CL->setmatchmode ("Sph_match_any"); incorrect! Won't work
as expected $CL->setmatchmode (sph_match_any);//correct, works OK
6.3.2. Setrankingmode (set scoring mode)


prototype: function Setrankingmode ($ranker)



Set the scoring mode. Currently only available in Sph_match_extended2 this matching mode. The parameter must be a constant corresponding to a known pattern.



Sphinx calculates two factors that are useful for the final matching weights by default. It is mainly the similarity between the query phrase and the document text. The second is a statistical function called BM25, which values values between 0 and 1 based on the frequency in the keyword document (high-frequency results in higher weights) and the frequency in the entire index (low-frequency results in high weights).



However, there may be times when you might need to change the weighting method--or you might not calculate weights at all to improve performance, and the result set is sorted by other means. This goal can be achieved by setting the appropriate correlation calculation mode.



The implemented patterns include: SPH_RANK_PROXIMITY_BM25, default mode, using phrase and BM25 scoring, and combining the two. SPH_RANK_BM25, Statistical correlation calculation mode, using only BM25 scoring calculations (same as most full-text search engines). This pattern is faster, but it may degrade the result quality of queries that contain multiple words. Sph_rank_none, disables the scoring mode, which is the fastest mode. This pattern is actually the same as a Boolean search. All matches are given a weight of 1. Sph_rank_wordcount, sorted by the number of keyword occurrences. This sequencer calculates the number of occurrences of a keyword in each field, multiplies the count by the weight of the field, and finally sums the product as the final result. Sph_rank_proximity, version 0.9.9-rc1 new, returns the original phrase similarity as a result. Internally, this pattern is used to simulate sph_match_all queries. Sph_rank_matchany, version 0.9.9-rc1 added, returns the precedence previously computed in Sph_match_any, which is used internally in this mode to simulate Sph_match_any queries. Sph_rank_fieldmask, version 0.9.9-rc2 new, returns a 32-bit mask where the nth bit corresponds to the nth full-text segment, counting from 0, and if a field contains a keyword that satisfies the query, the corresponding flag bit is set to 1.

6.3.3. Setsortmode (Set sort mode)



prototype: function Setsortmode ($mode, $sortby = "")



Set the sort pattern for the match, as described in section 4.5, "sort mode". The parameter must be a constant corresponding to a known pattern.



Warning: (PHP only) query pattern constants cannot be enclosed in quotation marks, which gives a string instead of a constant:


$CL->setsortmode ("Sph_sort_attr_desc"); incorrect! Won't work
as expected $CL->setsortmode (SPH_SORT_ATTR_ASC);//correct, works OK
6.3.4. Setweights (set weights)


prototype: function setweights ($weights)



Set weights for the fields in the order in which they appear in the index. Deprecated, it is recommended to use Setfieldweights (). 

6.3.5. setfieldweights (set field weights)



prototype: function setfieldweights ($weights)



Sets the weight of the field by field name. The argument must be a hash (associative array) that maps a string representing the name of the field to the weight of an integral type.



Field weights affect the rating of the match. Section 4.4, "Weight calculation" explains how phrase similarity affects ratings. This call is used to specify weights that are different from the default values for different full-text data fields.



The given weight must be a positive 32-bit integer. The final weight is also a 32-bit integer. The default weight is 1. Unknown property names are ignored.



There is currently no maximum limit on weights. However, you should be aware that setting too high a weight may result in a 32-bit integer overflow problem. For example, if you set a weight of 10000000 and search in extended mode, the maximum possible weight is 10M (the value you set) multiplied by the internal scale factor of BM25, see Section 4.4, "Weight calculation", "Weight calculation" ) multiplied by 1 or more (phrase similarity rating). The above results are at least 10 billion, which cannot be stored in a 32-bit integer, which will result in unexpected results. 

6.3.6. Setindexweights (set index weights)



prototype: function setindexweights ($weights)



Sets the weight of the index and enables weighted sum of the matching result weights in different indexes. The parameter must be a hash (associative array) that establishes a mapping between the string representing the index name and the integer weight. The default value is an empty array, which means to close the weighted plus.



When the same document ID is matched in a different local index, the Sphinx default selects the last index specified in the query. This is to support partially overlapping partition indexes.



However, in some cases the index is not just partitioned, you may want to add weights from different indexes instead of simply selecting one of them. Setindexweights () allows you to do so. When add and function is turned on, the last matching weights are weighted by weights in each index, and the right of each index is specified by this call. That is, if document 123 is found at index A, the weight is 2, it can be found in B, the weight is 3, and you call Setindexweights ("a" ⇒100, "B" ⇒10), then document 123 is ultimately returned to the client with a weighted value of 2*100+3* 10 = 230.

 6.4. Result set filter settings

6.4.1. Setidrange (set query ID range)



prototype:  function setidrange ($min, $max)


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.