[Coreseek/sphinx Learning note 5]--General API

Source: Internet
Author: User
Tags constant data structures hash in degrees min readable reset


[Refer to Coreseek Full-Text Search Server 2.0 (Sphinx 0.9.8) reference manual for details see http://www.coreseek.cn/docs/sphinx_doc_zhcn_0.9.pdf]

function GetLastError ()
Returns the most recent error description information in human readable form. Returns an empty string if the previous API call did not have an error. This function itself does not reset the description of the error, so it can be called multiple times, if necessary.

function getlastwarning ()
Returns the most recent warning description information in human readable format. Returns an empty string if the previous API call did not have a warning. The function itself does not reset the warning message, so it can be called multiple times, if necessary.

function getlastwarning ()
Returns the most recent warning description information in human readable format. Returns an empty string if the previous API call did not have a warning. The function itself does not reset the warning message, so it can be called multiple times, if necessary.

function Setserver ($host, $port)
Set the host name and TCP port of the searchd. All subsequent requests use the new host and port settings. The default host and Port are "localhost" and 3312, respectively.

function Setretries ($count, $delay = 0)
Sets the number of distributed search retries and the delay time. For transient failures, searchd retries up to $count times for each agent. $delay is the time, in milliseconds, to delay between retries of two times. By default, Retry is forbidden

function Setarrayresult ($arrayresult)
PHP dedicated. Controls the return format of the search result set (matches are returned by an array or by a hash)
The $arrayresult parameter should be a Boolean type. If $arrayresult is False (the default), the match is returned in PHP hash format, the document ID is the key, and the other information (weights, attributes) is the value. If $arrayresult is true, the match is returned as a normal array, including all information about the match (with the document ID)

function Setlimits ($offset, $limit, $max _matches=0, $cutoff = 0)
Sets an offset ($offset) to the server-side result set and a limit ($limit) to the number of matches that are returned to the client from that offset. The result set size of the current query ($max _matches) can be set on the server side, and a threshold ($cutoff) will stop the search when the found match reaches this threshold. All of these parameters must be non-negative integers.
The first two parameters behave the same as the parameters in the MySQL LIMIT clause. The default values for offset ($offset) and result limit ($limit) are 0 and 20 respectively, which means that the first 20 matches are returned.
Max_match This setting controls the number of matches that SEARCHD maintains in memory during the search. In general, even if Max_matches is set to 1, all matching documents will be processed, scored, filtered, and sorted. But there is also a limit for the entire server, which is controlled by the Max_matches settings in the configuration file. To prevent the misuse of memory, the server does not allow a single query to be more restrictive than the server.
The client cannot receive more than max_matches matches. The default limit is 1000, and you should not experience situations where you need to set them higher. 1000 records are sufficient to show the end user. If you want to transfer the results to the application for further sorting or filtering, be aware that the Sphinx end is much more efficient to complete.
$cutoff settings are provided for advanced performance optimizations. It tells Searchd to force a stop after it finds and processes a $cutoff match.

function Setmaxquerytime ($max _query_time)
Sets the maximum search time, in milliseconds. The parameter must be a non-negative integer. The default value is 0, meaning no limit is made.
This setting is similar to the $cutoff in Setlimits (), but this setting limits the query time, not the number of matches processed. Once the processing time has been too long, the local search query will be stopped. Note that if a search queries multiple local indexes, that restriction is used independently of these indexes.

function Setmatchmode ($mode)
To set the matching pattern for full-text queries, see the description in section 4.1, "matching patterns." The parameter must be a constant corresponding to a known pattern.
Warning: (PHP only) query pattern constants cannot be enclosed in quotation marks, which gives a string instead of a constant.

function Setrankingmode ($ranker)
Set the scoring mode. Currently only available in Sph_match_extended2 this matching mode. The parameter must be a constant corresponding to a known pattern.
Sphinx calculates two factors that are useful for the final matching weights by default. It is mainly the similarity between the query phrase and the document text. The second is a statistical function called BM25, which values values between 0 and 1 based on the frequency in the keyword document (high-frequency results in higher weights) and the frequency in the entire index (low-frequency results in high weights).
The patterns that have been implemented include:
SPH_RANK_PROXIMITY_BM25, the default mode, uses both phrase and BM25 ratings, and combines the two.
SPH_RANK_BM25, Statistical correlation calculation mode, using only BM25 scoring calculations (same as most full-text search engines). This pattern is faster, but it may degrade the result quality of queries that contain multiple words.
Sph_rank_none, disables the scoring mode, which is the fastest mode. This pattern is actually the same as a Boolean search. All matches are given a weight of 1.

function Setsortmode ($mode, $sortby = "")
The parameter must be a constant corresponding to a known pattern.
Warning: (PHP only) query pattern constants cannot be enclosed in quotation marks, which gives a string instead of a constant.

function Setweights ($weights)
Set weights for the fields in the order in which they appear in the index. Not recommended, please use Setfieldweights ().

function Setfieldweights ($weights)
Sets the weight of the field by field name. The argument must be a hash (associative array) that maps a string representing the name of the field to the weight of an integral type.
Field weights affect the rating of the match. This call is used to specify weights that are different from the default values for different full-text data fields. The given weight must be a positive 32-bit integer. The final weight is also a 32-bit integer. The default weight is 1. Unknown property names are ignored. There is currently no maximum limit on weights. However, you should be aware that setting too high a weight may result in a 32-bit integer overflow problem.

function Setindexweights ($weights)
Sets the weight of the index and enables weighted sum of the matching result weights in different indexes. The parameter must be a hash (associative array) that establishes a mapping between the string representing the index name and the integer weight. The default value is an empty array, which means to close the weighted plus.

function Setidrange ($min, $max)
Sets the accepted range of document IDs. parameter must be an integer. The default is 0 and 0, meaning unrestricted range. After this call is executed, only documents with IDs between $min and $max (including $min and $max) will be matched.

function SetFilter ($attribute, $values, $exclude =false)
Adds an integer value filter. This call adds a new filter to the list of existing filters.
$attribute is the property name. $values is an array of integers. $exclude is a Boolean value that controls whether to accept matching documents (the default mode, that is, when $exclude is false) or reject them. The document will be matched (or rejected if the $exclude value is true) only if the value of the $attribute column in the index matches any of the values in $values

function Setfilterrange ($attribute, $min, $max, $exclude =false)
Adds a new integer range filter. This call adds a new filter to the list of existing filters.
$attribute is the name of the property, $min, $max defines an integer closed interval, $exclude Boolean value that controls whether to accept matching documents (the default mode, that is, when $exclude is false) or reject them.
Only the values of the $attribute columns in the index fall between $min and $max (including $min and $max), the document will be matched (or rejected if the $exclude value is true).

function Setfilterfloatrange ($attribute, $min, $max, $exclude =false)
Add a new floating-point range filter. This call adds a new filter to the list of existing filters.
$attribute is the property name, $min, $max defines a floating-point closed interval, $exclude must be a Boolean value that controls whether to accept matching documents (the default mode, that is, when $exclude is false) or reject them. The document will be matched (or rejected if the $max value is true) only if the value of the $attribute column in the index falls between $min and $max (including $min and $exclude).

function Setgeoanchor ($attrlat, $attrlong, $lat, $long)
Sets anchor points for surface distance calculations and allows them to be used.
$attrlat and $attrlong are strings that specify the name of the attribute corresponding to the latitude and longitude. $lat and $long are floating-point values that specify the longitude and latitude values of the anchor point, in degrees.

function Setgroupby ($attribute, $func, $groupsort = "@group desc")
Set the sorting mode between the grouped properties, functions, and groups, and enable grouping
$attribute is the string that is the name of the property to be grouped.
$func is a constant that specifies the built-in function that is entered with the value of the Grouping property described earlier, with the current optional value: Sph_groupby_day, Sph_groupby_week, Sph_groupby_month,sph_groupby_ Year, Sph_groupby_attr.
$groupsort are clauses that control how groupings are sorted. Its syntax is similar to that described in section 4.5, "sph_sort_extended mode". Grouping is essentially the same as the GROUP BY clause in SQL. Generated by this function call.

function Setgroupdistinct ($attribute)
Sets the name of the attribute in the grouping that needs to calculate the number of different values. Valid only in a grouped query.
$attribute is a string that contains the name of the property. The value of this property for each group is stored (as long as memory is allowed), and then the total number of different values in this group is computed and returned to the client. This feature is similar to the COUNT (DISTINCT) clause in standard SQL.

function Query ($query, $index = "*")
Connects to the SEARCHD server, executes the given query based on the current settings of the server, obtains and returns the result set.
The $query is a query string, $index is one or more index names. Once a general error occurs, false is returned and the GetLastError () information is set. If successful, returns the result set of the search. The default value for $index is "*", which means querying all local indexes. The characters allowed in the index name include the Latin alphabet (A-Z), Number (0-9), minus (-) and underscore (_), and other characters as delimiters.
The result set is hash (PHP only, other language APIs may use other data structures), including the following keys and values:
"Matches": is a hash table that stores the document ID and its corresponding hash table (or array if Setarrayresult () is enabled) that contains the document weights and attribute values.
"Total": the number of matching documents retrieved by this query on the server (that is, the size of the server-side result set). This is the upper limit of the number of matching documents that can be obtained from the server side with the current query under current settings.
"Total_found": The total number of matching documents in the index (found and processed on the server).
"Words": a hash that maps the query keyword (the keyword has been case-converted, stemming, and other processing) to a small hash table containing statistics about the keyword ("docs"-how many documents appear, hits "-How many times).
"Error": Searchd reported error message (Human readable string). An empty string if there is no error.
"Warning": Searchd report warning message (Human readable string). An empty string if no warning is available.

function AddQuery ($query, $index = "*")
Add a query to the bulk query. $query is a query string. $index to a string that contains one or more index names. Returns a subscript in the array returned by Runqueries ().

function Runqueries ()
Connect to SEARCHD, run all queries added by AddQuery (), get and return their result set. Returns false and sets the GetLastError () information if a general error occurs, such as a network I/O failure. A simple array of result sets is returned if successful.
Each result set in the array is exactly the same as the result set returned by Query ().

function Resetfilters ()
Clears the current set of filters. Typically, this call is used when using bulk queries. You may need to provide different filters for the different queries in the bulk query, and for this purpose you need to call Resetfilters () and add a new filter with other calls.

function Resetgroupby ()
Clears all existing grouping settings and closes the grouping. Typically, this call is used when using bulk queries. Individual grouping settings can be changed with Setgroupby () and setgroupdistinct (), but they cannot turn off grouping. Resetgroupby () Resets the previous grouping settings completely and turns off grouping mode in the current state, so that subsequent addquery () can be searched without a group.

function buildexcerpts ($docs, $index, $words, $opts =array ())
This function is used to produce a document fragment (summary). Connects to Searchd, requiring it to produce fragments (summaries) from the specified document and return the results.
$docs as an array containing the contents of each document. $index is a string containing the index name. Different settings for a given index (such as settings for character sets, morphology, word forms, etc.) are used.
$words as a string containing the keywords that need highlighting. They are processed by the settings of the index. For example, if the English stem stemming is set to allow in the index, then even if the keyword is "shoe", the word "shoes" will be highlighted.
$opts as a hash with other optional highlight parameters:
"Before_match": the string that was inserted before the matched keyword. The default is "<b>"
"Chunk_separator": a string inserted between a summary block (paragraph). The default is "..."
"Limit": the maximum number of symbols (code points) that the digest contains. Integer, default = 256
"Around": the number of words selected around each keyword block. An integer that defaults to 5.
"Exact_phrase": whether to highlight only the exact match of the entire query phrase, rather than the individual keywords. A Boolean value that defaults to False.
"Single_passage": whether to extract only the best one paragraph. A Boolean value that defaults to No. returns false upon failure. On success, returns an array containing a fragment (digest) string.

function Updateattributes ($index, $attrs, $values)
Immediately updates the specified property value for the specified document. Success returns the number of documents that were actually updated (0 or more), and the failure returns-1.
$index is the name of the index (or indexes) to be updated. Can be either a separate index name or a list of index names.
The $attrs is an array of property name strings, and the properties listed are updated.
$values is a hash table, the key for the $values table is the document ID, and the value of the $values table is a simple array of new property values.
A $index can be either a separate index name or a list of index names, like the parameters of Query (). The list of index names can contain distributed indexes (updates are synchronized to all agents)
Updates can only be run under Docinfo=extern this storage policy. Updates are very fast because the operations are done entirely in memory, but they can also become persistent, and the updates are written to disk when the Searchd cleanly shuts down (when the SIGTERM signal is received).


Article Source: http://my.oschina.net/wzwitblog/blog/109999

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.