Example: sphinx.conf fragment:
... sql_query = SELECT ID, title, content, author_id, forum_id, post_date from my_forum_postssql_attr_uint = author_idsql_a Ttr_uint = Forum_idsql_attr_timestamp = Post_date ...
Example: Application code (PHP):
Only search posts by author whose ID is 123$cl->setfilter ("author_id", Array (123));//only search posts in sub -forums 1, 3 and 7$cl->setfilter ("forum_id", Array (1,3,7));//Sort found posts by posting date in descending Orde R$cl->setsortmode (Sph_sort_attr_desc, "post_date");
A specific property can be indicated by name, and the name is case-insensitive (note: Until now, Sphinx does not support Chinese as the name of the property). Properties are not full-text indexed, they are stored in the index file only as they are intact.
The ID of all documents must be a unique unsigned nonzero integer (32-bit or 64-bit, depending on the option of Sphinx construction)
When an index is established, Sphinx obtains a text document from the specified data source, divides the text into a collection of words, and then converts each word into a case, so that "ABC", "ABC" and "ABC" are all treated as the same word (word, or more scholarly, term)
In order to get the job done correctly, Sphinx needs to know:
- What is the source text encoded;
- Those characters are letters, which are not;
- Which characters need to be converted, and what is converted to.
These can be charset_type configured separately with the and charset_table options for each index. Specifies whether the charset_type encoding of the document is single-byte (SBCS) or UTF-8. In Coreseek, if the Chinese word mode is started by Charset_dictpath, the encoding of GBK and BIG5 can be used, but in the internal implementation, it is still pre-converted to UTF-8 encoding for processing. The charset_table corresponding table that specifies the alphabetic characters to their case-converted versions, the characters that do not appear in this table are considered to be non-alphabetic characters, and are treated as a word's delimiter when indexing and retrieving.
In Coreseek, when the Chinese word breaker is enabled, the system uses the MMSEG built-in code table (which is hardcoded in the MMSEG program), so charset_table is invalidated after the word breaker is enabled.
Sphinx Simple Configuration