Type: Data Source Type
Data source type. Required. No default value. Available types includemysql
,pgsql
,mssql
,xmlpipe
Andxmlpipe2
,odbc
, Andpython
.
SQL _host: Database Server
The address of the SQL server host to connect. Required. No default value. Only for SQL data sources (mysql
,pgsql
,mssql
.
In the simplest case, sphenders and MySQL or PostgreSQL servers are installed on the same host. In this case, you only need to set them to localhost. Note: The MYSQL client database determines whether to connect to the server through TCP/IP or UNIX socket based on the host name. Generally, "localhost" forces the use of UNIX socket connections (this is the default and recommended mode), while "127.0.01" forces the use of TCP/IP. For details, see MySQL
Manual
SQL _port: Database Port
IP port of the SQL Server to be connected. Optional. Default Value:mysql
Port 3306,pgsql
Port 5432. Only applicable to SQL data sources (mysql
,pgsql
,mssql
). Note: whether this option is actually used depends on the SQL _host option.
SQL _pass: Database Password
The SQL user password used to connect to SQL _host. Required. No default value. Only applicable to SQL data sources (mysql
,pgsql
,mssql
).
SQL _db: Database Name
The SQL database used after connecting to the SQL data source. Subsequent queries are performed on this database. Required. No default value. Only applicable to SQL data sources (mysql
,pgsql
,mssql
).
SQL _sock: Database socket File
The UNIX socket name used to connect to the local SQL Server. Optional. The default value is null (use the default setting of the client library ). Only applicable to SQL data sources (mysql
,pgsql
,mssql
).
In Linux/var/lib/mysql/mysql.sock
. On FreeBSD/tmp/mysql.sock
. Check whether the dependency and SQL _host settings are used.
Example:
sql_sock = /tmp/mysql.sock
11.1.8. mysql_connect_flags: MySQL connection Parameters
Connection flags ). Optional. The default value is 0 (no flag is set ). Applicable onlymysql
Data source.
This option must contain the integer value obtained from the addition of each sign. This integer is passed to mysql_real_connect () as is (). The available flag is listed in mysql_com.h. The following lists several index-related flags and their values:
- Client_compress = 32; allow the use of the compression protocol Protocol
- Client_ssl = 2048; Switch to SSL after HandShaking
- Client_secure_connection = 32768; new MySQL 4.1 Identity Authentication
For example, flag 2080 (2048 + 32) indicates that both compression and SSL are used, and 32768 indicates that only new authentication is used. At first, this option was usedindexer
Andmysql
The compression protocol is used for different hosts. Although the network bandwidth consumption is reduced, enabling compression on a 1 Gbps link may deteriorate the index time theoretically or practically. However, enabling compression on a 30% Mbps connection may significantly improve the index time (it is reported that the total index time is reduced by 20 ). Depending on your network connection, the improvements you have made may vary.
Example:
Mysql_connect_flags = 32 # enable Compression
Mysql_ssl_cert, mysql_ssl_key, and mysql_ssl_ca: the ssl authentication option used by MySQL to connect to the MySQL server. Optional. The default value is an empty string (that is, SSL authentication is not used ). Applicable only
mysql
Data source.
These commands are used inindexer
Establish a secure SSL connection with MySQL. For more information about how to establish an authentication mechanism and set up a MySQL server, see the MySQL documentation.
Example:
mysql_ssl_cert = /etc/ssl/client-cert.pemmysql_ssl_key = /etc/ssl/client-key.pemmysql_ssl_ca = /etc/ssl/cacert.pem
Odbc_dsn: ODBC connection string (DSN)
The odbc dsn to connect. Required. No default value. Applicable onlyodbc
Data source.
Dbc dsn (Data Source Name) specifies the authentication options (host address, user name, password, etc.) used to connect to the ODBC data source ). The specific format is related to the specific ODBC driver.
Example:
odbc_dsn = Driver={Oracle ODBC Driver};Dbq=myDBName;Uid=myUsername;Pwd=myPassword
SQL _query_pre: Query before the index data is obtained
Pre-Fetch query, or pre-query ). Multi-value option. Optional. The default value is an empty query list. Only applicable to SQL data sources (mysql
,pgsql
,mssql
).
Multi-value indicates that you can set this command multiple times to specify multiple pre-queries. They are executed before index data is obtained to query SQL _query, and are executed strictly in the order that appears in the configuration file. The pre-query results are ignored.
Pre-query is often useful. They are used to set character set encoding, Mark records to be indexed, update internal counters, and set SQL server connection options and variables.
Perhaps the most common application of pre-query is to specify the character encoding used by the server to return rows. ThisRequiredThe encoding is the same as that expected by sphinx (set in the charset_type and charset_table options ). The following are two mysql-related configuration examples:
sql_query_pre = SET CHARACTER_SET_RESULTS=cp1251sql_query_pre = SET NAMES utf8
For MySQL data sources, it is useful to disable query cache (only for indexer connections) in the pre-query, because index queries usually run again frequently, buffering their results makes no sense. This can be implemented as follows:
sql_query_pre = SET SESSION query_cache_type=OFF
Example:
sql_query_pre = SET NAMES utf8sql_query_pre = SET SESSION query_cache_type=OFF
SQL _query: obtain the data to be indexed.
Obtain the master query of the document (data) to be indexed. Required. No Default options. Only applicable to SQL data sources (mysql
,pgsql
,mssql
).
Only one primary query is allowed. It is used to obtain documents (document list) from the SQL server ). You can specify up to 32 full-text data fields (strictly speaking, sph_max_fields defined in sphsf-. H) and any number of attributes. All data in neither the Document ID (first column) nor attribute columns will be used to create a full-text index.
Document IDRequiredIs the first column, andMust be a unique positive integer (neither 0 nor negative)It can be either 32-bit or 64-bit. This depends on how sphenders are built. By default, the Document ID is 32-bit,configure
Specify--enable-id64
Supports 64-bit Document ID and Word ID.
Example:
sql_query = \SELECT id, group_id, UNIX_TIMESTAMP(date_added) AS date_added, \title, content \FROM documents
11.1.13. SQL _joined_field: SQL connection field settings
Query the connection/payload fields. Multi-value option. Optional. The default value is null. Valid only for SQL data sources (mysql
,pgsql
,mssql
).
sql_joined_field
Two different methods are provided: Connection field or payload (payload field ). The syntax format is as follows:
sql_joined_field = FIELD-NAME 'from' ( 'query' | 'payload-query' ); \ QUERY [ ; RANGE-QUERY ]
Where
- Field-name is the name of the Connection/payload field;
- Query is an SQL query used to obtain data from the index.
- Range-query is an optional SQL query used to obtain a range-aware index. (version 2.0.1-beta is added .)
Joined FieldsLet you avoid join and/or group_concat statements in the main document fetch query (SQL _query ). this can be useful when SQL-side join is slow, or needs to be offloaded on sphsf-side, or simply
Emulate mysql-specific group_concat funcionality in case your database server does not support it.
The query must return exactly 2 columns: Document ID, and text to append to a joined field. Document IDS can be duplicate, but theyMustBe in ascending order. All the text rows fetched for a given ID will be concatented
Together, And the concatenation result will be indexed as the entire contents of a joined field. rows will be concatenated in the order returned from the query, and separating whitespace will be inserted between them. for instance, if joined field query returns
The following rows:
( 1, 'red' )( 1, 'right' )( 1, 'hand' )( 2, 'mysql' )( 2, 'sphinx' )
Then the indexing results wocould be equivalent to that of adding a new text field with a value of 'red right hand' to document 1 and 'mysql sphs' to document 2.
Joined fields are only indexed differently. There are no other differences between joined fields and regular text fields.
Starting with 2.0.1-Beta,Ranged queriesCan be used when a single query is not efficient enough or does not work because of the database driver limitations. It works similar to the ranged queries in the main Indexing
Loop, see section 3.7 "section query". The range will be queried for and fetched upfront once, then multiple queries with different$start
And$end
Substitutions
Will be run to fetch the actual data.
PayloadsLet you create a special field in which, instead of keyword positions, so-called user payloads are stored. payloads are m integer values attached to every keyword. They can then be used in search time
To affect the ranking.
The payload query must return exactly 3 columns: Document ID; keyword; and integer payload value. Document IDS can be duplicate, but theyMustBe in ascending order. payloads must be unsigned integers within 24-bit
Range, ie. from 0 to 16777215. For reference, payloads are currently internally stored as in-field keyword positions, but that is not guaranteed and might change in the future.
Currently, the only method to account for payloads is to use sph_rank_proximity_bm25 ranker. on indexes with payload fields, it will automatically switch to a variant that matches keywords in those fields, computes a sum of matched payloads multiplied
Field wieghts, and adds that sum to the final rank.
Example:
sql_joined_field = \tagstext from query; \SELECT docid, CONCAT('tag',tagid) FROM tags ORDER BY docid ASC