Sphtracing Configuration Analysis

Source: Internet
Author: User
Tags odbc connection ssl connection
Type: Data Source Type

Data source type. Required. No default value. Available types includemysql,pgsql,mssql,xmlpipeAndxmlpipe2,odbc, Andpython.

SQL _host: Database Server

The address of the SQL server host to connect. Required. No default value. Only for SQL data sources (mysql,pgsql,mssql.

In the simplest case, sphenders and MySQL or PostgreSQL servers are installed on the same host. In this case, you only need to set them to localhost. Note: The MYSQL client database determines whether to connect to the server through TCP/IP or UNIX socket based on the host name. Generally, "localhost" forces the use of UNIX socket connections (this is the default and recommended mode), while "127.0.01" forces the use of TCP/IP. For details, see MySQL
Manual

SQL _port: Database Port

IP port of the SQL Server to be connected. Optional. Default Value:mysqlPort 3306,pgsqlPort 5432. Only applicable to SQL data sources (mysql,pgsql,mssql). Note: whether this option is actually used depends on the SQL _host option.

SQL _pass: Database Password

The SQL user password used to connect to SQL _host. Required. No default value. Only applicable to SQL data sources (mysql,pgsql,mssql).

SQL _db: Database Name

The SQL database used after connecting to the SQL data source. Subsequent queries are performed on this database. Required. No default value. Only applicable to SQL data sources (mysql,pgsql,mssql).

SQL _sock: Database socket File

The UNIX socket name used to connect to the local SQL Server. Optional. The default value is null (use the default setting of the client library ). Only applicable to SQL data sources (mysql,pgsql,mssql).

In Linux/var/lib/mysql/mysql.sock. On FreeBSD/tmp/mysql.sock. Check whether the dependency and SQL _host settings are used.

Example:
sql_sock = /tmp/mysql.sock
11.1.8. mysql_connect_flags: MySQL connection Parameters

Connection flags ). Optional. The default value is 0 (no flag is set ). Applicable onlymysqlData source.

This option must contain the integer value obtained from the addition of each sign. This integer is passed to mysql_real_connect () as is (). The available flag is listed in mysql_com.h. The following lists several index-related flags and their values:

  • Client_compress = 32; allow the use of the compression protocol Protocol
  • Client_ssl = 2048; Switch to SSL after HandShaking
  • Client_secure_connection = 32768; new MySQL 4.1 Identity Authentication

For example, flag 2080 (2048 + 32) indicates that both compression and SSL are used, and 32768 indicates that only new authentication is used. At first, this option was usedindexerAndmysqlThe compression protocol is used for different hosts. Although the network bandwidth consumption is reduced, enabling compression on a 1 Gbps link may deteriorate the index time theoretically or practically. However, enabling compression on a 30% Mbps connection may significantly improve the index time (it is reported that the total index time is reduced by 20 ). Depending on your network connection, the improvements you have made may vary.

Example:
Mysql_connect_flags = 32 # enable Compression
Mysql_ssl_cert, mysql_ssl_key, and mysql_ssl_ca: the ssl authentication option used by MySQL to connect to the MySQL server. Optional. The default value is an empty string (that is, SSL authentication is not used ). Applicable only mysqlData source.

These commands are used inindexerEstablish a secure SSL connection with MySQL. For more information about how to establish an authentication mechanism and set up a MySQL server, see the MySQL documentation.

Example:
mysql_ssl_cert = /etc/ssl/client-cert.pemmysql_ssl_key = /etc/ssl/client-key.pemmysql_ssl_ca = /etc/ssl/cacert.pem
Odbc_dsn: ODBC connection string (DSN)

The odbc dsn to connect. Required. No default value. Applicable onlyodbcData source.

Dbc dsn (Data Source Name) specifies the authentication options (host address, user name, password, etc.) used to connect to the ODBC data source ). The specific format is related to the specific ODBC driver.

Example:
odbc_dsn = Driver={Oracle ODBC Driver};Dbq=myDBName;Uid=myUsername;Pwd=myPassword 
SQL _query_pre: Query before the index data is obtained

Pre-Fetch query, or pre-query ). Multi-value option. Optional. The default value is an empty query list. Only applicable to SQL data sources (mysql,pgsql,mssql).

Multi-value indicates that you can set this command multiple times to specify multiple pre-queries. They are executed before index data is obtained to query SQL _query, and are executed strictly in the order that appears in the configuration file. The pre-query results are ignored.

Pre-query is often useful. They are used to set character set encoding, Mark records to be indexed, update internal counters, and set SQL server connection options and variables.

Perhaps the most common application of pre-query is to specify the character encoding used by the server to return rows. ThisRequiredThe encoding is the same as that expected by sphinx (set in the charset_type and charset_table options ). The following are two mysql-related configuration examples:

sql_query_pre = SET CHARACTER_SET_RESULTS=cp1251sql_query_pre = SET NAMES utf8

For MySQL data sources, it is useful to disable query cache (only for indexer connections) in the pre-query, because index queries usually run again frequently, buffering their results makes no sense. This can be implemented as follows:

sql_query_pre = SET SESSION query_cache_type=OFF

Example:
sql_query_pre = SET NAMES utf8sql_query_pre = SET SESSION query_cache_type=OFF 
SQL _query: obtain the data to be indexed.

Obtain the master query of the document (data) to be indexed. Required. No Default options. Only applicable to SQL data sources (mysql,pgsql,mssql).

Only one primary query is allowed. It is used to obtain documents (document list) from the SQL server ). You can specify up to 32 full-text data fields (strictly speaking, sph_max_fields defined in sphsf-. H) and any number of attributes. All data in neither the Document ID (first column) nor attribute columns will be used to create a full-text index.

Document IDRequiredIs the first column, andMust be a unique positive integer (neither 0 nor negative)It can be either 32-bit or 64-bit. This depends on how sphenders are built. By default, the Document ID is 32-bit,configureSpecify--enable-id64Supports 64-bit Document ID and Word ID.

Example:
sql_query = \SELECT id, group_id, UNIX_TIMESTAMP(date_added) AS date_added, \title, content \FROM documents
11.1.13. SQL _joined_field: SQL connection field settings

Query the connection/payload fields. Multi-value option. Optional. The default value is null. Valid only for SQL data sources (mysql,pgsql,mssql).

sql_joined_fieldTwo different methods are provided: Connection field or payload (payload field ). The syntax format is as follows:

sql_joined_field = FIELD-NAME 'from'  ( 'query' | 'payload-query' ); \    QUERY [ ; RANGE-QUERY ]

Where

  • Field-name is the name of the Connection/payload field;
  • Query is an SQL query used to obtain data from the index.
  • Range-query is an optional SQL query used to obtain a range-aware index. (version 2.0.1-beta is added .)

Joined FieldsLet you avoid join and/or group_concat statements in the main document fetch query (SQL _query ). this can be useful when SQL-side join is slow, or needs to be offloaded on sphsf-side, or simply
Emulate mysql-specific group_concat funcionality in case your database server does not support it.

The query must return exactly 2 columns: Document ID, and text to append to a joined field. Document IDS can be duplicate, but theyMustBe in ascending order. All the text rows fetched for a given ID will be concatented
Together, And the concatenation result will be indexed as the entire contents of a joined field. rows will be concatenated in the order returned from the query, and separating whitespace will be inserted between them. for instance, if joined field query returns
The following rows:

( 1, 'red' )( 1, 'right' )( 1, 'hand' )( 2, 'mysql' )( 2, 'sphinx' )

Then the indexing results wocould be equivalent to that of adding a new text field with a value of 'red right hand' to document 1 and 'mysql sphs' to document 2.

Joined fields are only indexed differently. There are no other differences between joined fields and regular text fields.

Starting with 2.0.1-Beta,Ranged queriesCan be used when a single query is not efficient enough or does not work because of the database driver limitations. It works similar to the ranged queries in the main Indexing
Loop, see section 3.7 "section query". The range will be queried for and fetched upfront once, then multiple queries with different$startAnd$endSubstitutions
Will be run to fetch the actual data.

PayloadsLet you create a special field in which, instead of keyword positions, so-called user payloads are stored. payloads are m integer values attached to every keyword. They can then be used in search time
To affect the ranking.

The payload query must return exactly 3 columns: Document ID; keyword; and integer payload value. Document IDS can be duplicate, but theyMustBe in ascending order. payloads must be unsigned integers within 24-bit
Range, ie. from 0 to 16777215. For reference, payloads are currently internally stored as in-field keyword positions, but that is not guaranteed and might change in the future.

Currently, the only method to account for payloads is to use sph_rank_proximity_bm25 ranker. on indexes with payload fields, it will automatically switch to a variant that matches keywords in those fields, computes a sum of matched payloads multiplied
Field wieghts, and adds that sum to the final rank.

Example:
sql_joined_field = \tagstext from query; \SELECT docid, CONCAT('tag',tagid) FROM tags ORDER BY docid ASC

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.