Sphinx installation configuration and examples of PHP usage

Source: Internet
Author: User
Tags chmod

First of course, from the Sphnix Web site Download Sphinx Source package, the current version is: http://www.sphinxsearch.com/downloads/
Of course, you also need to ensure that your system has MySQL installed.

Second, the installation is in accordance with the official installation instructions, the basic steps are as follows:

Official Starter Documentation HTTP://WWW.SPHINXSEARCH.ORG/ARCHIVES/80

1, Decompression Sphinx Source package:

Mac version of the direct decompression can be used

Http://sphinxsearch.com/files/sphinx-2.2.10-release-osx10.10-x86_64.tar.gz
CentOS steps are:

* [Root@localhost src]# wget http://www.sphinxsearch.com/downloads/sphinx-0.9.9.tar.gz
* [root@localhost src]# tar zxvf sphinx-0.9.9.tar.gz
* [root@localhost local]# CD sphinx-0.9.9
* [Root@localhost sphinx-0.9.9]#./configure–prefix=/usr/local/sphinx #注意: Here Sphinx has the default support for MySQL

* [Root@localhost sphinx-0.9.9]# makes && make install # where "warning" can be ignored

2, modify the configuration file

* [root@localhost ~] #cd/usr/local/sphinx/etc #进入sphinx的配置文件目录
* [root@localhost etc]# cp sphinx.conf.dist sphinx.conf #新建Sphinx配置文件
* [root@localhost etc]# vim sphinx.conf #编辑sphinx. conf
Specific instance configuration file: Main modification MySQL connection information

SOURCE Article_src
{
type = mysql # #数据源类型
Sql_host = 192.168.1.10 ##### #mysql主机
Sql_user = root ####### #mysql用户名
Sql_pass = pwd########### #mysql密码
sql_db = Test ######## #mysql数据库名
sql_port= 3306 ########## #mysql端口

3. Import test data into MySQL test database

Mysql-uroot-p Test < Example.sql
4, the establishment of index files

[Root@localhost sphinx]# bin/indexer-c etc/sphinx.conf ### the command to establish an index file
5. Operation Sphinx

Bin/searchd
6. Run PHP test

PHP api/test.php-h localhost
Query results are as follows

Query ' retrieved 4 of 4 matches in 0.000 sec.
Query Stats:

Matches:
1. Doc_id=1, Weight=1, group_id=1, date_added=2016-05-18 07:06:30
2. doc_id=2, Weight=1, group_id=1, date_added=2016-05-18 07:06:30
3. doc_id=3, Weight=1, group_id=2, date_added=2016-05-18 07:06:30
4. doc_id=4, Weight=1, group_id=2, date_added=2016-05-18 07:06:30


Incremental indexing achieves near-real-time updates.


Test condition: With the default sphinx.conf configuration as an example, the data for the database table is also example.sql.

1. Insert a Count table and two index tables in MySQL first

CREATE TABLE sph_counter (counter_id integer PRIMARY KEY not null and max_doc_id integer NOT NULL);
2. Modify Sphinx.conf

SOURCE main_src{

Type = MySQL

Sql_host = localhost

Sql_user = YourUserName

Sql_pass = YourPassword

sql_db = test//The database you are using

Sql_port = 3306//port used, default is 3306

Sql_query_pre = SET NAMES UTF8

Sql_query_pre = SET session Query_cache_type=off #下面的语句是更新sph_counter表中的 max_doc_id. Sql_query_pre = REPLACE into Sph_counter SELECT 1, MAX (ID) from documents

Sql_query = SELECT ID, group_id, Unix_timestamp (date_added) as date_added, title,\

Content from documents \

WHERE id<= (SELECT max_doc_id from Sph_counter where counter_id=1)

}

Note: The number of sql_query_pre in the delta_src should correspond to the MAIN_SRC, otherwise you may not be able to search the corresponding results

SOURCE delta_src:main_src{

Sql_ranged_throttle = 100

Sql_query_pre = SET NAMES UTF8

Sql_query_pre = SET Session Query_cache_type=off

Sql_query = SELECT ID, group_id, Unix_timestamp (date_added) as date_added, title, content from Documents\

WHERE id> (SELECT max_doc_id from Sph_counter where counter_id=1)

}

Index main//primary index {

Source = Main_src

Path =/path/to/main

# example:/usr/local/sphinx/var/data/main .......

Charset_type = Utf-8 #这个是支持中文必须要设置的

Chinese_dictionary =/usr/local/sphinx/etc/xdict ... ..... Other can default

}

Delta can replicate all primary indexes, and then change source and path as follows

Index Delta:main//Delta Indexes {

Source = Delta_src

Path =/path/to/delta

# Example:/usr/local/sphinx/var/data/delta ...

}

Other configurations can be default, if you set the index of distributed retrieval, then change the corresponding index name.

3. Re-establish the index:
If Sphinx is running, stop running first, then set up all indexes according to the sphinx.conf configuration file, and finally, start the service

/usr/local/sphinx/bin/searchd--stop/usr/local/sphinx/bin/indexer-c/usr/local/sphinx/etc/sphinx.conf--all/usr/ Local/sphinx/bin/searchd-c/usr/local/sphinx/etc/sphinx.conf
P.s/usr/local/sphinx/bin/indexer-c/usr/local/sphinx/etc/sphinx.conf--all--rotate

This does not need to stop searchd, the index also no longer need to restart the searchd.

If you want to test the success of the incremental index, insert the data into the database table and find out if it can be retrieved, this time the retrieval should be empty, and then rebuild the Delta index individually
/usr/local/sphinx/bin/indexer-c/usr/lcoal/sphinx/etc/sphinx.conf Delta
Check to see if the new records are indexed. If successful, you can then use the/usr/local/sphing/bin/search tool to retrieve it, and you will see that the results retrieved in the main index are 0 and the results are retrieved in the delta. Of course, the prerequisite is that the retrieved word only exists in the data that was later inserted.

The next question is how to merge the incremental index with the primary index

4. Index merging
Merging two existing indexes is sometimes more effective than indexing all the data, although the two indexes to be merged are read into memory once, and the merged content is written to disk once, that is, the combined 100GB and 1GB two will result in 202GB IO operations
Command prototype: indexer--merge dstindex Srcindex [--rotate] merges srcindex into Dstindex, so only dstindex will change and if two indexes are serving, then--Rotate parameter Number is a must. For example, the delta is merged into Main.
Indexer--merge Main Delta

5. Automatic indexing Update
Need to use to script.
Create two scripts: Build_main_index.sh and build_delta_index.sh.

Build_main_index.sh:
#!/bin/sh
# Stop the running SEARCHD
/usr/local/sphinx/bin/searchd-c/usr/local/sphinx/etc/mersphinx.conf--stop >>/usr/local/sphinx/var/log/ Sphinx/searchd.log
#建立主索引
/usr/local/sphinx/bin/indexer-c/usr/local/sphinx/etc/mersphinx.conf main >>/usr/local/sphinx/var/log/ Sphinx/mainindex.log
#启动searchd守护程序
/usr/local/sphinx/bin/searchd >>/usr/local/sphinx/var/log/sphinx/searchd.log

build_delta_index.sh

#!/bin/sh
#停止sphinx服务, redirect output
/usr/local/sphinx/bin/searchd–stop >>/usr/local/sphinx/var/log/sphinx/searchd.log
#重新建立索引delta, redirect output
/usr/local/sphinx/bin/indexer delta–c/usr/local/sphinx/etc/sphinx.conf>>/usr/lcoal/sphinx/var/log/sphinx/ Deltaindex.log
#将delta合并到main中
/usr/local/sphinx/bin/indexer–merge main delta–c/usr/local/sphinx/etc/sphinx.conf >>/usr/lcoal/sphinx/var/ Log/sphinx/deltaindex.log
#启动服务
/usr/local/sphinx/bin/searchd >>/usr/local/sphinx/var/log/sphinx/searchd.log

After the script is written, you need to compile chmod +x filename to run it. That
chmod +x build_main_index.sh
chmod +x build_delta_index.sh

Finally, we need the script to run automatically to implement the Delta index to be reset every 5 minutes, and the main index will be reset only 2:30 midnight.

Using the crontab command there are two places to use for reference crontab crontab file
Crontab-e to edit the crontab file, if not previously used, would be an empty file. Write down the following two statements
*/30 * * * */bin/sh/usr/local/sphinx/etc/build_delta_index.sh >/dev/null 2>&1
2 * * */bin/sh/usr/local/sphinx/etc/build_main_index.sh >/dev/null 2>&1

The first is a build_delta_index.sh script that runs every 30 minutes under/usr/local/sphinx/etc/and outputs redirects.
The second is the build_main_inde.sh script, output redirection, which represents the daily 2:30 run/usr/local/sphinx/etc.
The settings for the previous 5 values are described in detail in the crontab file above. For an explanation of redirects, see the crontab notes at the top, as well as the crontab introductions.

After saving: restarting the service

[Root@test1 init.d]# service Crond stop
[Root@test1 init.d]# Service Crond start
Or
/etc/init.d/crontab start

Until now, if the script is written with no problems, then build_delta_index.sh will run every 30 minutes, and build_main_index.sh will run at 2:30.

To verify, in the script, there will be output redirected to the relevant files, you can see whether the records in the next file increase, you can also look at the next/usr/local/sphinx/var/log under the Searchd.log, each rebuild the index will have records.

Summary
1. Index merge problem, as explained earlier, when two indexes are merged, they are read in, and then the hard drive is written again, and IO operations are large. In the case of PHP API calls, query ($query, $index) $index can set multiple index names, such as query ($query, "Main;delta"), there is no need to necessarily merge two indexes, or the number of times to merge not so much.
2. Another is not tried, the incremental index is stored in shared memory (/DEV/SHM) to improve indexing performance, reduce system load. about how the PHP API
can be successfully retrieved through the PHP page.
First, the SEARCHD must be running on the server.
Then, according to Test.php, modify the.
Run, the connection will have a big problem errno =13 permission deny. Finally, an English page was found because of the selinux of the reasons, about SELinux can be found on the Internet. There is no good solution but to set the SELinux to no use. There are two commands to use: Setenforce under/usr/bin
Setenforce 1 setting SELinux become enforcing mode
Setenforce 0 setting SELinux to permissive mode

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.