Build Coreseek (SPHINX+MMSEG3) Detailed installation configuration +php Sphinx Extension installation +php Call example
A document contains examples of installation, incremental backup, extensions, API invocation, and eliminates the time to find a large number of articles.
Build Coreseek (SPHINX+MMSEG3) installation
[First step] install MMSEG3 first
Cd/var/installwget Http://www.coreseek.cn/uploads/csft/4.0/coreseek-4.1-beta.tar.gztar ZXVF CORESEEK-4.1-BETA.TAR.GZCD CORESEEK-4.1-BETACD mmseg-3.2.14./bootstrap./configure--prefix=/usr/local/mmseg3make && make install problems encountered: Error:cannot find input file:src/makefile.in or other similar error errors ... Solution: Execute the following command in turn, I run the ' aclocal ' ERROR, the solution please see the description below yum-y install libtoolaclocallibtoolize--forceautomake-- Add-missingautoconfautoheadermake Clean
Install ' Libtool ' continue from ' aclocal ' to execute the above mentioned sequence of commands, and then run the initial installation process.
[Step two] install Coreseek
# #安装coreseek $ cd csft-3.2.14 or CD csft-4.0.1 or CD csft-4.1$ sh buildconf.sh #输出的warning信息可以忽略, if error occurs you need to resolve $./con Figure--prefix=/usr/local/coreseek --without-unixodbc--with-mmseg--with-mmseg-includes=/usr/local/mmseg3/ include/mmseg/--with-mmseg-libs=/usr/local/mmseg3/lib/--with-mysql# #如果提示mysql问题, you can view the MySQL data source installation instructions HTTP// www.coreseek.cn/product_install/install_on_bsd_linux/#mysql $ make && make install$ CD. # #命令行测试mmseg分词, Coreseek search (requires a pre-set character set of ZH_CN. UTF-8, make sure Chinese is displayed correctly) $ cd testpack$ cat Var/test/test.xml #此时应该正确显示中文 $/usr/local/mmseg3/bin/mmseg-d/usr/local/mmseg3 /etc var/test/test.xml$/usr/local/coreseek/bin/indexer-c etc/csft.conf--all$/usr/local/coreseek/bin/search-c etc/ csft.conf Web Search
This xmlpipe2 support is present in the compiled. To use Xmlpipe2, install missing XML libra error
Execute the following command:
Yum-y Install expat Expat-devel
Once installed, you can pass the new compilation Coreseek, and then build the index again.
The results are as follows:
Coreseek fulltext 4.1 [Sphinx 2.0.2-dev (r2922)] Copyright (c) 2007-2011, Beijing Choice software Technologies in C (http://www.coreseek.com) using config file ' etc/csft.conf ' ... Index ' XML ': Query ' Web search ': returned 1 matches of 1 total in 0.000 sec displaying matches: 1. Document=1, Weight=1 590, Published=thu Apr 1 07:20:07, author_id=1 words: 1. ' Network ': 1 documents, 1 hits 2. ' Search ': 2 documents, 5 hits
Below begins the configuration of Sphinx with MySQL
Create Sphinx tables and execute them in the Coreseek_test library.
CREATE TABLE sph_counter ( counter_id integer PRIMARY KEY not null, max_doc_id integer NOT NULL);
Create a configuration file that configures Sphinx with MySQL
# vi/usr/local/coreseek/etc/csft_mysql.conf
#MySQL数据源配置, for more information, see: http://www.coreseek.cn/products-install/mysql/#请先将var/test/ Documents.sql Import the database and configure the following MySQL user password database # Source definition sources main #定义源名称 {type = MySQL Sql_host = localhost Sql_user = root Sql_pass = 123456 sql_db = coreseek_test Sql_port = 3306 Sql_query_pre = SET NAM ES UTF8 sql_query_pre = REPLACE into Sph_counter SELECT 1,max (id) from Hr_spider_company; # update Sph_counter sql_query = SELECT * from Hr_spider_company WHERE id< = (SELECT max_doc_id from Sph_counter WHERE counter_id=1) # reads data according to the Sph_counter record ID #sql_query第一列id需为整数 #title, content as a string/text field, is For full-text indexing, refer to the actual database field Sql_attr_uint = from_id #从SQL读取到的值必须为整数, refer to the database actual field sql_attr_uint = link_id #从SQL读取到的值必须为整数, refer to the Actual database field Sql_att R_uint = Add_time #从SQL读取到的值必须为整数, refer to database actual field} #增量源定义source Delta:main #注意与定义名称的统一性 {sql_query_pre = SET NAMES UTF8 sql_query = SELECT * from hr_sp Ider_company where id> (SELECT max_doc_id from Sph_counter WHERE counter_id=1) # reads data according to the Sph_counter record ID sql_query _post_index = REPLACE into Sph_counter SELECT 1,max (id) from Hr_spider_company # Update Sph_counter} #index定义index main #注意与定义名称的统一性 {Source = ma In #对应的source名称 path =/usr/local/coreseek/var/data/mysql #请修改为实际使用的绝对路径, for example:/usr/local/coreseek/var/docinfo = extern Mlock = 0 Morpholo GY = none miN_word_len = 1 Html_strip = 0 #中文分词配置, see for details: http://www.coreseek.cn/products-install/coreseek_mmseg/ Charset_dictpath =/usr/local/mmseg3/etc/#BSD, settings under Linux,/end of symbol Charset_type = Zh_ Cn.utf-8}index Delta:main #注意与定义名称的统一性 {Source = Delta path =/usr/local/coreseek/var/data/delta} #全局index定义indexer {mem_limit = 128M} #searchd服务定义searchd { Listen = 9312 Read_timeout = 5 Max_children = Max_matches = + Seaml Ess_rotate = 0 preopen_indexes = 0 Unlink_old = 1 pid_file =/usr/local/coreseek/var/ Log/searchd_mysql.pid #请修改为实际使用的绝对路径, for example:/usr/local/coreseek/var/... log =/usr/local/coreseek/var/log /searchd_mysql.log #请修改为实际使用的绝对路径, for example:/usr/local/coreseek/var/... query_log =/usr/local/coreseek/var/log/qu Ery_mysql.log #请修改为实The absolute path to use, for example:/usr/local/coreseek/var/... binlog_path = #关闭binlog Log
My test table is named Hr_spider_company, and you just need to change it to your own table name based on your actual needs.
To invoke a command list:
Start background service (must be turned on)
#/usr/local/coreseek/bin/searchd-c/usr/local/coreseek/etc/csft_mysql.conf
Execute index (must be executed once before query, test)
/usr/local/coreseek/bin/indexer-c/usr/local/coreseek/etc/csft_mysql.conf--all--rotate
Performing an incremental index
/usr/local/coreseek/bin/indexer-c/usr/local/coreseek/etc/csft_mysql.conf Delta--rotate
Merging indexes
/usr/local/coreseek/bin/indexer-c/usr/local/coreseek/etc/csft_mysql.conf--merge Main Delta--rotate-- Merge-dst-range deleted 0 0
(In order to prevent multiple keywords from pointing to the same document plus--merge-dst-range deleted 0 0)
Background service Test
#/usr/local/coreseek/bin/search-c/usr/local/coreseek/etc/csft_mysql.conf AAA
Turn off background services
#/usr/local/coreseek/bin/searchd-c/usr/local/coreseek/etc/csft_mysql.conf--stop
Automation commands:
Crontab-e
*/1 * * * */bin/sh/usr/local/coreseek/bin/indexer-c/usr/local/coreseek/etc/csft_mysql.conf Delta--ROTATE*/5 * * */ Bin/sh/usr/local/coreseek/bin/indexer-c/usr/local/coreseek/etc/csft_mysql.conf--merge Main Delta--rotate-- Merge-dst-range deleted 0 030 1 * * * /bin/sh/usr/local/coreseek/bin/indexer-c/usr/local/coreseek/etc/csft_ mysql.conf--all--rotate
The following task plan means that you perform the incremental index every other minute, perform the merge index every five minutes, and perform the overall index 1:30 every day.
Sphinx Extended Installation Installation
Coreseek The official tutorial recommended that PHP use directly include a PHP file to operate, in fact, PHP has a separate Sphinx module can be directly operated Coreseek (Coreseek is sphinx! Has entered the official PHP library, and the efficiency of the promotion is not a little bit! But the PHP module relies on the Libsphinxclient package.
[First step] install dependent libsphinxclient
# cd/var/install/coreseek-4.1-beta/csft-4.1/api/libsphinxclient/#./configure --prefix=/usr/local/ sphinxclientconfigure:creating./config.statusconfig.status:creating Makefileconfig.status:error:cannot Find input file:Makefile.in #报错configure失败 //Handling Configure error during compilation a Config.status:error:cannot find input file:src/was reported Makefile.in this error, and then run the following command to compile it again: # aclocal# libtoolize--force# automake--add-missing# autoconf# autoheader# make clean//compiling from new Configure #./configure# Make && make install
[Step two] install PHP extensions for Sphinx
http://pecl.php.net/package/sphinx# wget http://pecl.php.net/get/sphinx-1.3.0.tgz# tar zxvf sphinx-1.3.0.tgz# CD sphinx-1.3.0# phpize#./configure--with-php-config=/usr/bin/php-config--with-sphinx=/usr/local/sphinxclient# make && make install# cd/etc/php.d/# cp gd.ini sphinx.ini# VI sphinx.iniextension=sphinx.so# Service PHP-FPM rest Art
Open the Phpinfo to see if the Sphinx module is already supported.
PHP Call Sphinx Example:
Setserver ("127.0.0.1", 9312); $s->setmatchmode (sph_match_phrase); $s->setmaxquerytime (+); $res = $s->query ("BMW", ' main '); #[BMW] keyword, [main] DataSource source $err = $s->getlasterror (); Var_dump (Array_keys ($res [' matches ')]); echo "
"." You can read the values in the database by getting the ID. "."
"; Echo ''; Var_dump ($res); Var_dump ($err); Echo '
';
Output Result:
Array () {[0]=> int (1513) [1]=> Int (42020) [2]=> int (57512) [3]=> int (59852) [4]=> int (59855) [5]=> Int (60805) [6]=> int (94444) [7]=> int (94448) [8]=> int (99229) [9]=> int (107524) [10]=> Int (111918) [11]=> int (148) [12]=> Int (178) [13]=> int (595) [14]=> int (775) [15]=> Int (860) [16] = = Int (938) [17]=> Int (1048) [18]=> Int (1395) [19]=> int (1657)}
You can read the values in the database by getting the ID.
Array (Ten) {["Error"]=> string (0) "" ["Warning"]=> string (0) "" ["Status"]=> int (0) ["Fields"]=> array {[0]=> string (3) "CID" [1]=> string (8) "Link_url" [2]=> string (All) "Company_Name" [3]= > string (9) "Type_name" [4]=> string (Ten) "Trade_name" [5]=> string (5) "Scale" [6]=> Strin G (8) "homepage" [7]=> string (7) "Address" [8]=> string (9) "City_name" [9]=> string (8) "Postcode "[10]=> string (7)" Contact "[11]=> string (9)" Telephone "[12]=> string (6)" Mobile "[13]=> ; String (3) "Fax" [14]=> string (5) "Email" [15]=> string (one) "description" [16]=> string (one) "upd Ate_time "} [" Attrs "]=> Array (3) {[" from_id "]=> string (1)" 1 "[" link_id "]=> string (1)" 1 "[" Ad D_time "]=> string (1)" 1 "} [" matches "]=> Array () {[1513]=> Array (2) {[" Weight "]=> in T (2) ["Attrs"]= = Array (3) {["from_id"]=> string (1) "2" ["link_id"]=> string (7) "3171471" ["Add_time"]=> string (Ten) "1394853454"}} [42020]=> Array (2) {["Weight"]=> in T (2) ["Attrs"]=> Array (3) {["from_id"]=> string (1) "2" ["link_id"]=> Strin G (7) "2248093" ["Add_time"]=> string (Ten) "1394913884"}} [57512]=> Array (2) {["We ight "]=> int (2) [" Attrs "]=> Array (3) {[" from_id "]=> string (1)" 2 "[" link_id " ]=> string (7) "2684470" ["Add_time"]=> string (Ten) "1394970833"}} [59852]=> A Rray (2) {["Weight"]=> int (2) ["Attrs"]=> Array (3) {["from_id"]=> string (1) "3 "[" link_id "]=> string (1)" 0 "[" Add_time "]=> string (10)" 1394977527 "}} [598 55]=> Array (2) { ["Weight"]=> int (2) ["Attrs"]=> Array (3) {["from_id"]=> string (1) "3" [" link_id "]=> string (1)" 0 "[" Add_time "]=> string (Ten)" 1394977535 "}} [60805]=> Array (2) {["Weight"]=> int (2) ["Attrs"]=> Array (3) {["from_id"]=> string (1) "3" ["link_id"]=> string (1) "0" ["Add_time"]=> string (10) "1394980072"}} [9 4444]=> Array (2) {["Weight"]=> int (2) ["Attrs"]=> Array (3) {["from_id"]=> String (1) "3" ["link_id"]=> string (1) "0" ["Add_time"]=> string (10) "1395084115" }} [94448]=> Array (2) {["Weight"]=> int (2) ["Attrs"]=> Array (3) {["From_i D "]=> string (1)" 3 "[" link_id "]=> string (1)" 0 "[" Add_time "]=> string (10)" 139 5084124 "}}[99229]=> Array (2) {["Weight"]=> int (2) ["Attrs"]=> Array (3) {["from_id"]=> String (1) "2" ["link_id"]=> string (7) "1297992" ["Add_time"]=> string (10) "13951005 "}} [107524]=> Array (2) {[" Weight "]=> int (2) [" Attrs "]=> Array (3) { ["from_id"]=> string (1) "5" ["link_id"]=> string (Ten) "4294967295" ["Add_time"]=> String (Ten) "1395122053"}} [111918]=> Array (2) {["Weight"]=> int (2) ["Attrs"]=> ; Array (3) {["from_id"]=> string (1) "5" ["link_id"]=> string (Ten) "4294967295" ["Ad D_time "]=> string (Ten)" 1395127953 "}} [148]=> Array (2) {[" Weight "]=> int (1) ["Attrs"]=> Array (3) {["from_id"]=> string (1) "2" ["link_id"]=> string (7) "277 0294 "[" AdD_time "]=> string (Ten)" 1394852562 "}} [178]=> Array (2) {[" Weight "]=> int (1) ["Attrs"]=> Array (3) {["from_id"]=> string (1) "2" ["link_id"]=> string (7) "247 4558 "[" Add_time "]=> string (Ten)" 1394852579 "}} [595]=> Array (2) {[" Weight "]=> int (1) ["Attrs"]=> Array (3) {["from_id"]=> string (1) "2" ["link_id"]=> String (6) "534804" ["Add_time"]=> string (Ten) "1394852862"}} [775]=> Array (2) { ["Weight"]=> int (1) ["Attrs"]=> Array (3) {["from_id"]=> string (1) "2" ["Lin k_id "]=> string (7)" 3230353 "[" Add_time "]=> string (Ten)" 1394852980 "}} [860]=> Array (2) {["Weight"]=> int (1) ["Attrs"]=> Array (3) {["from_id"]=> string (1) "2" ["link_id"]=> string (7) "2549233" ["Add_time"]=> string (Ten) "1394853048"}} [938]=> Array (2) {["Weight"]=> int (1) ["Attrs"]=> Array (3) {["from_id"]=> string (1) "2" ["link_id"]=> string (7) "3191382" ["Add_time"]=> string (10) "1394853114"}} [1 048]=> Array (2) {["Weight"]=> int (1) ["Attrs"]=> Array (3) {["from_id"]=> String (1) "2" ["link_id"]=> string (7) "3234645" ["Add_time"]=> string (10) "1394853174" }} [1395]=> Array (2) {["Weight"]=> int (1) ["Attrs"]=> Array (3) {["fr om_id "]=> string (1)" 2 "[" link_id "]=> string (7)" 2661219 "[" Add_time "]=> Strin G (Ten) "1394853375"}} [1657]=> Array (2) {["Weight"]=> int (1) ["Attrs"]=> Arra Y (3) {["from_id"]=> string (1) "2" ["link_id"]=> string (7) "2670624" ["Add_time"]=> string (10) "1394853540"}}} ["Total"]=> int (543) ["Total_found"]=> int (543) ["Time"]=> float (0.109) ["Word S "]=> Array (1) {[" BMW "]=> Array (2) {[" Docs "]=> int (543) [" hits "]=> int (741)} }}string (0) ""