Preparation Coreseek (SPHINX+MMSEG3) Detailed installation configuration +php Sphinx Extension installation +php Call example

Source: Internet
Author: User
Build Coreseek (SPHINX+MMSEG3) Detailed installation configuration +php Sphinx Extension installation +php Call example
A document contains examples of installation, incremental backup, extensions, API invocation, and eliminates the time to find a large number of articles.

Build Coreseek (SPHINX+MMSEG3) installation


[First step] install MMSEG3 first

Cd/var/installwget Http://www.coreseek.cn/uploads/csft/4.0/coreseek-4.1-beta.tar.gztar ZXVF CORESEEK-4.1-BETA.TAR.GZCD CORESEEK-4.1-BETACD mmseg-3.2.14./bootstrap./configure--prefix=/usr/local/mmseg3make && make install problems encountered: Error:cannot find input file:src/makefile.in or other similar error errors ... Solution: Execute the following command in turn, I run the ' aclocal ' ERROR, the solution please see the description below yum-y install libtoolaclocallibtoolize--forceautomake-- Add-missingautoconfautoheadermake Clean

Install ' Libtool ' continue from ' aclocal ' to execute the above mentioned sequence of commands, and then run the initial installation process.

[Step two] install Coreseek

# #安装coreseek $ cd csft-3.2.14 or CD csft-4.0.1 or CD csft-4.1$ sh buildconf.sh                                         #输出的warning信息可以忽略, if error occurs you need to resolve $./con Figure--prefix=/usr/local/coreseek  --without-unixodbc--with-mmseg--with-mmseg-includes=/usr/local/mmseg3/ include/mmseg/--with-mmseg-libs=/usr/local/mmseg3/lib/--with-mysql# #如果提示mysql问题, you can view the MySQL data source installation instructions HTTP//   www.coreseek.cn/product_install/install_on_bsd_linux/#mysql $ make && make install$ CD. # #命令行测试mmseg分词, Coreseek search (requires a pre-set character set of ZH_CN. UTF-8, make sure Chinese is displayed correctly) $ cd testpack$ cat Var/test/test.xml    #此时应该正确显示中文 $/usr/local/mmseg3/bin/mmseg-d/usr/local/mmseg3 /etc var/test/test.xml$/usr/local/coreseek/bin/indexer-c etc/csft.conf--all$/usr/local/coreseek/bin/search-c etc/ csft.conf Web Search


This xmlpipe2 support is present in the compiled. To use Xmlpipe2, install missing XML libra error

Execute the following command:
Yum-y Install expat Expat-devel

Once installed, you can pass the new compilation Coreseek, and then build the index again.

The results are as follows:

Coreseek fulltext 4.1 [Sphinx 2.0.2-dev (r2922)]  Copyright (c) 2007-2011,  Beijing Choice software Technologies in C (http://www.coreseek.com)   using config file ' etc/csft.conf '  ... Index ' XML ': Query ' Web search ': returned 1 matches of 1 total in 0.000 sec   displaying matches:  1. Document=1, Weight=1 590, Published=thu Apr  1 07:20:07, author_id=1   words:  1. ' Network ': 1 documents, 1 hits  2. ' Search ': 2 documents, 5 hits  


Below begins the configuration of Sphinx with MySQL


Create Sphinx tables and execute them in the Coreseek_test library.

CREATE TABLE sph_counter (    counter_id integer PRIMARY KEY not null,    max_doc_id integer NOT NULL);

Create a configuration file that configures Sphinx with MySQL

# vi/usr/local/coreseek/etc/csft_mysql.conf

#MySQL数据源配置, for more information, see: http://www.coreseek.cn/products-install/mysql/#请先将var/test/                    Documents.sql Import the database and configure the following MySQL user password database # Source definition sources main #定义源名称 {type                = MySQL Sql_host = localhost Sql_user = root Sql_pass = 123456 sql_db = coreseek_test Sql_port = 3306 Sql_query_pre = SET NAM                                        ES UTF8 sql_query_pre = REPLACE into Sph_counter SELECT 1,max (id) from Hr_spider_company; # update Sph_counter sql_query = SELECT * from Hr_spider_company WHERE id<                                                         = (SELECT max_doc_id from Sph_counter WHERE counter_id=1) # reads data according to the Sph_counter record ID #sql_query第一列id需为整数 #title, content as a string/text field, is                 For full-text indexing, refer to the actual database field Sql_attr_uint = from_id  #从SQL读取到的值必须为整数, refer to the database actual field sql_attr_uint = link_id #从SQL读取到的值必须为整数, refer to the Actual database field Sql_att                                        R_uint = Add_time #从SQL读取到的值必须为整数, refer to database actual field} #增量源定义source Delta:main #注意与定义名称的统一性 {sql_query_pre = SET NAMES UTF8 sql_query = SELECT * from hr_sp Ider_company where id> (SELECT max_doc_id from Sph_counter WHERE counter_id=1) # reads data according to the Sph_counter record ID sql_query  _post_index = REPLACE into Sph_counter SELECT 1,max (id) from Hr_spider_company # Update Sph_counter} #index定义index main #注意与定义名称的统一性 {Source = ma            In #对应的source名称 path =/usr/local/coreseek/var/data/mysql #请修改为实际使用的绝对路径, for example:/usr/local/coreseek/var/docinfo = extern Mlock = 0 Morpholo GY = none miN_word_len = 1 Html_strip = 0 #中文分词配置, see for details: http://www.coreseek.cn/products-install/coreseek_mmseg/ Charset_dictpath =/usr/local/mmseg3/etc/#BSD, settings under Linux,/end of symbol Charset_type = Zh_                Cn.utf-8}index Delta:main #注意与定义名称的统一性 {Source = Delta path    =/usr/local/coreseek/var/data/delta} #全局index定义indexer {mem_limit = 128M} #searchd服务定义searchd { Listen = 9312 Read_timeout = 5 Max_children = Max_matches = + Seaml Ess_rotate = 0 preopen_indexes = 0 Unlink_old = 1 pid_file =/usr/local/coreseek/var/ Log/searchd_mysql.pid #请修改为实际使用的绝对路径, for example:/usr/local/coreseek/var/... log =/usr/local/coreseek/var/log /searchd_mysql.log #请修改为实际使用的绝对路径, for example:/usr/local/coreseek/var/... query_log =/usr/local/coreseek/var/log/qu Ery_mysql.log #请修改为实The absolute path to use, for example:/usr/local/coreseek/var/... binlog_path = #关闭binlog Log


My test table is named Hr_spider_company, and you just need to change it to your own table name based on your actual needs.

To invoke a command list:

Start background service (must be turned on)
#/usr/local/coreseek/bin/searchd-c/usr/local/coreseek/etc/csft_mysql.conf

Execute index (must be executed once before query, test)
/usr/local/coreseek/bin/indexer-c/usr/local/coreseek/etc/csft_mysql.conf--all--rotate

Performing an incremental index
/usr/local/coreseek/bin/indexer-c/usr/local/coreseek/etc/csft_mysql.conf Delta--rotate

Merging indexes
/usr/local/coreseek/bin/indexer-c/usr/local/coreseek/etc/csft_mysql.conf--merge Main Delta--rotate-- Merge-dst-range deleted 0 0

(In order to prevent multiple keywords from pointing to the same document plus--merge-dst-range deleted 0 0)

Background service Test
#/usr/local/coreseek/bin/search-c/usr/local/coreseek/etc/csft_mysql.conf  AAA

Turn off background services
#/usr/local/coreseek/bin/searchd-c/usr/local/coreseek/etc/csft_mysql.conf--stop

Automation commands:

Crontab-e

*/1 * * * */bin/sh/usr/local/coreseek/bin/indexer-c/usr/local/coreseek/etc/csft_mysql.conf Delta--ROTATE*/5 * * */ Bin/sh/usr/local/coreseek/bin/indexer-c/usr/local/coreseek/etc/csft_mysql.conf--merge Main Delta--rotate-- Merge-dst-range deleted 0 030 1 * * *  /bin/sh/usr/local/coreseek/bin/indexer-c/usr/local/coreseek/etc/csft_ mysql.conf--all--rotate

The following task plan means that you perform the incremental index every other minute, perform the merge index every five minutes, and perform the overall index 1:30 every day.

Sphinx Extended Installation Installation


Coreseek The official tutorial recommended that PHP use directly include a PHP file to operate, in fact, PHP has a separate Sphinx module can be directly operated Coreseek (Coreseek is sphinx! Has entered the official PHP library, and the efficiency of the promotion is not a little bit! But the PHP module relies on the Libsphinxclient package.

[First step] install dependent libsphinxclient

# cd/var/install/coreseek-4.1-beta/csft-4.1/api/libsphinxclient/#./configure  --prefix=/usr/local/ sphinxclientconfigure:creating./config.statusconfig.status:creating Makefileconfig.status:error:cannot Find input file:Makefile.in   #报错configure失败    //Handling Configure error during compilation a Config.status:error:cannot find input file:src/was reported Makefile.in this error, and then run the following command to compile it again: # aclocal# libtoolize--force# automake--add-missing# autoconf# autoheader# make clean//compiling from new Configure #./configure# Make && make install

[Step two] install PHP extensions for Sphinx

http://pecl.php.net/package/sphinx# wget http://pecl.php.net/get/sphinx-1.3.0.tgz# tar zxvf sphinx-1.3.0.tgz# CD sphinx-1.3.0# phpize#./configure--with-php-config=/usr/bin/php-config--with-sphinx=/usr/local/sphinxclient# make && make install# cd/etc/php.d/# cp gd.ini  sphinx.ini# VI sphinx.iniextension=sphinx.so# Service PHP-FPM rest Art

Open the Phpinfo to see if the Sphinx module is already supported.

PHP Call Sphinx Example:


 
  Setserver ("127.0.0.1", 9312);    $s->setmatchmode (sph_match_phrase);    $s->setmaxquerytime (+);    $res = $s->query ("BMW", ' main '); #[BMW] keyword, [main] DataSource source    $err = $s->getlasterror ();    Var_dump (Array_keys ($res [' matches ')]);    echo "
"." You can read the values in the database by getting the ID. "."
"; Echo '
';    Var_dump ($res);    Var_dump ($err);    Echo '
';

Output Result:

Array () {[0]=> int (1513) [1]=> Int (42020) [2]=> int (57512) [3]=> int (59852) [4]=> int (59855)  [5]=> Int (60805) [6]=> int (94444) [7]=> int (94448) [8]=> int (99229) [9]=> int (107524) [10]=> Int (111918) [11]=> int (148) [12]=> Int (178) [13]=> int (595) [14]=> int (775) [15]=> Int (860) [16] = = Int (938) [17]=> Int (1048) [18]=> Int (1395) [19]=> int (1657)}
You can read the values in the database by getting the ID.
Array (Ten) {["Error"]=> string (0) "" ["Warning"]=> string (0) "" ["Status"]=> int (0) ["Fields"]=> array {[0]=> string (3) "CID" [1]=> string (8) "Link_url" [2]=> string (All) "Company_Name" [3]= > string (9) "Type_name" [4]=> string (Ten) "Trade_name" [5]=> string (5) "Scale" [6]=> Strin G (8) "homepage" [7]=> string (7) "Address" [8]=> string (9) "City_name" [9]=> string (8) "Postcode "[10]=> string (7)" Contact "[11]=> string (9)" Telephone "[12]=> string (6)" Mobile "[13]=&gt    ; String (3) "Fax" [14]=> string (5) "Email" [15]=> string (one) "description" [16]=> string (one) "upd Ate_time "} [" Attrs "]=> Array (3) {[" from_id "]=> string (1)" 1 "[" link_id "]=> string (1)" 1 "[" Ad D_time "]=> string (1)" 1 "} [" matches "]=> Array () {[1513]=> Array (2) {[" Weight "]=> in T (2) ["Attrs"]= = Array (3) {["from_id"]=> string (1) "2" ["link_id"]=> string (7) "3171471" ["Add_time"]=> string (Ten) "1394853454"}} [42020]=> Array (2) {["Weight"]=> in T (2) ["Attrs"]=> Array (3) {["from_id"]=> string (1) "2" ["link_id"]=> Strin G (7) "2248093" ["Add_time"]=> string (Ten) "1394913884"}} [57512]=> Array (2) {["We ight "]=> int (2) [" Attrs "]=> Array (3) {[" from_id "]=> string (1)" 2 "[" link_id " ]=> string (7) "2684470" ["Add_time"]=> string (Ten) "1394970833"}} [59852]=> A Rray (2) {["Weight"]=> int (2) ["Attrs"]=> Array (3) {["from_id"]=> string (1) "3 "[" link_id "]=> string (1)" 0 "[" Add_time "]=> string (10)" 1394977527 "}} [598  55]=> Array (2) {    ["Weight"]=> int (2) ["Attrs"]=> Array (3) {["from_id"]=> string (1) "3" ["    link_id "]=> string (1)" 0 "[" Add_time "]=> string (Ten)" 1394977535 "}} [60805]=> Array (2) {["Weight"]=> int (2) ["Attrs"]=> Array (3) {["from_id"]=> string (1) "3" ["link_id"]=> string (1) "0" ["Add_time"]=> string (10) "1394980072"}} [9        4444]=> Array (2) {["Weight"]=> int (2) ["Attrs"]=> Array (3) {["from_id"]=>      String (1) "3" ["link_id"]=> string (1) "0" ["Add_time"]=> string (10) "1395084115" }} [94448]=> Array (2) {["Weight"]=> int (2) ["Attrs"]=> Array (3) {["From_i D "]=> string (1)" 3 "[" link_id "]=> string (1)" 0 "[" Add_time "]=> string (10)" 139    5084124 "}}[99229]=> Array (2) {["Weight"]=> int (2) ["Attrs"]=> Array (3) {["from_id"]=> String (1) "2" ["link_id"]=> string (7) "1297992" ["Add_time"]=> string (10) "13951005        "}} [107524]=> Array (2) {[" Weight "]=> int (2) [" Attrs "]=> Array (3) {        ["from_id"]=> string (1) "5" ["link_id"]=> string (Ten) "4294967295" ["Add_time"]=> String (Ten) "1395122053"}} [111918]=> Array (2) {["Weight"]=> int (2) ["Attrs"]=&gt      ; Array (3) {["from_id"]=> string (1) "5" ["link_id"]=> string (Ten) "4294967295" ["Ad      D_time "]=> string (Ten)" 1395127953 "}} [148]=> Array (2) {[" Weight "]=> int (1) ["Attrs"]=> Array (3) {["from_id"]=> string (1) "2" ["link_id"]=> string (7) "277 0294 "[" AdD_time "]=> string (Ten)" 1394852562 "}} [178]=> Array (2) {[" Weight "]=> int (1) ["Attrs"]=> Array (3) {["from_id"]=> string (1) "2" ["link_id"]=> string (7) "247       4558 "[" Add_time "]=> string (Ten)" 1394852579 "}} [595]=> Array (2) {[" Weight "]=>        int (1) ["Attrs"]=> Array (3) {["from_id"]=> string (1) "2" ["link_id"]=>      String (6) "534804" ["Add_time"]=> string (Ten) "1394852862"}} [775]=> Array (2) { ["Weight"]=> int (1) ["Attrs"]=> Array (3) {["from_id"]=> string (1) "2" ["Lin    k_id "]=> string (7)" 3230353 "[" Add_time "]=> string (Ten)" 1394852980 "}} [860]=>  Array (2) {["Weight"]=> int (1) ["Attrs"]=> Array (3) {["from_id"]=> string (1) "2" ["link_id"]=> string (7) "2549233" ["Add_time"]=> string (Ten) "1394853048"}} [938]=> Array        (2) {["Weight"]=> int (1) ["Attrs"]=> Array (3) {["from_id"]=> string (1) "2" ["link_id"]=> string (7) "3191382" ["Add_time"]=> string (10) "1394853114"}} [1        048]=> Array (2) {["Weight"]=> int (1) ["Attrs"]=> Array (3) {["from_id"]=>       String (1) "2" ["link_id"]=> string (7) "3234645" ["Add_time"]=> string (10) "1394853174" }} [1395]=> Array (2) {["Weight"]=> int (1) ["Attrs"]=> Array (3) {["fr om_id "]=> string (1)" 2 "[" link_id "]=> string (7)" 2661219 "[" Add_time "]=> Strin G (Ten) "1394853375"}} [1657]=> Array (2) {["Weight"]=> int (1) ["Attrs"]=> Arra Y (3) {["from_id"]=> string (1) "2" ["link_id"]=> string (7) "2670624" ["Add_time"]=> string (10) "1394853540"}}} ["Total"]=> int (543) ["Total_found"]=> int (543) ["Time"]=> float (0.109) ["Word  S "]=> Array (1) {[" BMW "]=> Array (2) {[" Docs "]=> int (543) [" hits "]=> int (741)} }}string (0) ""


  • Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.