CentOS installation Coreseek and PHP extensions

Source: Internet
Author: User
Tags php language

One, Coreseek introduction

Official http://www.coreseek.cn/

Coreseek is a Chinese full-text search/search software, GPLV2 license Agreement open source release, based on Sphinx Research and Development and independent publishing, specializing in Chinese search and information processing field, for industry/vertical Search, forum/Site search, database search, Document/literature search, information retrieval, Application scenarios such as data mining. Commercial use (for example, embedding in other programs) requires commercial authorization.

Coreseek is a full-text search engine that supports Chinese, intended to provide high-speed, low-footprint, high-correlation results in Chinese full-text search capabilities for other applications. Coreseek can be very easy to integrate with SQL database and scripting languages.

The native search API provided in the Sphinx release supports PHP, Python, Perl, Rudy, and Java. The search API is very lightweight and can be ported to new languages within a few hours. Third-party API interfaces and plug-ins provide support for Perl, C #, Haskell, Ruby-on-rails, and other possible languages or frameworks.


Second, install Coreseek

Note: This article is a Coreseek installation tutorial based on Centos+mysql as a data source support. mysql installation skipped.


1, download Coreseek 3.2 stable version, download other versions please go to the official website to download

cd/usr/local/src/

wget http://www.coreseek.cn/uploads/csft/3.2/coreseek-3.2.14.tar.gz

Tar xzvf coreseek-3.2.14.tar.gz

CD coreseek-3.2.14

Pre-installed software is required before installing Coreseek: Yum install make gcc g++ gcc-c++ libtool autoconf automake imake mysql-devel libxml2-devel expat- Devel (Note: This is CentOS 64-bit

For other systems please refer to http://www.coreseek.cn/product_install/install_on_bsd_linux/#deps


2, install Mmseg

$ CD mmseg-3.2.14

$./bootstrap #输出的warning信息可以忽略, if error occurs, you need to resolve

$./configure--prefix=/usr/local/mmseg3

$ make && make install

$ CD.


# #如果提示libtool: Unrecognized option '--tag=cc ', see Libtool problem solution

# #安装完成后, the dictionaries and profiles used by MMSEG are automatically installed into/usr/local/mmseg3/etc

# #中文分词测试, if the display is unhealthy, check the locale and UTF-8 display settings in the current environment

$/usr/local/mmseg3/bin/mmseg-d/usr/local/mmseg3/etc Src/t1.txt

Chinese/x/x Word/x Test/X

Chinese/x Shanghai/X


Word Splite took:1 Ms.



3, install Coreseek

$ CD csft-3.2.14

# #执行configure, compile the configuration:

$ sh buildconf.sh

$./configure--prefix=/usr/local/coreseek--WITHOUT-UNIXODBC--with-mmseg--with-mmseg-includes=/usr/local/mmseg3/ include/mmseg/--with-mmseg-libs=/usr/local/mmseg3/lib/--with-mysql


If you find that MySQL includes file is not found, use the following compile command


./configure--prefix=/usr/local/coreseek--without-unixodbc--with-mmseg--with-mmseg-includes=/usr/local/mmseg3/ include/mmseg/--with-mmseg-libs=/usr/local/mmseg3/lib/--with-mysql-includes=/alidata/server/mysql/include/-- with-mysql-libs=/alidata/server/mysql/bin/

Make && make install



4, measuring the type Coreseek

Cd.. /testpack

$/usr/local/coreseek/bin/indexer-c etc/csft.conf

# #以下为正常情况下的提示信息:

Coreseek fulltext 3.2 [Sphinx 0.9.9-release (r2117)]

Copyright (c) 2007-2010,

Beijing Choice Software Technologies Inc (http://www.coreseek.com)


Using config file ' etc/csft.conf ' ...

Total 0 Reads, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg

Total 0 writes, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg

##

# #csft-version 4.0 display: Error:nothing to do.

##

$/usr/local/coreseek/bin/indexer-c etc/csft.conf--all

# #以下为正常索引全部数据时的提示信息: (similar to version csft-4.0)

Coreseek fulltext 3.2 [Sphinx 0.9.9-release (r2117)]

Copyright (c) 2007-2010,

Beijing Choice Software Technologies Inc (http://www.coreseek.com)


Using config file ' etc/csft.conf ' ...

Indexing index ' XML ' ...

Collected 3 docs, 0.0 MB

Sorted 0.0 mhits, 100.0% done

Total 3 docs, 7585 bytes

Total 0.075 sec, 101043 bytes/sec, 39.96 docs/sec

Total 2 Reads, 0.000 sec, 5.6 kb/call AVG, 0.0 msec/call avg

Total 7 writes, 0.000 sec, 3.9 kb/call avg, 0.0 msec/call avg


$/usr/local/coreseek/bin/indexer-c etc/csft.conf XML

# #以下为正常索引指定数据时的提示信息: (similar to version csft-4.0)

Coreseek fulltext 3.2 [Sphinx 0.9.9-release (r2117)]

Copyright (c) 2007-2010,

Beijing Choice Software Technologies Inc (http://www.coreseek.com)


Using config file ' etc/csft.conf ' ...

Indexing index ' XML ' ...

Collected 3 docs, 0.0 MB

Sorted 0.0 mhits, 100.0% done

Total 3 docs, 7585 bytes

Total 0.069 sec, 109614 bytes/sec, 43.35 docs/sec

Total 2 Reads, 0.000 sec, 5.6 kb/call AVG, 0.0 msec/call avg

Total 7 writes, 0.000 sec, 3.9 kb/call avg, 0.0 msec/call avg


$/usr/local/coreseek/bin/search-c etc/csft.conf

# #以下为正常测试搜索时的提示信息: (similar to version csft-4.0)

Coreseek fulltext 3.2 [Sphinx 0.9.9-release (r2117)]

Copyright (c) 2007-2010,

Beijing Choice Software Technologies Inc (http://www.coreseek.com)


Using config file ' etc/csft.conf ' ...

Index ' XML ': Query ': returned 3 matches of 3 total in 0.093 sec


displaying matches:

1. Document=1, Weight=1, Published=thu Apr 1 22:20:07, author_id=1

2. document=2, Weight=1, Published=thu Apr 1 23:25:48, author_id=1

3. Document=3, Weight=1, Published=thu Apr 1 12:01:00, author_id=2


Words



$/usr/local/coreseek/bin/search-c etc/csft.conf-a Twittter and opera all offer search services

# #以下为正常测试搜索关键词时的提示信息: (similar to version csft-4.0)

Coreseek fulltext 3.2 [Sphinx 0.9.9-release (r2117)]

Copyright (c) 2007-2010,

Beijing Choice Software Technologies Inc (http://www.coreseek.com)


Using config file ' etc/csft.conf ' ...

Index ' XML ': Query ' Twittter and opera both provide search services ': returned 3 matches of 3 total in 0.038 sec


displaying matches:

1. Document=3, weight=24, Published=thu Apr 1 12:01:00, author_id=2

2. Document=1, weight=4, Published=thu Apr 1 22:20:07, author_id=1

3. document=2, weight=3, Published=thu Apr 1 23:25:48, author_id=1


Words

1. ' Twittter ': 1 documents, 3 hits

2. ' AND ': 3 documents, hits

3. ' Opera ': 1 documents, hits

4. ' All ': 2 documents, 4 hits

5. ' Offer ': 0 documents, 0 hits

6. ' Up ': 3 documents, hits

7. ' Search ': 2 documents, 5 hits

8. ' Service ': 1 documents, 1 hits


$/usr/local/coreseek/bin/searchd-c etc/csft.conf

# #以下为正常开启搜索服务时的提示信息: (similar to version csft-4.0)

Coreseek fulltext 3.2 [Sphinx 0.9.9-release (r2117)]

Copyright (c) 2007-2010,

Beijing Choice Software Technologies Inc (http://www.coreseek.com)


Using config file ' etc/csft.conf ' ...

Listening on all interfaces, port=9312



Third, configure Coreseek support MySQL data source


1. Configure the csft_mysql.conf file

Copy the MySQL configuration file to the Coreseek installation directory etc/(e.g./usr/local/coreseek/etc/)

cp/usr/local/src/coreseek-3.2.14/testpack/etc/csft_mysql.conf/usr/local/coreseek/etc/

cd/usr/local/coreseek/etc/

VI csft_mysql.conf

The red part below is for you to configure yourself


Official Reference Document: Data source configuration: MySQL data source http://www.coreseek.cn/products-install/datasource/


For additional data sources please refer to the official

==============================================================

#源定义

SOURCE Phperz

{

Type = MySQL


Sql_host = localhost

Sql_user = root

Sql_pass = xxxx

sql_db = Phperz

Sql_port = 3306

Sql_query_pre = SET NAMES UTF8


Sql_query = SELECT Id,title,descs,status from article

#sql_query第一列id需为整数

#title, content as a string/text field, indexed by the full text

Sql_attr_uint = Status #从SQL读取到的值必须为整数

#sql_attr_timestamp = date_added #从SQL读取到的值必须为整数, as a time attribute


Sql_query_info_pre = set NAMES UTF8 #命令行查询时, setting the correct character sets

Sql_query_info = SELECT * from article where id= $id #命令行查询时 to read raw data information from the database

}


#index定义

Index Phperz

{

Source = Phperz #对应的source名称

Path =/usr/local/coreseek/var/data/phperz #请修改为实际使用的绝对路径, for example:/usr/local/coreseek/var/...

DocInfo = extern

Mlock = 0

Morphology = None

Min_word_len = 1

Html_strip = 0


#中文分词配置, for more information, see: http://www.coreseek.cn/products-install/coreseek_mmseg/

Charset_dictpath =/usr/local/mmseg3/etc/#BSD, settings under Linux,/end of symbol

#charset_dictpath = etc/#Windows环境下设置,/end of symbol, it is best to give absolute path, for example: c:/usr/local/coreseek/etc/...

Charset_type = Zh_cn.utf-8

}

#全局index定义

Indexer

{

Mem_limit = 128M

}


#searchd服务定义

Searchd

{

Listen = 9312

Read_timeout = 5

Max_children = 30

max_matches = 1000

Seamless_rotate = 0

preopen_indexes = 0

Unlink_old = 1

Pid_file =/usr/local/coreseek/var/log/searchd_mysql.pid #请修改为实际使用的绝对路径, for example:/usr/local/coreseek/var/...

Log =/usr/local/coreseek/var/log/searchd_mysql.log #请修改为实际使用的绝对路径, for example:/usr/local/coreseek/var/...

Query_log =/usr/local/coreseek/var/log/query_mysql.log #请修改为实际使用的绝对路径, for example:/usr/local/coreseek/var/...

}

==============================================================



2, building the index

The road section needs to be changed to your own address.

/usr/local/coreseek/bin/indexer-c/usr/local/coreseek/etc/csft_mysql.conf--all


Errors that may occur

Error:index ' Phperz ': Sql_connect:can ' t connect to local MySQL server through socket '/var/lib/mysql/mysql.sock ' (2) (DS n=mysql://root:*** @localhost: 3306/phperz).

This is because MySQL's sock file path was incorrectly caused.

Confirm your Mysql.sock path and establish a soft connection, such as

Ln-s/tmp/mysql.sock/var/lib/mysql/mysql.sock


3, the index after the completion of the test can be done!

/usr/local/coreseek/bin/search-c/usr/local/coreseek/etc/csft_mysql.conf I'm a little apple

Test results (see below):

Coreseek fulltext 3.2 [Sphinx 0.9.9-release (r2117)]

Copyright (c) 2007-2011,

Beijing Choice Software Technologies Inc (http://www.coreseek.com)


Using config file '/usr/local/coreseek/etc/csft.conf ' ...

Index ' mysql ': query ' I am a little Apple ': returned 1 matches of 1 total in 0.003 sec


displaying matches:

1. document=291, weight=4, prize=1

id=291

Winner_name= Chaoli

Subject_name= I'm a little apple

School_name= Beijing Haidian District First Kindergarten

Sub_url=http://www.xxxxx.com

Prize=1


Words

1. ' Me ': Documents, Hits

2. ' Yes ': Documents, Hits

3. ' Little ': 5 documents, 5 hits

4. ' Apple ': 2 documents, 2 hits


------------------above are test results------------------------------


Four Sphinx Extensions for PHP installation (using Coreseek in PHP language)


Cd/web/src/coreseek-3.2.14/csft-3.2.14/api/libsphinxclient

./configure--prefix=/usr/local/sphinxclient

Make && make install


CD cd/web/src/sphinx-1.3.0

/usr/local/php/bin/phpize

./configure--with-php-config=/usr/local/php/bin/php-config--with-sphinx=/usr/local/sphinxclient

Make

Make install

Modify Vi/usr/local/php/etc/php.ini #添加下面两行

[Sphinx]

Extension=sphinx.so

To this Sphinx extension installation complete, restart Apache for testing!

The test code is as follows:

<?php

$CL = new Sphinxclient ();

Set the Sphinx server address and port, and if it is native, it can be localhost

$CL->setserver ("192.168.1.23", 9312);//corresponds to SEARCHD port

The following settings are used to return the result in array form

$cl->setarrayresult (TRUE);

$CL->setmatchmode (Sph_match_boolean);

$result = $cl->query (' I am a little apple ', ' MySQL '); Parameter Keyword index name

if ($result = = = False) {

echo "Query failed:". $CL->getlasterror (). ". \ n";

}

else {

if ($cl->getlastwarning ()) {

echo "WARNING:". $CL->getlastwarning (). "";

}

Print_r ($result);

}

?>


Five, Coreseek daily maintenance


Start

/usr/local/coreseek/bin/searchd-c/usr/local/coreseek/etc/csft_mysql.conf

Stop it

/usr/local/coreseek/bin/searchd-c/usr/local/coreseek/etc/csft_mysql.conf--stop

Build an index

/usr/local/coreseek/bin/indexer-c/usr/local/coreseek/etc/csft_mysql.conf--all

Rebuilding indexes

/usr/local/coreseek/bin/indexer-c/usr/local/coreseek/etc/csft_mysql.conf--all--rotate


You need to add the boot command to boot.

Add the Rebuild Index command to the scheduled task for daily execution


CentOS installation Coreseek and PHP extensions

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.