Coreseek + Mysql + Thinkphp,

Source: Internet
Author: User
Tags perl script automake

Coreseek + Mysql + Thinkphp,
I. Preface1. Study the motivation of coreseek

I have a blog of my own, and I often write some technical articles on it. When querying some articles, you can only use like fuzzy match in mysql for the previously queried content. When there are too many articles, the efficiency of this approach is definitely not good. So I threw the target to coreseek, a Chinese search plug-in, and successfully applied it to my project.

:

I hope that through this analysis, you will be able to avoid detours.

2. Concepts

Sphinx is an open-source search engine that supports full-text retrieval in English. However, the natural word divider in English is space, while in Chinese, there are complicated word segmentation requirements. The Chinese people provide a Chinese full-text search engine that is available to enterprises based on sphsf. That is to say, the actual kernel of Coreseek is still sphkernel. But the biggest difference is that coreseek has a Chinese word segmentation tool mmseg.

3. Environment Introduction

System: Ubuntu

Http service: Apache/2.2.22

Mysql: Ver 14.14 Distrib 5.5.41

PHP: PHP 5.3.10

2. Download and install Coreseek

Installation Steps

Download coreseek-3.2.14.tar.gz and place it in/usr/local/src

First, to avoid missing dependency packages during installation

apt-get install make gcc g++ automake libtool mysql-client libmysqlclient15-dev libxml2-dev libexpat1-dev

You can execute the command above. Otherwise, there may be various strange issues due to insufficient software packages. For example, I updated the 159M software package. (I only need to recover the blood after encountering various pitfalls)

 

1. Install the mmseg word segmentation module.

Cd/usr/local/srctar zxvf coreseek-3.2.14.tar.gz # unzip the cd coreseek-3.2.14cd mmseg-3.2.14. /bootstrap # The output warning information can be ignored. If an error occurs, it must be resolved. /configure -- prefix =/usr/local/mmseg3 # configure make # compile make install # install

1.1) possible problems and solutions:
The./bootstrap: 27:./bootstrap: autoconf: not found error occurs when you execute./bootstrap,

Cause: Because the automake tool is not installed, (ubuntu 10.04) can be installed with the following command.

Sudo apt-get install autoconf automake libtool

1.2) possible problems: After mmseg is installed, the annot find input file: src/Makefile. in error occurs after compilation and installation.
Then I checked and found the solution as follows:

Aclocal // is a perl script program defined as "aclocal-create aclocal. m4 by scanning configure. ac"

Libtoolize -- force // there is an error after running, don't worry about it.
Automake -- add-missing
Autoconf
Autoheader
Make clean

Then re-compile
./Configure -- prefix =/usr/local/mmseg3
Make & make install
Compiled and installed

Conclusion: I did not find the cause of this error. The solution is successful. If anyone knows, please leave a message. Thank you.

2. Install CoreSeek

Cd/usr/local/srccd coreseek-3.2.14cd csft-3.2.14sh buildconf. sh # The output warning information can be ignored. If an error occurs, it must be resolved. /configure -- prefix =/usr/local/coreseek -- without-unixodbc -- with-mmseg-separated des =/usr/local/mmseg3/include/mmseg/--- mmseg-libs =/usr/local/mmseg3/lib/-- with-mysql # configure make # compile make install # install

3. Test mmseg word segmentation, coreseek search, and MySQL data source.

Cd/usr/local/src

Cd coreseek-3.2.14

Cd testpack

Cat/usr/local/src/coreseek-3.2.14/testpack/var/test. xml # Chinese characters should be correctly displayed, as shown in

/Usr/local/mmseg3/bin/mmseg-d/usr/local/mmseg3/etc/usr/local/src/coreseek-3.2.14/testpack/var/test. xml

/Usr/local/coreseek/bin/indexer-c/usr/local/src/coreseek-3.2.14/testpack/etc/csft. conf -- all

/Usr/local/coreseek/bin/search-c/usr/local/src/coreseek-3.2.14/testpack/etc/csft. conf network search

/Usr/local/coreseek/bin/indexer-c/usr/local/coreseek/etc/sphinx-min.conf.dist

/Usr/local/coreseek/bin/indexer-c/usr/local/src/coreseek-3.2.14/testpack/etc/csft. conf -- all -- rotate # Start the service and update the index

And no error is reported. It indicates that your coreseek is running properly.

3.1) possible problems and solutions:

When/usr/local/coreseek/bin/indexer-c etc/csft. conf -- all is input, the xmlpipe2 support NOT compiled in. To use xmlpipe2, install missing error is reported.

Cause:

The xmlpipe2 library is missing. solution:

Apt-get install expat -*

Recompile coreseek and remember to make clean

 

4. Configure and use coreseek

Cp/usr/local/src/coreseek-3.2.14/testpack/etc/csft_mysql.conf/usr/local/coreseek/etc/csft_mysql.conf # copy the MySQL data source configuration file ln-s/usr/local/coreseek /etc/csft_mysql.conf # Add a soft connection vim/etc/csft_mysql.conf # edit, modify
3. Modify the Coreseek configuration file

Take my own configuration file as an example:

/Usr/local/coreseek/etc/csft_mysql.conf

# Index source Definition source mysql {type = mysql SQL _host = localhost SQL _user = xxxx SQL _pass = xxxx SQL _db = xxxx SQL _port = 3306 SQL _query_pre = set names utf8 SQL _query = SELECT id, id, uid, title, data FROM notebook_notepad # id of the first column of SQL _query must be an integer # title and data are used as string/text fields, SQL _attr_uint = id # The value read from SQL must be an integer # SQL _attr_timestamp = time # The value read from SQL must be an integer, as the time attribute SQL _attr_uint = uid SQL _query_info_pre = set names utf8 # When querying through the command line, SET the correct character SET SQL _query_info = SELECT * FROM notebook_notepad WHERE id = $ id # When querying through the command line, read original data from the database} # index definition index mysql {source = mysql # corresponding source name path =/usr/local/coreseek/var/data/mysql # Please change to actual absolute path used, example:/usr/local/coreseek/var /... docinfo = extern mlock = 0 morphology = none min_word_len = 1 html_strip = 0 # Chinese Word Segmentation configuration. For details, see: http://www.coreseek.cn/products-install/coreseek_mmseg/ charset_dictpath =/usr/local/mmseg3/etc/# BSD, Linux environment settings,/Symbol end # charset_dictpath = etc/# Windows environment settings,/Symbol end, it is best to give an absolute path, for example: C:/usr/local/coreseek/etc /... charset_type = zh_cn.utf-8} # global index definition indexer {mem_limit = 128 M} # searchd service definition searchd {listen = 9312 read_timeout = 5 max_children = 30 max_matches = 1000 bytes = 0 preopen_indexes = 0 unlink_old = 1 pid_file =/usr/local/coreseek/var/log/searchd_mysql.pid # change it to the actual absolute path, example:/usr/local/coreseek/var /... log =/usr/local/coreseek/var/log/searchd_mysql.log # change it to the actual absolute path, for example,/usr/local/coreseek/var /... query_log =/usr/local/coreseek/var/log/query_mysql.log # change it to the actual absolute path, for example:/usr/local/coreseek/var /...}

In this way, the id, uid, title, and data fields will be retrieved from the index file.

OK. After the configuration is complete, restart the Coreseek service to produce the desired query index, and then you can get rid of the shackles of mysql, which can be Chinese or English, and also contain word segmentation. How is it? Has it opened the door to the New World.

The following describes the possible causes of index Reconstruction Errors and solutions. If you are interested, take a look. Otherwise, you can jump to the next section: PHP test Coreseek.

An error occurred while re-indexing: WARNING: failed to open pid_file '/usr/local/coreseek/var/log/searchd_mysql.pid '.

Solution:
Try to stop the coreseek Service
/Usr/local/coreseek/bin/searchd-c/usr/local/coreseek/etc/csft_mysql.conf -- stop the service

Then restart
/Usr/local/coreseek/bin/searchd-c/usr/local/coreseek/etc/csft_mysql.conf start the service

Create an index again
/Usr/local/coreseek/bin/indexer-c/usr/local/coreseek/etc/csft_mysql.conf -- all create an index

If the prompt is: FATAL: failed to lock/usr/local/coreseek/var/data/xxxx. spl: Resource temporarily unavailable, will not index. Try -- rotate option.

Then try re-Indexing
/Usr/local/coreseek/bin/indexer-c/usr/local/coreseek/etc/csft_mysql.conf -- all -- rotate

 

4. PHP testing Coreseek

1. Put sphinxapi. php In the test directory.

Cp/usr/local/src/coreseek-3.2.14/testpack/api/sphinxapi. php ./

Vim test. php

Header ("Content-type: text/html; charset = UTF-8"); // require (". /"); $ s = new SphinxClient; $ s-> setServer (" 127.0.0.1 ", 9312); // SPH_MATCH_ALL, match all query words (default mode); SPH_MATCH_ANY, match any of the query terms. SPH_MATCH_EXTENDED2 supports Special operators to query $ s-> setMatchMode (SPH_MATCH_ALL); $ s-> setMaxQueryTime (30 ); // set the maximum search time $ s-> SetArrayResult (false); // whether to replace the Matches key with ID $ s-> SetSelect ("*"); // set the returned information, which is equivalent to SQL $ s-> SetRankingMode (SPH_RANK_BM25); $ s-> SetLimits (0, 30,100 0, 0 ); // set the result set Offset SetLimits $ res = $ s-> query ('coresecret', 'mysql', '-- single-0-query --'); # [coreseek] Keyword: [mysql] Data source $ err = $ s-> GetLastError (); echo '<pre>'; var_dump ($ res ); var_dump ($ res ['matches']); var_export ($ err); echo '</pre> ';

Php5 test. php

Running result: matches is the matched result set.

 

5. Use Coreseek in Thinkphp

1. Install and install the Sphinx Extension

In the official Coreseek tutorial, php is recommended to directly include a php file. In fact, php has an independent sphsf-module that can directly operate coreseek (coreseek is sphsf !) I have already entered the official php function library, and it is more efficient! However, the php module depends on the libsphinxclient package. I installed the Sphinx extension following the steps in the following article.

Thanks http://blog.csdn.net/e421083458/article/details/21529969

[Step 1] install libsphinxclient

# Cd/var/install/coreseek-4.1-beta/csft-4.1/api/libsphinxclient /#. /configure -- prefix =/usr/local/sphinxclientconfigure: creating. /config. statusconfig. status: creating Makefileconfig. status: error: cannot find input file: Makefile. in # error: configure failed // handle configure error: A config is reported during compilation. status: error: cannot find input file: src/Makefile. in, and then run the following command to re-compile the Code: # aclocal # libtoolize -- force # automake -- add-missing # autoconf # autoheader # make clean // compile from new configure #. /configure # make & make install

[Step 2] install the PHP extension of sphinx

http://pecl.php.net/package/sphinx# wget http://pecl.php.net/get/sphinx-1.3.0.tgz# tar zxvf sphinx-1.3.0.tgz# cd sphinx-1.3.0# phpize# ./configure --with-php-config=/usr/bin/php-config --with-sphinx=/usr/local/sphinxclient# make && make install# cd /etc/php.d/# cp gd.ini  sphinx.ini# vi sphinx.iniextension=sphinx.so# service php-fpm restart

After installing the PHP Sphinx extension, you can directly use $ coreseek = new SphinxClient () without introducing the source file.

To put it simply, I used coreseek in TP to query and highlight keywords:

1. Use sphinx to find the id and uid set.
2, then $ SQL = "select * from post where id in ($ ids)"; $ res = mysql_query ($ SQL); get the real data of the database
3. Use BuildExcerpts to highlight the keywords of title and data, and then display them by page.

Key code:

$ Where = array (); $ where ['uid'] = $ uid; if (! Empty ($ search) {// if you have any content to search for, go to coreseek to find the corresponding id $ coreseek = new \ SphinxClient (); $ coreseek-> setServer ("127.0.0.1", 9312); // SPH_MATCH_ALL, match all query words (default mode); SPH_MATCH_ANY, match any of the query words; SPH_MATCH_EXTENDED2, supports Special operators to query $ coreseek-> setMatchMode (SPH_MATCH_ALL); $ coreseek-> setMaxQueryTime (30); // sets the maximum search time $ coreseek-> SetArrayResult (false ); // whether to replace the Matches key with ID $ coreseek-> SetSelect ("*"); // you can specify the content of the returned information, which is equivalent to SQL $ coreseek-> SetLimits (0, 30,100 0, 0); // set the result set Offset SetLimits $ res = $ coreseek-> query ($ search, 'mysql', '-- single-0-query --'); $ key = array_keys ($ res ['matches']); $ where ['id'] = array ('in', $ key ); $ coreseek-> close ();} else {}// obtain the total number of data records $ total = $ mod-> where ($ where)-> count ();

Key highlighted code:

If (! Empty ($ search) {$ page-> parameter ['search'] = $ search; // code highlight $ opt = array ("before_match" => "<font style = 'font-weight: bold; color: # f00'> ", "after_match" => "</font>"); $ coreseek1 = new \ SphinxClient (); $ coreseek1-> setServer ("127.0.0.1", 9312 ); $ coreseek1-> SetMatchMode (SPH_MATCH_ALL); $ I = 0; $ tags_title = array (); foreach ($ info as $ key => $ row) {$ tags_title [] = $ row ['title'];} $ replace_title = $ coreseek1-> BuildExcerpts ($ tags_title, 'mysql', $ search, $ opt ); foreach ($ info as $ key =>&$ row) {$ info [$ key] ['title'] = $ replace_title [$ key];} $ coreseek1-> close ();}

OK. Now, coreseek has been able to run perfectly in TP. This article can also end. The above are the details of step-by-step installation, and I want to help anyone who is interested in the installation. The amount of information in the article is too large. If there are any omissions, I hope you can correct them!

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.