Coreseek + Mysql + Thinkphp,
I. Preface1. Study the motivation of coreseek
I have a blog of my own, and I often write some technical articles on it. When querying some articles, you can only use like fuzzy match in mysql for the previously queried content. When there are too many articles, the efficiency of this approach is definitely not good. So I threw the target to coreseek, a Chinese search plug-in, and successfully applied it to my project.
:
I hope that through this analysis, you will be able to avoid detours.
2. Concepts
Sphinx is an open-source search engine that supports full-text retrieval in English. However, the natural word divider in English is space, while in Chinese, there are complicated word segmentation requirements. The Chinese people provide a Chinese full-text search engine that is available to enterprises based on sphsf. That is to say, the actual kernel of Coreseek is still sphkernel. But the biggest difference is that coreseek has a Chinese word segmentation tool mmseg.
3. Environment Introduction
System: Ubuntu
Http service: Apache/2.2.22
Mysql: Ver 14.14 Distrib 5.5.41
PHP: PHP 5.3.10
2. Download and install Coreseek
Installation Steps
Download coreseek-3.2.14.tar.gz and place it in/usr/local/src
First, to avoid missing dependency packages during installation
apt-get install make gcc g++ automake libtool mysql-client libmysqlclient15-dev libxml2-dev libexpat1-dev
You can execute the command above. Otherwise, there may be various strange issues due to insufficient software packages. For example, I updated the 159M software package. (I only need to recover the blood after encountering various pitfalls)
1. Install the mmseg word segmentation module.
Cd/usr/local/srctar zxvf coreseek-3.2.14.tar.gz # unzip the cd coreseek-3.2.14cd mmseg-3.2.14. /bootstrap # The output warning information can be ignored. If an error occurs, it must be resolved. /configure -- prefix =/usr/local/mmseg3 # configure make # compile make install # install
1.1) possible problems and solutions:
The./bootstrap: 27:./bootstrap: autoconf: not found error occurs when you execute./bootstrap,
Cause: Because the automake tool is not installed, (ubuntu 10.04) can be installed with the following command.
Sudo apt-get install autoconf automake libtool
1.2) possible problems: After mmseg is installed, the annot find input file: src/Makefile. in error occurs after compilation and installation.
Then I checked and found the solution as follows:
Aclocal // is a perl script program defined as "aclocal-create aclocal. m4 by scanning configure. ac"
Libtoolize -- force // there is an error after running, don't worry about it.
Automake -- add-missing
Autoconf
Autoheader
Make clean
Then re-compile
./Configure -- prefix =/usr/local/mmseg3
Make & make install
Compiled and installed
Conclusion: I did not find the cause of this error. The solution is successful. If anyone knows, please leave a message. Thank you.
2. Install CoreSeek
Cd/usr/local/srccd coreseek-3.2.14cd csft-3.2.14sh buildconf. sh # The output warning information can be ignored. If an error occurs, it must be resolved. /configure -- prefix =/usr/local/coreseek -- without-unixodbc -- with-mmseg-separated des =/usr/local/mmseg3/include/mmseg/--- mmseg-libs =/usr/local/mmseg3/lib/-- with-mysql # configure make # compile make install # install
3. Test mmseg word segmentation, coreseek search, and MySQL data source.
Cd/usr/local/src
Cd coreseek-3.2.14
Cd testpack
Cat/usr/local/src/coreseek-3.2.14/testpack/var/test. xml # Chinese characters should be correctly displayed, as shown in
/Usr/local/mmseg3/bin/mmseg-d/usr/local/mmseg3/etc/usr/local/src/coreseek-3.2.14/testpack/var/test. xml
/Usr/local/coreseek/bin/indexer-c/usr/local/src/coreseek-3.2.14/testpack/etc/csft. conf -- all
/Usr/local/coreseek/bin/search-c/usr/local/src/coreseek-3.2.14/testpack/etc/csft. conf network search
/Usr/local/coreseek/bin/indexer-c/usr/local/coreseek/etc/sphinx-min.conf.dist
/Usr/local/coreseek/bin/indexer-c/usr/local/src/coreseek-3.2.14/testpack/etc/csft. conf -- all -- rotate # Start the service and update the index
And no error is reported. It indicates that your coreseek is running properly.
3.1) possible problems and solutions:
When/usr/local/coreseek/bin/indexer-c etc/csft. conf -- all is input, the xmlpipe2 support NOT compiled in. To use xmlpipe2, install missing error is reported.
Cause:
The xmlpipe2 library is missing. solution:
Apt-get install expat -*
Recompile coreseek and remember to make clean
4. Configure and use coreseek
Cp/usr/local/src/coreseek-3.2.14/testpack/etc/csft_mysql.conf/usr/local/coreseek/etc/csft_mysql.conf # copy the MySQL data source configuration file ln-s/usr/local/coreseek /etc/csft_mysql.conf # Add a soft connection vim/etc/csft_mysql.conf # edit, modify
3. Modify the Coreseek configuration file
Take my own configuration file as an example:
/Usr/local/coreseek/etc/csft_mysql.conf
# Index source Definition source mysql {type = mysql SQL _host = localhost SQL _user = xxxx SQL _pass = xxxx SQL _db = xxxx SQL _port = 3306 SQL _query_pre = set names utf8 SQL _query = SELECT id, id, uid, title, data FROM notebook_notepad # id of the first column of SQL _query must be an integer # title and data are used as string/text fields, SQL _attr_uint = id # The value read from SQL must be an integer # SQL _attr_timestamp = time # The value read from SQL must be an integer, as the time attribute SQL _attr_uint = uid SQL _query_info_pre = set names utf8 # When querying through the command line, SET the correct character SET SQL _query_info = SELECT * FROM notebook_notepad WHERE id = $ id # When querying through the command line, read original data from the database} # index definition index mysql {source = mysql # corresponding source name path =/usr/local/coreseek/var/data/mysql # Please change to actual absolute path used, example:/usr/local/coreseek/var /... docinfo = extern mlock = 0 morphology = none min_word_len = 1 html_strip = 0 # Chinese Word Segmentation configuration. For details, see: http://www.coreseek.cn/products-install/coreseek_mmseg/ charset_dictpath =/usr/local/mmseg3/etc/# BSD, Linux environment settings,/Symbol end # charset_dictpath = etc/# Windows environment settings,/Symbol end, it is best to give an absolute path, for example: C:/usr/local/coreseek/etc /... charset_type = zh_cn.utf-8} # global index definition indexer {mem_limit = 128 M} # searchd service definition searchd {listen = 9312 read_timeout = 5 max_children = 30 max_matches = 1000 bytes = 0 preopen_indexes = 0 unlink_old = 1 pid_file =/usr/local/coreseek/var/log/searchd_mysql.pid # change it to the actual absolute path, example:/usr/local/coreseek/var /... log =/usr/local/coreseek/var/log/searchd_mysql.log # change it to the actual absolute path, for example,/usr/local/coreseek/var /... query_log =/usr/local/coreseek/var/log/query_mysql.log # change it to the actual absolute path, for example:/usr/local/coreseek/var /...}
In this way, the id, uid, title, and data fields will be retrieved from the index file.
OK. After the configuration is complete, restart the Coreseek service to produce the desired query index, and then you can get rid of the shackles of mysql, which can be Chinese or English, and also contain word segmentation. How is it? Has it opened the door to the New World.
The following describes the possible causes of index Reconstruction Errors and solutions. If you are interested, take a look. Otherwise, you can jump to the next section: PHP test Coreseek.
An error occurred while re-indexing: WARNING: failed to open pid_file '/usr/local/coreseek/var/log/searchd_mysql.pid '.
Solution:
Try to stop the coreseek Service
/Usr/local/coreseek/bin/searchd-c/usr/local/coreseek/etc/csft_mysql.conf -- stop the service
Then restart
/Usr/local/coreseek/bin/searchd-c/usr/local/coreseek/etc/csft_mysql.conf start the service
Create an index again
/Usr/local/coreseek/bin/indexer-c/usr/local/coreseek/etc/csft_mysql.conf -- all create an index
If the prompt is: FATAL: failed to lock/usr/local/coreseek/var/data/xxxx. spl: Resource temporarily unavailable, will not index. Try -- rotate option.
Then try re-Indexing
/Usr/local/coreseek/bin/indexer-c/usr/local/coreseek/etc/csft_mysql.conf -- all -- rotate
4. PHP testing Coreseek
1. Put sphinxapi. php In the test directory.
Cp/usr/local/src/coreseek-3.2.14/testpack/api/sphinxapi. php ./
Vim test. php
Header ("Content-type: text/html; charset = UTF-8"); // require (". /"); $ s = new SphinxClient; $ s-> setServer (" 127.0.0.1 ", 9312); // SPH_MATCH_ALL, match all query words (default mode); SPH_MATCH_ANY, match any of the query terms. SPH_MATCH_EXTENDED2 supports Special operators to query $ s-> setMatchMode (SPH_MATCH_ALL); $ s-> setMaxQueryTime (30 ); // set the maximum search time $ s-> SetArrayResult (false); // whether to replace the Matches key with ID $ s-> SetSelect ("*"); // set the returned information, which is equivalent to SQL $ s-> SetRankingMode (SPH_RANK_BM25); $ s-> SetLimits (0, 30,100 0, 0 ); // set the result set Offset SetLimits $ res = $ s-> query ('coresecret', 'mysql', '-- single-0-query --'); # [coreseek] Keyword: [mysql] Data source $ err = $ s-> GetLastError (); echo '<pre>'; var_dump ($ res ); var_dump ($ res ['matches']); var_export ($ err); echo '</pre> ';
Php5 test. php
Running result: matches is the matched result set.
5. Use Coreseek in Thinkphp
1. Install and install the Sphinx Extension
In the official Coreseek tutorial, php is recommended to directly include a php file. In fact, php has an independent sphsf-module that can directly operate coreseek (coreseek is sphsf !) I have already entered the official php function library, and it is more efficient! However, the php module depends on the libsphinxclient package. I installed the Sphinx extension following the steps in the following article.
Thanks http://blog.csdn.net/e421083458/article/details/21529969
[Step 1] install libsphinxclient
# Cd/var/install/coreseek-4.1-beta/csft-4.1/api/libsphinxclient /#. /configure -- prefix =/usr/local/sphinxclientconfigure: creating. /config. statusconfig. status: creating Makefileconfig. status: error: cannot find input file: Makefile. in # error: configure failed // handle configure error: A config is reported during compilation. status: error: cannot find input file: src/Makefile. in, and then run the following command to re-compile the Code: # aclocal # libtoolize -- force # automake -- add-missing # autoconf # autoheader # make clean // compile from new configure #. /configure # make & make install
[Step 2] install the PHP extension of sphinx
http://pecl.php.net/package/sphinx# wget http://pecl.php.net/get/sphinx-1.3.0.tgz# tar zxvf sphinx-1.3.0.tgz# cd sphinx-1.3.0# phpize# ./configure --with-php-config=/usr/bin/php-config --with-sphinx=/usr/local/sphinxclient# make && make install# cd /etc/php.d/# cp gd.ini sphinx.ini# vi sphinx.iniextension=sphinx.so# service php-fpm restart
After installing the PHP Sphinx extension, you can directly use $ coreseek = new SphinxClient () without introducing the source file.
To put it simply, I used coreseek in TP to query and highlight keywords:
1. Use sphinx to find the id and uid set.
2, then $ SQL = "select * from post where id in ($ ids)"; $ res = mysql_query ($ SQL); get the real data of the database
3. Use BuildExcerpts to highlight the keywords of title and data, and then display them by page.
Key code:
$ Where = array (); $ where ['uid'] = $ uid; if (! Empty ($ search) {// if you have any content to search for, go to coreseek to find the corresponding id $ coreseek = new \ SphinxClient (); $ coreseek-> setServer ("127.0.0.1", 9312); // SPH_MATCH_ALL, match all query words (default mode); SPH_MATCH_ANY, match any of the query words; SPH_MATCH_EXTENDED2, supports Special operators to query $ coreseek-> setMatchMode (SPH_MATCH_ALL); $ coreseek-> setMaxQueryTime (30); // sets the maximum search time $ coreseek-> SetArrayResult (false ); // whether to replace the Matches key with ID $ coreseek-> SetSelect ("*"); // you can specify the content of the returned information, which is equivalent to SQL $ coreseek-> SetLimits (0, 30,100 0, 0); // set the result set Offset SetLimits $ res = $ coreseek-> query ($ search, 'mysql', '-- single-0-query --'); $ key = array_keys ($ res ['matches']); $ where ['id'] = array ('in', $ key ); $ coreseek-> close ();} else {}// obtain the total number of data records $ total = $ mod-> where ($ where)-> count ();
Key highlighted code:
If (! Empty ($ search) {$ page-> parameter ['search'] = $ search; // code highlight $ opt = array ("before_match" => "<font style = 'font-weight: bold; color: # f00'> ", "after_match" => "</font>"); $ coreseek1 = new \ SphinxClient (); $ coreseek1-> setServer ("127.0.0.1", 9312 ); $ coreseek1-> SetMatchMode (SPH_MATCH_ALL); $ I = 0; $ tags_title = array (); foreach ($ info as $ key => $ row) {$ tags_title [] = $ row ['title'];} $ replace_title = $ coreseek1-> BuildExcerpts ($ tags_title, 'mysql', $ search, $ opt ); foreach ($ info as $ key =>&$ row) {$ info [$ key] ['title'] = $ replace_title [$ key];} $ coreseek1-> close ();}
OK. Now, coreseek has been able to run perfectly in TP. This article can also end. The above are the details of step-by-step installation, and I want to help anyone who is interested in the installation. The amount of information in the article is too large. If there are any omissions, I hope you can correct them!