Full-text search by sphinx. By default, only word splitting is supported. To achieve better Chinese word segmentation, you can use the libmmseg-based engine coreseek.
Yum install g ++
Yum install gcc
Yum install make
Yum install MySQL mysql-devel PHP-mysql qt4-mysql
Wget http://www.coreseek.cn/uploads/sources/mmseg3_0b3.tar.gz
Wget http://www.coreseek.cn/uploads/sources/csft3_0b4.tar.gz
Tar-xzvf mmseg3_0b3.tar.gz
Tar-xzvf csft3_0b4.tar.gz
CD mmseg.3.0b3/
./Configure -- prefix =/var/mmseg
Make
Make install
CD ..
CD csft3_0b4
. /Configure -- prefix =/var/coreseek -- With-mysql -- With-mmseg-separated des =/var/mmseg/include/mmseg -- With-mmseg-libs =/var/mmseg/ LIB/
Make
Make install
CD/var/coreseek/
Mkdir dict
CD/home/hfahe/mmseg.3.0b3/Data
/Var/mmseg/bin/mmseg-u unigram.txt
CP unigram.txt. Uni/var/coreseek/dict/Uni. Lib
CD/var/coreseek/dict/
VI mmseg. ini
Input
[Mmseg] <br/> merge_number_and_ascii = 1; <br/> number_and_ascii_joint =-; <br/> compress_space = 0; <br/> seperate_number_ascii = 1;
CD/var/coreseek/etc/
CP sphinx. conf. Dist sphinx. conf
Mysql-H 192.168.1.xxx-u root-pxxx test <example. SQL
VI sphinx. conf
Modify the IP address, user name, password, and database of the database in the configuration.
/Var/coreseek/bin/indexer -- config/var/coreseek/etc/sphinx. conf
In this case, the libmysqlclient error may occur. The solution is as follows:
Locate libmysqlclient. So
Ln-S/usr/local/lib/MySQL/libmysqlclient. so.16/lib/libmysqlclient. so.16
/Var/coreseek/bin/indexer -- config/var/coreseek/etc/sphinx. conf -- all
/Var/coreseek/bin/search -- config/var/coreseek/etc/sphinx. conf Doc
Displaying matches: <br/> 1. document = 3, Weight = 1, group_id = 2, date_added = Thu Apr 22 15:15:25 2010 <br/> id = 3 <br/> group_id = 2 <br/> group_id2 = 7 <br/> date_added = 15:15:25 <br/> title = another doc <br/> content = This is another group <br/> 2. document = 4, Weight = 1, group_id = 2, date_added = Thu Apr 22 15:15:25 2010 <br/> id = 4 <br/> group_id = 2 <br/> group_id2 = 8 <br/> date_added = 15:15:25 <br/> title = Doc number four <br/> content = This is to test groups </P> <p> words: <br/> 1. 'Doc': 2 Documents, 2 hits
To support Chinese, you need to change the charset_type value in the configuration to the zh_cn.utf-8 and add charset_dictpath =/var/coreseek/dict.
You also need to enable the configuration of SQL _query_pre = set names utf8.
/Var/coreseek/bin/indexer -- config/var/coreseek/etc/sphinx. conf -- all
/Var/coreseek/bin/search -- config/var/coreseek/etc/sphsf-. conf Chinese
Check whether Chinese characters can be retrieved normally.
The default configuration file of coreseek is CSFT. conf under etc. You do not need to add the config configuration when using this file.
The correct result is displayed.