During the compilation and installation of sphinx, a lot of Chinese characters are garbled, and finally the error is thrown out. I went to the official website to download an rpm package directly, so the installation would be great. I don't want to study the specific error, busy with development. install two packages, one for mmseg, and the other for generating Chinese... during the compilation and installation of sphinx, a lot of Chinese characters are garbled, and finally the error is thrown out. I went to the official website to download an rpm package directly, so the installation would be great. I don't want to study the specific error, busy with development.
Install two packages: mmseg, a program that generates a Chinese dictionary, csft, and sphinx of the Chinese version.
After rpm-ivh is installed, it will be smooth ~~ The installation will be completed in less than half a minute.
I directly downloaded the Chinese dictionary Library from csft.
Unigram.txt uni. lib
Unigram.txt dictionary text, you can add your own keywords in it.
Then, use mmseg-u unigram.txt to generate the Dictionary File: unigram.txt. uni and rename uni. lib, which is the dictionary recognized by sph.pdf.
Where? Put it in the dictionary path configured in the sphinx. conf file. we will talk about it later, and it will be almost the same. let's take a look at several practical sphinx programs:
[Root @ beihai365/] # csft-
Csft-indexer csft-search csft-searchd
Csft-indexer is a program for generating full-text search indexes.
Csft-search is used to test whether the search is effective or not. it is better to check whether the full-text search is successful if I haven't used client script development.
Csft-searchd is the daemon for sphsf-search. After startup, you can use scripts such as php and python to start the query.
It's that simple. let's take a look at the two key parts.
Sphworker. conf configuration file:
source tmsgs { type = mysql sql_host = localhost sql_user = root sql_pass = 1 sql_db = phpwind75sp3 sql_port = 3306 # optional, default is 3306 #sql_sock = /tmp/mysql3307.sock sql_query_pre = SET NAMES gbk sql_query = SELECT id,name,type,stock FROM pw_tools #sql_attr_uint = id sql_attr_uint = stock } index tmsgsindex { source = tmsgs path = /var/mmseg/searchdata/beihai365 docinfo = extern charset_type = zh_cn.gbk #min_prefix_len = 0 #min_infix_len = 2 #ngram_len = 2 charset_dictpath = /var/mmseg/data #min_prefix_len = 0 #min_infix_len = 0 #min_word_len = 2 } indexer { mem_limit = 128M } searchd { #listen = 3312 log = /var/log/searchd.log query_log = /var/log/query.log read_timeout = 5 max_children = 30 pid_file = /var/log/searchd.pid max_matches = 1000 #seamless_rotate = 1 #preopen_indexes = 0 #unlink_old = 1 } source tmsgs { type = mysql sql_host = localhost sql_user = root sql_pass = 1 sql_db = phpwind75sp3 sql_port = 3306 # optional, default is 3306 #sql_sock = /tmp/mysql3307.sock sql_query_pre = SET NAMES gbk sql_query = SELECT id,name,type,stock FROM pw_tools #sql_attr_uint = id sql_attr_uint = stock } index tmsgsindex { source = tmsgs path = /var/mmseg/searchdata/beihai365 docinfo = extern charset_type = zh_cn.gbk #min_prefix_len = 0 #min_infix_len = 2 #ngram_len = 2 charset_dictpath = /var/mmseg/data #min_prefix_len = 0 #min_infix_len = 0 #min_word_len = 2 } indexer { mem_limit = 128M } searchd { #listen = 3312 log = /var/log/searchd.log query_log = /var/log/query.log read_timeout = 5 max_children = 30 pid_file = /var/log/searchd.pid max_matches = 1000 #seamless_rotate = 1 #preopen_indexes = 0 #unlink_old = 1 }
Let's take a look at the client code:
SetServer ('localhost', 3312); $ cl-> SetMatchMode (SPH_MATCH_ALL); $ cl-> SetArrayResult (true); $ res = $ cl-> Query ("name card ", "*"); print_r ($ res );
SetServer ('localhost', 3312); $ cl-> SetMatchMode (SPH_MATCH_ALL); $ cl-> SetArrayResult (true); $ res = $ cl-> Query ("name card ", "*"); print_r ($ res); // open-source phprm.com
The keyword "name card" is manually added to the dictionary to check whether it can be found. The instance code is as follows:
Array ([error] => [warning] => [status] => 0 [fields] => Array ([0] => name [1] => type) [attrs] => Array ([stock] => 1) [matches] => Array ([0] => Array ([id] => 8 [weight] => 1 [attrs] => Array ([stock] => 100) )) [total] => 1 [total_found] => 1 [time] => 0.018 [words] => Array ([name card] => Array ([docs] => 1 [hits] => 1 ))) array ([error] => [warning] => [status] => 0 [fields] => Array ([0] => name [1] => type) [attrs] => Array ([stock] => 1) [matches] => Array ([0] => Array ([id] => 8 [weight] => 1 [attrs] => Array ([stock] => 100) )) [total] => 1 [total_found] => 1 [time] => 0.018 [words] => Array ([name card] => Array ([docs] => 1 [hits] => 1 )))
There is no problem at all. The search has been made and several key operations are as follows:
[Root @ beihai365/] # csft-searchd -- stop search daemon [root @ beihai365/] # csft-indexer -- all generates indexes for all nodes, you can also generate an index for a node, for example, csft-indexer xx [root @ beihai365/] # csft-search App, however, the following information is not found and does not hit any documents. coreseek Full Text Server 3.1 Copyright (c) 2006-2008 coreseek.com using config file '. /csft. conf '... 1, pt: 1, 1; index 'tmsgsindex ': query 'app': returned 0 matches of 0 total in 0.017 sec words: 1. 'app': 0 parameters, 0 hits
When you run these commands, you need to manually pin-config sphtasks. conf configuration file path is inconvenient, so I simply ln-s one in. /, so you do not need to input -- config every time.