Enable sphinx full-text search and instances. During the compilation and installation of sphinx, many Chinese characters are garbled and an error is thrown. I went to the official website to download an rpm package, and the installation was great... I don't want to find many Chinese garbled characters when compiling and installing sphinx, and the final error is thrown. I went to the official website to download an rpm package, and the installation was great... I don't want to study the specific error. Busy development ~~
Install two packages. One is mmseg. this is the program that generates the Chinese dictionary. The other is csft, which is also the Chinese version of sphinx.
After rpm-ivh is installed. Very smooth ~~ It takes less than half a minute to complete the installation...
I am lazy and have downloaded the Chinese dictionary library from the official csft website. Very thoughtful...
Unigram.txt uni. lib
Unigram.txt dictionary text, you can add your own keywords in it
Then use
Mmseg-u unigram.txt generates a Dictionary File: unigram.txt. uni. rename uni. lib, which is a dictionary recognized by sphinx.
Where? Put it in the dictionary path configured in sphexample. conf.
Then it's almost the same.
Let's take a look at several utility programs of sphinx.
[Root @ beihai365/] # csft-
Csft-indexer csft-search csft-searchd
Csft-indexer is a program for generating full-text search indexes.
Csft-search is used to test whether the search is effective or not. it is better to check whether the full-text search is successful if I haven't used client script development.
Csft-searchd is the daemon for sphsf-search. After startup, you can use scripts such as php and python to start the query.
That's simple ~~
Let's take a look at the two key parts.
Sphworker. conf configuration file
View plaincopy to clipboardprint?
Source tmsgs
{
Type = mysql
SQL _host = localhost
SQL _user = root
SQL _pass = 1
SQL _db = phpwind75sp3
SQL _port = 3306 # optional, default is 3306
# SQL _sock =/tmp/mysql3307.sock
SQL _query_pre = SET NAMES gbk
SQL _query = SELECT id, name, type, stock FROM pw_tools
# SQL _attr_uint = id
SQL _attr_uint = stock
}
Index tmsgsindex
{
Source = tmsgs
Path =/var/mmseg/searchdata/beihai365
Docinfo = extern
Charset_type = zh_cn.gbk
# Min_prefix_len = 0
# Min_infix_len = 2
# Ngram_len = 2
Charset_dictpath =/var/mmseg/data
# Min_prefix_len = 0
# Min_infix_len = 0
# Min_word_len = 2
}
Indexer
{
Mem_limit = 128 M
}
Searchd
{
# Listen = 3312.
Log =/var/log/searchd. log
Query_log =/var/log/query. log
Read_timeout = 5
Max_children = 30
Pid_file =/var/log/searchd. pid
Max_matches = 1000
# Seamless_rotate = 1
# Preopen_indexes = 0
# Unlink_old = 1
}
Source tmsgs
{
Type = mysql
SQL _host = localhost
SQL _user = root
SQL _pass = 1
SQL _db = phpwind75sp3
SQL _port = 3306 # optional, default is 3306
# SQL _sock =/tmp/mysql3307.sock
SQL _query_pre = SET NAMES gbk
SQL _query = SELECT id, name, type, stock FROM pw_tools
# SQL _attr_uint = id
SQL _attr_uint = stock
}
Index tmsgsindex
{
Source = tmsgs
Path =/var/mmseg/searchdata/beihai365
Docinfo = extern
Charset_type = zh_cn.gbk
# Min_prefix_len = 0
# Min_infix_len = 2
# Ngram_len = 2
Charset_dictpath =/var/mmseg/data
# Min_prefix_len = 0
# Min_infix_len = 0
# Min_word_len = 2
}
Indexer
{
Mem_limit = 128 M
}
Searchd
{
# Listen = 3312.
Log =/var/log/searchd. log
Query_log =/var/log/query. log
Read_timeout = 5
Max_children = 30
Pid_file =/var/log/searchd. pid
Max_matches = 1000
# Seamless_rotate = 1
# Preopen_indexes = 0
# Unlink_old = 1
}
Let's take a look at the test client code.
View plaincopy to clipboardprint?
Header ("Content-type: text/html; charset = utf-8 ");
Include 'sphinxapi. php ';
$ Cl = new SphinxClient ();
$ Cl-> SetServer ('localhost', 3312 );
$ Cl-> SetMatchMode (SPH_MATCH_ALL );
$ Cl-> SetArrayResult (true );
$ Res = $ cl-> Query ("name card ","*");
Print_r ($ res );
?>
Header ("Content-type: text/html; charset = utf-8 ");
Include 'sphinxapi. php ';
$ Cl = new SphinxClient ();
$ Cl-> SetServer ('localhost', 3312 );
$ Cl-> SetMatchMode (SPH_MATCH_ALL );
$ Cl-> SetArrayResult (true );
$ Res = $ cl-> Query ("name card ","*");
Print_r ($ res );
?>
The keyword "name card" is manually added to the dictionary. Check whether the search can be found.
View plaincopy to clipboardprint?
Array
(
[Error] =>
[Warning] =>
[Status] => 0
[Fields] => Array
(
[0] => name
[1] => type
)
[Attrs] => Array
(
[Stock] => 1
)
[Matches] => Array
(
[0] => Array
(
[Id] => 8
[Weight] => 1
[Attrs] => Array
(
[Stock] = & gt; 100
)
)
)
[Total] => 1
[Total_found] => 1
[Time] = & gt; 0.018
[Words] => Array
(
[Card] => Array
(
[Docs] => 1
[Hits] => 1
)
)
)
Array
(
[Error] =>
[Warning] =>
[Status] => 0
[Fields] => Array
(
[0] => name
[1] => type
)
[Attrs] => Array
(
[Stock] => 1
)
[Matches] => Array
(
[0] => Array
(
[Id] => 8
[Weight] => 1
[Attrs] => Array
(
[Stock] = & gt; 100
)
)
)
[Total] => 1
[Total_found] => 1
[Time] = & gt; 0.018
[Words] => Array
(
[Card] => Array
(
[Docs] => 1
[Hits] => 1
)
)
)
No problem at all. Search.
Several key operations
[Root @ beihai365/] # csft-searchd -- stop search daemon
[Root @ beihai365/] # csft-indexer -- all generates indexes for all nodes. You can also generate an index for a node, for example, csft-indexer xx.
[Root @ beihai365/] # csft-search App search keyword App. However, the following information is not found and does not hit any documents.
Coreseek Full Text Server 3.1
Copyright (c) 2006-2008 coreseek.com
Using config file './csft. Conf '...
1,
Pt: 1, 1; index 'tmsgsindex ': query 'app': returned 0 matches of 0 total in 0.017 sec
Words:
1. 'app': 0 parameters, 0 hits
When you run these commands, you need to manually pin the path of the -- config sphsf-. conf configuration file .. Very inconvenient ..
So I simply ln-s one in ./. In this way, you do not need to input -- config every time.
During the pull sphinx process, many Chinese characters are garbled and the error is stuck. I went to the official website to download an rpm package, and the installation was great... I don't want to study the specific error...