What is SphinxSphinx is a full-text search engine released under GPLv2, commercial authorization (for example, embedded in other programs) needs to contact us (Sphinxsearch.com) for commercial authorization. Generally, Sphinx is an independent search engine designed to provide other applications with high-speed, low-space usage, and high-result-related full-text searches.
What is sphtracing?
Sphinx is a full-text search engine released under GPLv2 with commercial authorization (for example, embedded in other programs)
You need to contact us (Sphinxsearch.com) for commercial authorization.
Generally, Sphinx is an independent search engine designed to provide high-speed, low-space usage, and high results for other applications.
Relevance full-text search function. Sphinx can be easily integrated with SQL databases and scripting languages.
Currently, the system supports built-in MySQL and PostgreSQL database data sources and supports reading specific formats from standard input.
. By modifying the source code, you can add new data sources (for example, other DBMS types ).
Native support ).
The search API supports PHP, Python, Perl, Rudy, and Java, and can also be used as a MySQL storage engine. Search
The API is very simple and can be transplanted to a new language within several hours.
Sphenders are short for SQL Phrase Index, but unfortunately they are the same as CMU's sphenders.
Sphinx features
High-speed index creation (in Contemporary CPU, peak performance can reach 10 MB/s );
High-Performance search (in 2? 4 GB of text data, the average retrieval response time is less than 0.1 seconds );
Massive data processing (it is known that it can process more than 100 GB of text data, on a single CPU system
Process 100 M documents );
It provides excellent relevance algorithms and a compound Ranking method based on phrase similarity and statistics (BM25;
Supports distributed search;
Provides document exceprts generation;
It can be used as the storage engine of MySQL to provide search services;
Supports multiple search modes such as Boolean, phrase, and word similarity;
The document supports multiple full-text search fields (up to 32 );
This document supports multiple additional attributes (such as group information and timestamp );
Stop word query;
Supports single-byte encoding and UTF-8 encoding;
Native MySQL support (both MyISAM and InnoDB are supported );
Native PostgreSQL support.
1. install required files
Mmseg-0.7.3.tar.gz Chinese word segmentation
Mysql-5.1.26-rc.tar.gzMysql-5.1.26 source code
Sphinx-0.9.9.tar.gz sphexample-0.9.9-release source code
Fix-crash-in-excerpts.patch sphinx support word segmentation patch
Sphinx-0.98rc2.zhcn-support.patch sphinx support word segmentation patch
II. start installation
1. install libmmseg
Tar-zxvf mmseg-0.7.3.tar.gz
Cd mmseg-0.7.3
./Configure -- prefix =/usr/local/mmseg
Make
Make install
Cd ..
Mmseg installation is complete. test it.
Mmseg
Coreseek COS (tm) MM Segment 1.0
Copyright By Coreseek.com All Right Reserved.
Usage: mmseg
-U Unigram Dictionary
-R Combine with-u, used a plain text build Unigram Dictionary, default Off
-B Synonyms Dictionary
-H print this help and exit
If you have any questions, run the following command:
Echo '/usr/local/mmseg/lib'>/etc/ld. so. conf
Ldconfig-v
Ln-s/usr/local/mmseg/bin/mmseg
2. recompile mysql
Two patches must be installed before installation.
Tar-zxvf sphinx-0.9.8-rc2.tar.gz
Cd sph0000- 0.9.8
Patch-p1 <../sphinx-0.98rc2.zhcn-support.patch
Patch-p1 <../fix-crash-in-excerpts.patch
I have installed mysql5.1.26 before. the installation steps are skipped here.
Mysql compilation path
/Root/lemp/mysql-5.1.26-rc/
Mysql installation path
/Opt/mysql
Close mysql before installation
/Opt/mysql/bin/mysql. server stop
Next, copy the data in the mysqlse folder under sphse to the mysql-5.1.26-rc/storage/sphinx.
(In this way, the SphinxSE storage engine can be compiled when mysql is compiled)
Cp-rf mysqlse/root/lemp/mysql-5.1.26-rc/storage/sphse
Cd/root/lemp/mysql-5.1.26-rc
Make clean
Sh BUILD/autorun. sh
# This step is required. do not omit it.
Start re-compilation
CFLAGS = "-O3" CXX = gcc CXXFLAGS = "-O3-felide-constructors-fno-exceptions-fno-rtti ". /configure -- prefix =/opt/mysql -- localstatedir =/opt/mysql/var -- sysconfdir =/opt/mysql -- without-debug -- with-unix-socket-path =/opt /mysql. sock -- with-big-tables -- with-charset = gbk -- with-collation = gbk_chinese_ci -- with-client-ldflags =-all-static -- with-mysqld-ldflags =-all- static -- enable-generator er -- with-extra-charsets = gbk, gb2312, utf8 -- with-pthread -- enable-thread-safe-client -- with-innodb -- with-plugins = sphins
Make
Make install
Configure: error: unknown plugin: sphwn
Solution:
Sudo yum install autoconf automake libtool
Sh BUILD/autorun. sh
./Configure-h
Check whether sphquota is included at the minimum.
=== Sphsf-storage Engine ===
Plugin Name: sphinx
Description: sphsf-storage Engines
Supports build: static and dynamic
Deployments: max, max-no-ndb
Then compile
Make error
../Libtool: line 466: CDPATH: command not found
../Libtool: line 1144: func_opt_split: command not found
Libtool: Version mismatch error. This is libtool 2.2.6, butthe
Libtool: definition of this LT_INIT comes from an olderrelease.
Libtool: You shoshould recreate aclocal. m4 with macros from libtool2.2.6
Libtool: and run autoconf again.
Make [1]: *** [conf_to_src] error 63
Make [1]: Leaving directory '/home/andychu/lemp2/mysql-5.1.26-rc/strings'
Make: *** [all-recursive] Error 1
If the libtool version is different, an error occurs. you can overwrite the installed libtool to the compiling directory.
Cp/usr/local/bin/libtool.
Re-compile
In fedora, libtinfo. so.5 is missing.
Cd client
Vim Makefile
Find LIBS and add/lib/libtinfo. so.5
After compilation, start mysql and check whether the SphinxSE storage engine is compiled.
/Opt/mysql/bin/mysql. server start
/Opt/mysql/bin/mysql-uroot-p
Mysql> show engines;
+ ------------ + --------- + --------------------------------------------------------- + -------------- + ------ + ------------ +
| Engine | Support | Comment | Transactions | XA | Savepoints |
+ ------------ + --------- + --------------------------------------------------------- + -------------- + ------ + ------------ +
| CSV | YES | CSV storage engine | NO |
| SPHINX | YES | sphsf-storage engine 0.9.9 | NO |
| MEMORY | YES | Hash based, stored in memory, useful for temporary tables | NO |
| MRG_MYISAM | YES | Collection of identical MyISAM tables | NO |
| MyISAM | DEFAULT | Default engine as of MySQL 3.23 with great performance | NO |
+ ------------ + --------- + --------------------------------------------------------- + -------------- + ------ + ------------ +
5 rows in set (0.00 sec)
Now we can see that the sphinxSE engine is included.
3. install sphinx
. /Configure -- prefix =/usr/local/sphinx -- with-mysql =/usr/local/mysql/-- with-mysql-primary des =/usr/local/mysql/include/mysql /-- with-mysql-libs =/usr/local/mysql/lib/mysql/-- with-mmseg-separated des =/usr/local/mmseg/include/mmseg/--- mmseg-libs =/usr/local/mmseg/lib -- with-mmseg
The header file cannot be found:
Tokenizer_zhcn.cpp: 1: 30: SegmenterManager. h: no file or directory
Tokenizer_zhcn.cpp: 2: 23: Segmenter. h: no file or directory
Make clean
./Configure -- prefix =/usr/local/sphure -- with-mysql =/opt/mysql \
-- With-mysql-connector des =/opt/mysql/include/mysql -- with-mysql-libs =/opt/mysql/lib/mysql \
-- With-mmseg-separated des =/usr/local/mmseg/include/mmseg -- with-mmseg-libs =/usr/local/mmseg/lib -- with-mmseg
/Root/sphinx/sphinx-0.9.8-rc2/src/tokenizer_zhcn.cpp: 34: undefined reference to 'libiconv _ close'
Collect2: ld returned 1 exit status
Solution on the official website:
In the meantime I 've change the configuration file and set
# Define USE_LIBICONV 0 in line 8179.
Modify the configure file to change the final value of # define USE_LIBICONV 0 from 1 to 0.
Recompile.
Make clean
./Configure -- prefix =/usr/local/sphure -- with-mysql =/opt/mysql \
-- With-mysql-connector des =/opt/mysql/include/mysql -- with-mysql-libs =/opt/mysql/lib/mysql \
-- With-mmseg-separated des =/usr/local/mmseg/include/mmseg -- with-mmseg-libs =/usr/local/mmseg/lib -- with-mmseg
Vi configure
Enter/define USE_LIBICONV to find the target row.
Press the I key and change 1 to 0. press esc and enter: wq to save and exit.
Copy a sphinx configuration
Cd/usr/local/sphinx/etc
Cp sphinx. conf. dist sphinx. conf
4. configure sphinx
Modify/usr/local/sphinx/etc/sphinx. conf
Type = mysql
# Some straightforward parameters for SQL source types
SQL _host = localhost
SQL _user = root
SQL _pass =
SQL _db = test
SQL _port = 3306 # optional, default is 3306
Address = 127.0.0.1 # the security point can only listen to the local machine
5. index creation
After installing sphvar, there are three directories in the sphinx Directory: bin etc var
The bin contains some execution files used by sphenders, including the searchd query server, which is the search query tool used by indexer indexing.
For the convenience of the following tests, we will first import the example. SQL script that comes with sphexample. conf into mysql.
// Creates a test database and creates the documents test table and data.
/Opt/mysql/bin/mysql-uroot-p </usr/local/sphinx/etc/example. SQL
The indexing method is
/Usr/local/sphinx/bin/indexer -- config/usr/local/sphinx/etc/sphinx. conf test1
Test1 indicates the resource name. If this parameter is left blank, all indexes are created by default.
Appendix:
During index creation, indexer may not be able to find the shared library libmysqlclient. so.16 due to different database versions.
Copy the/opt/mysql/lib/mysql/libmysqlclient. so.16.0.0 file to/usr/lib or use a soft connection.
6. query the server
/Usr/local/sphinx/bin/searchd -- config/usr/local/sphinx/etc/sphinx. conf # enabled
/Usr/local/sphinx/bin/searchd -- config/usr/local/sphinx/etc/sphinx. conf -- stop # disabled
Sphsf-query can be roughly divided into three types:
1. query in the database engine
2. query using the search tool
/Usr/local/sphinx/bin/search -- config/usr/local/sphinx/etc/sphinx. conf test
3. for details about how to query through the php interface, see sphinxapi. php.
3. use SphinxSE to call Sphinx in mysql
1. use SphinxSE to call Sphinx in mysql
First, create an index dedicated table:
CREATETABLE 'sphsecret '(
'Id' int (11) NOTNULL,
'Weight' int (11) NOTNULL,
'Query' varchar (255) NOTNULL,
'Catalogid' INTNOTNULL,
'Edituserid' INTNOTNULL,
'Hits 'intnull,
'Addtime' INTNOTNULL, KEY
'Query' ('query ')
) ENGINE = SPHINXDEFAULTCHARSET = utf8CONNECTION = 'hsf-: // localhost: 3312/test1'
Test1: index name, which can be found in sphexample. conf.
After creating an index dedicated table, we can use it in mysql. for example, enter
SELECT doc. * FROM documents doc JOIN sphsf-on (doc. id = sphsf-. id) WHERE query = 'Doc; mode = any'
After running, the record line containing the doc string will be displayed in the result record
For more information about the query syntax and sphinx configurations, see:
Http://www.sphinxsearch.com/doc.html
2. chinese word segmentation application
Generate Dictionary
After entering the source code directory of mmseg
Cd data
Mmseg-u unigram.txt
A file unigram.txt. uni will be created under data
This is the generated dictionary and rename it uni. lib to a readable directory.
Cp unigram.txt. uni/usr/local/sphinx/uni. lib
Modify the configuration file sphinx. conf (/usr/local/sphinx/etc/sphinx. conf)
Add to index
Charset_type = zh_cn.utf-8
Charset_dictpath =/usr/local/sphinx/
Add a piece of Chinese data to the database
Insert into 'test'. 'documents '(
'Id ',
'Group _ id ',
'Group _ id2 ',
'Date _ added ',
'Title ',
'Content'
) VALUES (NULL, '3', '9', NOW (), 'hashes Chinese search', 'hashes is an SQL-based full-text search engine that can be combined with MySQL, postgreSQL performs full-text search. It provides more professional search functions than the database itself, making it easier for applications to implement professional full-text search. Sphinx specially designs Search API interfaces for some scripting languages, such as PHP, Python, Perl, and Ruby. It also designs a Storage Engine plug-in for MySQL. ');
(If searchd has been run again, kill it before running)
Note: after adding data, you need to reload the index so that new data can be cached.
Re-create the index. after successful re-indexing, enable index listening.
/Usr/local/sphinx/bin/indexer -- config/usr/local/sphinx/etc/sphinx. conf -- all
/Usr/local/sphinx/bin/searchd -- config/usr/local/sphinx/etc/sphinx. conf
Indexing speed on ide hard disks
Indexing index 'test1 '...
Collected 423228 docs, 637.2 MB
Sorted 125.5 Mhits, 100.0% done
Total 423228 docs, 637201412 bytes
Total 753.401 sec, 845766.13 bytes/sec, 561.76 docs/sec
In this way, you canPhpmyadmin.
SELECT doc. * FROM documents doc JOIN sph0000on (doc. id = sph0000. id)
WHERE query = 'design; mode = any'
It doesn't seem to come out ....
Modify sphsf-. conf
Remove the following comments
SQL _query_pre = SET NAMES utf8
Restart searchd to search the result.
Refer:
Http://www.coreseek.com/uploads/pdf/sphinx_doc_zhcn_0.9.pdf
Http://www.sphinxsearch.com/wiki/doku.php? Id = sphinx_chinese_tutorial
Http://www.cnblogs.com/hushixiu/articles/1295605.html
Http://blog.xoyo.com/dcyhldcyhl/article/839863.shtml
Http://blog.sina.com.cn/s/blog_5aefd9770100axf1.html
Http://blog.s135.com/post/360/
Updated on
Encoding Solution (not tested)
-------------------------------------------------------------------------------
Convert existing table data
Iconv-f GB18030-t UTF-8-o dump. SQL dump_utf8. SQL
You can also directly use the existing GBK data without conversion, but you need to set the connection mode.
Mysql_query ("SET character_set_client = 'gbk'", $ conn );
Mysql_query ("SET character_set_connection = 'gbk'", $ conn); // SET character_set_connection and collation_connection
// Mysql_query ("SET collation_connection = 'gbk'", $ conn );
Mysql_query ("SET character_set_results = 'utf8'", $ conn );
After these three settings, the query result is the result of UTF8 encoding. Suitable for SPHINX.
Mysql_query ("set session query_cache_type = OFF", $ conn );
// Query when indexer creates an index, which does not need to be cached
-------------------------------------------------------------------------------
Updated on
The coreseek website seems to have a problem and cannot be downloaded.
Download some installation files, totaling 3.6 MB, including the following files
Build_delta_index.sh
Build_main_index.sh
Fix-crash-in-excerpts.patch
Mmseg-0.7.3.tar.gz
Sphinx-0.9.8-rc2.tar.gz
Sphinx-0.98rc2.zhcn-support.patch
Sphexample. conf
Sphinxapi. php
Test. php
Test2.php
> Click here to download the sphinx installation file.
-------------------------------------------------------------------------------
Updated on
Use service to control sphinx
First add a sphinx user and belong to the website Group (already exists), and modify the owner of the sphsite Directory
Useradd-d/usr/local/sph?- g website-s/sbin/nologinsph=
Chown-R sphsite: website/usr/local/sphinx
Create/etc/init. d/sphinx script
#! /Bin/sh
# Sphenders: Startup script for sphsf-search
#
# Chkconfig: 345 86 14
# Description: This is a daemon for high performance full text \
# Search of MySQL and PostgreSQL databases .\
# See http://www.sphinxsearch.com/for more info.
#
# Processname: searchd
# Pidfile: $ sphinxlocation/var/log/searchd. pid
# Source function library.
./Etc/rc. d/init. d/functions
Processname = searchd
Servicename = sphinx
Username = sphinx
Sphinxlocation =/usr/local/sphinx
Pidfile = $ sphinxlocation/var/log/searchd. pid
Searchd = $ sphinxlocation/bin/searchd
RETVAL = 0
PATH = $ PATH: $ sphinxlocation/bin
Start (){
Echo-n $ "Starting sphsf-daemon :"
Daemon -- user = $ username -- check $ servicename $ processname
RETVAL =$?
Echo
[$ RETVAL-eq 0] & touch/var/lock/subsys/$ servicename
}
Stop (){
Echo-n $ "Stopping Sphinx daemon :"
$ Searchd -- stop
# Killproc-p $ pidfile $ servicename-TERM
RETVAL =$?
Echo
If [$ RETVAL-eq 0]; then
Rm-f/var/lock/subsys/$ servicename
Rm-f $ pidfile
Fi
}
# See how we were called.
Case "$1" in
Start)
Start
;;
Stop)
Stop
;;
Status)
Status $ processname
RETVAL =$?
;;
Restart)
Stop
Sleep 3
Start
;;
Condrestart)
If [-f/var/lock/subsys/$ servicename]; then
Stop
Sleep 3
Start
Fi
;;
*)
Echo $ "Usage: $0 {start | stop | status | restart | condrestart }"
;;
Esac
Exit $ RETVAL
Modify permissions and add them to the service. The machine is automatically started when it is started.
Chmod 755/etc/init. d/sphinx
Chkconfig -- add sphinx
Chkconfig -- level 345 sph0000on
Chkconfig -- list | grep sphinx # Check
Service sphinx start # run
Service sphsf-stop # stop. the official script has some problems with my as4, so I changed it rudely.
Service sphinx restart # restart
Service sph1_status # check whether it is running
Check that sphsf-user has been used for running
Ps aux | grep searchd
Sph000024612 0.0 0.3 11376 6256 pts/1 S searchd
> Click here to download the sphinx startup script.