Mysql+sphinx implementing full-Text Search

Source: Internet
Author: User
Tags curl openssl openldap

Recently in a search engine, mainly on the book of object-level search, first to understand the next Sphinx bar.

It can improve the speed of your queries, this is not general fast.

Sphinx is an SQL-based full-text search engine that can be combined with mysql,postgresql for full-text searching, providing more specialized search capabilities than the database itself, making it easier for applications to implement specialized full-text searches. Sphinx specifically designed search API interfaces for some scripting languages, such as PHP, Python, Perl, Ruby, and a storage engine plugin for MySQL.

Sphinx A single index can contain up to 100 million records, and the query speed in the case of 10 million records is at the millisecond level. Sphinx indexes are created at a rate of 3-4 minutes to create an index of 1 million records, an index to create 10 million records can be completed in 50 minutes, and only an incremental index of the latest 100,000 records is required, only dozens of seconds to rebuild.

Key features of the Sphinx include:

High-speed index (on the new CPU, nearly ten MB/s);
High-speed search (the average query speed in the 2-4g text volume is less than 0.1 seconds);
High availability (up to a maximum of 100M documents on a single CPU);
Provide a good relevance ranking

Support distributed search;

Provide document summary generation;

Provides search from a plug-in storage engine inside MySQL

Support Boolean, phrase, and synonyms query;

Support for multiple full-text search domains per document (default maximum of 32);

Supports multiple attributes per document;
Support word breaking;

Support single-byte encoding and UTF-8 encoding;

Look at the above features are very good, look at the way to use it.

Native MySQL storage engine retrieval process:

Search based on the Sphinx storage Engine:

I still prefer to use the second storage engine, even if your programming language does not support the Sphinx API's interface can also be used yo.

Need to install some necessary components before starting the installation

Yum-y install gcc g++ gcc-c++ libjpeg libjpeg-devel libpng libpng-devel freetype freetype-devel libxml2 libxml2-devel Zli b zlib-devel glibc glibc-devel glib2 glib2-devel bzip2 bzip2-devel ncurses ncurses-devel Curl curl-devel e2fsprogs E2fspro Gs-devel krb5 krb5-devel libidn libidn-devel OpenSSL openssl-devel openldap openldap-devel nss_ldap openldap-clients Open Ldap-servers Patch Libtool automake imake mysql-devel expat-devel

(1) Install Python support

Yum Install–y python Python-devel

(2) Compile and install libmmseg (libmmseg is a Chinese word-breaker software designed for Sphinx full-text search engine, its Chinese word segmentation method under the GPL agreement, Tsai algorithm using Chin-hao mmseg. Libmmseg is used in this article to generate Chinese word segmentation thesaurus).

wget http://www.coreseek.com/uploads/sources/mmseg-0.7.3.tar.gz

Tar zxvf mmseg-0.7.3.tar.gz

CD mmseg-0.7.3

./configure

Make

Make install

(1) Compile and install MYSQL5.1.26-RC, Sphinx, Sphinxse storage engine

wget http://blog.s135.com/soft/linux/nginx_php/mysql/mysql-5.1.26-rc.tar.gz

Tar zxvf mysql-5.1.26-rc.tar.gz

wget http://www.sphinxsearch.com/downloads/sphinx-0.9.8-rc2.tar.gz

wget Http://www.coreseek.com/uploads/sources/sphinx-0.98rc2.zhcn-support.patch

wget Http://www.coreseek.com/uploads/sources/fix-crash-in-excerpts.patch

Tar zxvf sphinx-0.9.8.rc2.tar.gz

Patch–p1 <. /sphinx-0.98rc2.zhcn-support.patch #补丁

PATCH–P1 < /fix-crash-in-excerpts.patch #补丁

CP–RF Mysqlse. /mysql-5.1.26-rc/storage/sphinx

Cd.. /

CD mysql-5.1.26-rc/

SH build/autorun.sh

./configure--with-plugins= Partition,innobase,myisammrg,sphinx--prefix=/usr/local/mysql/--enable-assembler-- With-extra-charsets=complex--enable-thread-safe-client--with-big-tables--with-readline--with-ssl-- With-embedded-server--enable-local-infile

Make && make install

Cd.. /

Start the MySQL database

CP SUPPORT-FILES/MY-MEDIUM.CNF/ETC/MY.CNF # Configuration file

CP support-files/mysql.server/etc/rc.d/mysqld # Add MySQL service control

Cd/usr/local/mysql

bin/mysql_install_db--user=mysql # Installation

Bin/mysqld_safe--user=mysql & # Test Installation success

Bin/mysql # go to MySQL command prompt

Start stop

/etc/rc.d/mysqld start

/etc/rc.d/mysqld stop

So we created the file/etc/rc.local ourselves and gave execution permission. The general content is:

#!/bin/sh

/usr/local/mysql/bin/mysqld_safe--user=mysql &

Or

/etc/rc.d/mysqld start

Entering the following command appears sphinx indicates that Sphinxse has been ported to MySQL.

Show engines;

In the 0.9.8 version used in this article, it is recommended to use the 0.9.9 version, the 0.9.9 version is the most stable version, I finally changed to 0.9.9 version.

Sphinx Default does not support the Chinese index and retrieval, previously with Coreseek patch to solve, currently coreseek not provide patches alone, and based on Sphinx developed a Coreseek full-text retrieval server, Coreseek should be the most current Sphinx Chinese full-text search, it provides Sphinx design for the Chinese word breaker libmmseg contains mmseg Chinese word, In fact, Coreseek-3.2.14.tar.gz has included the Sphinx, the front of the installation Sphinxse can also use the Mysqlse in the compression package.

Installing autoconf

Tar zxvf autoconf-2.64.tar.gz

CD autoconf-2.64

./configure–prefix=/usr

Make

Make install

Installing Coreseek

Tar zxvf coreseek-3.2.14.tar.gz

CD coreseek-3.2.14

CD mmseg-3.2.14/

./bootstrap

./configure–prefix=/usr/local/mmseg3

Make

Make install

Cd.. /csft-3.2.14/

SH buildconf.sh

./configure--prefix=/usr/local/coreseek--without-python--without-unixodbc--with-mmseg--with-mmseg-includes=/ usr/local/mmseg3/include/mmseg/--with-mmseg-libs=/usr/local/mmseg3/lib/--with-mysql--host=arm

Make

Make install

Cd/usr/local/coreseek/etc

Enter configuration directory by command LS to see 3 files

Example.sql sphinx.conf.dist Sphinx-min.conf.dist

Where Example.sql is an instance of SQL script we import it into the test database in the database as testing data (will create documents and tags tables)

VI sphinx.conf

Enter some content:

SOURCE Src1

{

Type = MySQL

Sql_host = localhost

Sql_user = root

Sql_pass =12345678

sql_db = Test

Sql_port = 3306 # optional, default is 3306

Sql_sock =/tmp/mysql.sock

Sql_query_pre = SET NAMES UTF8

Sql_query = \

SELECT ID, group_id, Unix_timestamp (date_added) as date_added, title, content \

From documents

Sql_attr_uint = group_id

Sql_attr_timestamp = date_added

Sql_query_info = SELECT * from documents WHERE id= $id

}

Index Test1

{

Source = Src1

Path =/usr/local/coreseek/var/data/test1

DocInfo = extern

Charset_type = Zh_cn.utf-8

Mlock = 0

Morphology = None

Min_word_len = 1

Html_strip = 0

Charset_dictpath =/usr/local/mmseg3/etc/

Ngram_len = 0

}

Indexer

{

Mem_limit = 32M

}

Searchd

{

Port = 9312

Log =/usr/local/coreseek/var/log/searchd.log

Query_log =/usr/local/coreseek/var/log/query.log

Read_timeout = 5

Max_children = 30

Pid_file =/usr/local/coreseek/var/log/searchd.pid

max_matches = 1000

Seamless_rotate = 1

preopen_indexes = 0

Unlink_old = 1

}

Description: Code snippet Sorce src1{***} represents the data source contains the database configuration information, SRC1 represents the data source name, can be written casually.

The code snippet Index test1{***} represents the creation of an index for that data source, which appears paired with source, where the value of the source parameter must be the name of a data source.

Build Index

/usr/local/coreseek/bin/indexer-c/usr/local/coreseek/etc/sphinx.conf--all

Problems that arise:

Question 1: If SH build/autorun.sh

But Sphinx will not appear in the configure–h inside, need to execute SH build/cleanup and then execute SH build/autorun.sh and then execute./CONFIGURE–H can now see Sphinx.

Issue 2: If compiling MySQL is an error check to see if the Ncurses installation package is installed

Can be performed: Yum List|grep ncurses

Yum–y Install Ncurses-devel

Yum Install Ncurses-devel

And then execute the./configure.

Issue 3 Installing LIBMMSEG requires that you first perform the Yum install Mysql-devel libxml2-devel expat-devel

Issue 4 An error message appears when installing MMSEG: css/unigramcorpusreader.cpp:89:error: ' strncmp ' is not declared in this scope

Manually modified the Src/css/unigramcorpusreader.cpp

Add a sentence to the above

#include <string.h>

You can then start compiling the installation.

Reprinted from: http://www.cnblogs.com/sunwubin/p/3250554.html

Mysql+sphinx implementing full-Text Search

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.