The company project just import a large number of product data, and then found that the site's product search is very card, originally is the original SQL like to do fuzzy search, the amount of data 20W can also be accepted, but millions of cards, so need to do optimization.
After consideration, it is intended to be optimized using the architecture of full-Text search Sphinx + database Middleware (ATLAS/MYCAT).
My environment:
centos6.5 64-bit
lnmp1.3 One-click Environment Pack
CentOS6.4 X64 installing Sphinx and Sphinx for PHP extensions
Before installing, be sure to install the usual components and then download the latest Sphinx on the official website.
Yum install-y python Python-devel
http://sphinxsearch.com/downloads/release/
Installing Sphinx
Tar zxvf sphinx-2.2.10-release.tar.gzcd sphinx-2.2.10-release./configure--prefix=/usr/local/sphinx–- With-mysqlmake && make Install
If there is a undefined reference to libiconv error in make, refer to http://www.lvtao.net/database/sphinx-make-error.html workaround
Libsphinxclient Installation (PHP module required)
CD Api/libsphinxclient./configure–prefix=/usr/local/sphinxmake && make Install
Installing the PHP Sphinx module
: Http://pecl.php.net/package/sphinx
wget Http://pecl.php.net/get/sphinx-1.3.0.tgztar zxf sphinx-1.3.3.tgzcd sphinx-1.3.3/usr/local/php/bin/phpize./ Configure--with-php-config=/usr/local/php/bin/php-config--with-sphinx=/usr/local/sphinx/make && make Install
Add PHP Extension Library
View PHP.ini Location
PHP--ini
Edit Configuration
Vi/usr/local/php/etc/php.ini
: $ jump to end of file
Extension_dir= "/usr/local/php/lib/php/extensions/no-debug-non-zts-20131226/"
[Sphinx]
Extension=sphinx.so
Php-m or Phpinfo () to see if the extension has been loaded
First we have to set up the index on the server side so that PHP can access it via port.
Copy the default profile and recreate a configuration file
Cp/usr/local/sphinx/etc/sphinx-min.conf.dist/usr/local/sphinx/etc/sphinx.conf
Sphinx.conf.dist is the full version of the default configuration, there is a lot of content, I choose to replicate here is the Sphinx-min.conf.dist mini version, as long as the basic query needs to meet
# # Minimal Sphinx Configuration sample (clean, simple, functional) #source src1{type = MySQL Sql_host = localhost Sql_user = root Sql_pass = root sql_db = Allchips_test Sql_port = 3306 # optional, default is 3306 Sql_que ry = select * FROM products #sql_attr_uint = id #sql_attr_timestamp = date_added sql_field_string = product_id sql_field_string = Partno}source src2{type = MySQL Sql_host = localhost Sql_user = root Sql_pass = Root sql_db = Allchips_test Sql_port = 3306 # optional, default is 33 Sql_query = select * from product_prices}source src3{type = MySQL Sql_host = localhost Sql_user = root Sql_pass = Root sql_db = Allchips_test sql_port = 3306 # optional, default is 3306 sql_query = Select * from Product_attrs}index products{Source = src1 Path =/mnt/data/prod Ucts Min_infix_len = 1 Infix_fields = partno,short_desc}index prices{Source = Src2 Path =/mnt/data/prices}index attrs{Source = src3 Path =/mnt/data/attrs}indexer{Mem_limit = 128m}searchd{Listen = 9312 Listen = 9306:mysql41 log =/mnt/data/log/searchd.log Query_log =/mnt/data/log/query.log Read_timeout = 5 Max_children = pid_ File =/mNt/data/log/searchd.pid seamless_rotate = 1 preopen_indexes = 1 unlink_old = 1 workers = threads # for RT-to-Work Binlog_path =/mnt/data}
The bottom indexer and searchd are index creation, and the configuration of the query command, basically as long as you set the log path you want to
Important above section, source (source) and index (index)
Analysis of my needs, my product search main 3 sheets
Product table products, (id,product_id)
Product Price List product_prices,
Product parameter Table Product_attrs
The product_id of the product list is related to 1-to-many
SOURCE Src1 corresponds to index products
SOURCE SRC2 corresponding to index prices
SOURCE SRC3 corresponding to index attrs
In source, you can set the fields that are returned by the custom
As the above
sql_field_string = product_id
sql_field_string= PartNo
Once configured, create an index
I use/usr/local/sphinx/bin/indexer-c/usr/local/sphinx/etc/sphinx.conf--all--rotate command, except for the no rotate hint. To make sure that the index is built, I still build it separately from multiple sources
/usr/local/sphinx/bin/indexer-c/usr/local/sphinx/etc/sphinx.conf Products
/usr/local/sphinx/bin/indexer-c/usr/local/sphinx/etc/sphinx.conf Prices
/usr/local/sphinx/bin/indexer-c/usr/local/sphinx/etc/sphinx.conf Attrs
If there are no problems, this is usually the case.
The next step is to use Searchd as the daemon for the Sphinx server.
/usr/local/sphinx/bin/searchd-c/usr/local/sphinx/etc/sphinx.conf (test.conf on the way was previously tested, using sphinx.conf)
General if the error
folder does not exist, the folder is created
If the port process is already running, there are 2 ways to stop
1,/usr/local/sphinx/bin/searchd-c/usr/local/sphinx/etc/sphinx.conf--stop
2, NETSTAT-TNL see if Port 9312 is listening
lsof-i:9312 View 9312 port information, get PID
Kill {PID}
Re-execute the SEARCHD command after the kill process starts
==========
PHP side
<?PHP//index.php Phpinfo(); die; $s=Newsphinxclient; $s->setserver ("127.0.0.1", 9312); $s-Setmatchmode (sph_match_phrase); $s->setmaxquerytime (30); $res 1=$s->query (' USB ', ' products '); $res 2=$s->query (' 53e6dde17a667c4b2af1d38ba0a466c4 ', ' prices '); $res 3=$s->query (' 53e6dde17a667c4b2af1d38ba0a466c4 ', ' attrs '); //$res = $s->query (' switch ', ' products '); $res = $s->query (' products '); $err=$s-GetLastError (); //Var_dump (Array_keys ($res [' matches ')]); echo "<br>". " You can read the values in the database by getting the ID. "." <br> "; Echo' <pre> '; $products=!Empty($res 1[' matches '])?$res 1[' Matches ']: ""; $prices=!Empty($res 2[' matches '])?$res 2[' Matches ']: ""; $attrs=!Empty($res 3[' matches '])?$res 3[' Matches ']: ""; Print_r($products); Print_r($prices); Print_r($attrs); if(!Empty($err)){ Print_r($err); } $s->close ();
Coreseek of the official website is not downloaded, so temporarily do not get Chinese. Later to see the time to download a Chinese dictionary to hit in
This is the matches match result returned by the printed query, and if you want to see the entire query result, you can see the PHP manual http://php.net/manual/zh/sphinxclient.query.php
Return data structure
Key |
Value Description |
"Matches" |
Stores the document ID and another hash table that contains the document weights and attribute values |
"Total" |
The total number of matching documents retrieved by this query on the server (that is, the size of the server-side result set and related settings) |
"Total_found" |
The total number of matching documents in the index (found and processed on the server) |
"Words" |
Maps the query keyword (the keyword has been case-converted, stemming, and other processing) to a small hash table that contains statistics about the keyword ("docs"-how many documents appear, "hits"-How many times it appears). |
"Error" |
SEARCHD reported error messages |
"Warning" |
SEARCHD Report warning message |
================================================================
Atlas heard that a lot of people are in use, to be continued in the installation test-
Sphinx Full-Text Search installation configuration and use