Sphinx+mysql Full-Text Search architecture and installation

Source: Internet
Author: User
Tags create index

Objective:

This paper describes a TENS data retrieval (search engine) architecture which has been tested by production environment. This article only lists the content excerpts from the previous chapters and does not provide full text content.

On the Dell PowerEdge 6850 server (four 64-bit inter Xeon MP 7110N processor/8GB RAM), RedHat AS4 Linux operating system, MySQL 5.1.26, MyISAM storage engine, key_buffer=102 4M environment measured, 10 million records of the data volume (this MySQL table has int, datetime, varchar, text and other types of more than 10 fields, only primary key, no other index), with the primary key (PRIMARY KEY) as a Where condition for SQL queries, which is very fast and takes only 0.01 seconds.

From the Russian open source Full-text search engine software Sphinx, single index maximum can contain 100 million records, 10 million records in the case of the query speed of 0.x seconds (millisecond level). Sphinx creates an index at a rate of 3-4 minutes to create an index of 1 million records, an index that creates 10 million records can be completed in 50 minutes, and contains only the incremental index of the latest 100,000 records, and only dozens of seconds for the rebuild.

Based on the above points, I designed the search engine architecture. It worked in a production environment for a week, and the effect was very good. When I have time, I will be specifically designed to match the Sphinx search engine, the development of a simple logic, fast, memory low, non-table lock MySQL storage engine plug-ins, to replace the MyISAM engine, to solve the MyISAM storage engine in the frequent update operation of the lock table delay problem. In addition, there is no problem with distributed search technology.

First, the Search engine architecture design:

1, the Search engine architecture diagram:

2, search engine architecture design ideas:

(1), the most simplified way to call:

As easy as possible for front-end Web engineers, just a simple SQL statement "SELECT ..." From Myisam_table JOIN sphinx_table on (sphinx_table.sphinx_id=myisam_table.id) WHERE query= ' ... '; You can achieve efficient search.

(2), create index, query speed quickly:

①, Sphinx Search is a high-performance full-text searching package developed by Andrew Aksyonoff of Russia, issued under the GPL and Commercial Protocol dual license agreement.

Characteristics of the Sphinx:

Sphinx supports high speed indexing (up to 10mb/seconds, while Lucene builds indexes at 1.8mb/seconds)

High-Performance search (search on 2-4 GB text, averaging 0.1 seconds for results)

High scalability (measured up to 100GB of text indexed, single index can contain 100 million records)

Support for distributed retrieval

Supports the sorting mechanism of compound results based on phrase and statistics

Supports any number of file fields (numeric properties or Full-text retrieval properties)

Supports different search modes ("Exact match", "phrase match" and "any Match")

Supports storage engines as MySQL

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.