2014-10-16 09:56:37
target : full-Text Search
Using SQL to search for keywords, while guaranteeing fast and accurate, is still quite difficult.
One of the basic primitives of SQL (and the principle of how SQL Inherits) is that the individual data in a column is atomic .
anti-pattern : pattern matching
Use like or regular expressions.
Cons: (1) Unable to use the index, the full table traversal, very time-consuming, very low performance.
(2) Sometimes return to medical results. SELECT * from bugs where description like '%one% ',
The return result may be money, prone, lonely.
Regular expressions may provide a pattern for word boundaries to solve the problem of word matching.
how to recognize anti-patterns : May be anti-pattern when the following conditions occur
1. How do i insert a variable between the 2 wildcard characters in a like expression?
2. How do I write a regular expression to check if a string contains multiple words, does not contain a specific word, or contains any form of a given word?
3, the search function of the website has been added a lot of documents in the slow unreasonable.
Rational use of anti-patterns :
1, performance is always the most important, if some of the query process is rarely implemented, you do not have to spend a lot of effort to optimize it
2, it is difficult to use pattern matching operation to make complex queries, but if you design such pattern matching for some simple requirements, they can help you to use
Get the right results for the least amount of work.
solution : Using Tools
The best solution is to use special search engine technology instead of SQL.
Another scenario is to save the results to reduce the duplication of search overhead.
1. Full-text indexing in MySQL: You can define a full-text index on a column of type char, varchar, or text. Then use the match function to search.
2. Text index in Oracle: Context, Ctxcat, Ctxxpath, Ctxule.
3. Full-text search in SQL Server: Use the CONTAINS () operator to work with full-text indexing. It needs to be configured in complex steps before use.
4. PostgreSQL Text Search: Provides a complex, large and configurable way to convert text into searchable vocabulary collections and enable these documents to be
Pattern matching search.
5, SQLite full-Text search: Using SQLite extension components to achieve.
6. Third-party search engine:
(1) Sphinx search: Open source search engine, used for MySQL and PostgreSQL to support the use.
(2) Apache Lucene: is a mature search engine for Java programs.
7, to achieve their own search engine:
Use reverse indexing scheme: A reverse index is a list of all the words that may be searched.
(1) Define a keywords table to record all user-searched keywords, and then define a crosstab to establish a many-to-many relationship.
(2) Add each keyword and matching content to the crosstab.
When you have a new search word, use the like query result and save the result to the crosstab so you don't have to like it next time.
When a new document is in storage, it is necessary to populate the crosstab with a trigger (or timing).
Conclusion : It is not necessary to use SQL to solve all problems.
SQL anti-Pattern Learning Note 17 Full text Search