Notes for using sphdomains in mysql to search for subdomains

Source: Internet
Author: User

Search for subdomains

For example, jb51.net searches for www.jb51.net, jb51.net, and host.jb51.net.

If mysql is used with like, the efficiency is very low. millions or even tens of millions of data cannot be used, so sphinx is used.

Many problems have been found during use. Here we will summarize them and let unknown friends pay attention to these characters.

Analysis:
Sphinx is a full-text index that searches for contained records.

First, if we do not make any settings, search for jb51.net will display aajb51.net, jb51.a.cn, jb51.net.com (of course, this domain name suffix does not exist, but there will be domain names that comply with the relevant rules, here only for example.

Why is this happening?

We use. /search-c configuration file-I index name 'jb51. when searching, you will find that the following words are divided into two parts: 'jb51 'and 'cn'. The default value is. as a separator. If we do not want to use it as a separator, we need to use it. add it to charset_table. here we need to say that we only need letters, numbers, "-", and other characters to search for a domain name. The settings are as follows:
Charset_table = 0 .. 9, .. z-> .. z, .. z, U + 002e, U + 002d, U + 0040, U + 0060 # Where U + 002e represents ". ", U + 002d stands for"-", U + 0040 stands for" @ ", U + 0060 stands for" '", here is the ascii code value.
In this way, domain names such as jb51.a.cn will be searched out.

So what about jb51.net.com? We can add a unique suffix such as "XXXXX" and concat (search, 'xxxxx') to the index fields.

Now there are domain names such as aajb51.net. We use the keyword "'" .jb51.net "' (note that it is a pair of double quotation marks in single quotes) to search for them. The primary domain name is added separately, however, during the use process, we found that domain names irrelevant to this domain name, such as aa.bb.cn, were found ". "Question, later put ". "Replace with" @ "to search for the problem that many domain names such as 12306 cannot be read. Later, it is estimated that these special characters have special meanings in sphinx, so I found the character "'" and changed it to this character before everything was normal.

Note: After you Replace "." with "'" and other corresponding characters, you must add this character to charset_table. Otherwise, it will be ignored.

Therefore, we need to note these special characters during the search process.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.