Search for subdomains
For example, jb51.net searches for www.jb51.net, jb51.net, and host.jb51.net.
If mysql is used with like, the efficiency is very low. millions or even tens of millions of data cannot be used, so sphinx is used.
Many problems have been found during use. Here we will summarize them and let unknown friends pay attention to these characters.
Analysis:
Sphinx is a full-text index that searches for contained records.
First, if we do not make any settings, search for jb51.net will display aajb51.net, jb51.a.cn, jb51.net.com (of course, this domain name suffix does not exist, but there will be domain names that comply with the relevant rules, here only for example.
Why is this happening?
We use. /search-c configuration file-I index name 'jb51. when searching, you will find that the following words are divided into two parts: 'jb51 'and 'cn'. The default value is. as a separator. If we do not want to use it as a separator, we need to use it. add it to charset_table. here we need to say that we only need letters, numbers, "-", and other characters to search for a domain name. The settings are as follows:
Charset_table = 0 .. 9, .. z-> .. z, .. z, U + 002e, U + 002d, U + 0040, U + 0060 # Where U + 002e represents ". ", U + 002d stands for"-", U + 0040 stands for" @ ", U + 0060 stands for" '", here is the ascii code value.
In this way, domain names such as jb51.a.cn will be searched out.
So what about jb51.net.com? We can add a unique suffix such as "XXXXX" and concat (search, 'xxxxx') to the index fields.
Now there are domain names such as aajb51.net. We use the keyword "'" .jb51.net "' (note that it is a pair of double quotation marks in single quotes) to search for them. The primary domain name is added separately, however, during the use process, we found that domain names irrelevant to this domain name, such as aa.bb.cn, were found ". "Question, later put ". "Replace with" @ "to search for the problem that many domain names such as 12306 cannot be read. Later, it is estimated that these special characters have special meanings in sphinx, so I found the character "'" and changed it to this character before everything was normal.
Note: After you Replace "." with "'" and other corresponding characters, you must add this character to charset_table. Otherwise, it will be ignored.
Therefore, we need to note these special characters during the search process.