Notes for using sphdomains in mysql to search for subdomains bitsCN.com
Search for subdomains
For example, bitsCN.com searches for www.bitsCN.com, bitsCN.com, and host.bitsCN.com.
If mysql is used with like, the efficiency is very low. millions or even tens of millions of data cannot be used, so sphinx is used.
Many problems have been found during use. here we will summarize them and let unknown friends pay attention to these characters.
Analysis:
Sphinx is a full-text index that searches for contained records.
First, if we do not make any settings, the search for bitsCN.com will display aabitsCN.com, jb51.a.cn, bitsCN.com.com (of course, this domain name suffix does not exist, but there will be a domain name that complies with the relevant rules, here only.
Why is this happening?
We use. /search-c configuration file-I Index name 'bitscn. the following words are divided into two parts: 'jb51' and 'cn'. the default value is. as a separator. if we do not want to use it as a separator, we need to use it. add it to charset_table. here we need to say that we only need letters, numbers, "-", and other characters to search for a domain name. The settings are as follows:
Charset_table = 0 .. 9, .. z-> .. z, .. z, U + 002e, U + 002d, U + 0040, U + 0060 # Where U + 002e represents ". ", U + 002d stands for"-", U + 0040 stands for" @ ", U + 0060 stands for" '", here is the ascii code value.
In this way, domain names such as jb51.a.cn will be searched out.
What about bitsCN.com.com? We can add a unique suffix such as "XXXXX" and concat (search, 'xxxxx') to the index fields.
Now there are domain names like aabitsCN.com. We use the keyword "'" .bitsCN.com "' (note that it is a pair of double quotation marks in single quotes) to search for them. The primary domain name is added separately, however, during the use process, we found that domain names irrelevant to this domain name, such as aa.bb.cn, were found ". "question, later put ". "replace with" @ "to search for the problem that many domain names such as 12306 cannot be read. later, it is estimated that these special characters have special meanings in sphinx, so I found the character "'" and changed it to this character before everything was normal.
Note: After you replace "." with "'" and other corresponding characters, you must add this character to charset_table. Otherwise, it will be ignored.
Therefore, we need to note these special characters during the search process.
BitsCN.com