Difference between full-text search and like in sqlserver

Source: Internet
Author: User

Difference between full-text search and like in sqlserver

In SQL Server, the like keyword can be used for fuzzy query to determine whether a specific string matches the specified mode. The mode can contain regular characters and wildcards. During pattern matching, regular characters must match exactly the characters specified in the string. However, this rule can be changed by using wildcards, such as using? And other wildcards can match any part of the string. Therefore, the like keyword can be used for Fuzzy queries in the database. In addition, the database administrator can use the full-text search function to query SQL server data tables. Before you can perform a full-text query on a given table, the database administrator must create a full-text index for the table. Full-text index can also implement like fuzzy query. For example, search for information that matches a specific string in a talent history table. Although the like keyword is similar to the full-text search function, the implementation details are quite different. As a database administrator, you need to understand this difference and select an appropriate implementation mode. I. Differences in query efficiency. Generally, the query efficiency of the like keyword is relatively fast. Especially for structured data, like query efficiency and flexibility are commendable. However, for non-confidential text data, if you use the like keyword for fuzzy query, the execution efficiency is not ideal. Especially for full-text queries, the speed is much slower. As the number of records increases, similar differences become more obvious.
For example, a table contains about 3 million rows of text data. If you use the like keyword to search for related content, it may take several minutes to return the correct result. On the contrary, if full-text search is used for the same data, it may take less than one minute or more and return results. Therefore, if the number of lines of text data is large, for example, more than 10 thousand rows, the database administrator can use the full-text search function to significantly improve the database query efficiency. 2. Sensitivity to space characters. If the like keyword is used in the database for fuzzy search, all characters after this keyword are meaningful. For example, when you use like "ABCD" (with two spaces) to query, the space character after it is also sensitive to the like keyword. That is to say, if you use the preceding statement to query, the data to be queried must also be "ABCD" (with two spaces. If the queried content is "ABCD" (without spaces or with a space), the database system will think that this is not consistent with the query conditions, so no relevant records will be returned. Therefore, the like keyword is sensitive to spaces. Pay special attention to this issue when using the like keyword. If the user or program developer cannot determine whether there is any space behind the ABCD, it can be implemented through wildcard pulling. You can use "% ABCD %" as the condition statement. In this case, no matter whether there are spaces before or after the ABCD, it will be queried. However, in full-text search, spaces are usually ignored. That is, in the full-text search function, the system first optimizes the query condition statements. If spaces are found, spaces are often filtered out. Therefore, full-text search is not sensitive to special characters such as spaces. 3. Processing Requirements for some special characters. Data storage methods vary depending on different data types. For this reason, some special data types may not be able to use the like keyword for fuzzy query. For example, strings in the char and varchar data modes cannot be compared using the like keyword. That is to say, the condition statement followed by the like keyword is only valid for the character mode, and the like condition statement cannot be used to query formatted binary data. Therefore, if the like keyword is used for database management, you must understand the storage methods of each data type and the reasons for the failure of like keyword comparison. Know yourself, know yourself, and know each other. Only in this way can the database administrator avoid the query errors caused by the like keyword being used improperly. However, the like keyword supports matching the ASCII mode and Unicode mode. If all the parameters of the like keyword are of the ASCII character data type, the like keyword is automatically matched in the ASCII mode. If any of the parameters is of the Unicode data type, the system converts all parameters to the Unicode data type and performs Unicode mode matching. In addition, if the like keyword is added with the Unicode data type, the space in the following condition statement is valid, that is, the space that appears after the comparison will be taken into account. However, if the data type is not Unicode, It is not sensitive to spaces. That is, whether a space exists in the comparison will not affect the final result. However, if the database administrator only uses full-text search, there is usually no such concern. Because full-text search not only supports the traditional character mode, but also supports other data modes. In addition, full-text search can also be used to query formatted binary data. Therefore, if the data mode in the data table is not uniform or binary data needs to be queried, we recommend that the database administrator use full-text search instead of the like keyword. 4. Effect of escape characters on query. For example, the data table contains a percentage value. For example, the failure rate of a product with the serial number 10 is 10%. In this case, the user may need to identify the content with a pass rate of 10% and perform subsequent operations. However, "%" in "10%" is a special character. It is a wildcard character in the database. If you use like "10%" For queries, the database will search for both 10 and 10%. Obviously this does not meet our needs. To avoid the adverse effect of special characters such as wildcards on the like query, you need to use the escape clause to search for strings containing one or more special wildcards. In the preceding example, to treat % as a common character rather than a wildcard character, you must provide the escape keyword and escape symbol. If the Escape Character in like mode is not followed by a character, this mode is invalid and like returns false. If the character following the escape character is not a wildcard character, the escape character is discarded and the character following the escape character is treated as a regular character in this mode. However, full-text search will not be affected by this escape character. For example, there are ABCD, AB, ABEF, and AB * rows in the database. Now, the database administrator wants to search for the row starting with the AB character, that is, to implement the prefix search. In this case, the database administrator can use the 'top * 'Condition Statement. In this case, the system returns all texts that match the text specified before the asterisk. If the database administrator only wants to query the records of AB *, you can use the 'top * '(not including double quotation marks) Condition Statement to complete the query. That is, if double quotation marks are not added before and after the text and asterisk, the full-text search does not use the asterisk as a wildcard. This is much easier than using escape characters. 5. Differences in specific applications. Because full-text search and like keywords differ in functionality and performance, they also differ in the field of application. When designing SQL Server databases, they are also responsible for their respective fields. For example, in the case of like mud mounting, the full text may be trivial based on the following content to implement specific queries. For example, you can query one or more specific words and phrases. You can query specific words by deformation. For example, you can query words or phrases adjacent to another word or phrase; for example, you can query the synonyms of specific words, for example, you can query words or phrases with weighted values. It is precisely because of the specific features of full-text search that full-text search is particularly useful in some specific scenarios. According to the examination, full-text search has outstanding performance in the following application fields. First, on an e-commerce website, you can use the full-text search function to perform fuzzy search based on the product type or name on the website homepage. Second, on some talent websites, you can search for required talent information in the backend database based on educational qualifications, work experience, technical expertise, and other conditions. Regardless of the business application scenario, the basic management tasks of full-text search are the same as those of development tasks. However, in a given commercial application scenario, full-text indexing and query can be optimized to meet business objectives. For example, for e-commerce, to maximize the performance, it is possible to compare the results for sorting and retrieval accuracy (in fact, how many existing matching items are returned by full-text queries) or supporting multiple languages is more important. For law firms, the first thing to consider is to return all possible matches. So far, I have participated in e-commerce projects, legal case libraries, and other projects all adopt the full-text search function, which has achieved good results. In general, in some simple queries, fuzzy queries using the like keyword may achieve better results. However, in some complex query applications, especially when you need to query the relevant content in a large text, it is best to use full-text search for query. At this time, the latter will have excellent performance in both performance and accuracy.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.