Oracle Fuzzy Query optimization

Source: Internet
Author: User

Fuzzy queries are frequently used in database queries, and are commonly used in the following formats:

(1) field like '% keyword% ' field contains ' keyword ' records even if the target field is indexed and not indexed, the slowest

(2) field like ' Keyword% ' field records starting with "keyword" can be used in ascending index established in target Field

(3) The record of the field like '% keyword ' field ending with ' keyword ' can use descending index established to target Field

For the '% keyword% ' pattern that cannot be indexed, there is no way to optimize it, the answer is yes,

The InStr (Strsource,strtarget) function is available in Oracle, which is much more efficient than using the '% keyword% ' mode.

InStr Function Description:

INSTR

(Source string, target string, start position, match ordinal)

In Oracle/plsql, the InStr function returns the position of the string to intercept in the source string. Retrieve only once, that is, from the beginning of the character

Ends at the end of the character.

The syntax is as follows:

InStr (string1, string2 [, Start_position [, Nth_appearance]])

Parametric analysis:

String1

The source string to look up in this string.

string2

The string to find in the string1.

Start_position

Represents the location of the string1 to begin the search. This parameter is optional if omitted by default to 1. The string index starts at 1. If this parameter is positive, it is retrieved from left to right, and if this parameter is negative, right-to-left, returns the starting index of the string to find in the source string.

Nth_appearance

Represents the string2 to find the occurrence of the first few occurrences. This parameter is optional, and if omitted, the default is 1. If the system is negative, it will give an error.

Attention:

If String2 is not found in String1, the InStr function returns 0.

Example:

SELECT InStr (' Syranmo ', ' s ') from dual; --Return 1

SELECT InStr (' Syranmo ', ' RA ') from dual; --Return 3

SELECT InStr (' Syran Mo ', ' a ', up to) from dual; --Return 0

Contrast:

InStr (title, ' manual ') >0 equivalent to title like '% manual% '

InStr (Title, ' Handbook ') =1 equivalent to title like ' Handbook% '

InStr (title, ' manual ') =0 equivalent to title not a like '% manual% '

Fuzzy query optimization:

Knowing the use of the InStr function, optimization becomes simpler, such as the% keyword% equivalent to InStr (field, ' keywords ') >0

Practical application:

there are nearly 11 million data in the T table, and many times we want to do string matching, in SQL statements we usually use like to reach our search target. However, the actual test shows thatthe efficiency of like is quite different from the InStr function. Here are some test results:

Sql> Set Timing on
Sql> Select COUNT (*) from T where InStr (title, ' Manual ') >0;

COUNT (*)
----------
65881

elapsed:00:00:11.04
Sql> Select COUNT (*) from T where title like '% manual % ';

COUNT (*)
----------
65881

elapsed:00:00:31.47
Sql> Select COUNT (*) from T where InStr (title, ' Manual ') =0;

COUNT (*)
----------
11554580

elapsed:00:00:11.31
Sql> Select COUNT (*) from T where title is not a like '% manual % ';

COUNT (*)
----------
11554580

In addition, I am in the knot of another 200 million more tables, using 8 parallel, using like query for a long time will not come out results, but using instr,4 minutes to complete the search, performance is quite good. These tips are good to use and work much better. The above test shows that some of thefunctions built by Oracle are optimized to a considerable extent.

InStr (title, ' AAA ') >0 equivalent to Like

InStr (title, ' AAA ') =0 equivalent to not

Special usage:

Select ID, name from users where InStr (' 101914, 104703 ', id) > 0;
It is equivalent to
Select ID, name from users where id = 101914 or id = 104703;

Use Oracle of the InStr function and index matching improve the efficiency of fuzzy query

In general, in the Oracle database, we use the following two ways to make a fuzzy query on the name field of the TB table:
1.select * from TB where name is like '%xx% ';
2.select * from TB where InStr (name, ' XX ') >0;

If there is no index on the name field, the efficiency is almost the same, basically no difference.

To increase efficiency, we can add a non-unique index to the name field:
CREATE INDEX Idx_tb_name on TB (name);

This way, then use

SELECT * from TB where InStr (name, ' XX ') >0;

Such a statement query, the efficiency can be improved a lot, the larger the amount of table data, the greater the difference. But also take into account that the Name field is indexed after the DML statement causes the index data to reorder the effect.

Another type of unknown scenario:

Some people say that with full-text indexing, I looked, the steps are very troublesome, but is a good way to keep the spare:

http://sandish.itpub.net/post/4899/464369

Full-text Search for the Address field in the Cmng_custominfo table:
1, in the oracle9201 need to create a word participle of something:

BEGIN
Ctx_ddl.create_preference (' sms_address_lexer ', ' chinese_lexer ');
--ctx_ddl.create_preference (' my_lexer ', ' chinese_vgram_lexer '); No
End

2, create a full-text search:

CREATE INDEX Inx_custominfo_addr_docs on Cmng_custominfo (address) Indextype is Ctxsys. CONTEXT PARAMETERS (' LEXER sms_address_lexer ');

3, query time, use:

SELECT * from Cmng_custominfo where contains (address, ' Golden Town ') >1;

4, synchronization and optimization are required on a regular basis:
Sync: Updates the index of the full-text search based on the text content of the new record.

Begin
Ctx_ddl.sync_index (' Inx_custominfo_addr_docs ');
End

Optimization: Clears garbage from the full-text search index based on deleted records

Begin
Ctx_ddl.optimize_index (' Inx_custominfo_addr_docs ', ' FAST ');
End

5, use job to do the work in step 4:

1) This function requires the use of Oracle's job function to complete
Because Oracle9i does not enable the job feature by default, you first need to increase the job configuration parameters for the Oracle DB instance:
Job_queue_processes=5
Restart the Oracle database service and the listener service.

2) Synchronization and optimization
--Sync Sync:
Variable jobno number;
BEGIN
Dbms_job. SUBMIT (: Jobno, ' Ctx_ddl.sync_index (' Inx_custominfo_addr_docs '); ', Sysdate, ' Sysdate + (1/24/4) ');
Commit
END;

--Optimization
Variable jobno number;
Begin
Dbms_job. SUBMIT (: Jobno, ' Ctx_ddl.optimize_index (' Inx_custominfo_addr_docs ', ' full '); ', Sysdate, ' sysdate + 1 ');
Commit
END;

Among them, the first job of Sysdate + (1/24/4) is to synchronize every 15 minutes, the second job of Sysdate + 1 is every 1 days to do a full optimization. Specific time intervals, depending on the needs of the application

6, Index rebuild
Rebuilding the index removes the original index and rebuilds the index, which takes a long time.
The rebuild index syntax is as follows:
ALTER INDEX Inx_custominfo_addr_docs REBUILD;

According to some web-based experience, Oracle rebuilds the index is also faster, there is a use of the home description:

Oracle's full-text search builds and maintains indexes much faster than MS SQL Server, with the author's 650,000-record table indexed in just 20 minutes and only 1 minutes in sync.
Therefore, you may also consider using the job method to periodically rebuild the index.

Reference:
1,http://blog.csdn.net/yurenjia/archive/2007/04/08/1556306.aspx
2,http://topic.csdn.net/u/20080117/23/ 34004f4a-4989-47ef-8764-0b7e3bf737a7.html
3,http://tenwe.com/tech/ database/oracle/200702/content_561_4.shtml
4,http://www.knowsky.com/ 389357.html
5,http://yangtingkun.itpub.net/post/468/195520
6,http://bbs.zdnet.com.cn/archiver/tid-120474.html
7,http://bbs.51cto.com/archiver/tid-26270.html
8,http:// oracle.chinaitlab.com/exploiture/720104_3.html
9,http://www.33kuai.cn/ html/shujuku/20080126/5314_2.html
10,http://www.xrss.cn/dev/database/ 20084218963.Html

Oracle Fuzzy query optimization

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.