Research on full-text retrieval of Oracle (all 10)

Last Update:2013-11-24 Source: Internet

Author: User

Tags lexer

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

4. Operate instances

4.1 single and multi-columns support Chinese search

Create table mytable1 (id number primary key, doc1 varchar2 (400), doc2 clob, doc3 clob );

Insert into mytable1 values (1. Today's weather is good. I want to go shopping. Today is Sunday and I don't have to go to work. Good mood every day. Tomorrow is Monday. I want to go to work. Bad mood );

Insert into mytable1 values (2, the day is blue, cloudification. The weather is very good ., It is cloudy and the weather looks like rain. It's not suitable to go out. It's raining, and it's raining. Cannot go out .);

Insert into mytable1 values (3, this is a text, this is a word, this is a pdf );

Commit;

-- Delete the reference first

Begin

Ctx_ddl.drop_preference (my_chinese_vgram_lexer );

Ctx_ddl.drop_preference (my_chinese_lexer );

End;

-- Supports Chinese Word Segmentation

Begin

Ctx_ddl.create_preference (my_chinese_vgram_lexer, chinese_vgram_lexer );

Ctx_ddl.create_preference (my_chinese_lexer, chinese_lexer );

End;

-- Delete the reference first

Begin

Ctx_ddl.drop_preference (my_multi );

End;

-- Multi-column query. If it is only a single column, you do not need to set this type.

Begin

Ctx_ddl.create_preference (my_multi, multi_column_datastore );

Ctx_ddl.set_attribute (my_multi, columns, doc1, doc2, doc3 );

End;

Drop index myindex;

-- Single-Column query, supporting Chinese index creation

Create index myindex on mytable (docs)

Indextype is ctxsys. context

Parameters (datastore ctxsys. default_datastore lexer foo. my_chinese_lexer)

Drop index idx_mytable;

-- Multi-column query: supports the creation of Chinese Indexes

Create index idx_mytable on mytable1 (doc1) indextype is ctxsys. context

Parameters (datastore my_multi lexer foo. my_chinese_lexer );

-- The result of the chinese_lexer lexical analyzer can be queried in all three columns.

Select * from mytable1 where contains (doc1, today)> 0; -- the first data item is retrieved.

Select * from mytable1 where contains (doc1, not suitable)> 0; -- the second data record is retrieved.

Select * from mytable1 where contains (doc1, applicable)> 0; -- the data cannot be retrieved. The word segmentation technology is too simple, and 'unsuitable 'is used as a word.

Select * from mytable1 where contains (doc1, go out)> 0; -- the second data record is retrieved.

Select * from mytable1 where contains (doc1, this is a word)> 0; -- the third data record is retrieved, which is applicable in Chinese and English.

-- Result of the chinese_vgram_lexer lexical analyzer,

-- Although the chinese_vgram_lexer lexical analyzer is not so intelligent, the search results often meet our requirements,

-- For example, the word "unsuitable" should be divided into two words: "unsuitable" and "inappropriate", instead of being a word separately,

-- Chinese_vgram_lexer can be queried, but chinese_lexer cannot.

Drop index idx_mytable;

-- Multi-column query: supports the creation of Chinese Indexes

Create index idx_mytable on mytable1 (doc1) indextype is ctxsys. context

Parameters (datastore my_multi lexer foo. my_chinese_vgram_lexer );

-- The results under the chinese_vgram_lexer lexical analyzer can be queried by all three columns

Select * from mytable1 where contains (doc1, today)> 0; -- the first data item is retrieved.

Select * from mytable1 where contains (doc1, not suitable)> 0; -- the second data record is retrieved.

Select * from mytable1 where contains (doc1, suitable)> 0; -- the second data record is retrieved. Although this word segmentation is inefficient, the retrieval result is still acceptable.

Select * from mytable1 where contains (doc1, go out)> 0; -- the second data record is retrieved.

Select * from mytable1 where contains (doc1, this is a word)> 0; -- the third data record is retrieved, which is applicable in Chinese and English.

-- Column update operations for multi-column queries

-- Only update the slave table to check whether the updated information can be found.

Update mytable1 set doc2 = adladlhadad this datastore when your text is stored test where id = 2;

-- Synchronously update Indexes

Begin

Ctx_ddl.sync_index (idx_mytable );

End;

-- It can be seen that although you search for three columns, the column you update is not the index corresponding to (doc1), and synchronizing the index does not work.

Select * from mytable1 where contains (doc1, adladlhadad)> 0; -- no record

-- Update the original content of the doc1 column (the actual content remains unchanged and only operations are required)

Update mytable1 set doc1 = the day is blue and cloudification is not supported. The weather is very good. Where id = 2;

-- Synchronously update Indexes

Begin

Ctx_ddl.sync_index (idx_mytable );

End;

-- Query again

Select * from mytable1 where contains (doc1, adladlhadad)> 0; -- results are displayed. You can see that update operations are performed on other queried columns (non-indexed columns, you can update the index together with the column corresponding to the index. You can synchronize the index without changing the index content.

4.2 local disk search

Create table mytable3 (id number primary key, docs varchar2 (2000 ));

Insert into mytable3 values(111555,1.txt );

Insert into mytable3 values(111556,1.doc );

Insert into mytable3 values(111557,1.xls );

Insert into mytable3 values(111558,1.pdf );

Insert into mytable3 values(111559,2.txt );

Insert into mytable3 values(111560,2.doc );

Insert into mytable3 values(111561,2.xls );

Insert into mytable3 values(111562,2.pdf );

Commit;

-- Delete the reference first

Begin

Ctx_ddl.drop_preference (COMMON_DIR );

End;

-- Create file datastore

Begin

Ctx_ddl.create_preference (COMMON_DIR, FILE_DATASTORE );

Ctx_ddl.set_attribute (COMMON_DIR, PATH, D: search );

End;

-- Delete the index first

Drop index myindex3;

-- Index creation, 8 files, simple content, 1.5 s

Create index myindex3 on mytable3 (docs) indextype is ctxsys. context parameters (datastore COMMON_DIR lexer foo. my_chinese_lexer );

Select * from mytable3 where contains (docs, text)> 0; -- Query, supports txt

Select * from mytable3 where contains (docs, pdf)> 0; -- Query, pdf supported

Select * from mytable3 where contains (docs, excel)> 0; -- Query, excel supported

Select * from mytable3 where contains (docs, word)> 0; -- Query, support for doc

Select * from mytable3 where contains (docs, text)> 0; -- Query, Chinese supported

Select * from mytable3 where contains (docs, document)> 0; -- Query, Chinese supported

Select * from mytable3 where contains (docs, read)> 0; -- Query, supports Chinese pdf

Select * from mytable3 where contains (docs, Which is Excel)> 0; -- Query, supports Chinese

-- Support for doc, txt, xls, and pdf

--Updated file content 2.txt

Select * from mytable3 where contains (docs, the text used for this test)> 0; -- no updated data is queried.

-- The index is not synchronized and is invalid.

-- Synchronously update Indexes

Begin

Ctx_ddl.sync_index (myindex3 );

End;

-- Query again

Select * from mytable3 where contains (docs, test)> 0; -- still invalid

--Use the same value as 2.txt and then synchronize the index

Update mytable3 set docs1_2.txt where id = 111559;

-- Synchronize indexes again

-- Synchronously update Indexes

Begin

Ctx_ddl.sync_index (myindex3 );

End;

-- Query again

Select * from mytable3 where contains (docs, test)> 0; -- the result is displayed. It can be seen that the synchronization index is invalid for the content of a single Update file, and the index recognizes the database record, the index is updated only when the database record changes

-- Adds a new file and returns the same results. The key is to update the database record. Even if the file content is changed, the database record should be updated with the same value.

4.3 highlight search results

Create table my_high (id number primary key, docs varchar2 (1000 ));

Insert into my_high values (1, this is a oracle text example. And oracle is the key word .);

Insert into my_high values (2, <title> oracle text </title> <body> this is a oracle ctx_doc hightlight example. </body> );

Commit;

-- Create an index

Create index ind_m_high on my_high (docs) indextype is ctxsys. context;

-- Return result offset

Set serverout on

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More