Research on full-text retrieval of Oracle (full 1)

Source: Internet
Author: User

1. Preparation Process

 

1.1 check and set database roles

First, check whether there are CTXSYS users and CTXAPP roles in the database. If you do not have this user or role, it means that the intermedia function is not installed when your database is created. You must modify the database to install this function. By default, the ctxsys user is locked, so you must enable the ctxsys user first.

 

By default, the ctxsys user is locked and the password becomes invalid immediately. Therefore, we use the sys user to enter em, and then modify the status and password of the ctxsys user.

 

 

1.2 grant permissions

Take the previously created foo user as an example, and the T_DOCNEWS under the user as an example.

Log on to the system as a dba and grant resource and connect permissions to foo.

GRANT resource, connect to foo;

 

Log On with the ctxsys user and grant permissions to the foo user.

GRANT ctxapp TO foo;

GRANT execute ON ctxsys. ctx_cls TO foo;

GRANT execute ON ctxsys. ctx_ddl TO foo;

GRANT execute ON ctxsys. ctx_doc TO foo;

GRANT execute ON ctxsys. ctx_output TO foo;

GRANT execute ON ctxsys. ctx_query TO foo;

GRANT execute ON ctxsys. ctx_report TO foo;

GRANT execute ON ctxsys. ctx_thes TO foo;

GRANT execute ON ctxsys. ctx_ulexer TO foo;

 

View the system's default oracle text Parameters

Select pre_name, pre_object from ctx_preferences

 

 

2. Oracle Text Indexing principles

The Oracle text Index converts all characters in the text into tokens. For example, www.taobao.com converts

Mark as www, taobao, com.

Oracle10g supports four types of indexes: context, ctxcat, ctxrule, and ctxxpath.

 

 

 

2.1 Context Index

Oracle text Index converts all words into tokens. The context index architecture is reverse index (inverted

Index), each mark is mapped to a location containing its own text. For example, the word dog may have the following entries:

This indicates that dog has appeared in the documents doc1, doc3, and doc5. After the index is created, the system automatically generates

See the following DR $ MYINDEX $ I, DR $ MYINDEX $ K, DR $ MYINDEX $ R, DR $ MYINDEX $ X, and MYTABLE5 tables (assume that the table is

Mytable, index is myindx ). After the Dml operation, the context index is not automatically synchronized and must be used

Ctx_ddl.sync_index: manually synchronize indexes.

 

Example:

Create table docs (id number primary key, text varchar2 (200 ));

Insert into docs values (1,

Insert into docs values (2,

Insert into docs values (3,

Commit;

/

-- Create a context Index

Create index idx_docs on docs (text)

Indextype is ctxsys. context parameters

(Filter ctxsys. null_filter section group ctxsys.html _ section_group );

-- Query

Column text format a40; -- the string is 40 characters in length.

Select id, text from docs where contains (text, france)> 0;

 

Id text

-----------------------------------------

3

2

-- Continue to insert data

Insert into docs values (4,

Insert into docs values (5,

Commit;

Select id, text from docs where contains (text, city)> 0; -- newly inserted data not found

 

Id text

--------------------------------------------

2

 

 

-- Index Synchronization

Begin

Ctx_ddl.sync_index (idx_docs, 2 m); -- use 2 m to synchronize the index

End;

-- Query

Column text format a50;

Select id, text from docs where contains (text, city)> 0; -- query data

Id text

-----------------------------------------------

5

4

2

 

 

-- Or operator

Select id, text from docs where contains (text, city or state)> 0;

-- And Operator

Select id, text from docs where contains (text, city and state)> 0;

Or

Select id, text from docs where contains (text, city state)> 0;

 

-- Score indicates the score. The higher the score, the more accurate the data is.

Select score (1), id, text FROM docs where contains (text, oracle, 1)> 0;

The Context-type indexes are not automatically synchronized. After Dml is performed, you must manually synchronize the indexes. The query operator relative to the context index is INS ins.

 

2.2 Ctxcat Index

Used in Multi-column hybrid Query

Ctxcat can use index set to create an index set and add some query columns frequently used in combination with ctxcat queries to the index set. For example, when you query a product name, you also need to query the production date, price, description, and so on. You can add these columns to the index set. Oracle encapsulates these queries into the catsearch operation to improve the efficiency of full-text indexing. In some transactions with high real-time requirements, context indexes cannot be automatically synchronized. ctxcat automatically synchronizes indexes.

 

 

Example:

Create table auction (Item_id number, Title varchar2 (100), Category_id number, Price number, Bid_close date );

Insert into auction values (1, nikon camera, 1,400, 24--200-2002 );

Insert into auction values (2, olympus camera, 1,300, 25--200-2002 );

Insert into auction values (3, pentax camera, 1,200, 26-oct-2002 );

Insert into auction values (4, canon camera, 1,250, 27--200-2002 );

Commit;

 

 

/

-- Determine your query conditions (important)

-- Determine that all queries search the title column for item descriptions

-- Create an index set

Begin

Ctx_ddl.create_index_set (auction_iset );

Ctx_ddl.add_index (auction_iset, price);/* sub-index */

End;

-- Create an index

Create index auction_titlex on auction (title) indextype is ctxsys. ctxcat

Parameters (index set auction_iset );

Column title format a40;

Select title, price from auction where catsearch (title, camera, order by price)> 0;

 

 

Title price

-------------------------

Pentax camera 200

Canon camera 250

Olympus camera 300

Nikon camera 400

 

 

Insert into auction values (5, aigo camera, 1, 10, 27--200-2002 );

Insert into auction values (6, len camera, 1, 23, 27--200-2002 );

Commit;

 

 

 

/

-- Test whether the index is automatically synchronized

Select title, price from auction where catsearch (title, camera,

Price <= 100)> 0;

 

Title price

-------------------------

Aigo camera 10

Len camera 23

 

 

Add multiple subqueries to the index set:

 

Begin

Ctx_ddl.drop_index_set (auction_iset );

Ctx_ddl.create_index_set (auction_iset );

Ctx_ddl.add_index (auction_iset, price);/* sub-index */

Ctx_ddl.add_index (auction_iset, price, bid_close);/* sub-index B */

End;

 

Drop index auction_titlex;

 

Create index auction_titlex on auction (title) indextype is ctxsys. ctxcat

Parameters (index set auction_iset );

SELECT * FROM auction where catsearch (title, camera, price = 200 order by bid_close)> 0;

SELECT * FROM auction where catsearch (title, camera, order by price, bid_close)> 0;

 

After any Dml operation, the index of Ctxcat is automatically synchronized without manual execution. The query operator corresponding to the index of ctxcat is catsearch.

Syntax:

Catsearch (

[Schema.] column,

Text_query varchar2,

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.