How Oracle creates and uses Full-text indexing _oracle

Source: Internet
Author: User
Tags character set commit create index lexer oracle database

There are also many ways to search for text in an Oracle database without using the Oracle text feature. You can use the standard InStr function and the LIKE operator.

SELECT *from mytext WHERE INSTR (thetext, ' Oracle ') > 0;

SELECT * from MyText WHERE thetext like '%oracle% ';

There are many times, using InStr and like is ideal, especially when searching for only a small table. However, this method of positioning through these text will result in a full table scan, which is expensive for resources, and has very limited search capabilities, so when searching for massive amounts of text data, It is recommended that you use the Full-text Search feature provided by Oralce to establish a full text retrieval step by step, check and set the database role to first check the database for Ctxsys users and Ctxapp feet. Without this user and persona, it means that your database was created without the Intermedia feature installed. You must modify the database to install this feature. The Ctxsys user is locked in the default installation, so the Ctxsys user is enabled first. Step two assigns the execution permission of the CTX_DDL to the user who will use the Full-text index under the Ctxsys user, for example:

Grant execute on CTX_DDL to Pomoho;

First, set the lexical analyzer

Oracle to achieve Full-text search, the mechanism is actually very simple. That is, through Oracle's patented lexical Analyzer (lexer), all of the ideographic units in the article (Oracle called term) are found, recorded in a set of tables that begin with dr$, and note the location, number, hash value, etc. of the term. When retrieving, Oracle looks for the corresponding term from this set of tables and calculates the frequency of its occurrence, calculating the score (score) of each document according to an algorithm, known as the ' match rate '. And Lexer is the core of the mechanism, which determines the efficiency of full-text search. Oracle offers different lexer for different languages, and we can usually use three of them:

Basic_lexer: for English. It can separate the English words from the sentences according to the spaces and punctuation, and can automatically treat some words that have lost the retrieval meaning by the high frequency, such as ' rubbish ' processing, such as if, is, etc., with high processing efficiency. However, the lexer has many problems in Chinese, because it only recognize spaces and punctuation, and Chinese words usually do not have spaces, therefore, it will be the whole sentence as a term, in fact, lost the ability to retrieve. With the words ' Chinese people stand up ', for example, the result of basic_lexer analysis is only one term, that is ' Chinese people stand up '. If you retrieve ' China ' at this time, the content will not be retrieved.

Chinese_vgram_lexer: A specialized Chinese analyzer, All Kanji character sets (Zhs16cgb231280zhs16gbkzht32euczht16big5zht32triszht16mswin950zht16hkscsutf8) are supported. The analyzer analyzes Chinese sentences by word-unit. ' The Chinese people stand up ' this sentence, will be analyzed as follows several term: ' Middle ', ' China ', ' Chinese ', ' People ', ' the People's Station ', ' Stand up ', get up ', ' Come ', '. It can be seen that this method of analysis, the implementation of the algorithm is simple, and can achieve ' catch ', but the efficiency is not satisfactory.

Chinese_lexer: This is a new Chinese parser that only supports the UTF8 character set. Above has seen, Chinese vgram lexer This parser because does not know commonly used Chinese vocabulary, therefore the analysis Unit is very mechanical, like above ' The People Station ', ' stands ' in the Chinese does not appear alone, therefore this kind of term is meaningless, but affects the efficiency. Chinese_lexer's biggest improvement is that the analyzer can understand most of the commonly used Chinese vocabulary, so it can be more efficient to analyze sentences, like the above two stupid units will not appear again, greatly improve the efficiency. But it only supports UTF8, if your database is ZHS16GBK character set, you can only use the stupid Chinese vgram lexer.
If you do not make any settings, Oracle defaults to using the Basic_lexer parser. To specify which lexer to use, you can do this:

BEGIN
 ctx_ddl.create_preference (' my_lexer ', ' chinese_vgram_lexer ');
End;
/

Where My_lexer is the parser name.

Ii. Establishment of Full-text indexing

When establishing the Intermedia Index, indicate the lexer used:

CREATE INDEX Myindex on MyTable (MyColumn) indextype is ctxsys.context parameters (' lexer my_lexer ');

※ Personal experience: After the establishment of Full-text indexing, use Pl/sql Developer Tool View table, in the index column is not see the index information.

And I encountered an error when I deleted the Full-text index:

sql> DROP Index Searchkeytbl_key;
Drop INDEX Searchkeytbl_key
Ora-29868:cannot issue DDL on a domain index marked as LOADING

Workaround:

Ora-29868:cannot issue DDL on a domain index marked as LOADING
Note: When the index is created, disconnect, reboot and so on, cause the index interrupt is not executed successfully, then drop or rebuild and so on will be reported this error
Resolve: Only DROP index ind_name force forcibly deleted, and then rebuilt

Third, the synchronization of the index maintenance

Complete with the following two jobs (the job is built under the same user as the table):

VARIABLE jobno number;
BEGIN
Dbms_job. SUBMIT (: Jobno, ' Ctx_ddl.sync_index (' index_name '); ',
sysdate, ' Sysdate + (1/24/4) ');
commit;
End;      Synchronous
 
VARIABLE jobno number;
BEGIN
Dbms_job. SUBMIT (: Jobno, ' Ctx_ddl.optimize_index (' myindex ', ' ' full '); ',
sysdate, ' sysdate + 1 ');
commit;     Optimization

After the completion of the manual run:

EXEC Dbms_job.run (jobno);

※ Personal experience: Run the job may have problems, at this time can run the index alone, try

exec ctx_ddl.sync_index (' index_name ');

If there is no problem running on your own, check that the job is wrong or that the Oracle database user currently operating has permission to run the stored procedure

Sql> exec Dbms_job.run (190);
Begin Dbms_job.run (190); End

Ora-12011:execution of 1 jobs failed
Ora-06512:at "SYS. Dbms_ijob ", line 406
Ora-06512:at "SYS. Dbms_job ", line 272
Ora-06512:at Line 1

The escalation error is caused by the user not running any stored procedures, and the user is required to add this permission:

Sql> Grant execute any procedure to oracle_username;

And look at the job.

SELECT * from User_jobs;

Four, test

Associated query: SELECT * FROM table_name where CONTAINS (column_name, ' keyword ') >0;
Sql> SELECT * from Searchkeytbl where type= ' city ' and contains (key, ' Yangpu ') >0;

USERNAME TYPE KEY
-------------------- ---------------------------------------- --------------------------------------------------- -----------------------------
Mujian80 City, Yangpu District, Shanghai city

V. Issues

Problems with Full-text indexing (keep updating)

Sql> CREATE index Gh_ghname_idx on GH (ghname) indextype is ctxsys.context parameters (' lexer gh_ghname_lexer ');
Create index gh_ghname_idx on GH (ghname) indextype is ctxsys.context parameters (' lexer gh_ghname_lexer ')

Ora-24795:illegal COMMIT attempt made
Ora-29855:error occurred in the execution of Odciindexcreate routine
Ora-20000:oracle Text Error:
Drg-50857:oracle error in DRVDDL. Indexcreate
Ora-20000:oracle Text Error:
Drg-50857:oracle error in DRVDML. Maintainktab
Ora-24795:illegal COMMIT attempt made
Ora-06512:at "Ctxsys. Drue ", line 160
Ora-06512:at "Ctxsys. Textindexmethods ", line 364
To avoid the error, please use one of the following solutions

1. Don ' t use a 32k-blocksized tablespace to store the internal index objects
-OR-
2. Download Patch 5596325 from Metalink and apply it as described in the README file.

Look, there's probably not enough table space to create an index.

Reports-->dba-->total free space   pl/sql Developer tool to view the remaining space in the Tablespace
Select * from V$datafile;               View data file information

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.