Oracle Full-text indexing creation and use

Source: Internet
Author: User
Tags create index lexer

Sort out all the questions I've encountered about full-text indexing.

I. Setting up a lexical analyzer

Oracle implements full-text retrieval, and its mechanism is simple. That is, through the Oracle patented Lexical Analyzer (lexer), all the ideographic units in the article (Oracle called term) are found, recorded in a set of tables beginning with dr$, and note the position, number of times, hash value and other information. When retrieved, Oracle looks for the appropriate term from this set of tables and calculates its frequency, based on an algorithm that calculates the score for each document (score), the so-called ' match rate '. And Lexer is the core of the mechanism, it determines the efficiency of full-text search. Oracle offers different lexer for different languages, and we typically have three of them:

1, Basic_lexer: for English. It can be based on space and punctuation to the English words from the sentence separation, but also automatically some of the occurrence of high frequency has lost the meaning of the word as ' garbage ' processing, such as if, is, has a high processing efficiency. However, the lexer applied to Chinese has many problems, because it only recognize spaces and punctuation, and Chinese words usually do not have spaces, so it will be the whole sentence as a term, in fact, the loss of retrieval ability. With the words ' Chinese people stand up ' as an example, the result of basic_lexer analysis is only one term, that is ' Chinese people stand up '. If you retrieve ' China ' at this point, the content will not be retrieved.

2, Chinese_vgram_lexer: Special Chinese parser, support all Chinese character set (zhs16cgb231280 zhs16gbk zht32euc zht16big5 zht32tris zht16mswin950 Zht16hkscs UTF8). The analyzer analyzes Chinese sentences in terms of the word unit. ' The Chinese people stand up ' this sentence will be analyzed by it into the following term: ' Chinese ', ' China ', ' countrymen ', ' People ', ' min zhan ', ' Stand Up ', ' up ', ' Come ', ' up '. It can be seen that this method of analysis, the implementation of the algorithm is very simple, and can achieve ' clean sweep ', but the efficiency is passable.

3, Chinese_lexer: This is a new Chinese parser, only supports the UTF8 character set. As seen above, Chinese Vgram lexer This analyzer because do not know the commonly used Chinese vocabulary, so the analysis of the unit is very mechanical, like the above "station", "Stand Up" in the Chinese will not appear alone, so this term is meaningless, but affect efficiency. Chinese_lexer's biggest improvement is that the analyzer can recognize most commonly used Chinese words, so it can more efficiently analyze sentences, like the above two foolish units will not appear again, greatly improved efficiency. But it only supports UTF8, if your database is the ZHS16GBK character set, you can only use the stupid Chinese vgram lexer.

If you do not make any settings, Oracle uses the Basic_lexer parser by default. To specify which lexer to use, you can do this:

BEGIN

Ctx_ddl.create_preference (' my_lexer ', ' chinese_vgram_lexer ');

END;
/

Where My_lexer is the parser name.

Ii. establishment of full-text indexing

When establishing the Intermedia Index, indicate the lexer used:


CREATE INDEX myindex on MyTable (MyColumn) Indextype is Ctxsys.context parameters (' lexer my_lexer ');

※ Personal experience: After the establishment of the full-text index, with PL/SQL Developer Tool View table, in index This column is not see the index information.

And I encountered an error when deleting the full-text index:

sql> DROP Index Searchkeytbl_key;

Drop INDEX Searchkeytbl_key

Ora-29868:cannot issue DDL on a domain index marked as LOADING

Workaround:

Ora-29868:cannot issue DDL on a domain index marked as LOADING
Description: This error will be reported when the index is created, broken, restarted, and so on, causing the index break to fail to succeed, and then drop or rebuild.
Resolution: Only DROP Index ind_name force is removed and then rebuilt

Third, index synchronization maintenance

Complete with the following two jobs (the job is to be built under the same user as the table):

VARIABLE jobno number;

BEGIN

Dbms_job. SUBMIT (: Jobno, ' Ctx_ddl.sync_index (' index_name '); ',

Sysdate, ' Sysdate + (1/24/4) ');

Commit

END; Synchronous

VARIABLE jobno number;

BEGIN

Dbms_job. SUBMIT (: Jobno, ' Ctx_ddl.optimize_index (' myindex ', ' full '); ',

Sysdate, ' sysdate + 1 ');

Commit Optimization

After the completion of the manual operation:

EXEC Dbms_job.run (jobno);

※ Personal experience: Running the job may be problematic, you can run the index separately, try

exec ctx_ddl.sync_index (' index_name ');

If there is no problem running alone, check if the job is wrong or the Oracle database user for the current operation has no permissions to run the stored procedure

Sql> exec Dbms_job.run (190);

Begin Dbms_job.run (190); End

Ora-12011:execution of 1 jobs failed
Ora-06512:at "SYS. Dbms_ijob ", line 406
Ora-06512:at "SYS. Dbms_job ", line 272
Ora-06512:at Line 1

To escalate the error is caused by the user not running any stored procedures, you need to add this permission to the user:

Sql> Grant execute any procedure to oracle_username;

And look at the job situation.

SELECT * from User_jobs;

Iv. Testing

Correlation query: SELECT * FROM table_name where CONTAINS (column_name, ' keyword ') >0;

Sql> SELECT * from Searchkeytbl where type= ' city ' and contains (key, ' Yangpu ') >0;

USERNAME TYPE KEY
-------------------- ---------------------------------------- --------------------------------------------------- -----------------------------
Mujian80 City, Yangpu District, Shanghai city

V. Questions

Issues with full-text indexing (constantly updated)

Sql> CREATE index Gh_ghname_idx on GH (ghname) indextype is ctxsys.context parameters (' lexer gh_ghname_lexer ');

Create index gh_ghname_idx on GH (ghname) indextype is ctxsys.context parameters (' lexer gh_ghname_lexer ')

Ora-24795:illegal COMMIT attempt made
Ora-29855:error occurred in the execution of Odciindexcreate routine
Ora-20000:oracle Text Error:
Drg-50857:oracle error in DRVDDL. Indexcreate
Ora-20000:oracle Text Error:
Drg-50857:oracle error in DRVDML. Maintainktab
Ora-24795:illegal COMMIT attempt made
Ora-06512:at "Ctxsys. Drue ", line 160
Ora-06512:at "Ctxsys. Textindexmethods ", line 364

To avoid the error, please use one of the following solutions

1. Don ' t use a 32k-blocksized tablespace to store the internal index objects
-OR-
2. Download Patch 5596325 from Metalink and apply it as described in the README file.

Take a look. The table space that might be used to create the index is not enough

reports-->dba-->total free Space PL/SQL Developer tool to see the rest of the table space

SELECT * from V$datafile; Viewing data file information

Oracle Full-text indexing creation and use

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.