Sort out all the questions I've encountered about full-text indexing.
I. Setting up a lexical analyzer
Oracle implements full-text retrieval, and its mechanism is simple. That is, through the Oracle patented Lexical Analyzer (lexer), all the ideographic units in the article (Oracle called term) are found, recorded in a set of tables beginning with dr$, and note the position, number of times, hash value and other information. When retrieved, Oracle looks for the appropriate term from this set of tables and calculates its frequency, based on an algorithm that calculates the score for each document (score), the so-called ' match rate '. And Lexer is the core of the mechanism, it determines the efficiency of full-text search. Oracle offers different lexer for different languages, and we typically have three of them:
1, Basic_lexer: for English. It can be based on space and punctuation to the English words from the sentence separation, but also automatically some of the occurrence of high frequency has lost the meaning of the word as ' garbage ' processing, such as if, is, has a high processing efficiency. However, the lexer applied to Chinese has many problems, because it only recognize spaces and punctuation, and Chinese words usually do not have spaces, so it will be the whole sentence as a term, in fact, the loss of retrieval ability. With the words ' Chinese people stand up ' as an example, the result of basic_lexer analysis is only one term, that is ' Chinese people stand up '. If you retrieve ' China ' at this point, the content will not be retrieved.
2, Chinese_vgram_lexer: Special Chinese parser, support all Chinese character set (zhs16cgb231280 zhs16gbk zht32euc zht16big5 zht32tris zht16mswin950 Zht16hkscs UTF8). The analyzer analyzes Chinese sentences in terms of the word unit. ' The Chinese people stand up ' this sentence will be analyzed by it into the following term: ' Chinese ', ' China ', ' countrymen ', ' People ', ' min zhan ', ' Stand Up ', ' up ', ' Come ', ' up '. It can be seen that this method of analysis, the implementation of the algorithm is very simple, and can achieve ' clean sweep ', but the efficiency is passable.
3, Chinese_lexer: This is a new Chinese parser, only supports the UTF8 character set. As seen above, Chinese Vgram lexer This analyzer because do not know the commonly used Chinese vocabulary, so the analysis of the unit is very mechanical, like the above "station", "Stand Up" in the Chinese will not appear alone, so this term is meaningless, but affect efficiency. Chinese_lexer's biggest improvement is that the analyzer can recognize most commonly used Chinese words, so it can more efficiently analyze sentences, like the above two foolish units will not appear again, greatly improved efficiency. But it only supports UTF8, if your database is the ZHS16GBK character set, you can only use the stupid Chinese vgram lexer.
If you do not make any settings, Oracle uses the Basic_lexer parser by default. To specify which lexer to use, you can do this:
BEGIN
Ctx_ddl.create_preference (' my_lexer ', ' chinese_vgram_lexer ');
END;
/
Where My_lexer is the parser name.
Ii. establishment of full-text indexing
When establishing the Intermedia Index, indicate the lexer used:
CREATE INDEX myindex on MyTable (MyColumn) Indextype is Ctxsys.context parameters (' lexer my_lexer ');
※ Personal experience: After the establishment of the full-text index, with PL/SQL Developer Tool View table, in index This column is not see the index information.
And I encountered an error when deleting the full-text index:
sql> DROP Index Searchkeytbl_key;
Drop INDEX Searchkeytbl_key
Ora-29868:cannot issue DDL on a domain index marked as LOADING
Workaround:
Ora-29868:cannot issue DDL on a domain index marked as LOADING
Description: This error will be reported when the index is created, broken, restarted, and so on, causing the index break to fail to succeed, and then drop or rebuild.
Resolution: Only DROP Index ind_name force is removed and then rebuilt
Third, index synchronization maintenance
Complete with the following two jobs (the job is to be built under the same user as the table):
VARIABLE jobno number;
BEGIN
Dbms_job. SUBMIT (: Jobno, ' Ctx_ddl.sync_index (' index_name '); ',
Sysdate, ' Sysdate + (1/24/4) ');
Commit
END; Synchronous
VARIABLE jobno number;
BEGIN
Dbms_job. SUBMIT (: Jobno, ' Ctx_ddl.optimize_index (' myindex ', ' full '); ',
Sysdate, ' sysdate + 1 ');
Commit Optimization
After the completion of the manual operation:
EXEC Dbms_job.run (jobno);
※ Personal experience: Running the job may be problematic, you can run the index separately, try
exec ctx_ddl.sync_index (' index_name ');
If there is no problem running alone, check if the job is wrong or the Oracle database user for the current operation has no permissions to run the stored procedure
Sql> exec Dbms_job.run (190);
Begin Dbms_job.run (190); End
Ora-12011:execution of 1 jobs failed
Ora-06512:at "SYS. Dbms_ijob ", line 406
Ora-06512:at "SYS. Dbms_job ", line 272
Ora-06512:at Line 1
To escalate the error is caused by the user not running any stored procedures, you need to add this permission to the user:
Sql> Grant execute any procedure to oracle_username;
And look at the job situation.
SELECT * from User_jobs;
Iv. Testing
Correlation query: SELECT * FROM table_name where CONTAINS (column_name, ' keyword ') >0;
Sql> SELECT * from Searchkeytbl where type= ' city ' and contains (key, ' Yangpu ') >0;
USERNAME TYPE KEY
-------------------- ---------------------------------------- --------------------------------------------------- -----------------------------
Mujian80 City, Yangpu District, Shanghai city
V. Questions
Issues with full-text indexing (constantly updated)
Sql> CREATE index Gh_ghname_idx on GH (ghname) indextype is ctxsys.context parameters (' lexer gh_ghname_lexer ');
Create index gh_ghname_idx on GH (ghname) indextype is ctxsys.context parameters (' lexer gh_ghname_lexer ')
Ora-24795:illegal COMMIT attempt made
Ora-29855:error occurred in the execution of Odciindexcreate routine
Ora-20000:oracle Text Error:
Drg-50857:oracle error in DRVDDL. Indexcreate
Ora-20000:oracle Text Error:
Drg-50857:oracle error in DRVDML. Maintainktab
Ora-24795:illegal COMMIT attempt made
Ora-06512:at "Ctxsys. Drue ", line 160
Ora-06512:at "Ctxsys. Textindexmethods ", line 364
To avoid the error, please use one of the following solutions
1. Don ' t use a 32k-blocksized tablespace to store the internal index objects
-OR-
2. Download Patch 5596325 from Metalink and apply it as described in the README file.
Take a look. The table space that might be used to create the index is not enough
reports-->dba-->total free Space PL/SQL Developer tool to see the rest of the table space
SELECT * from V$datafile; Viewing data file information
Oracle Full-text indexing creation and use