Simple Optimization of Lucene index Library

Source: Internet
Author: User

Optimizing the index database based on the actual situation can speed up index creation and search.

1. Merge index library fragment files

The optimize () method of indexwriter is outdated because the efficiency of this method is very low. The setmergefactor (INT) method of indexwriter is used to merge files. However, in ipve3.6, this method is outdated and can be used directly.Instead of the logmergepolicy. setmergefactor (INT) method.

When the setmergefactor (INT) parameter value is small, the index creation speed is slow. When the parameter value is large, the index creation speed is faster. An index larger than 10 is suitable for batch index creation.

2. Use the memory index directory in combination with the file system INDEX DIRECTORY

The operation of the memory index directory is very fast, so we can load the index library from the file system to the memory when operating the index, and then write it back to the file system after the operation is complete.

When the index file in the memory is written back to the document creation system, we need to re-create the index directory. For example, the index directory in the original file system contains 10 files. when loading the files to the memory directory, copy the 10 files to the memory, and then add an index file, the number of index directory files in the memory becomes 11. When writing to the file system, the number of memory index directory files (11) plus the number of files in the original file system index directory (10) it turns into 21 files, and 10 files are duplicated. Therefore, we need to delete the index directory in the original file system and recreate it.

However, if the index library is huge, it is not recommended because the memory is large.

3. ImplementationCode.

 1   /**  2   * Index library Optimization  3   * @ Author  Luxh  4    */  5   Public   Class  Indexoptimizetest {  6       7           //  Word Divider  8           Private  Analyzer analyzer;  9           10          //  Index storage directory  11           Private  Directory directory;  12           13           /**  14   * Initialize analyzer and Directory  15   *  @ Throws  Ioexception  16            */  17  @ Before  18           Public   Void Before () Throws  Ioexception {  19               20               //  Create a standard tokenizer  21               //  Version. ipve_36 indicates matching the ipve3.6 version.  22 Analyzer = New Standardanalyzer (version. paie_36 );  23               24               //  Create a directory named indexdir in the current path  25 File indexdir = New File ("./indexdir" );  26               27               //  Create INDEX DIRECTORY  28 Directory = Fsdirectory. Open (indexdir ); 29   }  30           31           /**  32   * Merge index fragment files  33   *  @ Throws  Ioexception  34   *  @ Throws  Lockobtainfailedexception  35  *  @ Throws  Corruptindexexception  36            */  37   @ Test  38           Public   Void Testmergefactor () Throws  Corruptindexexception, lockobtainfailedexception, ioexception {  39               40 Indexwriterconfig =New  Indexwriterconfig (version. paie_36, analyzer );  41               42 Logmergepolicy mergepolicy = New  Logbytesizemergepolicy ();  43               //  Merge when 3 files are reached  44 Mergepolicy. setmergefactor (3 );  45               46   Indexwriterconfig. setmergepolicy (mergepolicy ); 47 Indexwriter = New  Indexwriter (directory, indexwriterconfig );  48               49 Book = New  Book ();  50 Book. setid (1 );  51 Book. settitle ("the eternal path of architecture" );  52 Book. setauthor ("Alexander" ); 53 Book. setcontent ("the eternal path of building" puts forward a new theory and idea about architectural design, building and planning, the core of this theory is that social members set the world order in which they live according to their own status of existence. This ancient method fundamentally forms the foundation of post-industrial architecture, these buildings are created by people. " );  54               55               //  Create document  56 Document Doc = New  Document ();  57               //  Store specifies whether the field needs to be stored, and index specifies whether the field needs word segmentation Index  58 Doc. Add (New Field ("ID" , Book. GETID (). tostring (), store. Yes, index. not_analyzed ));  59 Doc. Add ( New Field ("title" , Book. gettitle (), store. Yes, index. Analyzed ));  60 Doc. Add ( New Field ("author" , Book. getauthor (), store. Yes, index. Analyzed ));  61 Doc. Add ( New Field ("content" , Book. getcontent (), store. Yes, index. Analyzed )); 62   63   Indexwriter. adddocument (DOC );  64   Indexwriter. Close ();  65               66   }  67           68           /**  69   * The combination of the memory index directory And the file system INDEX DIRECTORY  70   *  @ Throws Ioexception  71            */  72   @ Test  73           Public   Void Testdirectorycombination () Throws  Ioexception {  74               75               //  Create a memory index directory and load the index library in the file system  76 Ramdirectory =New  Ramdirectory (directory );  77               78 Indexwriterconfig ramindexwriterconfig = New  Indexwriterconfig (version. paie_36, analyzer );  79               80 Indexwriter ramindexwriter = New  Indexwriter (ramdirectory, ramindexwriterconfig );  81               82 Book = New Book ();  83 Book. setid (1 );  84 Book. settitle ("the eternal path of architecture" );  85 Book. setauthor ("Alexander" );  86 Book. setcontent ("the eternal path of building" puts forward a new theory and idea about architectural design, building and planning, the core of this theory is that social members set the world order in which they live according to their own status of existence. This ancient method fundamentally forms the foundation of post-industrial architecture, these buildings are created by people. " );  87               88               //  Create document 89 Document Doc = New  Document ();  90               //  Store specifies whether the field needs to be stored, and index specifies whether the field needs word segmentation Index  91 Doc. Add ( New Field ("ID" , Book. GETID (). tostring (), store. Yes, index. not_analyzed ));  92 Doc. Add ( New Field ("title" , Book. gettitle (), store. Yes, index. Analyzed )); 93 Doc. Add ( New Field ("author" , Book. getauthor (), store. Yes, index. Analyzed ));  94 Doc. Add ( New Field ("content" , Book. getcontent (), store. Yes, index. Analyzed ));  95               96   Ramindexwriter. adddocument (DOC );  97   Ramindexwriter. Close ();  98               99 Indexwriterconfig fsindexwriterconfig = New  Indexwriterconfig (version. paie_36, analyzer );  100               //  Create a new index directory or overwrite the original INDEX DIRECTORY  101   Fsindexwriterconfig. setopenmode (openmode. Create );  102               103 Indexwriter fsindexwriter = New  Indexwriter (directory, fsindexwriterconfig );  104               // Write the index library in the memory to the file system  105   Fsindexwriter. addindexes (ramdirectory );  106   Fsindexwriter. Close ();  107               108   }  109   110 }

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.