Use Lucene. NET for data retrieval

Source: Internet
Author: User

  • Introduction
It is not uncommon to query data in a software system. When the data volume is large, the data storage media is not a database, or the retrieval method is more flexible, how can we retrieve data? Create an index for the data, and use the index technology to implement the retrieval function more flexibly and quickly. The following describes how Lucene of. NET is applied in actual projects.
  • Case Summary
Taking a file retrieval system as an example, the main function is to establish a unified retrieval platform for a large number of files on the hard disk without using a database.
  • Ideas
The system is divided into two parts: the first part is the index management, creating or updating indexes for files; the second part is the file retrieval, matching and obtaining relevant information based on keywords and index libraries. These two functions can be integrated into one project or different projects separately.
  • Word Segmentation
It should be noted that neither index management nor file retrieval can be separated, that is, word segmentation, it is the power of word segmentation that precisely matches multiple keywords with a large index Library Based on word segmentation rules. Because Lucene is a foreign technology, it does not have a high degree of support for Chinese word segmentation. Here I recommend pangu word segmentation.
  • Index management
Index management mainly involves creating, updating, and deleting indexes. Note that the ID field used for identification cannot use strings with special characters. Use words or numbers as much as possible. Otherwise, the index may not be deleted or updated properly.
  FSDirectory directory = FSDirectory.Open( DirectoryInfo(.IndexDataDir),    isExist =             PanGuAnalyzer analyzer =   IndexWriter writer =  IndexWriter(directory, analyzer, !   (IndexDataQueue.Count >       Document document =       BaseDataMode mode =                                  (KeyValuePair<, > kv                                            document.Add(               };                                     MultiFieldQueryParser parser =  MultiFieldQueryParser(Lucene.Net.Util.Version.LUCENE_29,  [] {              Query query = parser.Parse(mode.Content[               (KeyValuePair<, > kv                   document.Add(           };                        MultiFieldQueryParser parser =  MultiFieldQueryParser(Lucene.Net.Util.Version.LUCENE_29,  [] {              Query query = parser.Parse(mode.Content[          };          : { };          directory.Close();
  • File Retrieval
The main process of file retrieval is to first split the queried content into multiple keywords, and then use the built-in Lucene search function to query the created index library, finally, the search results are displayed.
  FSDirectory directory = FSDirectory.Open( DirectoryInfo(.IndexDir),  IndexReader reader = IndexReader.Open(directory,  IndexSearcher searcher =   BooleanQuery queryOr =   ( word        (KeyValuePair<, > kv           TermQuery query =  TermQuery(               TopDocs tds = searcher.Search(queryOr, ,  ScoreDoc[] docs =  ( i = ; i < docs.Length; i++       docId =     Document doc =  content = doc.Get(   
  • Resources

DLL and dictionary: http://download.csdn.net/detail/aaakingwin/7208679

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.