It is not uncommon to query data in a software system. When the data volume is large, the data storage media is not a database, or the retrieval method is more flexible, how can we retrieve data? Create an index for the data, and use the index technology to implement the retrieval function more flexibly and quickly. The following describes how Lucene of. NET is applied in actual projects.
Taking a file retrieval system as an example, the main function is to establish a unified retrieval platform for a large number of files on the hard disk without using a database.
The system is divided into two parts: the first part is the index management, creating or updating indexes for files; the second part is the file retrieval, matching and obtaining relevant information based on keywords and index libraries. These two functions can be integrated into one project or different projects separately.
It should be noted that neither index management nor file retrieval can be separated, that is, word segmentation, it is the power of word segmentation that precisely matches multiple keywords with a large index Library Based on word segmentation rules. Because Lucene is a foreign technology, it does not have a high degree of support for Chinese word segmentation. Here I recommend pangu word segmentation.
Index management mainly involves creating, updating, and deleting indexes. Note that the ID field used for identification cannot use strings with special characters. Use words or numbers as much as possible. Otherwise, the index may not be deleted or updated properly.
FSDirectory directory = FSDirectory.Open( DirectoryInfo(.IndexDataDir), isExist = PanGuAnalyzer analyzer = IndexWriter writer = IndexWriter(directory, analyzer, ! (IndexDataQueue.Count > Document document = BaseDataMode mode = (KeyValuePair<, > kv document.Add( }; MultiFieldQueryParser parser = MultiFieldQueryParser(Lucene.Net.Util.Version.LUCENE_29, [] { Query query = parser.Parse(mode.Content[ (KeyValuePair<, > kv document.Add( }; MultiFieldQueryParser parser = MultiFieldQueryParser(Lucene.Net.Util.Version.LUCENE_29, [] { Query query = parser.Parse(mode.Content[ }; : { }; directory.Close();
The main process of file retrieval is to first split the queried content into multiple keywords, and then use the built-in Lucene search function to query the created index library, finally, the search results are displayed.
FSDirectory directory = FSDirectory.Open( DirectoryInfo(.IndexDir), IndexReader reader = IndexReader.Open(directory, IndexSearcher searcher = BooleanQuery queryOr = ( word (KeyValuePair<, > kv TermQuery query = TermQuery( TopDocs tds = searcher.Search(queryOr, , ScoreDoc[] docs = ( i = ; i < docs.Length; i++ docId = Document doc = content = doc.Get(
DLL and dictionary: http://download.csdn.net/detail/aaakingwin/7208679