Use Lucene. NET to implement intra-site search, and perform intra-site search on the e.net site

Source: Internet
Author: User

Use Lucene. NET to implement intra-site search, and perform intra-site search on the e.net site

Import Lucene. NET Development Kit

Lucene is an open-source full-text search engine toolkit of the apache Software Foundation. It is a full-text search engine architecture that provides a complete query engine and index engine, part of the text analysis engine. Lucene aims to provide software developers with a simple and easy-to-use toolkit to conveniently implement full-text retrieval in the target system, or build a complete full-text retrieval engine based on this. Lucene. Net is a. NET version of Lucene.

You can download the latest Lucene. NET

Create, update, and delete Indexes

Search, search by index

IndexHelper add, update, and delete Indexes

Using System; using Lucene. net. store; using Lucene. net. index; using Lucene. net. analysis. panGu; using Lucene. net. documents; namespace BLL {class IndexHelper {/// <summary> /// log assistant /// </summary> static Common. logHelper logger = new Common. logHelper (typeof (SearchBLL); // <summary> // the location where the index is saved, read from the configuration file // </summary> static string indexPath = Common. configurationHelper. appSettingMapPath ("IndexP Ath "); /// <summary> /// create an index file or update an index file /// </summary> /// <param name = "item"> index information </param> public static void CreateIndex (Model. helperModel. indexFileHelper item) {try {// index repository FSDirectory directory = FSDirectory. open (new System. IO. directoryInfo (indexPath), new NativeFSLockFactory (); // you can check whether the index has bool isUpdate = IndexReader. indexExists (directory); if (isUpdate) {// if the INDEX directory is locked (for example, the program unexpectedly exits during the indexing process), first unlock if (IndexWriter. isLocked (directory) {// unlock the index library IndexWriter. unlock (directory) ;}// create an IndexWriter object and add the index IndexWriter writer = new IndexWriter (directory, new PanGuAnalyzer (),! IsUpdate, Lucene. net. index. indexWriter. maxFieldLength. UNLIMITED); // obtain the news title part string title = item. fileTitle; // obtain the news subject content string body = item. fileContent; // to avoid repeated indexes, you must first Delete the records with number = I and then add the records again. // in particular, you must first Delete the previous index writer. deleteDocuments (new Term ("id", item. fileName); // create the index file Document document Document = new document (); // ANALYZED only for fields that require full-text retrieval // Add the id field Document. add (new Field ("id", item. fileName, Fiel D. store. YES, Field. index. NOT_ANALYZED); // Add the title field document. add (new Field ("title", title, Field. store. YES, Field. index. NOT_ANALYZED); // Add the body field document. add (new Field ("body", body, Field. store. YES, Field. index. ANALYZED, Lucene. net. documents. field. termVector. WITH_POSITIONS_OFFSETS); // Add the url field document. add (new Field ("url", item. filePath, Field. store. YES, Field. index. NOT_ANALYZED); // write to the index database writ Er. addDocument (document); // closes the resource writer. close (); // do not forget to Close; otherwise, the index results cannot be found in directory. close (); // log logger. debug (String. format ("index {0} created", item. fileName);} catch (SystemException ex) {// record the error log logger. error (ex); throw;} catch (Exception ex) {// record the Error log logger. error (ex); throw ;}} /// <summary> /// Delete the corresponding index based on the id /// </summary> /// <param name = "guid"> id of the index to be deleted </param> public static void DeleteIndex (stri Ng guid) {try {// index repository FSDirectory directory = FSDirectory. open (new System. IO. directoryInfo (indexPath), new NativeFSLockFactory (); // determines whether the index bool isUpdate = IndexReader exists in the index library. indexExists (directory); if (isUpdate) {// if the INDEX directory is locked (for example, the program unexpectedly exits during the indexing process), first unlock if (IndexWriter. isLocked (directory) {IndexWriter. unlock (directory) ;}} IndexWriter writer = new IndexWriter (directory, new PanGuAnalyzer (),! IsUpdate, Lucene. net. index. indexWriter. maxFieldLength. UNLIMITED); // Delete the index file writer. deleteDocuments (new Term ("id", guid); writer. close (); directory. close (); // do not forget to Close; otherwise, logger cannot be found in the index results. debug (String. format ("index deleted {0} succeeded", guid);} catch (Exception ex) {// log logger. error (ex); // throw an exception throw ;}}}}

Search by searching Indexes

Using Lucene. net. analysis; using Lucene. net. analysis. panGu; using Lucene. net. events; using Lucene. net. index; using Lucene. net. search; using Lucene. net. store; using Model. helperModel; using System. collections. generic; namespace BLL {public static class SearchBLL {// multiple logs may be output to a class, and logs must be logged in multiple places, logger is often made into a static variable // <summary> // log assistant // </summary> static Common. logHelper logger = new Commo N. logHelper (typeof (SearchBLL); // <summary> // index storage location /// </summary> static string indexPath = Common. configurationHelper. appSettingMapPath ("IndexPath "); /// <summary> /// search /// </summary> /// <param name = "keywords"> keywords searched by users </param> /// <returns> return Search results </returns> public static List <SearchResult> Search (string keywords) {try {// index repository FSDirectory directory = FSDirectory. open (new System. IO. dire CtoryInfo (indexPath), new NoLockFactory (); // create the IndexReader object IndexReader reader = IndexReader. open (directory, true); // create IndexSearcher object IndexSearcher searcher = new IndexSearcher (reader); // create PhraseQuery query object PhraseQuery query = new PhraseQuery (); // split the keywords entered by the user into foreach (string word in SplitWord (keywords) {// Add the search keyword query. add (new Term ("body", word);} // set the word segmentation interval to 100 words within the query. setSlop (1, 100); TopS CoreDocCollector collector = TopScoreDocCollector. create (1000, true); // query the result searcher Based on the query conditions. search (query, null, collector); // The ScoreDoc result ScoreDoc [] docs = collector. topDocs (0, collector. getTotalHits ()). scoreDocs; // Save the list of search results <SearchResult> listResult = new List <SearchResult> (); for (int I = 0; I <docs. length; I ++) {// obtain the document number (primary key, which is Lucene. net allocated) // only the Document id is found in the search results. If you want to retrieve the Document, you need the Doc to retrieve/ /Reduce the content usage int docId = docspolici2.16.doc; // find Document doc = searcher by id. doc (docId); string number = doc. get ("id"); string title = doc. get ("title"); string body = doc. get ("body"); string url = doc. get ("url"); // create a search result object SearchResult result = new SearchResult (); result. number = number; result. title = title; result. bodyPreview = Preview (body, keywords); result. url = url; // Add it to the result list listResult. add (r Esult);} if (listResult. count = 0) {return null;} else {return listResult;} catch (SystemException ex) {logger. error (ex); return null;} catch (Exception ex) {logger. error (ex); return null ;}} /// <summary> /// get Content preview /// </summary> /// <param name = "body"> content </param> /// <param name = "keyword"> keywords </param> // <returns> </returns> private static string Preview (string body, string keyword ){ // Create an HTMLFormatter. The parameter is the prefix and suffix of the highlighted word. highLight. simpleHTMLFormatter simpleHTMLFormatter = new PanGu. highLight. simpleHTMLFormatter ("<font color = \" red \ ">", "</font>"); // create a Highlighter and enter the HTMLFormatter and PanGu word segmentation object Semgent PanGu. highLight. highlighter highlighter = new PanGu. highLight. highlighter (simpleHTMLFormatter, new PanGu. segment (); // set the number of characters for each abstract Segment to highlighter. fragmentSize = 100; // obtain the most matched abstract segment string bod YPreview = highlighter. getBestFragment (keyword, body); return bodyPreview;} // <summary> // pangu word segmentation, perform word segmentation for the search keywords entered by the user /// </summary> /// <param name = "str"> keywords entered by the user </param> /// <returns> array composed of results after word segmentation </returns> private static string [] SplitWord (string str) {List <string> list = new List <string> (); Analyzer analyzer = new PanGuAnalyzer (); TokenStream tokenStream = analyzer. tokenStream ("", new System. IO. StringReader (str); Lucene. Net. Analysis. Token token = null; while (token = tokenStream. Next ())! = Null) {list. Add (token. TermText ();} return list. ToArray ();}}}

SearchResult Model

namespace Model.HelperModel{  public class SearchResult  {    public string Number { get; set; }    public string Title { get; set; }    public string BodyPreview { get; set; }    public string Url { get; set; }  }}

The above is all the content of this article. I hope you will like it.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.