Lucene. NET full-text search, e.net

Source: Internet
Author: User
Tags createindex

Lucene. NET full-text search, e.net

Lucene has been used in recent projects. A front-end master is responsible for that module. In my spare time, I also made a Demo about Lucene full-text retrieval, easy to learn later.
I have a long article on the Internet about the principle of Lucene. If you are interested, you can read it. Once again, I will go straight to the topic and analyze the principle in the code.

1. Create an index (pangu word segmentation is used here)

Note: add a line of code such as # define notes on the first line of the background code to use the external code # if. After using it, you will understand it.

# Region create index void CreateIndex (object sender, EventArgs e) /// <summary> /// create an index /// </summary> /// <param name = "sender"> </param> /// <param name = "e"> </param> private void CreateIndex (object sender, eventArgs e) {// the physical path of the index. // this. createDirectory (); // assign FSDirectory directory = FSDirectory to indexPath. open (new DirectoryInfo (indexPath), new NativeFSLockFactory (); bool isUpdate = IndexReader. index Exists (directory); // determines whether an index library folder Exists and has an index library feature file if (isUpdate) {// At the same time, only a piece of code can be provided to write the index library! When you use IndexWriter to open directory, the index database is automatically locked .!!! // If the index directory is locked (for example, the program unexpectedly exits during the indexing process), first unlock if (IndexWriter. isLocked (directory) // if the index library file is locked, unlock {IndexWriter. unlock (directory) ;}// IndexWriter writer = new IndexWriter (indexPath, new PanGuAnalyzer (),! IsUpdate, Lucene. Net. Index. IndexWriter. MaxFieldLength. UNLIMITED); // This method is out of date. IndexWriter writer = new IndexWriter (directory, new PanGuAnalyzer (),! IsUpdate, Lucene. net. index. indexWriter. maxFieldLength. UNLIMITED); IEnumerable <Story> list = bllHelper. getAllStory (); foreach (Story story in list) {writer. deleteDocuments (new Term ("ID", story. ID. toString (); Document document = new Document (); // an article. Set the Field for a novel // Field to be searched in full text. index. ANALYZED !!!!!!!!!!!!!!!!!!!!!!!!!! Document. add (new Field ("ID", story. ID. toString (), Field. store. YES, Field. index. NOT_ANALYZED); document. add (new Field ("Title", story. title, Field. store. YES, Field. index. ANALYZED, Lucene. net. documents. field. termVector. WITH_POSITIONS_OFFSETS); document. add (new Field ("Author", story. author, Field. store. YES, Field. index. NOT_ANALYZED); document. add (new Field ("Content", story. content, Field. store. YES, Field. index. ANALYZED, Lucene. net. documents. field. termVector. WITH_POSITIONS_OFFSETS); document. add (new Field ("URL", story. URL, Field. store. YES, Field. index. NOT_ANALYZED); writer. addDocument (document);} writer. close (); directory. close () ;}# endregion

2. Search

# Region Search IEnumerable <Story> Search (string keyWord) /// <summary> /// Search /// </summary> /// <param name = "keyWords"> keyword </param> private IEnumerable <Story> Search (string keyWord) {FSDirectory directory = FSDirectory. open (new DirectoryInfo (indexPath), new NoLockFactory (); IndexReader reader = IndexReader. open (directory, true); IndexSearcher searcher = new IndexSearcher (reader); // multi-condition query // Search Condition Phras EQuery queryTitle = new PhraseQuery (); // you can divide the input "Beijing is the capital" into three words: "Beijing is the capital, then add the query condition foreach (string word in CommonHelper. splitWords (keyWord) {queryTitle. add (new Term ("Title", word);} queryTitle. setSlop (100); // maximum distance between words with multiple query conditions. In this article, it is generally meaningless to be too far apart. // The Search Condition PhraseQuery queryContent = new PhraseQuery (); // divide the word "Beijing is capital" entered by the user into three words: "Beijing is capital", and then add the query condition foreach (string word in CommonHelper. splitWords (keyWord) {queryContent. add (new Term ("Content", word);} queryContent. setSlop (100); // use BooleanQuery to splice multiple query conditions into a large query condition BooleanQuery query = new BooleanQuery (); query. add (queryTitle, BooleanClause. occur. shocould); // query is available. add (queryContent, BooleanClause. Occur. shocould); // You Can Have # if! Notes // The composite relationship indicates the following meanings: // 1. MUST and MUST indicate the relationship between "and", that is, "Union ". // 2. The former includes MUST and MUST_NOT, and the latter does not. // 3. MUST_NOT and MUST_NOT are meaningless. // 4. The values of shoshould and MUST indicate that MUST and shoshould are meaningless. // 5. The values of shoshould and MUST_NOT are equivalent to MUST and MUST_NOT. // 6. The concept of "or" is represented by shocould and shocould. # Endif // create a container that stores the query results TopScoreDocCollector collector = TopScoreDocCollector. create (1000, true); searcher. search (query, null, collector); ScoreDoc [] docs = collector. topDocs (0, collector. getTotalHits ()). scoreDocs; // obtain the document List <Story> list = new List <Story> (); foreach (ScoreDoc doc in docs) {int docID = doc.doc; // obtain the id of the query result Document (the id assigned by Lucene) document Document document = searcher. doc (docID); // find the corresponding Document Story story = new Story (); story based on the ID. ID = Convert. toInt32 (document. get ("ID"); story. title = CommonHelper. highlight (keyWord, document. get ("Title"); story. author = document. get ("Author"); story. content = CommonHelper. highlight (keyWord, document. get ("Content"); // story. content = document. get ("Content"); story. URL = document. get ("URL"); list. add (story) ;}return list ;}# endregion

3. Help Files

3.1 BusinessHelper class

# Region get novel + Story GetStoryById (int ID) based on id) /// <summary> /// obtain the novel by ID /// </summary> /// <param name = "id"> ID </param> /// <returns> </returns> public Story GetStoryById (int id) {string SQL = "SELECT * FROM Story nolock WHERE Id = @ Id"; using (SqlDataReader reader = SqlHelper. executeDataReader (SQL, new SqlParameter ("@ Id", id) {if (reader. read () {return ToModel (reader) ;}else {return null ;}}# endregion # region get all novels + IEnumerable <Story> GetAllStory () /// <summary> /// obtain all novels /// </summary> /// <returns> </returns> public IEnumerable <Story> GetAllStory () {var list = new List <Story> (); string SQL = "SELECT * FROM Story nolock"; using (SqlDataReader reader = SqlHelper. executeDataReader (SQL) {while (reader. read () {list. add (ToModel (reader) ;}} return list ;}# endregion # region converts SqlDataReader into an object Story ToModel (SqlDataReader reader) /// <summary> /// converts SqlDataReader to an object // </summary> /// <param name = "reader"> </param> // <returns> </returns> private Story ToModel (SqlDataReader reader) {Story story = new Story (); story. ID = (int) ToModelValue (reader, "Id"); story. title = (string) ToModelValue (reader, "Title"); story. author = (string) ToModelValue (reader, "Author"); story. content = (string) ToModelValue (reader, "Content"); story. URL = (string) ToModelValue (reader, "URL"); return story ;}# endregion private object ToDBValue (object value) {if (value = null) {return DBNull. value;} else {return value;} private object ToModelValue (SqlDataReader reader, string columnName) {if (reader. isDBNull (reader. getOrdinal (columnName) {return null;} else {return reader [columnName] ;}}

3.2 CommonHelper class

/// <Summary> /// splits the string s passed by the user into several words // </summary> /// <param name = "s"> </ param> // <returns> </returns> public static string [] SplitWords (string s) {List <string> list = new List <string> (); Analyzer analyzer = new PanGuAnalyzer (); TokenStream tokenStream = analyzer. tokenStream ("", new StringReader (s); Lucene. net. analysis. token token = null; while (token = tokenStream. next ())! = Null) // Next. If no more words exist, null {list is returned. add (token. termText (); // get the word} return list. toArray ();} public static string Highlight (string keyword, string content) {try {// create HTMLFormatter. The parameter is the prefix and suffix of the highlighted word. highLight. simpleHTMLFormatter simpleHTMLFormatter = new PanGu. highLight. simpleHTMLFormatter ("<font color = \" red \ "> <B>", "</B> </font>"); // create a Highlighter, enter the HTMLFormatter and PanGu word segmentation object Semgent PanGu. highLight. highlighter highlighter = new PanGu. highLight. highlighter (simpleHTMLFormatter, new Segment (); // you can specify the number of characters for each abstract Segment. fragmentSize = 5000; // obtain the most matched abstract segment string result = highlighter. getBestFragment (keyword, content); if (string. isNullOrEmpty (result) {return content;} else {return result ;}} catch {return content ;}}

3.3 SqlHelper class

Public static string CONNECTIONSTRING = ConfigurationManager. connectionStrings ["connreceivedb"]. connectionString; # region execution query method + static DataTable ExecuteDataTable (string SQL) /// <summary> /// execute the query method /// <para> return DataTable </para> /// </summary> /// <param name = "SQL "> SQL statement </param> /// <param name =" list "> </param> public static DataTable ExecuteDataTable (string SQL) {using (SqlConnection conn = new Sq LConnection (SqlHelper. CONNECTIONSTRING) {conn. open (); using (SqlCommand cmd = new SqlCommand (SQL, conn) {SqlDataAdapter da = new SqlDataAdapter (cmd); DataTable dt = new DataTable (); da. fill (dt); return dt ;}}## endregion # region executes the query method and returns the DataReader object + static SqlDataReader ExecuteDataReader (string plain text, params SqlParameter [] parameters) /// <summary> /// execute the query method and return the DataReader object // </summary> /// <Param name = "plain text"> </param> /// <param name = "parameters"> </param> /// <returns> </returns> public static SqlDataReader ExecuteDataReader (string plain text, params SqlParameter [] parameters) {SqlConnection conn = new SqlConnection (CONNECTIONSTRING); conn. open (); using (SqlCommand cmd = conn. createCommand () {cmd. commandText = plain text; cmd. parameters. addRange (parameters); return cmd. executeRe Ader (CommandBehavior. closeConnection) ;}# endregion # method for adding, deleting, and modifying region + static void ExecuteNonQuery (string SQL, out bool flag) /// <summary> /// Method for adding, deleting, and modifying execution // </summary> /// <param name = "SQL"> SQL statement </param >/// <returns> return true OR false execution results </returns> public static bool ExecuteNonQuery (string SQL) {var flag = false; using (SqlConnection conn = new SqlConnection (SqlHelper. CONNECTIONSTRING )){ Conn. Open (); using (SqlCommand cmd = new SqlCommand (SQL, conn) {flag = cmd. ExecuteNonQuery ()> 0? True: false ;}}; return flag ;}# endregion

4. Novel entity

/// <Summary> /// novel entity class /// </summary> public class Story {// <summary> // novel ID /// </summary> public int ID {get; set ;}//< summary> /// novel Title /// </summary> public string Title {get; set ;} /// <summary> /// Author /// </summary> public string Author {get; set ;} /// <summary> /// novel Content /// </summary> public string Content {get; set ;} /// <summary> // online novel reading address // </summary> public string URL {get; set ;}}

5. Front-end

<Form id = "form1" runat = "server" method = "post"> <asp: textBox ID = "txtKW" runat = "server" Width = "291px"> </asp: TextBox> <asp: button ID = "btnSearch" runat = "server" Text = "Search" onclick = "btnSearch_Click"/> <asp: button ID = "btnCreateIndex" runat = "server" Text = "create Index" onclick = "btnCreateIndex_Click"/> <asp: gridView ID = "gdvShowStory" runat = "server" AutoGenerateColumns = "False" CellPadding = "4" ForeColor = "#333333" GridLines = "None"> <AlternatingRowStyle BackColor = "White" foreColor = "#284775"/> <Columns> <asp: templateField HeaderStyle-Width = "3%"> <HeaderTemplate> NO. </HeaderTemplate> <ItemTemplate> <asp: label ID = "Label1" runat = "server" Text = '<% # Eval ("ID") %>'> </asp: label> </ItemTemplate> </asp: TemplateField> <asp: TemplateField HeaderStyle-Width = "10%"> <HeaderTemplate> title </HeaderTemplate> <ItemTemplate> <asp: label ID = "Label2" Text = '<% # Eval ("Title") %> 'runat = "server"> </asp: label> </ItemTemplate> </asp: TemplateField> <asp: TemplateField HeaderStyle-Width = "8%"> <HeaderTemplate> author </HeaderTemplate> <ItemTemplate> <asp: label ID = "Label2" Text = '<% # Eval ("Author") %> 'runat = "server"> </asp: label> </ItemTemplate> </asp: TemplateField> <asp: TemplateField HeaderStyle-Width = "70%"> <HeaderTemplate> content </HeaderTemplate> <ItemTemplate> <asp: label ID = "Label2" Text = '<% # Eval ("Content") %> 'runat = "server"> </asp: label> </ItemTemplate> </asp: TemplateField> <asp: templateField HeaderStyle-Width = "5%"> <HeaderTemplate> operation </HeaderTemplate> <ItemTemplate> <a href = '<% # Eval ("URL ") %> '> online reading </a> </ItemTemplate> </asp: templateField> </Columns> <EditRowStyle BackColor = "#999999"/> <FooterStyle BackColor = "# 5D7B9D" Font-Bold = "True" ForeColor = "White"/> <HeaderStyle backColor = "# 5D7B9D" Font-Bold = "True" ForeColor = "White"/> <PagerStyle BackColor = "#284775" ForeColor = "White" HorizontalAlign = "Center"/> <RowStyle BackColor = "# F7F6F3" ForeColor = "#333333"/> <SelectedRowStyle BackColor = "# E2DED6" Font-Bold = "True" ForeColor = "#333333"/> <sortedAscendingCellStyle BackColor = "# E9E7E2"/> <SortedAscendingHeaderStyle BackColor = "# 506C8C"/> <symbol BackColor = "# FFFDF8"/> <symbol BackColor = "# 6F8DAE"/> </asp: gridView> </form>

Note: Several class libraries need to be introduced.

 

 

 

 

 

Okay. So far, a simple Demo has come out. Let's see the effect:

 

 

 

 

 

 

 

 

 

 

 

(PS: You are welcome to participate in the introduction to Lucene. If you are interested, you can even deduct 1686336218,I have you on the way to success.)


Full-text indexing by lucene net can be performed locally. An error is reported after being uploaded to the server.

The error cause has been told. The path access is denied. You can set the directory access permission,
E.net cannot access the specified directory to read and create indexes.

How to configure lucene for full-text search?

To develop a system based on Lucene, first understand its operating mechanism. We recommend that you read a group of articles from a friend of mine at JavaEye, maybe it will help you with our company's full-text index, search engine or something.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.