Lucene.Net Station Search 3-The simplest search engine code

Source: Internet
Author: User

Lucene.Net Core Class Introduction

Run the code of the well-written index first, and then go down to the function of each class without having to memorize the code.

(*) directory represents the place where the index file (lucene.net is used to hold the data thrown by the user) is an abstract class, two subclasses fsdirectory (in the file), Ramdirectory (in memory). Don't mix with the directory in IO when you use it.

Method for creating Fsdirectory, Fsdirectory directory =fsdirectory.open (new DirectoryInfo (Indexpath), New Nativefslockfactory ()), Folder path for path index

Indexreader class that reads the index and writes to IndexWriter.

The static method of Indexreader bool Indexexists (Directory directory) determines whether directory directories are an index directory. The indexwriter bool IsLocked (Directory directory) determines whether the directory is locked, and the directory is locked before it is written to the directory. Two IndexWriter cannot write an index file at the same time. The indexwriter is automatically locked when it is written, and is automatically unlocked when close. The Indexwriter.unlock method is manually unlocked (such as before the close IndexWriter program crashes, which can cause the lock to remain).

Create an index

Constructor: IndexWriter (Directorydir, Analyzer A, bool Create, MaxFieldLength MFL) because IndexWriter writes the input to the index, Lucene.Net is to write the file with the specified word breaker will be the article word breaker (so that when the search can be checked quickly), and then put the word into the index file.

void Adddocument (document DOC) to add documents (inserts) to the index. The document class represents the documents to be indexed (article), the most important method of add field, to add fields to the document. Document is a piece of documentation, field is a domain (attribute). Document is equivalent to a record, and field is equivalent to fields.

Field class constructor field (string name, String value, Field.store Store, Field.indexindex, Field.termvector termvector) : Name indicates the field name; value indicates field values;

Store indicates whether to store value values, Optional value Field.Store.YES storage, Field.Store.NO do not store, Field.Store.COMPRESS compressed storage, the default is to save only a word after a bunch of words, and do not save the word, the search can not be based on the things after the word to restore the original text, so if you want to display the original (than such as the body of the article) you need to set up storage.

Index indicates how to create an index with an optional value of Field.index. Not_analyzed, do not create an index, Field.index. ANALYZED, create indexes, and create indexed fields for better retrieval. Whether or not to pieces! Whether you need to follow this field for full-text search.

Termvector represents how to save the distance between indexed words. "Beijing welcomes you all", the index is how to save "Beijing" and "everyone" between "how many words." Easy to retrieve only words within a certain distance.

Why do you want the URL of the post as a field, because to be in the search show when the post address out to build hyperlinks, so Field.Store.YES; generally do not need to retrieve the URL, so Field.Index.NOT_ANALYZED. According to the "Dream of Red Mansions" construction of the "Word: page" paper, after the completion of the construction can be the original "Red mansions" thrown

Case: Index posts 1000 to 1100th. "As long as you can read examples and documents, make changes to achieve your own needs." In addition to the basic knowledge, the third-party development package as long as "can read, change can"

Introduce a namespace:

Using lucene.net.store;using system.io;using lucene.net.index;using lucene.net.analysis.pangu;using Lucene.net.documents;using Lucene.Net.Search;

1. Index the data

            String indexpath = @ "C:\1017index";//Pay attention to the case of the folder on disk, otherwise you will get an error.            Fsdirectory directory = Fsdirectory.open (new DirectoryInfo (Indexpath), New Nativefslockfactory ()); BOOL Isupdate = indexreader.indexexists (directory);//Determine if the index library exists if (isupdate) {//if the index                The directory is locked (for example, the program exits unexpectedly during the indexing process), the first unlock//lucene.net is automatically locked before the index library is written, automatically unlocked when close,//cannot be multithreaded, and can only handle cases where the event is permanently locked. if (indexwriter.islocked (directory)) {Indexwriter.unlock (directory);//un Negative Forced unlock}} indexwriter writer = new IndexWriter (directory, new Panguanalyzer (),!ISUPDA            TE, Lucene.Net.Index.IndexWriter.MaxFieldLength.UNLIMITED); for (int i = n; i < 1100; i++) {String txt = File.readalltext (@ "D:\ My Documents \ Fast disk \ Wisdom data \ Class Information \2011                -10-17 employment class \ article \ "+ i +". txt "); Document document = new document ();//A document is equivalent to a record documenT.add (New Field ("id", i.tostring (), Field.Store.YES, Field.Index.NOT_ANALYZED));                Each document can have its own attributes (fields), all field names are custom, the values are string type//field.store.yes not only to the article to record word segmentation, but also to save the original text, you do not have to go to the database to check Fields that require full-text search are added Field.index. ANALYZED document. ADD (New Field ("msg", TXT, Field.Store.YES, Field.Index.ANALYZED, Lucene.Net.Documents.Field.TermVector.WITH_                Positions_offsets)); Prevents duplicate indexing of writer.                Deletedocuments (New term ("id", i.ToString ()));//prevent the presence of data//delete from T where id=i//if not present delete 0 strips Writer. Adddocument (document);//write documents to index library} writer.            Close (); Directory. Close ()///Do not forget close, otherwise the index results cannot be searched

2, the Search code

            String indexpath = @ "C:\1017index";            string kw = TextBox1.Text;            Fsdirectory directory = Fsdirectory.open (new DirectoryInfo (Indexpath), New Nolockfactory ());            Indexreader reader = Indexreader.open (directory, True);            Indexsearcher searcher = new Indexsearcher (reader); Phrasequery query = new Phrasequery (); ADD (New Term ("msg", kw)),//where contains ("msg", kw)//foreach (string word in kw.) Split ("))//first with a space, let users go to participle, the space is separated by the word" computer professional "//{//query. ADD (New Term ("MSG", word));//contains ("msg", Word)//} query. Setslop (100);//two words with distances greater than 100 (XP) are not put into the search results, because the distance is too far correlation is not high topscoredoccollector collector = Topscoredoccollector.cre Ate (+, true);//The Container searcher The result of the query. Search (query, null, collector);//Use the query condition for searching, and search results into collector//collector. Gettotalhits () Total number of result bars scoredoc[] docs = Collector. Topdocs (0, collector. Gettotalhits()). scoredocs;//the data from the results of the query to the nth list<searchresult> List = new list<searchresult> (); for (int i = 0; i < Docs. Length; i++)//Traversal query result {int docId = docs[i].doc;//Get the ID of the document. Because document can be very memory (the difference between datasets and DataReader)//So only the ID in the query results, the specific content needs two times to query Document doc = searcher. Doc (docId);//query content by ID. Put in is document, find out or document//console.writeline (Doc.                Get ("id")); Console.WriteLine (Doc.                Get ("MSG"));                SearchResult result = new SearchResult (); Result. Id = Convert.ToInt32 (Doc.                Get ("id")); Result. MSG = doc. Get ("MSG");//Only Field.Store.YES fields can be used to find out the list with get.            ADD (result);            } repeater1.datasource = list; Repeater1.databind ();

ASPX code:

    <form id= "Form1" runat= "Server" >    <div>        <asp:button id= "Button1" runat= "Server" onclick= " Button1_Click "text=" CREATE INDEX "/>        <br/> <br/> <asp:textbox        id=" TextBox1 "runat=" Server " ></asp:TextBox>        <asp:button id= "Button2" runat= "Server" onclick= "button2_click" text= "Search"/>        <br/>        <ul>        <asp:repeater id= "Repeater1" runat= "Server" >            <itemtemplate ><li>id:<% #Eval ("Id")%><br/><% #Eval ("MSG")%></li></itemtemplate>        </asp:Repeater>        </ul>    </div>    </form>

Lucene.Net Station Search 3-The simplest search engine code

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.