Full-text search implementation steps for lucene.net Based on ASP. NET

Source: Internet
Author: User

When I was working on a project, I needed to add a full-text search. I selected the direction e.net. After a survey, I basically met the requirement and now I want to share it with you. Please forgive me for your understanding.

You can view a large amount of information when you complete the requirement. This article does not describe the detailed establishment of the lucene.net project, but only describes how to perform full-text search on the document. For details about how to create a project named e.net, visit

The searche.net search is divided into two parts: Creating an index, creating an index for text content, and then searching based on the created index. How to index a document is mainly to index the content of the document. The key is to extract the content of the document, which can be simplified to difficult according to the general implementation, extracting txt text is relatively simple. If you extract txt text, it will be much easier. This is the foundation of the project.

1. Create an ASP. NET page first.

This is an extremely simple page. After creating the page, double-click each button to generate a corresponding Click Event, and implement programming in the corresponding click event.

2. Implement the index section.

As mentioned above, the index mainly creates an index based on the text content, so we need to extract the text content. Creates a function to extract text from a txt file.

Copy codeThe Code is as follows: // extract the txt file
Public static string FileReaderAll (FileInfo fileName)
{
// Read text and use the default encoding format to prevent garbled characters
StreamReader reader = new StreamReader (fileName. FullName, System. Text. Encoding. Default );
String line = "";
String temp = "";
// Read the text content cyclically
While (line = reader. ReadLine ())! = Null)
{
Temp + = line;
}
Reader. Close ();
// Return string, used to generate indexes on e.net
Return temp;
}

The text content has been extracted. Next, you need to create an index based on the extracted content.Copy codeThe Code is as follows: protected void Button2_Click (object sender, EventArgs e)
{
// Determine whether the folder containing text exists
If (! System. IO. Directory. Exists (filesDirectory ))
{
Response. Write ("<script> alert ('the specified directory does not exist'); </script> ");
Return;
}
// Read the folder content
DirectoryInfo dirInfo = new DirectoryInfo (filesDirectory );
FileInfo [] files = dirInfo. GetFiles ("*.*");
// Folder empty
If (files. Count () = 0)
{
Response. Write ("<script> alert ('files directory does not have files'); </script> ");
Return;
}
// Determine whether the folder containing the index exists and does not exist
If (! System. IO. Directory. Exists (indexDirectory ))
{
System. IO. Directory. CreateDirectory (indexDirectory );
}
// Create an index
IndexWriter writer = new IndexWriter (FSDirectory. Open (new DirectoryInfo (indexDirectory )),
Analyzer, true, IndexWriter. MaxFieldLength. LIMITED );

For (int I = 0; I <files. Count (); I ++)
{
String str = "";
FileInfo fileInfo = files [I];
// Determine the file format to prepare for other file formats in the future
If (fileInfo. FullName. EndsWith (". txt") | fileInfo. FullName. EndsWith (". xml "))
{
// Obtain text
Str = FileReaderAll (fileInfo );
}
Lucene. Net. Documents. Document doc = new Lucene. Net. Documents. Document ();
Doc. add (new Lucene. net. documents. field ("FileName", fileInfo. name, Lucene. net. documents. field. store. YES, Lucene. net. documents. field. index. ANALYZED ));
// Generate an index based on text
Doc. add (new Lucene. net. documents. field ("Content", str, Lucene. net. documents. field. store. YES, Lucene. net. documents. field. index. ANALYZED ));
Doc. add (new Lucene. net. documents. field ("Path", fileInfo. fullName, Lucene. net. documents. field. store. YES, Lucene. net. documents. field. index. NO ));
// Add the generated index
Writer. AddDocument (doc );
Writer. Optimize ();
}
Writer. Dispose ();
Response. Write ("<script> alert ('index created successfully'); </script> ");
}

3. After the index is created, the next step is to search. As long as the search is written in a fixed format, no error will occur.Copy codeThe Code is as follows: protected void button#click (object sender, EventArgs e)
{
// Obtain the keyword
String keyword = TextBox1.Text. Trim ();
Int num = 10;
// Keyword empty
If (string. IsNullOrEmpty (keyword ))
{
Response. Write ("<script> alert ('Enter the keyword to lookup '); </script> ");
Return;
}

IndexReader reader = null;
IndexSearcher searcher = null;
Try
{
Reader = IndexReader. Open (FSDirectory. Open (new DirectoryInfo (indexDirectory), true );
Searcher = new IndexSearcher (reader );
// Create a query
PerFieldAnalyzerWrapper wrapper = new PerFieldAnalyzerWrapper (analyzer );
Wrapper. AddAnalyzer ("FileName", analyzer );
Wrapper. AddAnalyzer ("Path", analyzer );
Wrapper. AddAnalyzer ("Content", analyzer );
String [] fields = {"FileName", "Path", "Content "};

QueryParser parser = new MultiFieldQueryParser (Lucene. Net. Util. Version. paie_30, fields, wrapper );
// Query by keyword
Query query = parser. Parse (keyword );

TopScoreDocCollector collector = TopScoreDocCollector. Create (num, true );

Searcher. Search (query, collector );
// The Order of weight ranking is queried here.
Var hits = collector. TopDocs (). ScoreDocs;

Int numTotalHits = collector. TotalHits;

// You can perform operations on the acquired collector data later
For (int I = 0; I {
Var hit = hits [I];
Lucene. Net. Documents. Document doc = searcher. Doc (hit. Doc );
Lucene. Net. Documents. Field fileNameField = doc. GetField ("FileName ");
Lucene. Net. Documents. Field pathField = doc. GetField ("Path ");
Lucene. Net. Documents. Field contentField = doc. GetField ("Content ");
// Output the table cyclically on the page
StrTable. Append ("<tr> ");
StrTable. Append ("<td>" + fileNameField. StringValue + "</td> ");
StrTable. Append ("</tr> ");
StrTable. Append ("<tr> ");
StrTable. Append ("<td>" + pathField. StringValue + "</td> ");
StrTable. Append ("</tr> ");
StrTable. Append ("<tr> ");
StrTable. Append ("<td>" + contentField. StringValue. Substring (0,300) + "</td> ");
StrTable. Append ("</tr> ");
}
}
Finally
{
If (searcher! = Null)
Searcher. Dispose ();

If (reader! = Null)
Reader. Dispose ();
}
}

Now, the entire process of searching for the full text in e.net is complete. Now you can search for files in txt format and search for files in other formats will be added later, the core idea is to extract the text content of different formats of files.

The display effect is as follows:

In future blog posts, you will continue to search for documents in other formats.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.