In doing the project, the need to add full text search, select the direction of Lucene.Net, research, basic realization of the demand, and now share it to everyone. Please forgive me if I don't understand you deeply.
In the completion of the requirements, the view of a large amount of information, this article does not introduce the detailed Lucene.Net project establishment, only describes how to Full-text search documents. For how to establish the Lucene.Net project, please visit
Using the Lucene.Net search is divided into two parts, first by creating an index, by creating an index of the text content, and then by searching based on the index created. So how do you index a document? The main content of the document index, the key is to extract the contents of the document, in accordance with the general implementation, from simple to difficult, extract the text of the TXT format is relatively simple, if the realization of the extraction of txt text, the next is much easier, the high building ground up, this is the foundation.
1. Create the ASP.net page first.
This is an extremely simple page, after creating the page, double-click each button to generate the corresponding click events, in the corresponding click events to implement the program design.
2. Implement the index section.
As already mentioned, indexes are indexed primarily by text content, so you want to extract text content. Creates a function that extracts the text content of a TXT-formatted document.
Copy Code code as follows:
Extract TXT file
public static string Filereaderall (FileInfo fileName)
{
Read text content, and default encoding format to prevent garbled
StreamReader reader = new StreamReader (Filename.fullname, System.Text.Encoding.Default);
String line = "";
String temp = "";
Looping through text content
while (line = reader. ReadLine ())!= null)
{
temp + = line;
}
Reader. Close ();
Returns a string that is used to lucene.net a build index
return temp;
}
The text has been extracted, and the next step is to create an index based on the extracted content.
Copy Code code as follows:
protected void button2_click (object sender, EventArgs e)
{
Determine if the folder that holds the text exists
if (! System.IO.Directory.Exists (filesdirectory))
{
Response.Write ("<script>alert (' specified directory does not exist ');</script>");
Return
}
Read Folder Contents
DirectoryInfo dirinfo = new DirectoryInfo (filesdirectory);
fileinfo[] files = dirinfo.getfiles ("*.*");
Folder is empty
if (Files. Count () = = 0)
{
Response.Write ("<script>alert (no file in the files directory);</script>");
Return
}
To determine if the folder that holds the index exists, no creation
if (! System.IO.Directory.Exists (indexdirectory))
{
System.IO.Directory.CreateDirectory (indexdirectory);
}
Create an index
IndexWriter writer = new IndexWriter (Fsdirectory.open (New DirectoryInfo (indexdirectory)),
Analyzer, True, IndexWriter.MaxFieldLength.LIMITED);
for (int i = 0; i < files. Count (); i++)
{
String str = "";
FileInfo FileInfo = files[i];
Determine file format, prepare for other file formats later
if (FileInfo.FullName.EndsWith (". txt") | | fileInfo.FullName.EndsWith (". xml")
{
Get text
str = Filereaderall (fileInfo);
}
Lucene.Net.Documents.Document doc = new Lucene.Net.Documents.Document ();
Doc. ADD (New Lucene.Net.Documents.Field ("FileName", Fileinfo.name, Lucene.Net.Documents.Field.Store.YES, Lucene.Net.Documents.Field.Index.ANALYZED));
Generate indexes from text
Doc. ADD (New Lucene.Net.Documents.Field ("Content", str, Lucene.Net.Documents.Field.Store.YES, Lucene.Net.Documents.Field.Index.ANALYZED));
Doc. ADD (New Lucene.Net.Documents.Field ("Path", Fileinfo.fullname, Lucene.Net.Documents.Field.Store.YES, Lucene.Net.Documents.Field.Index.NO));
To add a generated index
Writer. Adddocument (DOC);
Writer. Optimize ();
}
Writer. Dispose ();
Response.Write ("<script>alert (' Index creation success ');</script>");
}
3. The index is created, the next is the search, search as long as the written in a fixed format will not be wrong.
Copy Code code as follows:
protected void Button1_Click (object sender, EventArgs e)
{
Get keywords
string keyword = TextBox1.Text.Trim ();
int num = 10;
Keyword NULL
if (string. IsNullOrEmpty (keyword))
{
Response.Write ("Please enter the keyword to find");</script> ("<script>alert");
Return
}
Indexreader reader = null;
Indexsearcher searcher = null;
Try
{
Reader = Indexreader.open (Fsdirectory.open (New DirectoryInfo (Indexdirectory)), true);
Searcher = new Indexsearcher (reader);
Create a query
Perfieldanalyzerwrapper wrapper = new Perfieldanalyzerwrapper (analyzer);
Wrapper. Addanalyzer ("FileName", analyzer);
Wrapper. Addanalyzer ("Path", analyzer);
Wrapper. Addanalyzer ("Content", analyzer);
string[] fields = {"FileName", "Path", "Content"};
Queryparser parser = new Multifieldqueryparser (Lucene.Net.Util.Version.LUCENE_30, fields, wrapper);
Query by keyword
Query query = parser. Parse (keyword);
TopScoreDocCollector collector = Topscoredoccollector.create (num, true);
Searcher. Search (query, collector);
This will be based on the weight ranking query order
var hits = Collector. Topdocs (). Scoredocs;
int numtotalhits = Collector. Totalhits;
You can then manipulate the collector data that you get
for (int i = 0; I < hits. Count (); i++)
{
var hit = hits[i];
Lucene.Net.Documents.Document doc = searcher. Doc (hit. DOC);
Lucene.Net.Documents.Field Filenamefield = doc. GetField ("FileName");
Lucene.Net.Documents.Field Pathfield = doc. GetField ("Path");
Lucene.Net.Documents.Field Contentfield = doc. GetField ("Content");
Output a table in a page loop
Strtable.append ("<tr>");
Strtable.append ("<td>" + filenamefield.stringvalue + "</td>");
Strtable.append ("</tr>");
Strtable.append ("<tr>");
Strtable.append ("<td>" + pathfield.stringvalue + "</td>");
Strtable.append ("</tr>");
Strtable.append ("<tr>");
Strtable.append ("<td>" + contentField.StringValue.Substring (0) + "</td>");
Strtable.append ("</tr>");
}
}
Finally
{
if (searcher!= null)
Searcher. Dispose ();
if (reader!= null)
Reader. Dispose ();
}
}
Now the entire lucene.net search the full text of the process has been established, can now search txt format files, search for other formats of the file in the future, the main core idea is to extract the text content of the different format files.
The display effect is as follows:
In the future Bovenri continues to accept documents that search other formats.