Word breaker system is established, this is the basis is also the core, after we set up the index to use the word breaker system.
The following, in turn, explains the establishment of indexes and the search for indexes.
Word breaker system is established, this is the basis is also the core, after we set up the index to use the word breaker system. The following, in turn, explains the establishment of indexes and the search for indexes.
The index is set up using the inverted sort, the principle is to traverse all the text, the word participle, and then the vocabulary of the sub-index table. The form is similar to the following:
Words in Vocabulary 1, Chapter 2, Chapter 3 ...
It is important to pay attention to these terms when building an index Document,field. Document represents a file, which contains one or more Filed,field represents a domain, you can add a variety of fields in a document, name yourself, but the content of the document must be added, the way is as follows:
Doc. ADD (new field("Contents", str, Field. Store. YES, Field. Index. ANALYZED,Field. Termvector. With_positions_offsets));
The entire index is established as follows:
Using system;using system.collections.generic;using system.linq;using system.web;using System.IO;using Lucene.net.analysis;using lucene.net.analysis.standard;using lucene.net.index;using Lucene.Net.Documents;using lucene.net.search;using lucene.net.analysis.dchinese;using Version = lucene.net.util.version;using FSDirectory = Lucene.net.store.fsdirectory;using nativefslockfactory = Lucene.net.store.nativefslockfactory;namespace webapplication6{public class Indexfiles {public static bool Createindexfromfile (DirectoryInfo DOCDI R, DirectoryInfo indexdir) {string struserdicpath = System.AppDomain.CurrentDomain.BaseDirectory; string strtestdic = Struserdicpath; hashset<string> lststopwords = new hashset<string> (); Struserdicpath = Struserdicpath + "Userdictionary\\stopwords.txt"; string[] STRs = null; StreamWriter SW = new StreamWriter (strtestdic + "Userdictionary\\stoptest.txt"); using (StreamReader Strreader = new StreamReader (Struserdicpath)) {string strLine; while ((StrLine = Strreader.readline ()) = null) {StrLine = Strline.trim (); STRs = Strline.split (); foreach (String str in STRs) {lststopwords.add (str); Sw. WriteLine (str); }} strreader.close (); Sw. Close (); } bool Bexist = file.exists (docdir.fullname) | | Directory.Exists (Docdir.fullname); if (!bexist) {return false; }//using (IndexWriter writer = new IndexWriter (Fsdirectory.open (Indexdir), New Dchineseanalyzer (Version.lucene _30), True, IndexWriter.MaxFieldLength.LIMITED)//indexwriter writer = new IndexWriter (fsdirrctory, new stand ArdanalYzer (version.lucene_30), true, IndexWriter.MaxFieldLength.LIMITED); Fsdirectory fsdirrctory = Fsdirectory.open (Indexdir, New Nativefslockfactory ()); Analyzer Analyzer = new Dchineseanalyzer (version.lucene_30,lststopwords); IndexWriter writer = new IndexWriter (fsdirrctory, Analyzer, True, IndexWriter.MaxFieldLength.LIMITED); try {indexdirectory (writer, docdir); Writer. Optimize (); Writer.commit (); } finally {writer. Dispose (); Fsdirrctory.dispose (); } return true; } internal static void Indexdirectory (IndexWriter writer, DirectoryInfo directory) {foreach (var subdirectory in directory. GetDirectories ()) indexdirectory (writer, subdirectory); foreach (var file in directory. GetFiles ()) Indexdocs (writer, file); } internal static void Indexdocs (IndexWriter writer, FileInfo file) {Console.Out.WriteLine ( "Adding" + file); try {writer. Adddocument (Document (file)); } catch (FileNotFoundException) {//At least on Windows, some temporary files raise This exception with A//"Access denied" message checking if the file can be read doesn ' t help. } catch (UnauthorizedAccessException) {//Handle any access-denied errors that OCC ur while reading the file. } catch (IOException) {//Generic handler for any io-related exceptions that occur. }} public static document document (FileInfo f) {//Make a new, empty document Document doc = new document (); Add the path of the file as a field named "Path". Use a FielD is//indexed (i.e. searchable), but don ' t tokenize the field into words. Doc. ADD (New Field ("path", F.fullname, Field.Store.YES, Field.Index.NOT_ANALYZED)); ADD the last modified date of the file a field named "Modified". Use//A field, which is indexed (i.e. searchable), but don ' t tokenize the field/to words. Doc. ADD (New Field ("Modified", Datetools.timetostring (F.lastwritetime.millisecond, DateTools.Resolution.MINUTE), Field.Store.YES, Field.Index.NOT_ANALYZED)); Add the contents of the file to a field named "Contents". Specify a Reader,//So, the text of the file was tokenized and indexed, but not stored. Note that FileReader expects the file to is in the system ' s default encoding. If that's not the case searching for special characters would fail. String str = File.readalltext (f.fullname); Doc. ADD (New Field ("contents", NEW StreamReader (F.fullname, System.Text.Encoding.UTF8)); Doc. ADD (New Field ("Contents", str, Field.Store.YES, Field.Index.ANALYZED, Field.TermVector.WITH_POSITIONS_OFFSETS)); Return the document return doc; } }}
Find the implementation:
There are a variety of lookup classes in Lucene.Net, but it is necessary to use phrasequery if you want to implement a multi-condition query
Class. The search results are placed in the container using the search function.
When the final result is rendered, we put the search results in the list, and if we want to display the keyword highlighting, then we need to do a little extra work. Here I am implemented by colorword this class. The specific search code is as follows:
Using system;using system.collections.generic;using system.linq;using system.web;using System.IO;using Lucene.net.analysis;using lucene.net.analysis.dchinese;using lucene.net.documents;using Lucene.Net.QueryParsers; Using lucene.net.index;using lucene.net.search;using fsdirectory = lucene.net.store.fsdirectory;using NoLockFactory = lucene.net.store.nolockfactory;using Version = lucene.net.util.version;namespace webapplication6{public static class searchfiles {public static list<itemlist> Searchindex (DirectoryInfo dirindex, list<string> termlist {Fsdirectory Dirfs = Fsdirectory.open (Dirindex, New Nolockfactory ()); Indexreader reader = Indexreader.open (dirfs,true); Indexsearcher searcher = new Indexsearcher (reader); Analyzer Analyzer = new Dchineseanalyzer (version.lucene_30); Phrasequery query = new Phrasequery (); foreach (string word in termlist) { Query. ADD (New term ("contents", word)); } query. Slop = 100; TopScoreDocCollector collector = Topscoredoccollector.create (+, true); Searcher. Search (Query,collector); Scoredoc[] Hits = Collector. Topdocs (). Scoredocs; list<itemlist> Lstresult = new list<itemlist> (); for (int i = 0; I < hits. Length; i++) {Document doc = new Document (); Doc = searcher. Doc (Hits[i]. DOC); ItemList item = new ItemList (); Item. Itemcontent = doc. Get ("contents"); Item. Itemcontent = Colorword.addcolor (Doc. Get ("Contents"), termlist); Item. Itempath = doc. Get ("path"); Lstresult.add (item); } return Lstresult; } }}The final effect is shown below:
The final code: download