This document records the use of lucene+paoding diagrams:First, download Lucene (official website:http://archive.apache.org/dist/lucene/java/) This article is used: 2.9.4, download, unzip, Lucene requires the following list of basic jar files: Lucene-core-2.9.4.jar lucene core Jar lucene-analyzers-2.9.4.jar lucene participle jar Lucene-highlighter-2.9.4.jar lucene highlighting Jar second, because the Chinese word segmentation in Lucene can not achieve the functions we need, so we need to download the third party package (blister ding Niu) (official website: http://code.google.com/p/paoding/) The latest version is: Paoding-analysis-2.0.4-beta.zip Download the extracted, Lucene uses the ' blister ding ' required jar file as follows list: Paoding-analysis.jar lucene requires jar for Chinese participle commons-logging.jar log Files {padoding_home}/dic (paoding_home: representative of the extracted paoding directory)third, open Eclipse and create a Java project ( the project name and the path of the project cannot contain spaces ), in this case project Name:paoding1_1: Create a folder--lib (for storing all jars) in paoding Project, and copy the previously mentioned jar files to the Lib directory. and add all the jars under Lib to the project Classpath. 1_2: Copy {paoding_home}/dic directory to paoding project/SRC the entire project chart below: Four, create the Testfileindex.java class, the implementation function is: D:\data\*.txt all the files read into memory, and write to the index directory (d:\luceneindex)Testfileindex.java PackageCom.lixing.paoding.index;
ImportJava.io.BufferedReader;
ImportJava.io.File;
ImportJava.io.FileInputStream;
ImportJava.io.InputStreamReader;
ImportNet.paoding.analysis.analyzer.PaodingAnalyzer;
ImportOrg.apache.lucene.analysis.Analyzer;
ImportOrg.apache.lucene.document.Document;
ImportOrg.apache.lucene.document.Field;
ImportOrg.apache.lucene.index.IndexWriter;
ImportOrg.apache.lucene.store.Directory;
ImportOrg.apache.lucene.store.FSDirectory;
Public classTestfileindex {
Public Static voidMain (string[] args)throwsException {
String datadir="D:/data";
String indexdir="D:/luceneindex";
File[] Files=NewFile (DataDir). Listfiles ();
System.out.println (files.length);
Analyzer analyzer=NewPaodinganalyzer ();
Directory Dir=fsdirectory.open (NewFile (Indexdir));
IndexWriter writer=NewIndexWriter (dir, analyzer, IndexWriter.MaxFieldLength.UNLIMITED);
for(inti=0;i<files.length;i++) {
StringBuffer strbuffer=NewStringBuffer ();
String line= "";
FileInputStream is=NewFileInputStream (Files[i].getcanonicalpath ());
BufferedReader reader=NewBufferedReader (NewInputStreamReader (IS,"gb2312"));
Line=reader.readline ();
while(Line! =NULL){
Strbuffer.append (line);
Strbuffer.append ("\ n");
Line=reader.readline ();
}
Document doc=NewDocument ();
Doc.add (NewField ("FileName", Files[i].getname (), Field.Store.YES, Field.Index.ANALYZED));
Doc.add (NewField ("Contents", Strbuffer.tostring (), Field.Store.YES, Field.Index.ANALYZED));
Writer.adddocument (DOC);
Reader.close ();
Is.close ();
}
Writer.optimize ();
Writer.close ();
Dir.close ();
System.out.println ("OK");
}
}Create Testfilesearcher.java, the real function is to read the contents of the index:Testfilesearcerh.java PackageCom.lixing.paoding.index;
ImportJava.io.File;
ImportNet.paoding.analysis.analyzer.PaodingAnalyzer;
ImportOrg.apache.lucene.analysis.Analyzer;
ImportOrg.apache.lucene.document.Document;
ImportOrg.apache.lucene.queryParser.QueryParser;
ImportOrg.apache.lucene.search.IndexSearcher;
ImportOrg.apache.lucene.search.Query;
ImportOrg.apache.lucene.search.ScoreDoc;
ImportOrg.apache.lucene.search.TopDocs;
ImportOrg.apache.lucene.store.Directory;
ImportOrg.apache.lucene.store.FSDirectory;
ImportOrg.apache.lucene.util.Version;
Public classTestfilesearcher {
Public Static voidMain (string[] args)throwsException {
String Indexdir ="D:/luceneindex";
Analyzer Analyzer =NewPaodinganalyzer ();
Directory dir = Fsdirectory.open (NewFile (Indexdir));
Indexsearcher searcher =NewIndexsearcher (dir,true);
Queryparser parser =NewQueryparser (version.lucene_29,"Contents", analyzer);
Query query = Parser.parse ("Cry for Help");
//term term=new term ("FileName", "university");
//termquery query=new termquery (term);
Topdocs docs=searcher.search (query, 1000);
Scoredoc[] Hits=docs.scoredocs;
System.out.println (hits.length);
for(inti=0;iDocument Doc=searcher.doc (Hits[i].doc);
System.out.print (Doc.get ("FileName")+"--:\n");
System.out.println (Doc.get ("Contents")+"\ n");
}
Searcher.close ();
Dir.close ();
}
}
This article is from the "Li Xin Blog" blog, please be sure to keep this source http://kinglixing.blog.51cto.com/3421535/702663
(turn) Lucene Chinese participle plot