Indexer:
ImportOrg.apache.lucene.index.IndexWriter;ImportOrg.apache.lucene.analysis.standard.StandardAnalyzer;Importorg.apache.lucene.document.Document;ImportOrg.apache.lucene.document.Field;Importorg.apache.lucene.store.FSDirectory;Importorg.apache.lucene.store.Directory;Importorg.apache.lucene.util.Version;ImportJava.io.File;ImportJava.io.FileFilter;Importjava.io.IOException;ImportJava.io.FileReader;//From Chapter 1/*** This code is originally written for * Erik ' s Lucene intro java.net article*/ Public classIndexer { Public Static voidMain (string[] args)throwsException {if(Args.length! = 2) { Throw NewIllegalArgumentException ("Usage:java" + Indexer.class. GetName ()+ "<index dir> <data dir>"); } String Indexdir= Args[0];//1String datadir = args[1];//2 LongStart =System.currenttimemillis (); Indexer Indexer=NewIndexer (Indexdir); intnumindexed; Try{numindexed= Indexer.index (DataDir,NewTextfilesfilter ()); } finally{indexer.close (); } LongEnd =System.currenttimemillis (); System.out.println ("Indexing" + Numindexed + "files took" + (End-start) + "milliseconds"); } PrivateIndexWriter writer; PublicIndexer (String Indexdir)throwsIOException {Directory dir= Fsdirectory.open (NewFile (Indexdir)); Writer=NewIndexWriter (dir,//3 NewStandardAnalyzer (//3VERSION.LUCENE_30),//3 true,//3IndexWriter.MaxFieldLength.UNLIMITED);//3 } Public voidClose ()throwsIOException {writer.close (); //4 } Public intIndex (String datadir, filefilter filter)throwsException {file[] files=NewFile (DataDir). Listfiles (); for(File f:files) {if(!f.isdirectory () &&!f.ishidden () &&f.exists ()&&F.canread ()&&(Filter==NULL||filter.accept (f))) {indexfile (f); } } returnWriter.numdocs ();//5 } Private Static classTextfilesfilterImplementsFileFilter { Public BooleanAccept (File path) {returnPath.getname (). toLowerCase ()//6. EndsWith (". txt");//6 } } protectedDocument GetDocument (File f)throwsException {Document doc=NewDocument (); Doc.add (NewField ("Contents",NewFileReader (f)));//7Doc.add (NewField ("filename", F.getname (),//8Field.Store.YES, Field.Index.NOT_ANALYZED));//8Doc.add (NewField ("FullPath", F.getcanonicalpath (),//9Field.Store.YES, Field.Index.NOT_ANALYZED));//9 returnDoc; } Private voidIndexfile (File f)throwsException {System.out.println ("Indexing" +F.getcanonicalpath ()); Document Doc=getdocument (f); Writer.adddocument (DOC); //Ten }}
Index procedure Core class:
IndexWriter
Responsible for creating new or open existing indexes and adding, deleting, or updating indexed document information to the index, typically via the constructor to the directory and Analyzer
Directory
Abstract class that describes where the index is stored
Analyzer
Responsible for extracting the lexical units from the indexed text, only the plain text files, if not plain text, need to be converted first (e.g. using Tika)
Document
The Document object represents a collection of some field
Field
Lucene handles only the text that appears as a field extracted from a binary document, and the document's metadata is stored and indexed separately as a different domain of the document
Digression: The Lucene kernel itself handles only java.lang.String, Java.io.Reader, and local numeric types (int, float, and so on)
Searcher:
Importorg.apache.lucene.document.Document;ImportOrg.apache.lucene.search.IndexSearcher;ImportOrg.apache.lucene.search.Query;ImportOrg.apache.lucene.search.ScoreDoc;ImportOrg.apache.lucene.search.TopDocs;Importorg.apache.lucene.store.FSDirectory;Importorg.apache.lucene.store.Directory;ImportOrg.apache.lucene.queryParser.QueryParser;Importorg.apache.lucene.queryParser.ParseException;ImportOrg.apache.lucene.analysis.standard.StandardAnalyzer;Importorg.apache.lucene.util.Version;ImportJava.io.File;Importjava.io.IOException;//From Chapter 1/*** This code is originally written for * Erik ' s Lucene intro java.net article*/ Public classSearcher { Public Static voidMain (string[] args)throwsillegalargumentexception, IOException, parseexception {if(Args.length! = 2) { Throw NewIllegalArgumentException ("Usage:java" + Searcher.class. GetName ()+ "<index dir> <query>"); } String Indexdir= Args[0];//1String q = args[1];//2Search (Indexdir, q); } Public Static voidSearch (String indexdir, String q)throwsIOException, parseexception {Directory dir= Fsdirectory.open (NewFile (Indexdir));//3Indexsearcher is =NewIndexsearcher (dir);//3Queryparser Parser=NewQueryparser (version.lucene_30,//4"Contents",//4 NewStandardAnalyzer (//4VERSION.LUCENE_30));//4Query query = parser.parse (q);//4 LongStart =System.currenttimemillis (); Topdocs hits= Is.search (query, 10);//5 LongEnd =System.currenttimemillis (); System.err.println ("Found" + hits.totalhits +//6"Document (s) (in" + (End-start) +//6"milliseconds) that matched query '" +//6Q + "':");//6 for(Scoredoc scoreDoc:hits.scoreDocs) {Document doc= Is.doc (Scoredoc.doc);//7System.out.println (Doc.get ("FullPath"));//8} is.close (); //9 }}
Search Process Core classes:
Indexsearcher
Used to search for an index created by IndexWriter, the constructor needs to pass in the directory to get the index created. Then provide a search method
Term
The term object is the basic unit of the search (similar to field)
New Termquery (new term ("contents", "Lucene"= Searcher.search (q,10);
Query
Query is the base class for all query classes, such as Termquery, Booleanquery
Termquery
Termquery is one of the most basic and simple query types to match the document containing the specified item in the specified domain
Topdocs
is a simple pointer container that accommodates the query results
Soup can be kept in a tidy, reproduced annotated
Lucene in Action "Hello Lucene World"