Using Lucene makes it easy to add full-text indexing to our applications. It takes only five minutes to learn how to use it.
1. download it from the official website. The latest version is 4.4.0. The sample code below is also based on 4.4;
2. Create a Java project, find these jar from the Lucene Directory: lucene-analyzers-common-4.4.0.jar, lucene-core-4.4.0.jar, lucene-queries-4.4.0.jar, lucene-queryparser-4.4.0.jar, and add to the classpath of the project;
3. The sample Java code is used to add some strings to the memory and display the query results.
1) create a Content Index
// Create a analyzer. The standard analyzer is used here. It is applicable to most scenarios and includes some Chinese analysis and processing functions in standardanalyzer. Although it also has a Chinese analyzer, // However, chineseanalyzer will be removed from version 5.0 and standardanalyzer will be used. // In analyzers-common, analyzer for many different languages is included, including the Chinese analyzer = new standardanalyzer (version. lucene_44); // directory is an abstract class used to store index files. Its sub-classes include writing index files to files and directly storing them in the memory, here, ramdirectory is indexed in the memory // The advantage is that the speed is fast, and the missing index is not suitable for a large amount of data. The data here is relatively small, so ramdirctory is very suitable. // For details, see the descriptions of directories, ramdirectory, fsdirectory, and other APIs. Here we want to emphasize that fsdirectory is an abstract class for file index storage. There are three sub-classes below: mmapdirectory, niofsdirectory, simplefsdirectory is selected based on different operating systems and scenarios. Directory Index = new ramdirectory (); // indexwriterconfig includes all the configurations for creating indexwriter. Once indexwriter is created, modifying indexwriterconfig does not affect the indexwriter instance, if you want to obtain the correct indexwirter configuration, you should use indexwirter. the getconfig () method, and indexwriterconfig itself is also a final class. Indexwriterconfig Config = new indexwriterconfig (version. paie_44, analyzer); // as the name suggests, indexwriter is the indexwriter W = new indexwriter (index, config) for maintenance and addition of indexes; adddoc (W, "Lucene in Action ", "193398817"); adddoc (W, "Lucene for Dummies", "55320055z"); adddoc (W, "Managing gigabytes", "55063554a"); adddoc (W, "The art of computer science", "9900333x"); W. close ();
The code of the adddoc method is as follows. The function is to add the content to the index.
private static void addDoc(IndexWriter w, String title, String isbn) throws IOException { Document doc = new Document(); doc.add(new TextField("title", title, Field.Store.YES)); doc.add(new StringField("isbn", isbn, Field.Store.YES)); w.addDocument(doc);}
Here, we need to note that the added Index uses textfield and the added ISBN Index uses stringfield, both of which are subclasses of indexablefield, textfield indicates that the field will be split and indexed, while stringfield will only be indexed as a whole, rather than the split index.
2) query by reading the command line parameters and passing them to luence's queryparset, and then executing the query through the query
String querystr = args. length> 0? ARGs [0]: "Lucene"; // create a query using queryparser. // queryparser is javacc (http://javacc.java.net) Compilation of the most important method is queryparserbase. parse (string), // note that queryparser is not a thread-safe query q = new queryparser (version. paie_44, "title", analyzer ). parse (querystr );
3) execute the query to create indexsearcher Based on the index, and then topscoredoccollector will return the query result.
// This indicates the maximum number of results displayed each time int hitsperpage = 10; // create an index reader indexreader reader = indexreader. open (INDEX); // create an index queryer indexsearcher searcher = new indexsearcher (Reader); // return the query result topscoredoccollector = topscoredoccollector with the maximum hitsperpage in topdocs mode. create (hitsperpage, true); // execute the query searcher. search (Q, collector); scoredoc [] hits = collector. topdocs (). scoredocs;
4) display index query results
System.out.println("Found " + hits.length + " hits.");for(int i=0;i
The following is the complete code:
import org.apache.lucene.analysis.standard.StandardAnalyzer;import org.apache.lucene.document.Document;import org.apache.lucene.document.Field;import org.apache.lucene.document.StringField;import org.apache.lucene.document.TextField;import org.apache.lucene.index.DirectoryReader;import org.apache.lucene.index.IndexReader;import org.apache.lucene.index.IndexWriter;import org.apache.lucene.index.IndexWriterConfig;import org.apache.lucene.queryparser.classic.ParseException;import org.apache.lucene.queryparser.classic.QueryParser;import org.apache.lucene.search.IndexSearcher;import org.apache.lucene.search.Query;import org.apache.lucene.search.ScoreDoc;import org.apache.lucene.search.TopScoreDocCollector;import org.apache.lucene.store.Directory;import org.apache.lucene.store.RAMDirectory;import org.apache.lucene.util.Version;import java.io.IOException;public class HelloLucene { public static void main(String[] args) throws IOException, ParseException { // 0. Specify the analyzer for tokenizing text. // The same analyzer should be used for indexing and searching StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_40); // 1. create the index Directory index = new RAMDirectory(); IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_40, analyzer); IndexWriter w = new IndexWriter(index, config); addDoc(w, "Lucene in Action", "193398817"); addDoc(w, "Lucene for Dummies", "55320055Z"); addDoc(w, "Managing Gigabytes", "55063554A"); addDoc(w, "The Art of Computer Science", "9900333X"); w.close(); // 2. query String querystr = args.length > 0 ? args[0] : "lucene"; // the "title" arg specifies the default field to use // when no field is explicitly specified in the query. Query q = new QueryParser(Version.LUCENE_40, "title", analyzer).parse(querystr); // 3. search int hitsPerPage = 10; IndexReader reader = DirectoryReader.open(index); IndexSearcher searcher = new IndexSearcher(reader); TopScoreDocCollector collector = TopScoreDocCollector.create(hitsPerPage, true); searcher.search(q, collector); ScoreDoc[] hits = collector.topDocs().scoreDocs; // 4. display results System.out.println("Found " + hits.length + " hits."); for(int i=0;i