Luence5-minute Quick Start example

Source: Internet
Author: User

Using Lucene makes it easy to add full-text indexing to our applications. It takes only five minutes to learn how to use it.

1. download it from the official website. The latest version is 4.4.0. The sample code below is also based on 4.4;

2. Create a Java project, find these jar from the Lucene Directory: lucene-analyzers-common-4.4.0.jar, lucene-core-4.4.0.jar, lucene-queries-4.4.0.jar, lucene-queryparser-4.4.0.jar, and add to the classpath of the project;

3. The sample Java code is used to add some strings to the memory and display the query results.

1) create a Content Index

// Create a analyzer. The standard analyzer is used here. It is applicable to most scenarios and includes some Chinese analysis and processing functions in standardanalyzer. Although it also has a Chinese analyzer, // However, chineseanalyzer will be removed from version 5.0 and standardanalyzer will be used. // In analyzers-common, analyzer for many different languages is included, including the Chinese analyzer = new standardanalyzer (version. lucene_44); // directory is an abstract class used to store index files. Its sub-classes include writing index files to files and directly storing them in the memory, here, ramdirectory is indexed in the memory // The advantage is that the speed is fast, and the missing index is not suitable for a large amount of data. The data here is relatively small, so ramdirctory is very suitable. // For details, see the descriptions of directories, ramdirectory, fsdirectory, and other APIs. Here we want to emphasize that fsdirectory is an abstract class for file index storage. There are three sub-classes below: mmapdirectory, niofsdirectory, simplefsdirectory is selected based on different operating systems and scenarios. Directory Index = new ramdirectory (); // indexwriterconfig includes all the configurations for creating indexwriter. Once indexwriter is created, modifying indexwriterconfig does not affect the indexwriter instance, if you want to obtain the correct indexwirter configuration, you should use indexwirter. the getconfig () method, and indexwriterconfig itself is also a final class. Indexwriterconfig Config = new indexwriterconfig (version. paie_44, analyzer); // as the name suggests, indexwriter is the indexwriter W = new indexwriter (index, config) for maintenance and addition of indexes; adddoc (W, "Lucene in Action ", "193398817"); adddoc (W, "Lucene for Dummies", "55320055z"); adddoc (W, "Managing gigabytes", "55063554a"); adddoc (W, "The art of computer science", "9900333x"); W. close ();

The code of the adddoc method is as follows. The function is to add the content to the index.

private static void addDoc(IndexWriter w, String title, String isbn) throws IOException {  Document doc = new Document();  doc.add(new TextField("title", title, Field.Store.YES));  doc.add(new StringField("isbn", isbn, Field.Store.YES));  w.addDocument(doc);}

Here, we need to note that the added Index uses textfield and the added ISBN Index uses stringfield, both of which are subclasses of indexablefield, textfield indicates that the field will be split and indexed, while stringfield will only be indexed as a whole, rather than the split index.

2) query by reading the command line parameters and passing them to luence's queryparset, and then executing the query through the query

String querystr = args. length> 0? ARGs [0]: "Lucene"; // create a query using queryparser. // queryparser is javacc (http://javacc.java.net) Compilation of the most important method is queryparserbase. parse (string), // note that queryparser is not a thread-safe query q = new queryparser (version. paie_44, "title", analyzer ). parse (querystr );
 

3) execute the query to create indexsearcher Based on the index, and then topscoredoccollector will return the query result.

// This indicates the maximum number of results displayed each time int hitsperpage = 10; // create an index reader indexreader reader = indexreader. open (INDEX); // create an index queryer indexsearcher searcher = new indexsearcher (Reader); // return the query result topscoredoccollector = topscoredoccollector with the maximum hitsperpage in topdocs mode. create (hitsperpage, true); // execute the query searcher. search (Q, collector); scoredoc [] hits = collector. topdocs (). scoredocs;
 

4) display index query results

System.out.println("Found " + hits.length + " hits.");for(int i=0;i

The following is the complete code:

import org.apache.lucene.analysis.standard.StandardAnalyzer;import org.apache.lucene.document.Document;import org.apache.lucene.document.Field;import org.apache.lucene.document.StringField;import org.apache.lucene.document.TextField;import org.apache.lucene.index.DirectoryReader;import org.apache.lucene.index.IndexReader;import org.apache.lucene.index.IndexWriter;import org.apache.lucene.index.IndexWriterConfig;import org.apache.lucene.queryparser.classic.ParseException;import org.apache.lucene.queryparser.classic.QueryParser;import org.apache.lucene.search.IndexSearcher;import org.apache.lucene.search.Query;import org.apache.lucene.search.ScoreDoc;import org.apache.lucene.search.TopScoreDocCollector;import org.apache.lucene.store.Directory;import org.apache.lucene.store.RAMDirectory;import org.apache.lucene.util.Version;import java.io.IOException;public class HelloLucene {  public static void main(String[] args) throws IOException, ParseException {    // 0. Specify the analyzer for tokenizing text.    //    The same analyzer should be used for indexing and searching    StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_40);    // 1. create the index    Directory index = new RAMDirectory();    IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_40, analyzer);    IndexWriter w = new IndexWriter(index, config);    addDoc(w, "Lucene in Action", "193398817");    addDoc(w, "Lucene for Dummies", "55320055Z");    addDoc(w, "Managing Gigabytes", "55063554A");    addDoc(w, "The Art of Computer Science", "9900333X");    w.close();    // 2. query    String querystr = args.length > 0 ? args[0] : "lucene";    // the "title" arg specifies the default field to use    // when no field is explicitly specified in the query.    Query q = new QueryParser(Version.LUCENE_40, "title", analyzer).parse(querystr);    // 3. search    int hitsPerPage = 10;    IndexReader reader = DirectoryReader.open(index);    IndexSearcher searcher = new IndexSearcher(reader);    TopScoreDocCollector collector = TopScoreDocCollector.create(hitsPerPage, true);    searcher.search(q, collector);    ScoreDoc[] hits = collector.topDocs().scoreDocs;        // 4. display results    System.out.println("Found " + hits.length + " hits.");    for(int i=0;i

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.