Lucene Foundation (ii)--operation of the index

Last Update:2015-10-19 Source: Internet

Author: User

Tags createindex

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

operation of the index

We establish all is to achieve the purpose of fast retrieval, the data can be the aspect of the search, and the database is similar to the index has its own related additions and deletions to change the operation.
In the index of additions and deletions to the search, additions and deletions belong to the write operation, mainly has the method of Indexwrite to deal with, and check obvious, read operation, using the method provided by Indexseacher to achieve. In Lucene's Official document find Org.apache.lucene.index.IndexWriter This class, we can see many of his methods.

Create an index

Like the code in the previous chapter, creating an index creates a file, creates an indexed domain, and then uses the IndexWriter adddocument () method to do so, with the core code as follows:

Iwriter = new IndexWriter (directory, new Indexwriterconfig (version, new StandardAnalyzer (version));

for (String text:content) {doc = new Document ();

//Use field There are many types, understand their differences such as: TextField and Stringfield etc.

Doc.add (new TextField ("content", Text,field.store.yes));

Iwriter.adddocument (DOC);

}

Index Delete

Index deletions include deleting only the document below the index and deleting the index file
There are some methods in IndexWriter

DeleteAll () Delete all documents in the index
Deletedocuments (Query ... queries) Delete documents according to the provided Query
Deletedocuments (term ... terms) Delete documents by phrase
Deleteunusedfiles () Delete all files that are no longer using index
Forcemergedeletes () Delete in the deleted state documents, thus, the previous method of deleting the document did not really delete the documents, just mark Delete, I personally understand is similar to the deletion of the logical
Forcemergedeletes (Boolean dowait) indicates whether to block during removal until the operation is complete

Index Update

The same is true for the update operation, where the document is viewed, and a screenshot is shown here:

Index Query

Query
Index queries can use the implementation subclass of query to create queries, execute Indexsearcher search methods to query, or use the Queryparse class to construct queries.

Page out

Mode 1: Paging in Scoredoc, data isolated at one time, in the result set paging, the result set is large, easily overflow
Mode 2: Use Searcheafter, the number of equivalent queries, but will not appear overflow query results, recommended, similar to the database paged query

This database-like query, the result set can be paginated display, similar to one, the query when the direct paging, similar to the way two.

Index Operation Instance

Package lucene_demo03;

Import java.io.IOException;

Import Org.apache.lucene.analysis.standard.StandardAnalyzer;
Import org.apache.lucene.document.Document;
Import Org.apache.lucene.document.Field;
Import Org.apache.lucene.document.TextField;
Import org.apache.lucene.index.CorruptIndexException;
Import Org.apache.lucene.index.DirectoryReader;
Import Org.apache.lucene.index.IndexWriter;
Import Org.apache.lucene.index.IndexWriterConfig;
Import Org.apache.lucene.index.Term;
Import org.apache.lucene.queryparser.classic.ParseException;
Import Org.apache.lucene.queryparser.classic.QueryParser;
Import Org.apache.lucene.search.IndexSearcher;
Import Org.apache.lucene.search.Query;
Import Org.apache.lucene.search.ScoreDoc;
Import Org.apache.lucene.search.TermQuery;
Import Org.apache.lucene.search.TopDocs;
Import org.apache.lucene.store.Directory;
Import org.apache.lucene.store.RAMDirectory;
Import org.apache.lucene.util.Version;

/**
*
* About index query (paged query) mode 1: Paging in Scoredoc, data detected at one time, in the result set paging, the result set is large, easy to overflow
* Mode 2: Use Searcheafter, the number of equivalent queries, but will not appear overflow query results, recommended, similar to the database paged query
*
* @author Yipfun
*/
public class LuceneDemo03
{

Private static final version version = Version.lucene_4_9;
Private Directory directory = NULL;
Private Directoryreader ireader = null;
Private IndexWriter iwriter = null;

Test data
Private string[] content = {"Hello Lucene", "I Love coding", "I can play basketball", "I can play football", "I Can Play" DotA "};

/**
* Construction Method
*/
Public LuceneDemo03 ()
{
directory = new Ramdirectory ();
}

/**
* Create an index
*/
public void CreateIndex ()
{
Document doc = null;
Try
{
Iwriter = new IndexWriter (directory, new Indexwriterconfig (version, new StandardAnalyzer (version));
for (String text:content)
{
doc = new Document ();
There are many types of field used, understanding their differences such as: TextField and Stringfield
Doc.add (New TextField ("content", text, Field.Store.YES));
Iwriter.adddocument (DOC);
}

} catch (IOException e)
{
E.printstacktrace ();
} finally
{
Try
{
if (iwriter! = null)
Iwriter.close ();
} catch (IOException e)
{
E.printstacktrace ();
}
}

}

Public Indexsearcher Getsearcher ()
{
Try
{
if (Ireader = = null)
{
Ireader = Dir Ectoryreader.open (directory);　　　　　　
} else
{
Directoryreader tr = directoryreader.openifchanged (ireader);
　　　　　　　　if (tr! = null)
{
Ireader.close ();
Ireader = TR;　　　　
}
}
return new Indexsearcher (Ireader);
　　　　} catch (Corruptindexexception e)
{
E.printstacktrace ();
　　　　} catch (IOException e)
{
E.printstacktrace ();
}
return null;
}

/**
*
* @param field
* @param term
* @param num
*/
public void Searchbyterm (Strin　　　　G field, String term, int num)
{
Indexsearcher isearcher = Getsearcher ();
　　　　Note the difference between the implementation class of query and the usage of queryparse
termquery query = new Termquery (New Term (field, term));
Scoredoc[] Hits;　　　　　　
Try
{
//Note Several methods of searcher
hits = isearcher.search (query, NULL, num). Scoredocs;
Iterate through the results:
for (int i = 0; i < hits.length; i++)
{
Document Hitdoc = Isearcher.doc (Hits[i].doc);
System.out.println ("The text to be indexed=" + hitdoc.get ("content"));　　　　
}
} catch (IOException e)
{
E.printstacktrace ();
}
}

/**
* Difference with the previous query, use Queryparser's Parse method to construct a query passed to the method using
*
* @param query
* @param num
*/
public void Searchbyqueryparse (query query, int num)
{
Try
{
Indexsearcher searcher = Getsearcher ();
Topdocs TDS = searcher.search (query, num);
System.out.println ("Altogether queried:" + tds.totalhits);
for (Scoredoc Sd:tds.scoreDocs)
{
Document doc = Searcher.doc (sd.doc);
System.out.println ("This is the text to be indexed=" + doc.get ("content"));
}
} catch (Corruptindexexception e)
{
E.printstacktrace ();
} catch (IOException e)
{
E.printstacktrace ();
}
}

/**
* The first page paging method, the Scoredoc page
*
* @param query
* @param pageIndex
* Starting from 1, that is, the first page
* @param pageSize
* Paging Size
* @param num
* Search Top N Hits
*/
public void Searchforpage (query query, int pageIndex, int pageSize, int num)
{
Try
{
Indexsearcher searcher = Getsearcher ();
Topdocs TDS = searcher.search (query, num);
System.out.println ("Altogether queried:" + tds.totalhits);
Paging to Scoredoc
int start = (pageIndex-1) * pageSize;
int end = PageIndex * PAGESIZE;
Scoredoc scoredocs[] = Tds.scoredocs;
for (int i = start; i < end; i++)
{
Document doc = Searcher.doc (scoredocs[i].doc);
System.out.println ("This is the text to be indexed=" + doc.get ("content"));
}
} catch (Corruptindexexception e)
{
E.printstacktrace ();
} catch (IOException e)
{
E.printstacktrace ();
}
}

/**
* Use Searchafter to implement paging at query time
*
* @param query
* @param pageIndex
* @param pageSize
* @throws IOException
*/
public void Searchforpagebyafter (query query, int pageIndex, int pageSize) throws IOException
{
Indexsearcher searcher = Getsearcher ();
Get the last element of the previous page first
Scoredoc LASTSD = Getlastscoredoc (PageIndex, pageSize, query, searcher);
Topdocs TDS = searcher.searchafter (lastsd, query, pageSize);
for (Scoredoc Sd:tds.scoreDocs)
{
Document doc = Searcher.doc (sd.doc);
System.out.println ("This is the text to be indexed=" + doc.get ("content"));
}

}

/**
* Returns the previous bar of a paged query
*
* @param pageIndex
* @param pageSize
* @param query
* @param indexsearcher
* @return
*/
Private Scoredoc getlastscoredoc (int pageIndex, int pageSize, query query, Indexsearcher searcher)
{
if (PageIndex = = 1)
Return null;//returns empty if it is the first page
int num = pageSize * (pageIndex-1);//Get the number of previous page
Topdocs TDS = null;
Try
{
TDS = searcher.search (query, num);
} catch (IOException e)
{
E.printstacktrace ();
}
return tds.scoredocs[num-1];
}

public static void Main (string[] args) throws ParseException, IOException
{
LuceneDemo03 ld = new LuceneDemo03 ();
Ld.createindex ();
Ld.searchbyterm ("Content", "play", 500);
System.out.println ("==============1======================");

Queryparser parser = new Queryparser (version, "Content", new StandardAnalyzer (version));
Query q = Parser.parse ("Play");//study the syntax of parse
Ld.searchbyqueryparse (q, 500);
System.out.println ("===============2=====================");

Ld.searchforpage (q, 1, 2, 500);//starting from the first page
System.out.println ("================3====================");

Ld.searchforpagebyafter (q, 1, 2);//starting from the first page
System.out.println ("================4====================");
}

}

Lucene Foundation (ii)--operation of the index

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More