Lucene Foundation (ii)--operation of the index

Source: Internet
Author: User
Tags createindex

operation of the index

We establish all is to achieve the purpose of fast retrieval, the data can be the aspect of the search, and the database is similar to the index has its own related additions and deletions to change the operation.
In the index of additions and deletions to the search, additions and deletions belong to the write operation, mainly has the method of Indexwrite to deal with, and check obvious, read operation, using the method provided by Indexseacher to achieve. In Lucene's Official document find Org.apache.lucene.index.IndexWriter This class, we can see many of his methods.

Create an index

Like the code in the previous chapter, creating an index creates a file, creates an indexed domain, and then uses the IndexWriter adddocument () method to do so, with the core code as follows:

Iwriter = new IndexWriter (directory, new Indexwriterconfig (version, new StandardAnalyzer (version));

for (String text:content) {doc = new Document ();

//Use field There are many types, understand their differences such as: TextField and Stringfield etc.

Doc.add (new TextField ("content", Text,field.store.yes));

Iwriter.adddocument (DOC);

}

Index Delete

Index deletions include deleting only the document below the index and deleting the index file
There are some methods in IndexWriter

    1. DeleteAll () Delete all documents in the index
    2. Deletedocuments (Query ... queries) Delete documents according to the provided Query
    3. Deletedocuments (term ... terms) Delete documents by phrase
    4. Deleteunusedfiles () Delete all files that are no longer using index
    5. Forcemergedeletes () Delete in the deleted state documents, thus, the previous method of deleting the document did not really delete the documents, just mark Delete, I personally understand is similar to the deletion of the logical
    6. Forcemergedeletes (Boolean dowait) indicates whether to block during removal until the operation is complete
Index Update

The same is true for the update operation, where the document is viewed, and a screenshot is shown here:

Index Query

Query
Index queries can use the implementation subclass of query to create queries, execute Indexsearcher search methods to query, or use the Queryparse class to construct queries.

Page out

    • Mode 1: Paging in Scoredoc, data isolated at one time, in the result set paging, the result set is large, easily overflow
    • Mode 2: Use Searcheafter, the number of equivalent queries, but will not appear overflow query results, recommended, similar to the database paged query

This database-like query, the result set can be paginated display, similar to one, the query when the direct paging, similar to the way two.

Index Operation Instance

Package lucene_demo03;

Import java.io.IOException;

Import Org.apache.lucene.analysis.standard.StandardAnalyzer;
Import org.apache.lucene.document.Document;
Import Org.apache.lucene.document.Field;
Import Org.apache.lucene.document.TextField;
Import org.apache.lucene.index.CorruptIndexException;
Import Org.apache.lucene.index.DirectoryReader;
Import Org.apache.lucene.index.IndexWriter;
Import Org.apache.lucene.index.IndexWriterConfig;
Import Org.apache.lucene.index.Term;
Import org.apache.lucene.queryparser.classic.ParseException;
Import Org.apache.lucene.queryparser.classic.QueryParser;
Import Org.apache.lucene.search.IndexSearcher;
Import Org.apache.lucene.search.Query;
Import Org.apache.lucene.search.ScoreDoc;
Import Org.apache.lucene.search.TermQuery;
Import Org.apache.lucene.search.TopDocs;
Import org.apache.lucene.store.Directory;
Import org.apache.lucene.store.RAMDirectory;
Import org.apache.lucene.util.Version;

/**
*
* About index query (paged query) mode 1: Paging in Scoredoc, data detected at one time, in the result set paging, the result set is large, easy to overflow
* Mode 2: Use Searcheafter, the number of equivalent queries, but will not appear overflow query results, recommended, similar to the database paged query
*
* @author Yipfun
*/
public class LuceneDemo03
{

Private static final version version = Version.lucene_4_9;
Private Directory directory = NULL;
Private Directoryreader ireader = null;
Private IndexWriter iwriter = null;

Test data
Private string[] content = {"Hello Lucene", "I Love coding", "I can play basketball", "I can play football", "I Can Play" DotA "};

/**
* Construction Method
*/
Public LuceneDemo03 ()
{
directory = new Ramdirectory ();
}

/**
* Create an index
*/
public void CreateIndex ()
{
Document doc = null;
Try
{
Iwriter = new IndexWriter (directory, new Indexwriterconfig (version, new StandardAnalyzer (version));
for (String text:content)
{
doc = new Document ();
There are many types of field used, understanding their differences such as: TextField and Stringfield
Doc.add (New TextField ("content", text, Field.Store.YES));
Iwriter.adddocument (DOC);
}

} catch (IOException e)
{
E.printstacktrace ();
} finally
{
Try
{
if (iwriter! = null)
Iwriter.close ();
} catch (IOException e)
{
E.printstacktrace ();
}
}

}

Public Indexsearcher Getsearcher ()
{
Try
{
if (Ireader = = null)
{
Ireader = Dir Ectoryreader.open (directory);      
} else
{
Directoryreader tr = directoryreader.openifchanged (ireader);
        if (tr! = null)
{
Ireader.close ();
Ireader = TR;    
}
}
return new Indexsearcher (Ireader);
    } catch (Corruptindexexception e)
{
E.printstacktrace ();
    } catch (IOException e)
{
E.printstacktrace ();
}
return null;
}

/**
*
* @param field
* @param term
* @param num
*/
public void Searchbyterm (Strin    G field, String term, int num)
{
Indexsearcher isearcher = Getsearcher ();
    Note the difference between the implementation class of query and the usage of queryparse
termquery query = new Termquery (New Term (field, term));
Scoredoc[] Hits;      
Try
{
//Note Several methods of searcher
hits = isearcher.search (query, NULL, num). Scoredocs;
Iterate through the results:
for (int i = 0; i < hits.length; i++)
{
Document Hitdoc = Isearcher.doc (Hits[i].doc);
System.out.println ("The text to be indexed=" + hitdoc.get ("content"));    
}
} catch (IOException e)
{
E.printstacktrace ();
}
}

/**
* Difference with the previous query, use Queryparser's Parse method to construct a query passed to the method using
*
* @param query
* @param num
*/
public void Searchbyqueryparse (query query, int num)
{
Try
{
Indexsearcher searcher = Getsearcher ();
Topdocs TDS = searcher.search (query, num);
System.out.println ("Altogether queried:" + tds.totalhits);
for (Scoredoc Sd:tds.scoreDocs)
{
Document doc = Searcher.doc (sd.doc);
System.out.println ("This is the text to be indexed=" + doc.get ("content"));
}
} catch (Corruptindexexception e)
{
E.printstacktrace ();
} catch (IOException e)
{
E.printstacktrace ();
}
}

/**
* The first page paging method, the Scoredoc page
*
* @param query
* @param pageIndex
* Starting from 1, that is, the first page
* @param pageSize
* Paging Size
* @param num
* Search Top N Hits
*/
public void Searchforpage (query query, int pageIndex, int pageSize, int num)
{
Try
{
Indexsearcher searcher = Getsearcher ();
Topdocs TDS = searcher.search (query, num);
System.out.println ("Altogether queried:" + tds.totalhits);
Paging to Scoredoc
int start = (pageIndex-1) * pageSize;
int end = PageIndex * PAGESIZE;
Scoredoc scoredocs[] = Tds.scoredocs;
for (int i = start; i < end; i++)
{
Document doc = Searcher.doc (scoredocs[i].doc);
System.out.println ("This is the text to be indexed=" + doc.get ("content"));
}
} catch (Corruptindexexception e)
{
E.printstacktrace ();
} catch (IOException e)
{
E.printstacktrace ();
}
}

/**
* Use Searchafter to implement paging at query time
*
* @param query
* @param pageIndex
* @param pageSize
* @throws IOException
*/
public void Searchforpagebyafter (query query, int pageIndex, int pageSize) throws IOException
{
Indexsearcher searcher = Getsearcher ();
Get the last element of the previous page first
Scoredoc LASTSD = Getlastscoredoc (PageIndex, pageSize, query, searcher);
Topdocs TDS = searcher.searchafter (lastsd, query, pageSize);
for (Scoredoc Sd:tds.scoreDocs)
{
Document doc = Searcher.doc (sd.doc);
System.out.println ("This is the text to be indexed=" + doc.get ("content"));
}

}

/**
* Returns the previous bar of a paged query
*
* @param pageIndex
* @param pageSize
* @param query
* @param indexsearcher
* @return
*/
Private Scoredoc getlastscoredoc (int pageIndex, int pageSize, query query, Indexsearcher searcher)
{
if (PageIndex = = 1)
Return null;//returns empty if it is the first page
int num = pageSize * (pageIndex-1);//Get the number of previous page
Topdocs TDS = null;
Try
{
TDS = searcher.search (query, num);
} catch (IOException e)
{
E.printstacktrace ();
}
return tds.scoredocs[num-1];
}

public static void Main (string[] args) throws ParseException, IOException
{
LuceneDemo03 ld = new LuceneDemo03 ();
Ld.createindex ();
Ld.searchbyterm ("Content", "play", 500);
System.out.println ("==============1======================");

Queryparser parser = new Queryparser (version, "Content", new StandardAnalyzer (version));
Query q = Parser.parse ("Play");//study the syntax of parse
Ld.searchbyqueryparse (q, 500);
System.out.println ("===============2=====================");

Ld.searchforpage (q, 1, 2, 500);//starting from the first page
System.out.println ("================3====================");

Ld.searchforpagebyafter (q, 1, 2);//starting from the first page
System.out.println ("================4====================");
}

}

Lucene Foundation (ii)--operation of the index

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.