Full-text search (2)-adds, deletes, modifies, and queries based on ipve4.10, and full-text search using ipve4.10

Last Update:2014-11-07 Source: Internet

Author: User

Tags createindex

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Full-text search (2)-adds, deletes, modifies, and queries based on ipve4.10, and full-text search using ipve4.10

Today, lucene is used to complete a simple web application and extract a test class written earlier. First, we will introduce several common lucene packages;

Structure of the lucene package:For external applications, index and search are the main external application portals.

Org. apache. Lucene. search/search entry
Org. apache. Lucene. index/index Entry
Org. apache. Lucene. analysis/Language Analyzer
Org. apache. Lucene. queryParser/query Analyzer
Org.apache.e.doc ument/Storage Structure
Org. apache. Lucene. store/underlying IO/Storage Structure
Org. apache. Lucene. util/Some Common Data Structures

Let's not talk much about the Code directly (this is a test class encapsulated in the early stage, and the package is relatively complete. If you are interested, you can continue to improve it on this basis ):

Package com. lucene. util; import java. io. file; import java. io. IOException; import java. io. stringReader; import java. util. list; import org. apache. log4j. logger; import org. apache. lucene. analysis. analyzer; import org. apache. lucene. analysis. tokenStream; import org.apache.w.e.doc ument. document; import org.apache.e.doc ument. field; import org.apache.e.doc ument. stringField; import org.apache.w.e.doc ume Nt. textField; import org.apache.e.doc ument. field. store; import org. apache. lucene. index. indexReader; import org. apache. lucene. index. indexWriter; import org. apache. lucene. index. indexWriterConfig; import org. apache. lucene. index. logDocMergePolicy; import org. apache. lucene. index. logMergePolicy; import org. apache. lucene. index. term; import org. apache. lucene. index. indexWriterConfig. openMode; import org. apache. Lucene. queryparser. classic. multiFieldQueryParser; import org. apache. lucene. queryparser. classic. parseException; import org. apache. lucene. queryparser. classic. queryParser; import org. apache. lucene. search. indexSearcher; import org. apache. lucene. search. matchAllDocsQuery; import org. apache. lucene. search. query; import org. apache. lucene. search. scoreDoc; import org. apache. lucene. search. termQuery; import org. apach E. lucene. search. topDocs; import org. apache. lucene. search. highlight. highlighter; import org. apache. lucene. search. highlight. invalidTokenOffsetsException; import org. apache. lucene. search. highlight. queryScorer; import org. apache. lucene. search. highlight. simpleFragmenter; import org. apache. lucene. search. highlight. simpleHTMLFormatter; import org. apache. lucene. store. directory; import org. apache. lucene. store. FSDi Rectory; import org. apache. lucene. util. numericUtils; import org. apache. lucene. util. version; import org. wltea. analyzer. lucene. IKAnalyzer; import com. message. base. search. searchBean;/*** lucene 4.10.1 ** @ creatTime 2014-10-28 * @ author Hu huichao **/public class HhcIndexTools {private final static Logger logger = Logger. getLogger (HhcIndexTools. class); private static String indexPath = "E: // lucene // index"; publ Ic static void main (String [] args) {try {// createIndex (); // searchIndex (""); // query (); // deleteIndex (null); forceDeleteIndex (); query (); highlighterSearch ();} catch (Exception e) {// TODO Auto-generated catch blocke. printStackTrace () ;}/ *** create Index */public static void createIndex () {// The most fine-grained Splitting Algorithm -- true indicates intelligent splitting Analyzer analyzer = new IKAnalyzer (false); Document doc = null; IndexWriter indexWriter = null; try {IndexWriter = getIndexWriter (analyzer); // Add the index doc = new Document (); doc. add (new StringField ("id", "1", Store. YES); doc. add (new TextField ("title", "title: Start", Store. YES); doc. add (new TextField ("content", "content: I am a programmer now", Store. YES); indexWriter. addDocument (doc); doc = new Document (); doc. add (new StringField ("id", "2", Store. YES); doc. add (new TextField ("title", "title: End", Store. YES); doc. add (new TextField ("c Ontent "," content: I am now an expert of lucene development engineers ", Store. YES); indexWriter. addDocument (doc); indexWriter. commit ();} catch (IOException e) {// TODO Auto-generated catch blocke. printStackTrace (); logger.info ("indexer sending exception");} finally {try {destroyWriter (indexWriter);} catch (IOException e) {// TODO Auto-generated catch blocke. printStackTrace () ;}}/ *** search document *** @ param keyword */@ SuppressWarnings ("deprecation") public static Void searchIndex (String keyword) {IndexReader indexReader = null; IndexSearcher indexSearcher = null; try {// 1. create an index Directory under F:/luence/index on the hard disk. Directory dir = FSDirectory. open (new File (indexPath); // 2. create IndexReaderindexReader = IndexReader. open (dir); // instantiate the searcher indexSearcher = new IndexSearcher (indexReader); // use the QueryParser Query analyzer to construct the Query object QueryParser parse = new QueryParser (Version. paie_4_10_1, "Content", new IKAnalyzer (false); // search for documents that contain the keyword Query = parse. parse (keyword. trim (); // when lucene is used to construct a search engine, if you want to perform one-time query on multiple domains, the advantage of this method is that you can weighted the field control // retrieve String [] fields = {"phoneType" in these four domains ", "name", "category", "price"}; Query querys = new MultiFieldQueryParser (Version. LATEST, fields, new IKAnalyzer (false )). parse (keyword. trim (); TopDocs results = indexSearcher. search (query, 1000); // 6. based on TopD Ocs obtains ScoreDoc object ScoreDoc [] score = results. scoreDocs; if (score. length> 0) {logger.info ("Number of query results:" + score. length); System. out. println ("Number of query results:" + score. length); for (int I = 0; I <score. length; I ++) {// 7. obtain the specific Document object Document doc = indexsearcher.doc(scorepolici#.doc) based on the Seacher and ScoreDoc objects; // 8. obtain the required value based on the Document object System. out. println (doc. toString (); System. out. println (doc. get ("title") + "[" + doc. get ("content ") +"] ") ;}} Else {}} catch (Exception e) {// TODO: handle exceptionlogger.info (" the query result is blank! ");} Finally {if (indexReader! = Null) {try {indexReader. close ();} catch (IOException e) {// TODO Auto-generated catch blocke. printStackTrace ();}}}} /*** display the first n results returned by the search by PAGE ** @ param keyWord * query keyWord * @ param pageSize * Number of records per page * @ param currentPage * Current page * @ throws ParseException */@ SuppressWarnings ("deprecation ") public void paginationQuery (String keyWord, int pageSize, int currentPage) throws IOException, ParseException {String [] Fields = {"title", "content"}; QueryParser queryParser = new MultiFieldQueryParser (Version. LATEST, fields, new IKAnalyzer (); Query query = queryParser. parse (keyWord. trim (); IndexReader indexReader = IndexReader. open (FSDirectory. open (new File (indexPath); IndexSearcher indexSearcher = new IndexSearcher (indexReader); // The result returned by the TopDocs search is TopDocs topDocs = indexSearcher. search (query, 100); // only returns the first 100 records PDocs all = indexSearcher. search (new MatchAllDocsQuery (), 100); // int totalCount = topDocs. totalHits; // total number of search results ScoreDoc [] scoreDocs = topDocs. scoreDocs; // result set returned by the Search // query start record location int begin = pageSize * (currentPage-1); // query end record location int end = Math. min (begin + pageSize, scoreDocs. length); // perform paging query for (int I = begin; I <end; I ++) {int docID = scoredocs? I =.doc; System. out. println ("docID =" + docID); lead E Nt doc = indexSearcher.doc (docID); String title = doc. get ("title"); System. out. println ("title is:" + title);} indexReader. close () ;}@ SuppressWarnings ("deprecation") public static void highlighterSearch () throws IOException, ParseException, InvalidTokenOffsetsException {IndexReader reader = IndexReader. open (FSDirectory. open (new File (indexPath); IndexSearcher searcher = new IndexSearcher (reader); // St Ring [] fields = {"title", "content"}; // QueryParser parser = new MultiFieldQueryParser (Version. LATEST, fields, // new IKAnalyzer (); // Query query = parser. parse (""); Term term = new Term ("content", "lucene"); TermQuery query = new TermQuery (term); TopDocs topdocs = searcher. search (query, Integer. MAX_VALUE); ScoreDoc [] scoreDoc = topdocs. scoreDocs; System. out. println ("Total number of query results:" + topdocs. totalHits); System. out. p Rintln ("Maximum score:" + topdocs. getMaxScore (); for (int I = 0; I <scoreDoc. length; I ++) {int docid?scoredoc= I =.doc; Document document=searcher.doc (docid); System. out. println ("============= file [" + (I + 1) + "] ========== "); system. out. println ("Search Keyword:" + term. toString (); String content = document. get ("content"); // highlight SimpleHTMLFormatter // SimpleHTMLFormatter formatter = new SimpleHTMLFormatter ("<font color = 'red'>", "</font>"); Hig Hlighter highlighter = new Highlighter (formatter, new QueryScorer (query); highlighter. setTextFragmenter (new SimpleFragmenter (content. length (); if (! "". Equals (content) {TokenStream tokenstream = new IKAnalyzer (). tokenStream (content, new StringReader (content); String highLightText = highlighter. getBestFragment (tokenstream, content); System. out. println ("highlight section" + (I + 1) + "the search results are as follows:"); System. out. println (highLightText);/* End: End keyword highlighted */System. out. println ("File content:" + content); System. out. println ("matching relevance:" + scoreDoc [I]. score) ;}}/ *** get the indexWriter object- -- Get the index ** @ param dir * @ param analyer * @ return * @ throws IOException */private static IndexWriter getIndexWriter (Analyzer analyzer) throws IOException {File indexFile = new File (indexPath); if (! IndexFile. exists () indexFile. mkdir (); // if the index library does not exist, create a Directory directory = FSDirectory. open (indexFile); // Directory directory = new RAMDirectory (); // create an index in the memory IndexWriterConfig conf = new IndexWriterConfig (Version. paie_4_10_1, analyzer); LogMergePolicy mergePolicy = new LogDocMergePolicy (); // basic index configuration // set the merging frequency when the segment adds a Document (Document). // The value is small, index creation speed is slow // The value is large, and index creation speed is fast.> 10 is suitable for batch index creation of mergePolicy. setMergeFacto R (30); // you can specify the maximum number of documents to be merged for a segment. // a smaller value is helpful for appending an index. // The value is large, it is suitable for batch indexing and faster searching for mergePolicy. setMaxMergeDocs (5000); conf. setMaxBufferedDocs (10000); conf. setMergePolicy (mergePolicy); conf. setRAMBufferSizeMB (64); conf. setOpenMode (OpenMode. CREATE_OR_APPEND); if (IndexWriter. isLocked (directory )){//? IndexWriter. unlock (directory);} IndexWriter indexWriter = new IndexWriter (directory, conf); return indexWriter ;} /*** destroy writer ** @ param writer * @ throws IOException */private static void destroyWriter (IndexWriter indexWriter) throws IOException {if (indexWriter! = Null) {indexWriter. close () ;}/ *** batch Delete ** @ param list * @ throws IOException */public static void deleteIndexs (List <SearchBean> list) throws IOException {if (list = null | list. size ()> 0) {logger. debug ("beans is null"); return;} for (SearchBean bean: list) {deleteIndex (bean) ;}/ *** delete a single index -- it will not be deleted immediately, generate. del file ** @ param bean * @ throws IOException */private static void deleteIndex (SearchBean bean) Throws IOException {// if (bean = null) {// logger. debug ("Get search bean is empty! "); // Return; //} IndexWriter indexWriter = getIndexWriter (new IKAnalyzer (); // The parameter is an option, either a Query or a term, term is a precisely searched value // The document with id = 1 is deleted here and will be left in the "recycle bin". Xxx. delindexWriter. deleteDocuments (new Term ("id", "1"); destroyWriter (indexWriter);}/*** query document */@ SuppressWarnings ("deprecation ") public static void query () {// 1. create a Directory index under F:/luence/index on the hard disk. try {IndexReader indexReader = IndexReader. open (FSDirectory. open (new File (indexPath); System. out. println ("Number of stored documents:" + indexReader. numDocs (); System. out. println ("total storage capacity:" + indexReader. maxDoc (); System. out. println ("deleted document:" + indexReader. numDeletedDocs ();} catch (IOException e) {// TODO Auto-generated catch blocke. printStackTrace () ;}/ *** rollback recycle bin ** @ throws IOException */public void recoveryIndexByIsDelete () throws IOException {IndexWriter indexWriter = getIndexWriter (new IKAnalyzer ()); indexWriter. rollback (); destroyWriter (indexWriter);}/*** clear the recycle bin without unDeleteAll () after version 3.6 () the method is ** @ throws IOException */public static void forceDeleteIndex () throws IOException {IndexWriter indexWriter = getIndexWriter (new IKAnalyzer (); indexWriter. forceMergeDeletes (); destroyWriter (indexWriter);}/*** update index ** @ throws IOException */public void update () throws IOException {IndexWriter indexWriter = new IndexWriter (FSDirectory. open (new File (indexPath), new IndexWriterConfig (Version. LATEST, new IKAnalyzer (true); Document document = new Document (); document. add (new Field ("id", "10", Field. store. YES, Field. index. NOT_ANALYZED_NO_NORMS); document. add (new Field ("email", "9481629991", Field. store. YES, Field. index. NOT_ANALYZED); document. add (new Field ("name", "Xiaomi", Field. store. YES, Field. index. NOT_ANALYZED_NO_NORMS); document. add (new Field ("content", "Xiaomi Hao", Field. store. NO, Field. index. ANALYZED); // here, we can see from the method that it actually deletes the old one, then adds a new document, and deletes the document matching the term, then add the new document to indexWriter. updateDocument (new Term ("id", "1"), document); indexWriter. close ();}}

For beginners, refer to the Code for adding, deleting, modifying, and querying indexes on javase301. Some interfaces earlier than version 30 have been deleted.

AddDocument (Document doc)
DeleteDocuments (Query query)
UpdateDocument (Term term, Document doc)

SQL addition, deletion, modification, and query operations

Insert into table_name (column1, column2) values (value1, value2)

Delete from table_name where columnN = conditionN

Update table_name set column1 = values where columnN = conditionN

Select column1, column2 from table_name where columnN = conditionN

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More